Friday, July 8, 2016

Bitcrack / Hashkiller contest write-up 2016

Bitcrack / Hashkiller 2016 contest write-up


















HashcatV3 & HashcatV2 (, MDXfind (, hashtopussy (fork of the hashtopus project), TeamLogic (hash management platform), Unified List Manager (


CPU (cores)
CPU (cores) [bcrypt only]
Total (cores)
150 (SHA1)
190 (SHA1)

A constant combined compute power of 150 GH (measured on SHA1 bruteforce) was used throughout the contest. This figure peaked to about 190 GH which is the rough equivalent of 35 GTX 980Ti. Around 130 CPU cores were reserved solely for GPU unfriendly algorithms, this burst to maximum of 300 cores for a short period. An additional 100 CPU cores were used for all other algorithms which peaked to 250 cores.
  • Free-for-all approach
  • Have fun
  • Utilize resources efficiently
  • Surprise the other teams

Before the contest
We redeveloped our hash management system and ensured it was fully functional prior to the contest. In addition we had the pleasure of beta testing a personal project of one of our members. An improved distributed hashcat system dubbed Hashtopussy, (a fork of the hashtopus project) with numerous improvements including; a revamped interface, multi-user and user-rights-management support, optimized hash handling and of course support for Hashcat3. Keep an eye out for this project, as it will be released soon.

Hashtopussy instances were deployed and allowed the team to remotely manage, voluntarily donate compute cycles and deploy tasks across clusters of compute nodes and streamline the cracking process. As hashcat is now open source (big thanks to the hashcat developers), we were able to easily apply minor changes to ensure it played nicely in a distributed environment.

During the contest
We started off by probing all algorithms looking, for any signs of patterns and tackled the bcrypts immediately by running extremely simple checks against common passwords.  We recovered about 20 bcrypts within the first hour on our CPU cluster and were able to feed it with enough test candidates allowing us yield hits consistently.

MDXfind was used to quickly test algorithms which hashcat couldn’t initially handle namely DCC, with Waffle quickly adding WBB support. Once we knew these hashes were valid, support for both these algorithms were swiftly added to hashcat.

As there is already a write-up regarding the patterns for the generated hashes we won’t go into them, other than saying we spotted some and missed others and discovered some too late into the contest. 11 hours into the contest and we had hits for every algorithm except phpbb3_gen which we didn’t waste too much time pursuing. This was a pretty good starting point and kept us busy through the remainder of the time.

To make it up to some individuals who have complained that our large submission towards the end of the contest would have skewed any pretty graphs, we have decided to provide analytics gathered by our hash management system. The graphs should reflect the actual crack progression for each individual hashlist throughout the contest. This should provide some insight on how we tackled each hashlist.

Graphs for real hashlists

Graphs for generated hashlists

Interesting observations
As a portion of the hashes were from the real environment there is always the chance the hashes are mislabeled. We identified some DoubleMD5 labelled as MD5, these hashes tackled by cracking the initial MD5 list as DoubleMD5 then performing a single MD5 on the password prior to submission. We also identified vBulletin <3.8.5 hashes which were mislabeled MD5:pass with the salt being the plain for this MD5, there was no possible way to submit these since they were technically solved.

Once again since there were real world hashes, sometimes hashes become corrupted during extraction or transport. A feature of hashcat is that does not match every bit of the hash, allowing it to essentially detect a mistyped hash. We encountered a small portion of these which we assumed were most likely corrupted. As there wasn’t a large number of these, we simply ignored them.
While GPUs are extremely powerful in parallel hash cracking, it was surprising to see that the top scorer in our team predominately used CPUs.

Final remarks

A huge thanks to Bitcrack and Hashkiller for organizing an almost flawless contest, we had plenty of fun and very little sleep. We can only imagine the amount of time and effort put into arranging this contest to ensure it run so smoothly. Congratulations to Team Hashcat on their second place, glad we’re able to finally beat our rivals. Congratulations to the FCHC, I’m in your Wifi, LeakedSource and all other teams who participated.

Thursday, July 7, 2016

Myspace hashes, length 10 and beyond

Usable data359,005,905355,886,68699.13%
Salted hashes68,494,253
Salted pairs66,099,05947,120,45371.29%
non-user pass14,412,2995,8310.04%
meaningful passes51,686,76047,114,62291.15%

When we obtained the Myspace data, we didn’t think too much of it for several reasons. In addition to being a fairly old data-set, the passwords were also truncated to length ten and converted to lowercase prior to being hashed with the SHA-1 algorithm. This means that some of the passwords recovered would be ambiguous and incomplete. This is no longer the case for roughly 68M of the hashes.

The total data-set of roughly 360,213,049 lines contained 359,005,905 usable hashes. This data was de-duplicated to 116,822,086 SHA-1 hashes. Roughly 97% of these hashes were recovered by our group, totaling to 113M hashes. As the passwords were all pre-processed before hashing, the plain-texts which we recovered did not exceed length ten and were all lower-cased.

Since the plain-text passwords aren’t in their original form, they are not as interesting as it does not allow us to gather that much useful information from them. Being truncated, they do give us a glimpse of some longer passwords we may have previously not been able to recover.

Interestingly, user ‘frekvent’ over at the forum made an amazing discovery. It appears that for some users there exists an additional salted SHA-1 hash that contains the password in it’s original form, without being truncated or lower-cased. This hash is generated by salting the password with the userid prior to being hashed with SHA-1.

Rather than directly recover the salted SHA-1 hashes, we can take a shortcut. This means for all those users who contain this secondary salted SHA-1 hash, we can now case correct it against the plain-text we previously recovered. It also means we can derive the actual password  for these users prior to length ten truncation.

A generated example

UserID: 65535
Password: Cynosureprime082!

First hash:
(password is truncated to length 10 and lower cased)

Second hash:
(userid is applied as a salt to the unmodified password)

Stored as:

Step 1: Recover 6fba0c905ded07590fdbc4b0fa6eb17e565dd814 as cynosurepr
Step 2: Perform case toggling and length extension cynosureprA, cYnosureprBB, cyNosureprZZ etc etc and test against 20c25cbb791bc0b7fcce739f42b682376057eb9e:65535

Out of the entire data-set, about 68M users contain the secondary salted SHA-1 password hash.  Of these 68M users, we were able to pair 66M up with the recovered password. This 66M list was then divided into two groups, ‘non-user pass’ which are users containing system generated passwords (14M) and ‘meaningful passes’, those which belong to users (51.6M). We were only able to pair 66M of the total 68M hashes as we have not fully recovered all the SHA1 hashes, but only 97% of them.

Using our tools we performed either a case toggle and/or length extension attack for each of the salted hash pairs. We have successfully verified over 45M plain-texts against their salted SHA-1 counterpart. The case toggle refers to toggling all passes length ten or less against the salted SHA-1. The length extension attack involves cycling through all possible characters and appending them to the plain-text derived from the recovered normal SHA1 and checking this against the salted SHA-1 hash.

Having both variations of the password hashes has made cracking the longer passwords quite easy since we can first recover the length 10 representation and use this in length extension attacks to obtain the full length password. It would appear that the Myspace data may have some usefulness after all.

Note: The salted hashes can be paired up with their corresponding plaintext data and arranged such that they can be recovered using off the shelf software. However, this won't work for case correction, you will also need to reparse the final output.