Skip to main content
Vulnerability Research

LinkedIn Revisited - Full 2012 Hash Dump Analysis

May 18, 2016
Rick Redman
#forensics #passwords

As you may know, a “full” dump of email addresses and password hashes for the Linkedin.com attack that occurred in 2012 has become available. Here at KoreLogic, we got our hands on the list of emails and the separate list of passwords (but nothing linking the two together, which we don’t want or need). We started to gather some statistics on them using our Password Recovery Service (PRS). The following analysis assumes the lists are real; due to the valid email addresses and confirming some of our own accounts’ data from back then, we believe that the dump is real.

What we know so far:

It contains 164,590,819 unique email addresses. It contains 177,500,189 unsalted SHA1 password hashes. Note that this is a larger number than the amount of email addresses.

It contains 61,829,207 unique hashes. This means there are duplicates, and this is good for password researchers because it allows us to come up with statistics of how often certain passwords are used.

As of Thursday May 19 14:09 EDT 2016, we’ve cracked 65% of the lists, after about two hours work on our private distributed cracking grid. Approximately 41,500,000 plain-text hashes have been recovered so far. There are literally thousands of new cracks coming in every minute, so the numbers are a bit rough.

The most common password hashes are:

Number | Hash
1135936 7c4a8d09ca3762af61e59520943dc26494f8941b
 207488 7728240c80b6bfd450849405e8500d6d207783b6
 188380 5baa61e4c9b93f3f0682250b6cf8331b7ee68fd8
 149916 f7c3bc1d808e04732adf679965ccc34ca7ae3441
 95854 7c222fb2927d828af22f592134e8932480637c0d
 85515 3d4f2bf07dc1be38b20cd6e46949a1071f9d0e3d
 75780 20eabe5d64b0e216796e834f52d61fd0b70332fc
 51969 dd5fef9c1c1da1394d6d34b248c51be2ad740840
 51870 b1b3773a05c0ed0176787a4f1574ff0075f7521e
 51535 8d6e34f987851aa599257d3831a1af040886842f
 49235 c984aed014aec7623a54f0591da07a85fd4b762d
 41449 6367c48dd193d56ea7b0baad25b19455e529f5ee
 35919 d8cd10b920dcbdb5163ca0185e402357bc27c265
 34440 1411678a0b9e25ee2f7c8b2f7ac92b6a74b3f9c5
 32879 601f1889667efaebb33b8c12572835da3f027f78
 32289 ff539c96a2ed9f72a47a5e1c7d59e143ba1fba94
 30972 019db0bfd5f85951cb46e4452e9642858c004155
 30923 01b307acba4f54f55aafc33bb06bbbf6ca803e9a
 28928 775bb961b81da1ca49217a48e533c832c337154a
 28705 17b9e1c64588c7fa6419b4d29dc1f4426279ba01

These values crack to:

Number | Hash | Plaintext
1135936 7c4a8d09ca3762af61e59520943dc26494f8941b 123456
 207488 7728240c80b6bfd450849405e8500d6d207783b6 linkedin
 188380 5baa61e4c9b93f3f0682250b6cf8331b7ee68fd8 password
 149916 f7c3bc1d808e04732adf679965ccc34ca7ae3441 123456789
 95854 7c222fb2927d828af22f592134e8932480637c0d 12345678
 85515 3d4f2bf07dc1be38b20cd6e46949a1071f9d0e3d 111111
 75780 20eabe5d64b0e216796e834f52d61fd0b70332fc 1234567
 51969 dd5fef9c1c1da1394d6d34b248c51be2ad740840 654321
 51870 b1b3773a05c0ed0176787a4f1574ff0075f7521e qwerty
 51535 8d6e34f987851aa599257d3831a1af040886842f sunshine
 49235 c984aed014aec7623a54f0591da07a85fd4b762d 000000
 41449 6367c48dd193d56ea7b0baad25b19455e529f5ee abc123
 35919 d8cd10b920dcbdb5163ca0185e402357bc27c265 charlie
 34440 1411678a0b9e25ee2f7c8b2f7ac92b6a74b3f9c5 666666
 32879 601f1889667efaebb33b8c12572835da3f027f78 123123
 32289 ff539c96a2ed9f72a47a5e1c7d59e143ba1fba94 linked
 30972 019db0bfd5f85951cb46e4452e9642858c004155 maggie
 30923 01b307acba4f54f55aafc33bb06bbbf6ca803e9a 1234567890
 28928 775bb961b81da1ca49217a48e533c832c337154a princess
 28705 17b9e1c64588c7fa6419b4d29dc1f4426279ba01 michael

The most common patterns used in the passwords are follows: (Updated May 20 11:00 EDT 2016)

?d = Digit [0-9] ?s = “Special Character” +_)*(&^%$#@!~`-=[]{}|;’:”,./<>? …etc. ?l = Lower case letter [a-z] ?u = Upper case letter [A-Z]

Number | Pattern
2464707 ?l?l?l?l?l?l?l?l Example: linkedin
1776416 ?l?l?l?l?l?l?d?d Example: linked12
1663330 ?l?l?l?l?l?l?l?l?l Example: alinkedin
1587423 ?l?l?l?l?d?d?d?d Example: link2012
1528434 ?l?l?l?l?l?l?l Example: linkedi
1525784 ?l?l?l?l?l?l Example: linked
1348195 ?d?d?d?d?d?d?d?d
1172612 ?l?l?l?l?l?l?l?l?l?l
1074096 ?l?l?l?l?l?d?d?d?d
1042003 ?d?d?d?d?d?d?d?d?d?d
 984939 ?l?l?l?l?l?l?d?d?d?d
 936771 ?l?l?l?l?l?l?l?d?d
 819341 ?l?l?l?d?d?d?d
 781166 ?d?d?d?d?d?d?d
 723656 ?l?l?l?l?l?d?d
 713165 ?l?l?l?l?l?l?l?l?l?l?l
 692280 ?l?l?l?l?l?d?d?d
 690521 ?d?d?d?d?d?d
 670878 ?l?l?l?l?l?l?l?l?d?d
 653118 ?l?l?l?l?l?l?l?d
 539001 ?l?l?l?l?l?l?d?d?d
 494526 ?l?l?l?l?d?d
 491474 ?l?l?d?d?d?d
 462250 ?l?l?l?l?l?l?l?l?l?l?l?l

The most common “base words” used in the passwords are shown below. These are calculated by taking all the recovered passwords, removing all special characters and digits, and then sorting the results. This was the initial technique used by KoreLogic in 2012 to determine that the set of ~6.5 million hashes found on a Russian message board was in fact from LinkedIn.com (which now appears to have been only a subset of this larger leak).

Number | Base word
 29883 linkedin Examples: linkedin1 linkedin2012 linkedin!
 26194 link Examples: link2012 2012link !!link!!
 21731 love
 19721 ever
 15574 linked
 14156 life
 11674 alex
 10773 mike
 10566 pass
 9540 john
 9176 blue
 8937 june
 8338 jack
 8006 july
 7305 home
 7205 star
 7094 password
 7005 angel

Update: May 19 15:53 EDT 2016 Here is a list of the most common domains used by the accounts in the dump. No real surprises here.

Number | Domain Name
32865035 gmail.com
24018467 hotmail.com
20361246 yahoo.com
 4268015 aol.com
 1977483 comcast.net
 1427168 yahoo.co.in
 1333354 msn.com
 1039135 sbcglobal.net
 1036522 rediffmail.com
 992936 yahoo.fr
 913406 yahoo.co.uk
 843158 live.com
 839735 yahoo.com.br
 748001 hotmail.co.uk
 740473 verizon.net
 574117 hotmail.fr
 549022 yahoo.com
 528635 ymail.com
 528040 cox.net
 509047 bellsouth.net
 503271 libero.it
 478587 att.net
 428930 yahoo.es
 406492 btinternet.com

Update: May 19 17:00 EDT 2016 42,691,862 unique passwords recovered so far; 69% of the unique hashes have cracked at this point.

Of the total 177,500,189 non-unique hashes leaked, there are 143,914,964 password hashes cracked, 33,585,225 left. That represents 81.07% of all LinkedIn.com users in the dump.

Update: May 20 10:00 EDT 2016 ~48,520,000 unique passwords recovered so far; ~78% of the unique hashes have cracked at this point. And we have recovered the passwords for ~86% of all LinkedIn.com users in the dump.

~13,360,000 unique hashes left to crack …

Update: May 20 11:00 EDT 2016 Here is a list of the most common email addresses without their domain. No real surprises here.

555249 info@
 64325 john@
 60845 david@
 55525 mike@
 52685 chris@
 52251 mail@
 50654 sales@
 50444 mark@
 48006 steve@
 45872 paul@
 39051 contact@
 37424 linkedin@
 36511 peter@
 35818 michael@
 35770 admin@
 30473 dave@
 30034 tom@
 29102 jim@
 26872 jeff@

Update: May 20 18:00 EDT 2016 Our grid was busy doing client work for about 24 hours, so not many new cracks today. But here’s some updated stats and analysis.

~49,290,000 unique passwords recovered so far.

~12,520,000 unique hashes left to crack.

5,184,351 of the recovered passwords are 8+ characters and contain one upper, one lower, and one digit.

825,975 of the recovered passwords are 8+ characters and contain one upper, one lower, and one digit and one special character.

The pattern distribution of these passwords closely resembles the findings of our PathWell research - they are heavily biased towards some universally common topologies:

29742 ?u?l?l?l?l?l?s?d?d
 26640 ?u?l?l?l?l?l?d?d?s
 26287 ?u?l?l?l?l?s?d?d
 23830 ?u?l?l?l?l?l?s?d
 20296 ?u?l?l?l?l?l?d?s
 18365 ?u?l?l?l?l?d?d?s
 17390 ?u?l?l?l?s?d?d?d?d
 17085 ?u?l?l?l?l?l?l?d?s
 16723 ?u?l?l?l?l?l?l?s?d
 14989 ?u?l?l?l?l?l?l?s?d?d
 13565 ?u?l?l?l?l?s?d?d?d?d
 12986 ?u?l?l?l?l?l?l?d?d?s
 12590 ?u?l?l?s?d?d?d?d
 12305 ?u?l?l?l?l?s?d?d?d
 11280 ?u?l?l?l?l?l?l?l?d?s
 10991 ?u?l?l?l?d?d?d?d?s
 10822 ?u?l?l?l?l?l?s?d?d?d?d
 10796 ?u?l?l?l?s?d?d?d

The PACK output of the unqiue cracks so far (numbers rounded slightly):

[*] Length:
[+] 8: 29% (14,620,000)
[+] 9: 17% (8,430,000)
[+] 10: 14% (6,950,000)
[+] 7: 13% (6,660,000)
[+] 6: 10% (5,410,000)
[+] 11: 06% (3,270,000)
[+] 12: 03% (1,930,000)
[+] 13: 01% (921,000)
[+] 14: 01% (508,000)
[+] 15: 00% (263,000)
[+] 16: 00% (159,000)

[*] Character-set:
[+] loweralphanum: 48% (24,128,000)
[+] loweralpha: 20% (10,303,000)
[+] mixedalphanum: 10% (5,026,000)
[+] numeric: 08% (4,428,000)
[+] loweralphaspecialnum: 02% (1,377,000)
[+] upperalphanum: 01% (957,000)
[+] all: 01% (936,000)
[+] mixedalpha: 01% (852,000)
[+] loweralphaspecial: 01% (507,000)
[+] upperalpha: 00% (431,000)
[+] mixedalphaspecial: 00% (147,000)
[+] specialnum: 00% (84,000)
[+] upperalphaspecialnum: 00% (62,000)
[+] upperalphaspecial: 00% (19,000)

Update: May 21 18:00 EDT 2016 ~49,999,999 unique passwords recovered so far.

~11,863,000 unique hashes left to crack.

Update: May 25 15:40 EDT 2016 Our grid is mostly doing other things now. We have gotten a couple requests about re-sharing the list, and/or about building some kind of online interface to look up individual credentials. We have no plans to do so.

For more of KoreLogic’s talks about password recovery, check out the following videos of KoreLogic employee, and founder of PRS, Rick Redman: Your Password Complexity Requirements are Worthless - OWASP AppSecUSA 2014 Cracking Corporate Passwords: Why Your Password Policy Sucks

Similar Articles