LinkedIn Data Breach: 117 Million Credentials in 2012, 700 Million Profiles Scraped in 2021, and a Decade of Disclosure Framing
Breach Summary
The LinkedIn breach pattern spans more than a decade and illustrates two distinct categories of risk: the 2012 password breach (117 million credentials exposed via SQL injection and weak hashing), and the 2021 scraping incident (700 million profiles aggregated and sold). The 2012 breach exposed passwords protected only by unsalted SHA-1 hashes — a weakness widely understood at the time. The 2021 incident exposed how much sensitive data can be extracted from a service through legitimate API access without any technical compromise of the platform itself.
What Happened
In June 2012, attackers exfiltrated approximately 6.5 million LinkedIn password hashes and posted them to a Russian password-cracking forum. The actual scope was significantly larger: in May 2016, a hacker using the alias "Peace" offered 117 million LinkedIn email and password combinations from the same 2012 breach for sale on dark web markets. LinkedIn confirmed the expanded scope and forced password resets for all affected accounts.
In June 2021, a separate incident emerged when a user named "TomLiner" posted 700 million LinkedIn user profiles for sale on a hacker forum. The data — names, email addresses, phone numbers, professional information, location, gender, and other profile fields — was extracted through scraping LinkedIn's APIs rather than through breach of LinkedIn's systems. LinkedIn maintained that no breach had occurred because the data was publicly viewable; security researchers and regulators viewed the position skeptically.
Attack Vector Detail
The 2012 breach exploited SQL injection vulnerabilities in LinkedIn's website to access the user database. The exfiltrated password hashes used unsalted SHA-1, which was already known to be cryptographically weak; researchers cracked approximately 90% of the hashes within days. LinkedIn's storage of password hashes without salt — a basic cryptographic practice — was the foundational failure that turned credential exposure into mass credential compromise.
Russian hacker Yevgeniy Nikulin was indicted in 2016, extradited from the Czech Republic in 2018, and convicted in 2020 for the LinkedIn breach as well as breaches of Dropbox and Formspring. He was sentenced to 88 months in federal prison.
The 2021 scraping incident exploited the absence of effective rate limiting and bulk-extraction detection on LinkedIn's APIs. The attacker used legitimate API access — the same access mechanism used by recruiters, sales tools, and integration partners — to extract profile data at scale. The incident is functionally a breach in the sense that data was aggregated and sold without user consent, even though it didn't require exploiting a vulnerability.
Breach Pattern Timeline
June 2012
Initial LinkedIn breach disclosed. Approximately 6.5 million password hashes posted to Russian password-cracking forum. LinkedIn forces password resets for affected accounts. Investigation underway.
2012-2016
Stolen credentials circulate on underground forums and are used for credential-stuffing campaigns against other services. The unsalted SHA-1 hashes are progressively cracked at scale.
May 2016
Hacker "Peace" offers 117 million LinkedIn credentials from the 2012 breach for sale. True scope of 2012 breach revealed. LinkedIn forces additional password resets across the expanded affected user base.
October 2016
Russian national Yevgeniy Nikulin arrested in Prague at U.S. request. Charges include the LinkedIn, Dropbox, and Formspring breaches.
March 2018
Nikulin extradited to the United States to face federal charges in Northern District of California.
July 2020
Nikulin convicted of computer hacking, wire fraud, identity theft, and conspiracy charges related to the LinkedIn, Dropbox, and Formspring breaches.
September 2020
Nikulin sentenced to 88 months in federal prison.
April 2021
500 million LinkedIn user profiles posted on a hacker forum. LinkedIn states the data was scraped from public profiles, not breached. Privacy regulators in multiple jurisdictions open inquiries.
June 2021
700 million LinkedIn user profiles posted for sale by user "TomLiner." LinkedIn maintains the data was scraped, not breached. The expanded scope includes phone numbers and inferred personal data not on public profiles.
2022-2024
Privacy regulator investigations continue in the EU and other jurisdictions. LinkedIn implements additional rate limiting and bulk-extraction detection on its APIs. Civil litigation related to the scraping incidents is largely unsuccessful in establishing breach liability under current legal frameworks.
Total impact: 117 million credentials (2012) plus 700+ million profiles (2021), Yevgeniy Nikulin sentenced to 88 months federal prison, ongoing regulatory scrutiny of API scraping practices, sustained credential-stuffing campaigns enabled by exposed credentials.
Executive Lessons
The 2012 breach illustrates that foundational security practices — salted password hashing, in this case — are non-negotiable. The technical knowledge that unsalted password hashes were inadequate was already well-established in 2012. LinkedIn's failure was not technical sophistication; it was choosing not to implement standard cryptographic practice on credentials representing the keys to 100+ million accounts. The cost — including the 2018 EU GDPR-anticipated litigation, multiple class actions, and the sustained credential-stuffing campaigns enabled by exposed credentials — extended for years.
The 2021 scraping incident establishes that public APIs require their own threat model distinct from perimeter security. Companies operating public APIs that expose user data should implement rate limiting, behavioral monitoring for bulk extraction patterns, and authentication tiers that prevent commodity scraping from approximating breach-equivalent data extraction. The legal and regulatory categorization of scraping remains contested — LinkedIn's position that scraping is not a breach prevailed in some courts and not others — but the user-trust impact is real regardless.
Related Reading
Private Equity Implications
For PE sponsors evaluating consumer-facing technology and SaaS targets, the LinkedIn case establishes two distinct diligence dimensions. First, credential storage practices: are passwords hashed with current standards (bcrypt, scrypt, Argon2) with appropriate work factors? Second, public API exposure: is the target's API surface designed to prevent bulk extraction through rate limiting, authentication tiers, and behavioral monitoring? Targets where the answer to either question is unclear or unsatisfactory carry structural risk that doesn't appear in standard penetration testing.
How Cloudskope Can Help
Cloudskope's Cyber Risk Assessment evaluates credential storage practices, password policies, and API exposure across the full application surface. Our Penetration Testing & Vulnerability Assessment specifically tests for the SQL injection patterns and credential-extraction vulnerabilities that produced the 2012 LinkedIn breach. Our M&A Cyber Due Diligence examines API governance for consumer-facing technology targets where bulk data extraction is a category of risk distinct from traditional breach scenarios.
.png)