30 Biggest Data Breaches of All Time

May 10, 2026

25 minute read

From Yahoo's 3 billion accounts to the SolarWinds supply-chain attack, this is the ranked list of the thirty largest data breaches in history — ordered by records exposed, with the regulatory, operational, and strategic context that determines which of them actually mattered most. The largest by record count is not always the most consequential: SolarWinds notified roughly 18,000 organizations and reshaped U.S. federal cybersecurity policy more than any single 100-million-record consumer breach has. This ranking leads with records (the metric most readers expect), then reframes against the composite-impact dimensions — regulatory precedent, strategic significance, and architectural consequence — that determine which of these incidents actually built modern cybersecurity governance.

Data breaches have become a defining feature of the modern digital economy. The largest publicly disclosed incidents now exceed one billion records each, regulatory penalties have entered the billion-dollar range, and the operational impact of cyber attacks on critical infrastructure has produced fuel shortages, hospital diversions, and sector-wide disruption. This ranking covers the thirty largest and most consequential data breaches by combined exposure, regulatory significance, and operational impact.

How we ranked these breaches

The primary ranking criterion is the number of records or individuals exposed — the metric most readers expect from a "biggest breaches" list. Records is a useful but not determinative measure: some 50-million-record breaches produced larger regulatory consequence than some 500-million-record breaches, and several incidents on this list are operationally consequential despite comparatively modest record counts. The full evaluation weights five dimensions:

Records exposed — the raw count of affected individuals or accounts. The primary ranking input.
Operational disruption — for incidents like Colonial Pipeline, MGM, and CDK Global where the disruption itself exceeded the data exposure in consequence.
Direct financial and regulatory cost — regulatory penalties, civil settlements, remediation expenses, and ongoing compliance costs.
Strategic and intelligence significance — geopolitical, defense, and national-security implications, including attribution to state-sponsored actors and the operational uses of exfiltrated data.
Regulatory precedent and architectural consequence — the extent to which the breach catalyzed regulatory frameworks, enforcement standards, or industry practices that persist (the SEC Cybersecurity Disclosure Rules, the FTC's 20-year consent orders, the post-SolarWinds software supply chain executive orders).

Persistence of the exposure matters more than the records count alone captures. Genetic data (23andMe), passport numbers (Marriott-Starwood), Social Security numbers (Equifax, National Public Data), and biometric information cannot be rotated like passwords. The "permanent identity exposure" that these breaches produce will continue to enable identity fraud and targeted attacks for decades, not years.

The list excludes breach incidents where the record count cannot be confirmed. Incidents primarily characterized as ransomware or operational events without consumer data exposure are included where the operational consequence ranks them among the most significant of all time. Several incidents that do not make the records-ranked Top 30 — OPM, Anthem, Microsoft Exchange ProxyLogon, Kaseya VSA, Bybit — are included as honorable mentions at the end. Each entry links to a comprehensive analysis of the breach including attack vector, financial impact, regulatory response, and M&A and PE diligence implications.

The 30 biggest data breaches of all time

1. Yahoo (2013-2014) — 3 billion accounts

Yahoo's three-billion-account exposure remains the largest single-company breach in history. State-sponsored Russian intelligence (Alexey Belan and FSB officers Dmitry Dokuchaev and Igor Sushchin) compromised every Yahoo account that existed at the time. The breach went undisclosed during Yahoo's acquisition by Verizon — leading to a $350 million repricing of the deal, a $35 million SEC penalty for disclosure failures, and a $117.5 million class action settlement. The Yahoo case became the foundational precedent for the SEC's 2023 Cybersecurity Disclosure Rules and the regulatory framework that ultimately produced the SolarWinds CISO charges.

Read the full breakdown: Yahoo breach analysis

2. National Public Data (2024) — 2.9 billion records

A background-check data broker most consumers had never heard of, National Public Data exposed an estimated 2.9 billion records containing Social Security numbers, full names, addresses, and date-of-birth information for substantial portions of the U.S., Canadian, and U.K. populations. The data was reportedly sold on dark web forums for $3.5 million before being posted for free. The incident exposed the systemic risk created by data broker aggregation: companies most consumers don't know exist hold consolidated identity profiles that, when exposed, create permanent identity fraud exposure no remediation can undo. National Public Data filed for bankruptcy shortly after the disclosure.

Read the full breakdown: National Public Data breach analysis

3. LinkedIn (2021) — 800 million records

The 2021 LinkedIn data scrape exposed 800 million records — professional profile data, contact information, and inferred employment details — through abuse of the platform's public APIs. LinkedIn maintained that no breach of their systems occurred and that the data was scraped from publicly visible profile information; the dataset nonetheless represented one of the largest collections of professional and contact data ever assembled. The scrape illustrates the security limits of public-facing professional data: information that individual users would never have consented to aggregating at scale becomes commodity-priced commercial product once aggregated.

Read the full breakdown: LinkedIn breach analysis

4. Marriott-Starwood (2014-2018) — 344 million guests

Chinese state-sponsored attackers operated inside the Starwood reservation database for four years — two years before Marriott's $13.6 billion acquisition closed in September 2016, and two more years inside what was then Marriott's network. The breach exposed 344 million guest records globally, including 5 million unencrypted passport numbers — the largest exposure of unencrypted travel documents in consumer cybersecurity history. The October 2024 FTC consent order requires Marriott to maintain zero trust architecture and forensic threat hunting in M&A through 2044 and established the canonical precedent for inherited M&A cyber risk.

Read the full breakdown: Marriott-Starwood breach analysis

5. Canvas/Instructure (2026) — 275 million users

The May 2026 breach of Canvas by Instructure — the dominant learning management system in U.S. higher education — exposed an estimated 275 million student, faculty, and administrator records across thousands of educational institutions. The breach landed in the middle of finals season, creating cascading operational disruption across K-12 districts and universities simultaneously. Canvas's status as critical education infrastructure used by the majority of U.S. universities made the breach a defining incident for EdTech security and accelerated K-12 and higher-ed cybersecurity regulatory attention.

Read the full breakdown: Canvas/Instructure breach analysis

6. T-Mobile (2021-2023) — 200+ million records

T-Mobile has experienced nine documented breaches across the 2014-2023 period, including a 76.6 million customer record exposure in August 2021 and a 37 million customer record exposure in January 2023. The cumulative exposure exceeds 200 million records and has produced $531.5 million in settlements — $500 million in class action settlements plus a $31.5 million FCC enforcement action that itself established the FCC consent decree framework now applied to telecom cybersecurity. The case is the operational reference for how serial-breach companies are penalized differently than first-time-breach companies.

Read the full breakdown: T-Mobile breach analysis

7. Equifax (2017) — 147.9 million Americans

The Equifax breach exposed Social Security numbers, dates of birth, addresses, driver's license numbers, and credit-card numbers for 147.9 million Americans — roughly 56% of all U.S. adults. Chinese state-sponsored attackers exploited an unpatched Apache Struts vulnerability that had been disclosed two months earlier and for which a patch had been available throughout the period. The breach produced the FTC's largest data-protection settlement at the time ($575 million-plus) and became the regulatory template for credit bureau cybersecurity. The pattern — known critical vulnerability, public patch, prolonged failure to apply — is among the most frequently cited examples of preventable enterprise breaches.

Read the full breakdown: Equifax breach analysis

8. Target (2013) — 110 million records

The 2013 Target breach exposed payment card data for 40 million customers and personal information for 70 million additional customers, compromised via a third-party HVAC vendor's credentials. The HVAC vendor — Fazio Mechanical Services — had remote access to Target's network for energy monitoring and was compromised through phishing. The lateral movement from a vendor with no obvious connection to payment systems into Target's POS infrastructure established the foundational template for third-party risk management. The breach produced a $202 million class action and consent settlement and is the canonical precedent for vendor risk management programs.

Read the full breakdown: Target breach analysis

9. Capital One (2019) — 106 million records

A former Amazon Web Services employee exploited a misconfigured web application firewall to access an AWS S3 bucket containing Capital One credit card application data for 106 million customers in the U.S. and Canada. The breach exposed Social Security numbers, bank account numbers, addresses, and credit scores. The case is the foundational cloud security incident: a configuration error in a perimeter control allowed access to data that was nonetheless inside Capital One's own AWS account. The $190 million class action settlement plus an $80 million OCC penalty established the cloud-specific regulatory framework that subsequent FFIEC guidance has built on.

Read the full breakdown: Capital One breach analysis

10. Change Healthcare (2024) — 100+ million patients

The February 2024 ransomware attack on Change Healthcare — UnitedHealth's payment processing subsidiary handling roughly one-third of U.S. healthcare claims — exposed protected health information for at least 100 million patients and disrupted prescription processing, claim submission, and clinical operations for healthcare providers nationwide. UnitedHealth disclosed a $22 million ransom payment to the BlackCat/ALPHV ransomware group. The breach produced the largest HIPAA breach notification in U.S. history and is now cited as the foundational case for healthcare supply-chain concentration risk.

Read the full breakdown: Change Healthcare breach analysis

💡 Key Insight

The largest breaches share a common pattern: a small number of foundational control failures multiplied across hundreds of millions of records.

11. Facebook / Cambridge Analytica (2014-2018) — 87 million profiles

Researcher Aleksandr Kogan harvested 87 million Facebook profiles via the platform's then-permitted Friends API and transferred them to Cambridge Analytica for political microtargeting in the 2016 U.S. election and U.K. Brexit referendum. The case established that consent-based data exposure — data collected through legitimate API access used outside what users would understand — can produce regulatory consequences equivalent to technical breach. The $5 billion FTC settlement was the largest consumer protection penalty in U.S. history at the time and accelerated GDPR enforcement, CCPA enactment, and the state-by-state privacy law cascade.

Read the full breakdown: Facebook / Cambridge Analytica breach analysis

12. JPMorgan Chase (2014) — 83 million records

Russian nationals (operating with apparent state acquiescence rather than direct sponsorship) compromised JPMorgan Chase through a single server that had not been updated to require two-factor authentication. The breach exposed contact information for 76 million households and 7 million small businesses. While no financial data was reportedly exfiltrated, the case established that the absence of two-factor authentication on even one network access path could produce eight-figure consumer exposure at a top-tier financial institution. The case is the foundational MFA-gap precedent in financial services cybersecurity.

Read the full breakdown: JPMorgan Chase breach analysis

13. AT&T (2024) — 73 million customers

AT&T disclosed in March 2024 that customer data for approximately 73 million current and former customers had appeared on the dark web. The data included Social Security numbers, dates of birth, and AT&T account information. AT&T initially characterized the data as old (from 2019 or earlier) before later acknowledging it included data from active customer accounts. A separate July 2024 disclosure revealed that call and text metadata for nearly all of AT&T's wireless customers — approximately 109 million customers — had been exposed through compromised Snowflake credentials. AT&T paid an estimated $370,000 ransom to delete the data.

Read the full breakdown: AT&T breach analysis

14. Dropbox (2012) — 68.7 million accounts

The 2012 Dropbox breach exposed credentials for 68.7 million accounts. The breach occurred when an employee's password (reused from LinkedIn, which had been breached the same year) was used to access an internal Dropbox project document containing user credentials. The credentials were hashed with SHA-1 and bcrypt — relatively strong protection — but the breach nonetheless created credential-stuffing exposure that propagated for years. The case is the canonical precedent for password reuse risk and the operational case for password managers and mandatory MFA on employee accounts with sensitive system access.

Read the full breakdown: Dropbox breach analysis

15. PowerSchool (2025) — 60+ million records

The PowerSchool breach exposed sensitive student information — including Social Security numbers in some districts — for an estimated 60 million-plus students and teachers across thousands of K-12 schools in the United States and Canada. PowerSchool's dominant position in the K-12 student information system market made the breach a single-vendor exposure across an entire educational sector. The case is the K-12 equivalent of Change Healthcare in terms of supply-chain concentration: when a critical SaaS vendor serves a majority of the sector, a single breach produces sector-wide exposure that no individual customer could have prevented.

Read the full breakdown: PowerSchool breach analysis

16. Comcast/Xfinity (2023) — 35.9 million customers

Comcast disclosed in December 2023 that attackers had exploited the Citrix Bleed vulnerability (CVE-2023-4966) to compromise customer authentication credentials and Social Security number information for 35.9 million Xfinity customers. The Citrix Bleed exploitation followed a pattern documented across multiple major enterprise breaches: a critical-severity vulnerability with available patch was exploited during the window between disclosure and patch deployment. Comcast's reported delay in applying the patch was the operational failure that converted a generic vulnerability into a 35.9 million customer breach.

Read the full breakdown: Comcast/Xfinity breach analysis

17. LastPass (2022) — 30+ million vaults

The two-part LastPass breach in August and December 2022 ultimately compromised encrypted password vaults for an estimated 30 million-plus users. Attackers initially accessed development environments through a compromised employee laptop, then used the access to exfiltrate encrypted vault backups. The encryption protected vault contents from direct exposure but created a long-tail risk: vaults are subject to offline brute-force attacks for users with weak master passwords. The breach is the operational case study for why password manager companies face heightened scrutiny and why master password strength matters more than most users realize.

Read the full breakdown: LastPass breach analysis

18. LoanDepot (2024) — 16.9 million customers

The January 2024 ransomware attack on mortgage lender LoanDepot exposed personal information including Social Security numbers, financial account information, and dates of birth for 16.9 million current and former customers. The attack disrupted LoanDepot's loan origination and servicing operations for weeks and was attributed to the ALPHV/BlackCat ransomware group. The case illustrates the operational concentration risk in mortgage lending: customer data, payment processing, loan servicing, and document management are typically consolidated in a small number of integrated platforms whose compromise can disable an institution's core operations.

Read the full breakdown: LoanDepot breach analysis

19. Adobe (2026) — 13+ million tickets

The Adobe customer support breach in 2026 exposed personal information and support ticket content for 13 million-plus customers. Support tickets are an underappreciated high-value target: they contain not just contact information but the specific technical contexts, screenshots, and product configurations that customers shared with support staff. The exposure of this data creates phishing exposure that is qualitatively different from generic credential exposure because attackers can reference real product contexts the customer would recognize. The case is the operational case study for support-system security across enterprise SaaS providers.

Read the full breakdown: Adobe breach analysis

20. Medibank (2022) — 9.7 million customers

The October 2022 breach of Australia's largest health insurer Medibank exposed personal and health claims data for 9.7 million current and former customers. Attackers — attributed to the REvil ransomware group — demanded a $10 million ransom; Medibank refused to pay, and the attackers progressively released data on the dark web including categorized 'good list' and 'bad list' files based on sensitive medical claim categories (mental health, abortion, addiction treatment). The disclosure was unusually harmful relative to the record count and produced an AUD$50 million Australian regulatory penalty plus the largest privacy-related class action in Australian history.

Read the full breakdown: Medibank breach analysis

3 Billion

Yahoo accounts compromised in the 2013–2016 breaches — every Yahoo account that existed at the time, and still the largest single corporate breach in history.

$20 Billion+

Cumulative direct financial cost of the 30 largest breaches in history, including regulatory penalties, customer settlements, and breach-remediation expense.

5 Causes

Recurring structural weaknesses — identity, vendor concentration, disclosure delay, unpatched infrastructure, and over-permissioned access — that drive nearly every major breach.

21. 23andMe (2023) — 6.9 million users

Attackers used credential stuffing — testing reused credentials from prior breaches — to compromise approximately 14,000 23andMe accounts, then exploited the platform's DNA Relatives feature to extract ancestry and relative information for 6.9 million additional users connected to the compromised accounts. The breach produced uniquely sensitive exposure: genetic information cannot be rotated, and the relative-discovery feature meant the breach affected users who had themselves taken no action to compromise their security. 23andMe filed for bankruptcy in 2025; the data's continuing exposure remains an unresolved consumer issue.

Read the full breakdown: 23andMe breach analysis

22. Ascension Health (2024) — 5.6 million patients

The May 2024 ransomware attack on Ascension Health — one of the largest U.S. nonprofit hospital systems with 140 hospitals across 19 states — exposed personal and health information for 5.6 million patients and disrupted clinical operations across the network for weeks. The attack required Ascension to revert to paper-based clinical workflows in many hospitals and produced documented patient-safety incidents from the operational disruption. The case is the operational reference for hospital cybersecurity and the patient-safety dimension of healthcare ransomware that the federal HHS Office for Civil Rights has subsequently emphasized.

Read the full breakdown: Ascension Health breach analysis

23. Blue Shield of California / Google Analytics (2025) — 4.7 million members

Blue Shield of California disclosed in 2025 that approximately 4.7 million members had been affected by Google Analytics configuration that had inadvertently shared protected health information with Google's advertising systems. The case is notable because no traditional breach occurred — the exposure was created by a configuration error in routine analytics tooling that transmitted data to Google in ways that violated HIPAA. The pattern — accidental disclosure through analytics, advertising pixels, and SaaS integrations — is now a primary HIPAA enforcement category that has produced settlements at multiple major U.S. health systems.

Read the full breakdown: Blue Shield California breach analysis

24. Snowflake Customer Environments (2024) — Hundreds of millions

The 2024 Snowflake customer breach campaign — attributed to the ShinyHunters group — compromised an estimated 165 Snowflake customer environments through credential-stuffing attacks targeting accounts without multi-factor authentication. Affected customers included AT&T (109 million records), Ticketmaster (560 million records), Santander, Advance Auto Parts, and many others. The cumulative exposure across customer environments reached hundreds of millions of individuals. The campaign illustrates the cascading risk of shared cloud platforms when MFA is not enforced as a default and the operational case for mandatory MFA on cloud data platforms regardless of individual customer security posture.

Read the full breakdown: Snowflake customer breach analysis

25. MOVEit Transfer (2023) — Tens of millions

The May-June 2023 MOVEit Transfer breach campaign — attributed to the Cl0p ransomware group — exploited a zero-day SQL injection vulnerability (CVE-2023-34362) in Progress Software's MOVEit file transfer product to compromise hundreds of organizations using the product. Affected entities included the U.S. Department of Energy, multiple federal agencies, state government departments, healthcare systems, and consumer organizations. The downstream consumer impact has been estimated in the tens of millions of records across all affected organizations. The case is the supply-chain breach reference for file-transfer tooling and the operational basis for the post-2023 movement away from on-premises file-transfer products in regulated sectors.

Read the full breakdown: MOVEit breach analysis

26. SolarWinds (2020) — 18,000+ organizations

The SolarWinds Orion supply chain compromise — disclosed in December 2020 and attributed to Russian SVR (APT29/Cozy Bear) — used malicious code injected into legitimate SolarWinds Orion software updates to compromise approximately 18,000 customer organizations including U.S. federal agencies (Treasury, Commerce, DHS, State), Fortune 500 enterprises, and security companies including FireEye. The breach is the canonical supply chain attack and the foundational precedent for the 2023 SEC charges against SolarWinds' CISO — the first individual-officer cybersecurity enforcement action of its kind. The case has reshaped how software supply chains are evaluated in M&A and regulatory contexts.

Read the full breakdown: SolarWinds breach analysis

27. Colonial Pipeline (2021) — Operational disruption

The May 2021 ransomware attack on Colonial Pipeline by the DarkSide ransomware group forced the shutdown of the largest fuel pipeline on the U.S. East Coast for six days, producing fuel shortages across the eastern United States. Colonial paid a $4.4 million ransom; the FBI subsequently recovered approximately $2.3 million of the payment. The incident was the catalyst for President Biden's Executive Order 14028 (Improving the Nation's Cybersecurity) and the foundational precedent for critical-infrastructure cybersecurity regulation including the TSA's pipeline cybersecurity directives and CISA's evolving operational technology guidance.

Read the full breakdown: Colonial Pipeline breach analysis

28. MGM Resorts (2023) — Operational disruption

The September 2023 social engineering attack on MGM Resorts — attributed to the Scattered Spider/UNC3944 group working with the ALPHV/BlackCat ransomware affiliate — compromised MGM's network through a 10-minute phone call to the company's IT help desk. The attackers reset credentials for a senior employee whose information they had obtained from LinkedIn. The resulting ransomware deployment disabled hotel check-in, slot machines, room keys, ATMs, restaurant payment systems, and reservation systems across MGM's properties for ten days. MGM disclosed approximately $100 million in operational losses. The case is the operational reference for social-engineering-driven help-desk attacks.

Read the full breakdown: MGM Resorts breach analysis

29. CDK Global (2024) — 15,000+ auto dealers

The June 2024 ransomware attack on CDK Global — the dominant dealer management system serving approximately 15,000 U.S. auto dealerships — disabled the systems used for sales transactions, parts inventory, service scheduling, and financing applications across the affected dealerships for weeks. CDK reportedly paid a $25 million ransom. The incident is the operational reference for sector-concentration ransomware attacks: when a single SaaS vendor serves the dominant share of a sector, a single attack produces sector-wide operational disruption that no individual customer could have prevented through unilateral security investment.

Read the full breakdown: CDK Global breach analysis

30. Twitter (2020) — 130 high-profile accounts

The July 2020 Twitter breach — executed by a 17-year-old Florida resident and accomplices through social engineering of Twitter employees — compromised 130 high-profile accounts including those of Barack Obama, Joe Biden, Elon Musk, Bill Gates, and Apple. The attackers posted cryptocurrency scams from the compromised accounts and extracted approximately $118,000 in Bitcoin. The case is included not for the record count but for the operational lessons: social engineering of internal employees with administrative access can compromise the most prominent communications platform in the world, and the same techniques scale to attacks on political and financial actors with national-security implications.

Read the full breakdown: Twitter 2020 breach analysis

Honorable mentions: breaches outside the records-ranked Top 30

Several incidents rank near the top of the composite-impact framework without making the records-ranked Top 30 above. Each is included here because its omission would produce a meaningfully incomplete picture of the cybersecurity landscape that executives, boards, and PE sponsors should understand.

Office of Personnel Management (OPM, 2015) — 22.1 million federal records

Chinese state-sponsored attackers compromised security clearance background investigation records — SF-86 forms containing extensive biographical, financial, and personal information — for approximately 22 million current and former U.S. federal employees, contractors, and their relatives. The breach is the foundational case for understanding the strategic intelligence value of consolidated biographical databases. Director Katherine Archuleta resigned. The breach catalyzed substantial federal cybersecurity reorganization, including the eventual establishment of CISA.

Anthem (2015) — 78.8 million health records

Chinese state-sponsored attackers compromised health-insurance records for nearly 80 million Americans — names, dates of birth, Social Security numbers, employment information, and medical identifiers. The case is operationally inseparable from the China MSS biographical database campaign (OPM, Anthem, Marriott-Starwood) that defines the state-sponsored breach pattern of the 2014-2018 period. The breach established health-insurance industry cybersecurity expectations under HIPAA's Security Rule and prompted the HHS Office for Civil Rights to substantially increase its enforcement posture.

Microsoft Exchange ProxyLogon (2021) — 250,000+ organizations

Chinese state-sponsored actor HAFNIUM exploited four zero-day vulnerabilities in Microsoft Exchange Server (CVE-2021-26855, -26857, -26858, -27065) to deploy web shells on hundreds of thousands of Exchange servers globally. The campaign was so extensive that the FBI executed a court-authorized operation in April 2021 to remove web shells from affected servers without owner consent — the first such operation in U.S. cybersecurity enforcement history. ProxyLogon catalyzed the broader migration from on-premises Exchange to Microsoft 365 and substantially accelerated CISA's role as operational coordinator of national cybersecurity response.

Kaseya VSA (2021) — ~1,500 downstream organizations via MSP supply chain

The REvil ransomware group exploited a zero-day vulnerability in Kaseya's VSA remote management software to deploy ransomware through managed service providers to approximately 1,500 downstream organizations. The campaign is the canonical MSP-supply-chain ransomware case and foundational for understanding the asymmetric attack surface created by MSP-level access to many downstream networks.

Bybit (2025) — $1.4 billion cryptocurrency theft

The February 2025 compromise of Bybit's Ethereum cold wallet — attributed to North Korean state-sponsored actor Lazarus Group — is the largest single cryptocurrency theft in history. The case is foundational for understanding DPRK state-sponsored cryptocurrency theft as a sovereign revenue source and for the FATF-track international response to crypto-laundering networks.

What these breaches have in common

Foundational control failures, not novel exploits

Across the thirty breaches in this ranking, the attack vectors trace overwhelmingly to a small set of foundational control failures: missing multi-factor authentication (JPMorgan Chase, Snowflake customers, LastPass), unpatched known-critical vulnerabilities (Equifax, Comcast/Xfinity, MOVEit), inadequate access controls or excessive privilege (Capital One, Marriott-Starwood), credential reuse and password management failures (Dropbox, 23andMe, LinkedIn scrape), and social engineering of internal employees with administrative access (Twitter, MGM Resorts). Almost none of the breaches in this ranking required novel attacker capability. The attackers used techniques that have been documented and recommended against for over a decade.

Detection delays compound exposure

The largest breaches share a second common pattern: multi-year detection delays that converted contained intrusions into catastrophic exposures. Yahoo's attackers operated for at least 18 months in the 2013 incident and four years for the cookie-forging operation. Marriott-Starwood's attackers operated for four years across the M&A transition. JPMorgan Chase's attackers operated for at least two months. The pattern indicates that the primary differentiator between contained incidents and headline breaches is detection capability inside the network — not perimeter strength.

Third-party and supply-chain risk dominates the post-2020 landscape

The breach pattern has shifted measurably since 2020 toward supply-chain and shared-infrastructure compromises. SolarWinds, MOVEit, the Snowflake customer campaign, Change Healthcare, PowerSchool, and CDK Global all share the structural feature that the breach was not of the affected organization directly but of a vendor or shared platform that the organization could not have independently secured. The shift has produced regulatory responses including Executive Order 14028, the SEC's 2023 Cybersecurity Disclosure Rules, the FTC's October 2024 Marriott consent order's M&A provisions, and the developing case law on supply-chain cyber liability.

The regulatory penalty range has expanded by orders of magnitude

The Equifax $575 million settlement in 2019 was widely considered the upper bound of regulatory data-breach penalty exposure at the time. Subsequent settlements have repeatedly exceeded that benchmark: Facebook's $5 billion FTC penalty (2019), T-Mobile's $531.5 million settlement (2022), Marriott's $170 million-plus combined costs (through 2024), and Texas's $1.4 billion settlement with Meta (2024) have collectively reset the regulatory pricing of major data-breach incidents into the billions of dollars. For boards and audit committees of consumer-data-rich businesses, the operational implication is that cyber risk is now a material financial-statement risk category, not a routine operational risk.

How breaches keep getting bigger

Three structural trends explain why breaches have continued to grow in size across the period covered by this ranking:

Data aggregation has continued to scale. National Public Data's 2.9 billion records were not collected by NPD itself — they were aggregated from public records, credit bureaus, and other data brokers across decades. The aggregation business model creates structurally large exposure: a single compromise can expose data the affected consumers never consented to having centralized.

Shared infrastructure concentration has continued to grow. Cloud platforms, SaaS vendors, and integrated industry-specific systems concentrate data and operations from many organizations in single platforms. Snowflake's customer environments, PowerSchool's K-12 footprint, Change Healthcare's payment processing role, and CDK Global's auto dealer footprint all illustrate the pattern: when one vendor serves the majority of a sector, one compromise produces sector-wide exposure.

Adversaries have professionalized. The ransomware-as-a-service model (REvil, ALPHV/BlackCat, LockBit), the credential-stuffing market for stolen credentials, and the state-sponsored intelligence services targeting commercial data (Yahoo, Marriott-Starwood, Equifax, OPM, Anthem) have all matured into sustained operational capabilities. The maturity has produced sustained pressure on consumer-data-rich organizations that did not exist at the same scale in earlier periods.

What boards and executives should take from this list

The recurring themes across thirty incidents have direct implications for cybersecurity governance:

Cyber-readiness is a board-level governance function. The post-Marriott FTC consent order, the SEC's 2023 Cybersecurity Disclosure Rules, the Delaware Caremark-doctrine case law (Marchand, Boeing 737 MAX), and the developing state AG enforcement framework all establish that boards owe substantive oversight of cybersecurity risk — not as a matter of general risk oversight but as a specific governance function comparable to financial-statement audit oversight.

M&A diligence must include forensic threat hunting. The Marriott-Starwood inheritance pattern, the Yahoo-Verizon repricing precedent, and the Equifax post-IPO enforcement together establish that pre-acquisition diligence based on target self-disclosure cannot find breaches the target does not know about. Forensic threat hunting in M&A is now standard practice in consumer-data-rich sectors and is procedurally required under the Marriott FTC consent order.

Foundational controls are the highest-leverage investment. The thirty breaches in this ranking traced overwhelmingly to a small set of preventable control failures. Investment in MFA enforcement, patch management discipline, network segmentation, identity governance, and detection capability inside the network produces more risk reduction per dollar than investment in novel security technology.

The disclosure framework has changed. The SEC's four-business-day materiality determination clock, the FTC's data minimization mandate, and the named-officer accountability framework that began with the Facebook 2019 settlement and continues through the SolarWinds CISO charges have collectively reset the disclosure and personal-accountability expectations for cybersecurity governance. Boards and audit committees should expect that material future cyber incidents will produce named-officer accountability questions in addition to corporate enforcement actions.

Frequently asked questions

What is the largest data breach of all time?

Yahoo's 2013 breach, which compromised every Yahoo account in existence at the time — approximately 3 billion accounts — remains the largest single-company data breach in history. The breach was attributed to state-sponsored Russian intelligence and was discovered and disclosed years after it occurred, leading to a $350 million repricing of Yahoo's acquisition by Verizon and a $35 million SEC penalty for disclosure failures.

How are data breaches ranked by size?

Data breaches are typically ranked by the number of records or individuals exposed. This ranking uses that primary criterion but also incorporates secondary factors including regulatory significance, persistence of the exposure (data that cannot be rotated, such as Social Security numbers, passport numbers, and genetic information, is weighted more heavily), and operational impact (for incidents like Colonial Pipeline where the operational disruption was the primary harm rather than the data exposure).

What is the most expensive data breach in history?

Facebook's $5 billion FTC settlement in 2019 over the Cambridge Analytica incident remains the largest single consumer-protection penalty in U.S. history. Subsequent settlements with Texas ($1.4 billion in 2024 over biometric privacy) and the cumulative regulatory and class-action costs of breaches like Equifax (over $1.4 billion), T-Mobile (over $500 million), and Marriott (over $170 million through 2024) have established that major consumer-data breaches now routinely produce nine- and ten-figure penalty exposure.

What kinds of data are most commonly exposed in breaches?

The most common categories of exposed data in the breaches on this list are names, email addresses, dates of birth, Social Security numbers, account credentials, payment card information, and home addresses. More sensitive categories that appear in specific incidents include passport numbers (Marriott-Starwood), genetic information (23andMe), protected health information (Change Healthcare, Ascension, Medibank, Blue Shield California), biometric data (Facebook BIPA cases), and location data (multiple advertising-technology cases).

What should I do if my data was exposed in one of these breaches?

Standard recommendations include placing a credit freeze with all three credit bureaus (Equifax, Experian, and TransUnion), enabling multi-factor authentication on all accounts, using unique passwords managed through a password manager, monitoring financial accounts and credit reports for unauthorized activity, and being skeptical of unsolicited communications that reference accurate personal information — a common pattern in post-breach phishing. For breaches involving Social Security numbers, considering an identity-theft monitoring service is reasonable. For breaches involving genetic or biometric information, the exposure cannot be remediated through credential rotation, and longer-term monitoring of the categories of harm those data can enable is appropriate.

Are data breaches getting bigger over time?

Yes. The largest breaches of the 2010s exposed records in the hundreds of millions; the largest breaches of the 2020s have exposed records in the billions. The trend reflects three structural shifts: continued data aggregation in data broker and analytics businesses, growing concentration of operations in shared cloud platforms and SaaS vendors, and the professionalization of adversary capabilities through ransomware-as-a-service and state-sponsored intelligence operations. These trends are unlikely to reverse, and consumer-data-rich businesses should plan for the continued growth of breach scale rather than treating individual incidents as outliers.

What is the difference between a data breach and a data leak?

The terms are often used interchangeably, but the substantive distinction matters in regulatory and incident-response contexts. A data breach typically involves unauthorized access by an external attacker who has bypassed security controls. A data leak typically involves accidental or negligent exposure (misconfigured cloud storage, accidental email sends, analytics tools that inadvertently transmit data to third parties, lost devices). The Blue Shield California Google Analytics incident is a leak; the Equifax breach is a breach. Both categories are regulated under most modern privacy frameworks and both can produce substantial regulatory and consumer harm.

The pattern that keeps repeating

Reviewing these thirty incidents together makes clear that the primary cause of large breaches is not adversary sophistication but defender failure on foundational controls. Multi-factor authentication, patch management discipline, network segmentation, least-privilege access controls, encryption at rest, and detection capability inside the network are the controls whose absence appears repeatedly across the cases. Organizations that invest in those controls do not become unbreachable, but they substantially reduce the probability that an incident will scale to a headline-class breach.

For boards and executives, the practical implication is that cybersecurity is now a financial-statement risk category that deserves the same governance attention as audit, financial reporting, and other mission-critical control areas. The breaches on this list will continue to grow in size, in regulatory cost, and in personal-accountability exposure for sitting executives. The framework for managing that risk is well-understood; the operational discipline to execute the framework consistently is the differentiator between organizations that appear on lists like this one and those that do not.

Conclusion

The largest breaches are not feats of adversary brilliance — they are the predictable consequences of preventable control failures multiplied across hundreds of millions of records. Four themes recur across this ranking and the honorable mentions, and they should structure board-level cybersecurity governance. First, state-sponsored attribution dominates the top tier of composite impact — Yahoo (Russian FSB), Marriott-Starwood (Chinese MSS), SolarWinds (Russian SVR), Anthem and OPM (Chinese), Microsoft Exchange ProxyLogon (Chinese HAFNIUM), Bybit (North Korean Lazarus Group). Second, detection delays of 12-48 months between compromise and disclosure are standard, not exceptional. Third, third-party and supply-chain vectors increasingly dominate — SolarWinds, MOVEit, Kaseya, the Snowflake customer campaign, ProxyLogon, and Change Healthcare all involved third-party or shared-platform vectors. Fourth, regulatory consequence is increasingly individualized — the Yahoo executive forfeiture, Equifax executive insider-trading prosecutions, the Uber CSO criminal conviction, the SolarWinds CISO SEC enforcement, and the FTC's named-officer certification framework collectively establish personal accountability for sitting officers as standard, not exception. The 20-year consent-order architecture has implications that exceed typical PE hold periods and must be evaluated as inherited liability in M&A diligence.

CLOUDSKOPE VIEW

Cloudskope's Cyber Risk Assessment, M&A Cyber Due Diligence, and ongoing security operations evaluate the foundational controls — MFA, patch management, segmentation, access governance, and detection capability — whose absence appears in every breach on this list. Talk to us before your organization becomes the next entry.