Ben's Book of the Month: Hacks, Leaks, and Revelations: The Art of Analyzing Hacked and Leaked Data


Posted on by Ben Rothke

Some records last a very long time. For example, Cy Young's record of 511 career baseball wins, set over 110 years ago. In football, the Tampa Bay Buccaneers had 26 consecutive losses 47 years ago. 

However, records inevitably last much shorter when it comes to the biggest data breaches. The Yahoo data breach of 2013 compromised over 3 billion user accounts. In August 2022, the registration data of 1.3 billion phones in Indonesia was posted on Breached Forums. That was particularly devastating as the SIM registrations included national identity numbers, phone numbers, names of telecommunications providers, and more.

Very few people can access and analyze the breach data. But in Hacks, Leaks, and Revelations: The Art of Analyzing Hacked and Leaked Data (No Starch Press), author Micah Lee has written a fascinating guide on how to analyze the data from several significant data breaches over the last decade. What the Hacking Exposed series does for penetration testing and hacking, Hacks, Leaks, and Revelations does for researching large data sets from breached data. 

One of the many challenges with analyzing the data from these breaches is the vast size of the data sets. Standard searching tools only sometimes work effectively when dealing with multi-terabyte data sets. Here, Lee shows the reader how to do that in deep technical detail.

After a high-level overview, the book shows the reader how to use the tools needed to perform the analysis. The author guides the reader through tools and languages such as Python, Aleph, and more.

He then shows the reader how to analyze famous breached data sets such as BlueLeaks (25 years of data from law enforcement comprising 270 gigabytes of internal intelligence, memos, reports, emails, and more), Oath Keepers (anti-government militia), Heritage Foundation (conservative think tank), and more. 

In the book, Lee writes from firsthand experience. He was the former Director of Information Security for The Intercept, where investigative reporter Glenn Greenwald disclosed many of Edward Snowden's NSA documents. 

He is also co-founder of the Freedom of the Press Foundation, known for its SecureDrop platform that enables confidential and secure communication between journalists and their sources. This has proven to be a helpful platform for whistleblowers who want to securely and anonymously share their information

The platform is needed as getting that information to journalists is not a trivial endeavor. As Glenn Greenwald writes in No Place to Hide: Edward Snowden, the NSA, and the US Surveillance State, one of the challenges in initially vetting Snowden was getting the NSA documents. 

This is a unique book in that it is a blend of deep technology, information security, privacy, politics, and more. Those who want to understand the high-level issues can skip (not that they should) the chapters on using the command line interface, exploring datasets in the terminal, and working with data in Python. The rest of the chapters provide interesting insights into manipulating and reviewing hacked data, which will only increase in the future.

Lee is a unique author with extremely deep technical knowledge who conveys information in a readable manner. Hacks, Leaks, and Revelations is one of the more unique information security books of recent memory and a fascinating read.


Contributors
Ben Rothke

Senior Information Security Manager, Tapad

Hackers & Threats

data lakes data security identity theft privacy

Blogs posted to the RSAConference.com website are intended for educational purposes only and do not replace independent professional judgment. Statements of fact and opinions expressed are those of the blog author individually and, unless expressly stated to the contrary, are not the opinion or position of RSA Conference™, or any other co-sponsors. RSA Conference does not endorse or approve, and assumes no responsibility for, the content, accuracy or completeness of the information presented in this blog.


Share With Your Community

Related Blogs