The Spamhaus Project

blog

Between input and output: The enigma of being a Spamhaus threat investigator

Spamhaus processes millions of IPs and domains every day. Given the vast amount of incoming data, automation is a necessity. But is technology alone enough? Let’s find out. Meet one of our researchers, Jonas Arnold, as he sheds light on the threat investigators' role in Spamhaus and the fight against Internet abuse.

by The Spamhaus TeamApril 03, 20246 minutes reading time

Jump to

The data-driven world of Spamhaus

The data-driven world of Spamhaus

At Spamhaus, life as a threat investigator is all about data, preferably structured data. Data flows in via multiple sources, from organizations globally, sharing non-personally identifiable information (PII) data via our in-house spam traps, to law enforcement agencies also sharing data they gain during or after investigations or takedowns.

Normalized, validated, enriched, and contextualized data becomes signal. Through careful investigation, "reputation" is attributed, which may lead to certain actions, such as listing domains or IP addresses. Listings that are compiled and made available for consumption via one of the Spamhaus blocklists. Structured data goes in, and structured data comes out.

Like many other cybersecurity businesses, you could say Spamhaus is a data business. After all, Spamhaus provides threat intelligence that must be actionable in real-time, highly accurate, and ready to be consumed by Spamhaus users (or rather, their infrastructure) within seconds of the signals being detected by Spamhaus systems.

The sheer volume of incoming data makes robust automation a necessity. On average, Spamhaus processes 7.5 million IPs and 3 million domains every 24 hours – approximately 86 IPs and 34 domains per second. At this scale, non-targeted human investigations on raw data simply aren't feasible.

Technology that inspires?

Oversimplified, technology (particularly IT) is both today's greatest problem and its most promising solution. No matter what topic is hyped up, it appears to be the silver bullet we have all been waiting for - until, too often, we realize that this is not the case.

In light of this, it might appear Spamhaus only needs a team to understand and develop big data, machine learning, and any other requirements for processing data at scale. Except, such an approach would miss the most critical part of the threat landscape: humans. While the vast majority of Internet abuse is generated automatically (hence must be addressed automatically), the miscreants behind it are humans.

Weird investigators for a weird world

Dealing with humans is quirky. They make mistakes, they sleep (not always on a regular schedule), and their behavior can be erratic and unpredictable. So, can't we just let AI take care of Internet abuse? Although some Spamhaus systems make heavy use of machine learning, unfortunately, no. It does not work, at least not to the extent that manual investigation would become obsolete - been there, tried that, “We hope this message finds you well.”

This is where human investigations come into play. They are necessary to understand the socioeconomic circumstances that influence internet abuse, to track cybercrime organizations (often including names and faces), and how major events in the analogue or digital sphere incentivize shifts in the cyber threat landscape.

For example, a well-connected but less cybersecure country struggling with an economic crisis may become the next hotbed for hosting spam emission farms. After all, even criminal customers bring in the (urgently needed) money. Major geopolitical developments may both enable new abuse schemes and disrupt existing threats at the same time. As law enforcement in a particular region cracks down on more analogue crime, miscreants may seek alternative ways of making illicit revenue, often experimenting with cybercrime, or turning to it entirely.

Unsurprisingly, neither automated or manual research at Spamhaus could function without the other. While the latter may seem detached from the actual data, signals, and listings, they are closely interconnected, providing intelligence and investigation leads in both directions. The work is highly interdisciplinary, with team members from diverse backgrounds, locations, and cultures.

From a human perspective, threat intelligence often feels more like an art than a science. Investigators can spend many days (and nights) at a keyboard, aimlessly trawling through data. It is often a gut feeling that prompts a closer look. This could be triggered by a rather mundane detail, such as the district of a city appearing repeatedly, a weird infrastructure setup, or the recurrence of a characteristic behavior. The work in between can be surprisingly fuzzy and unstructured for a “structured data in, structured data out” organization.

Dropping sufficiently sized anvils from low orbit

In contrast to other threat intelligence vendors, Spamhaus's approach to make most of its threat intelligence available immediately is a doubled-edged sword. It provides both users and miscreants, with information on specific types of internet abuse. While it may not be immediately clear to a cybercriminal how Spamhaus has identified their infrastructure, they know if they have been listed. This inevitably induces evolutionary pressure.

While human investigations tend to have a more strategic aim, dropping metaphorical anvils on internet infrastructure that poses a threat to users is only a fraction of the process. Monitoring changes afterward can be both educating and challenging - some threat actors adapt within days, while others need months to understand the situation. Operations security mistakes are likely to occur in certain phases of the cybercrime lifecycle, requiring time and attention to spot them and conduct follow-up investigations.

Balancing great power with great responsibility

Assuming everyone has only one mailbox at their disposal, Spamhaus protects more than half of the world's population—over 4.5 billion mailboxes globally. Sitting in front of a screen that displays the Spiderman quote, "With great power comes great responsibility," is somewhat surreal. Translating an odd “gut feel” into concrete evidence of ongoing or future abuse can be challenging during investigations. Transparent and well-defined policies ensure a standardized approach, keeping emotions out of the process.

As the saying goes, "Not all heroes wear capes", so here’s the reality of life as a Spamhaus threat investigator. Imagine a bunch of people at home glued to their computers, attempting to protect Internet users from abuse, hunting for potential threats before they blow up, and dissecting the greatest hits and misses—for many Spamhausians, this both resembles the source of personal satisfaction and motivation to keep going. But, would we change it? Absolutely, not.

Do you have the desire and experience to join our team of threat investigators? At Spamhaus we all share the desire to make the Internet a safer, more accountable place. If you want to be part of this journey, please contact us if you'd like to learn more.