But it can also reveal a treasure trove of information about the connected device and, by extension, sensitive data it might handle. Shodan stands out for highlighting this inadvertent exposure of information by device owners. Launched by programmer John Matherly in 2009, Shodan is a search engine that enables users to scour the web for webcams, routers and other connectable smart products. It operates 24/7 with the help of web servers located around the world, providing 56 percent of Fortune 100 companies and over a thousand universities with the intelligence to discover and track their internet-connected devices. Those organizations can then use that information to perform empirical market research in an attempt to advance their brand and business. Of course, Shodan has other uses besides helping enterprises gain a competitive edge. Researchers often use “the scariest search engine on the Internet” to locate potential security risks. For instance, Matherly and other Shodan users found “uncounted numbers” of open and potentially exploitable industrial control systems (ICS) back in 2012. Around that same time, Bob Radvanovsky and Jacob Brodsky of security consultancy InfraCritical uncovered 500,000 computers overseeing nuclear power plants and other utilities via surveying the service. The United States’ Department of Homeland Security took that list, narrowed it down to 7,200 important targets and contacted the owners to impress upon them the importance of securing their web-connected devices. Users have made some remarkable discoveries by searching through Shodan’s servers, including incidents involving the exposure of sensitive information. Here are some of the biggest revelations to make headlines in recent years.

Database of 560 Million Previously Compromised Credentials

During a regular security audit of Shodan, researchers at the Kromtech Security Center came across 313 large databases with more than 1 gigabyte and in some cases several terabytes of data. One of those databases was a MongoDB instance with default configuration enabled, thereby allowing the researchers to view its contents. When they peered inside, they unearthed more than 560 million email addresses and passwords collected from other sources. Kromtech Security Center made Troy Hunt of Have I Been Pwned aware of the database, which was hosted on a cloud-based IP at the time of discovery. By running a sample set for his service, Hunt identified 243,692,899 unique emails. Nearly all of them were already in Have I Been Pwned as a result of “mega-breaches” like LinkedIn and Dropbox. It’s unclear who owned the vulnerable database. Using a name found in the database credentials, Kromtech says it belonged to someone named “Eddie.”

13 Million Users’ Account Credentials Potentially Exposed

Speaking of Kromtech, security researcher Chris Vickery queried Shodan for vulnerable MongoDB instances listening on port 27101 for incoming connections. He then took this information and posted it into MongoVue, a tool for browsing databases. In so doing, he came across a security issue on the web servers for MacKeeper, software developed by Kromtech. The weakness discovered by Vickery allowed anyone to view the information contained in the databases without any authentication. As reported by Krebs on Security, a look into the databases revealed 21 gigabytes of data including the names, passwords and other account information for 13 million MacKeeper users. After Vickery reported the issue to the technology company, Kromtech released a statement thanking Vickery for his discovery and explaining its data storage policies: The only customer information we retain are name, products ordered, license information, public ip address and their user credentials such as product specific usernames, password hashes for the customer’s web admin account where they can manage subscriptions, support, and product licenses. Kromtech also confirmed that it had secured the databases.

750 MB from Thousands of etcd Servers Disclosed

Researcher Giovanni Collazo conducted a simple search of Shodan by querying “etcd,” a type of database which stores passwords, configuration settings and other sensitive information across a cluster of machines. The search yielded 2,284 etcd servers open to the web in that their authentication mechanism was disabled by default. That meant each server’s stored credentials were publicly viewable. The researcher didn’t test any of the credentials he found. But given the sheer number of credentials uncovered, Collazo suspects that at least some of them would have worked.

Tens of Thousands of Computers Infected by DOUBLEPULSAR

In April 2017, the Shadow Brokers group published a dump of internal NSA documents containing exploits, hacking tools, and attack code. Among the leaked resources was DOUBLEPULSAR, a backdoor dropped by exploits like EternalBlue, EternalChampion, EternalSynerg, and EternalRomance onto vulnerable machines. The backdoor allows attackers to run additional malicious code on compromised machines. Various detection scripts written shortly following the Shadow Brokers’ data dump uncovered that DOUBLEPULSAR was already active on as many as 50,000 machines. Matherly said at the time that the numbers could be much higher. “Shodan has currently indexed more than 2 million IPs running a public SMB service on port 445. 0.04 percent of SMB services that we’re observing in our data firehose are susceptible to DOUBLEPULSAR which results in a projection of ~100,000 devices on the Internet that are impacted,” Matherly wrote in an email, as quoted by CyberScoop. “Shodan has already indexed 45k confirmed [infections] so far.” A scan conducted by security firm Below0Day at around the same time detected 35,000 infections by DOUBLEPULSAR.

5.12 PETABYTES of Data Uncovered

An analysis conducted by Shodan uncovered nearly 4,500 servers with the Hadoop Distributed File System (HDFS). That’s far fewer than the 47,820 MongoDB servers detected online. But while the MongoDB instances exposed 25 terabytes of data, the HDFS servers compromised 5,120 terabytes. That’s 5.12 petabytes of information.

The Dual Use of Shodan

Clearly, security researchers routinely use Shodan to spot potential sources of data exposure online. But they’re not the only ones searching the web for Internet-connected devices. Nor are they alone in their use of Shodan to their advantage. For instance, bad actors have come up with scripts that scan the service for IPs of vulnerable Memcached servers. Malefactors can then use those insecure assets to launch distributed denial-of-service (DDoS) attacks against a target. Also available are tools like Autosploit, a marriage of Shodan and Metasploit which allows users to hack improperly secured Internet of Things (IoT) devices according to platform-specific search queries. Given these abuses, it’s important that security researchers who use Shodan notify device owners of their exposure. They can’t force organizations to secure their IoT products and other vulnerable assets. But they can raise awareness of those issues and in so doing promote best security practices for devices more generally.