In early October, the FBI announced the arrest of a man named Ross William Ulbricht; he was charged with narcotics trafficking. No ordinary drug pusher, Ulbricht was the founder and chief operator of the notorious online black market Silk Road. The arrest shone a light into one of the best-kept secrets in our increasingly connected society: the existence, and potential, of the Deep Web.
As SEOs, we spend every workday obsessing over search engines; what they can see, what they’ll praise or punish, and how to improve our clients’ rankings in the results pages. We think in links, in social signals, and in search phrases, because at the end of the day we are concerned with what happens when a web user types a query into a search engine. For many internet users—if not most—that’s how the Internet works: you search for something and the relevant results pop up. But the number of pages indexed by web crawlers is just a tiny fraction of all the pages in the World Wide Web as a whole, and exploring the unsearchable ones has become a dark descent into completely unknown territory. And I find it fascinating.
In order for a web page to be indexed, it must be static and linked to other pages. Deep web pages, in contrast, are not indexed by a search engine, and thus never show up in the results. These pages store their content in searchable databases, but they do not actually exist until a specific search calls up the data and creates a dynamic page on which it can be viewed. While most users don’t realize it, they’ve encountered the deep web at some point in their online travels; a lot of the deep web includes stuff like catalog search results, flight schedules, and research data, all of which adds up to an estimated 7,750 terabytes of information. It’s believed that the surface web—our bread and butter—consists of only 1% of the entire World Wide Web.
Of course, one of the most famous elements of the deep web is the fact that the pages operate in almost complete anonymity, which has made it a haven for illegal activity and black markets such as Silk Road. These sites used a software called TOR, which conceals their IP addresses by bouncing them around several servers and making them very difficult to find. If you searched for these sites in Google, you’d come up with absolutely nothing, because as far as the search engines know, these sites simply do not exist.
This anonymity hasn’t only been used by pornographers and drug lords; the deep web has been hugely helpful in countries where the internet is strictly regulated, because it offers a place for activists to communicate and share information that would get them arrested or killed in real life. In a world of NSA tracking, where your data is a huge commodity, there’s definitely an appeal to the concept of being able to navigate the web without being traced or tracked.
Of course, web pages which specifically avoid being crawled by search engines aren’t of much use to SEOs. But I think it’s amazing to realize that there is a gigantic world beneath our virtual feet; it’s deeply humbling to remember that, at the end of the day, we’re mere drops in the ocean.
SEO news blog post by Mia Steinberg @ 10:54 am on November 6, 2013