Onion routing is a technique for anonymous communication over a computer network. In an onion network, messages are encapsulated in layers of encryption, analogous to layers of the vegetable onion. The encrypted data is transmitted through a series of network nodes called onion routers, each of which "peels" away a single layer, uncovering the data's next destination. When the final layer is decrypted, the message arrives at its destination. The sender remains anonymous because each intermediary knows only the location of the immediately preceding and following nodes
Development and implementation
Onion routing was developed in the mid-1990s at the U.S. Naval Research Laboratory by employees Paul Syverson, Michael Reed, and David Goldschlag to protect U.S. intelligence communications online. It was further developed by the Defense Advanced Research Projects Agency (DARPA) and patented by the Navy in 1998.
Computer scientists Roger Dingledine and Nick Mathewson joined Syverson in 2002 to develop what would become the largest and best known implementation of onion routing, Tor, then called The Onion Routing project or TOR project. After the Naval Research Laboratory released the code for Tor under a free license, Dingledine, Mathewson and five others founded The Tor Project as a non-profit organization in 2006, with the financial support of the Electronic Frontier Foundation and several other organizations.
Data structure
In this example onion, the source of the data sends the onion to Router A, which removes a layer of encryption to learn only where to send it next and where it came from (though it does not know if the sender is the origin or just another node). Router A sends it to Router B, which decrypts another layer to learn its next destination. Router B sends it to Router C, which removes the final layer of encryption and transmits the original message to its destination.
An onion is the data structure formed by "wrapping" a message with successive layers of encryption to be decrypted ("peeled" or "unwrapped") by as many intermediary computers as there are layers before arriving at its destination. The original message remains hidden as it is transferred from one node to the next, and no intermediary knows both the origin and final destination of the data, allowing the sender to remain anonymous.
Onion creation and transmission
To create and transmit an onion, the originator selects a set of nodes from a list provided by a "directory node". The chosen nodes are arranged into a path, called a "chain" or "circuit", through which the message will be transmitted. To preserve the anonymity of the sender, no node in the circuit is able to tell whether the node before it is the originator or another intermediary like itself. Likewise, no node in the circuit is able to tell how many other nodes are in the circuit and only the final node, the "exit node", is able to determine its own location in the chain.
Using asymmetric key cryptography, the originator obtains a public key from the directory node to send an encrypted message to the first ("entry") node, establishing a connection and a shared secret ("session key"). Using the established encrypted link to the entry node, the originator can then relay a message through the first node to a second node in the chain using encryption that only the second node, and not the first, can decrypt. When the second node receives the message, it establishes a connection with the first node. While this extends the encrypted link from the originator, the second node cannot determine whether the first node is the originator or just another node in the circuit. The originator can then send a message through the first and second nodes to a third node, encrypted such that only the third node is able to decrypt it. The third, as with the second, becomes linked to the originator but connects only with the second. This process can be repeated to build larger and larger chains, but is typically limited to preserve performance.
When the chain is complete, the originator can send data over the Internet anonymously. When the final recipient of the data sends data back, the intermediary nodes maintain the same link back to the originator, with data again layered, but in reverse such that the final node this time removes the first layer of encryption and the first node removes the last layer of encryption before sending the data, for example a web page, to the originator.
Weaknesses
See also: Tor (anonymity network) ยง Weaknesses
Timing analysis
See also: Traffic analysis
One of the reasons typical Internet connections are not considered anonymous is the ability of Internet service providers to trace and log connections between computers. For example, when a person accesses a particular website, the data itself may be secured through a connection like HTTPS such that your password, emails, or other content is not visible to an outside party, but there is a record of the connection itself, what time it occurred, and the amount of data transferred. Onion routing creates and obscures a path between two computers such that there's no discernible connection directly from a person to a website, but there still exist records of connections between computers. Traffic analysis searches those records of connections made by a potential originator and tries to match timing and data transfers to connections made to a potential recipient. For example, a person may be seen to have transferred exactly 51 kilobytes of data to an unknown computer just three seconds before a different unknown computer transferred exactly 51 kilobytes of data to a particular website. Factors that may facilitate traffic analysis include nodes failing or leaving the network and a compromised node keeping track of a session as it occurs when chains are periodically rebuilt.
Garlic routing is a variant of onion routing associated with the I2P network that encrypts multiple messages together to make it more difficult for attackers to perform traffic analysis.
Exit node vulnerability
Although the message being sent is transmitted inside several layers of encryption, the job of the exit node, as the final node in the chain, is to decrypt the final layer and deliver the message to the recipient. A compromised exit node is thus able to acquire the raw data being transmitted, potentially including passwords, private messages, bank account numbers, and other forms of personal information. Dan Egerstad, a Swedish researcher, used such an attack to collect the passwords of over 100 email accounts related to foreign embassies.
Exit node vulnerabilities are similar to those on unsecured wireless networks, where the data being transmitted by a user on the network may be intercepted by another user or by the router operator. Both issues are solved by using a secure end-to-end connection like SSL or secure HTTP (HTTPS). If there is end-to-end encryption between the sender and the recipient, then not even the last intermediary can view the original message.