The Network Architecture of TOP Network | Part 1 | Technical Spotlight
One of the most exciting and innovative parts of TOP Network is — perhaps unsurprisingly — the network architecture. The team has an abundance of experience in this area, having built distributed communications systems over the past 20 years. While a full discussion of the network design and topology would be too technical, we can still get a general sense of the architecture and how it ties together all the different components of TOP Network.
The networking of TOP draws parallels from the architecture of the Internet. Recall that the Internet is essentially one big network of interconnected networks. In fact, the name “Internet” is short for “Interconnected Networks.” TOP Network aims for a similar goal: to build one big overlay network consisting of many interconnected p2p networks.
Before we continue, a little bit of background information on existing networking models is helpful to understand the network topology of TOP Network, and give context as to why it is desirable.
The Client-Server Model Introduced Centralization into The Internet
Contrary to popular belief, the Internet is in large part decentralized. So why do we so often hear complaints about how centralized the Internet is? The reason is because almost all applications of the Internet are centralized thanks to what is called the Client-Server network model.
The Client-Server model is just what it sounds like. There are clients — which are essentially the end users/devices — and a few powerful servers which exist to serve the clients. The servers do most of the heavy lifting required to run the application or website, while the clients rely on the servers to store, process and serve them data.
The World Wide Web, most mobile apps, email, and pretty much everything else employs the Client-Server network architecture. So, while the lower levels of the Internet remain decentralized, the Client-Server model has helped turn the application layer of the Internet into an archipelago of servers, making the Internet functionally centralized. To get a feel for why this is undesirable, let’s take a brief look at how routing works on today’s Internet.
Internet Routing and DNS
Devices can communicate across the various networks of the Internet through something called the TCP/IP protocol. Every device on the Internet is assigned a unique string of numbers called an IP address — which is comparable to a home address or phone number — but for the Internet. If you have the IP address of the device you want to connect with, the TCP/IP protocol will take care of routing your data packets across the Internet to the correct device or server.
However, when we browse the Web or use our mobile apps, we never concern ourselves with IP addresses. For instance, when we want to visit a website, we type in a URL address like www.topnetwork.org into our browser. So how does this URL get us to the server that is hosting the website we are trying to reach? The answer lies in something called a Domain Name Service (DNS).
This is where things get centralized. The Internet has a relatively small number of DNS servers which act like big phonebooks where IP addresses are mapped to domain names.
When you type in a URL, a DNS server is pinged, after which the IP address associated with that domain name is returned to your device. This IP address can then be used to contact the server which hosts the website, which will serve the web page to your browser to be subsequently displayed on your screen.
As you can imagine, DNS servers are crucial. Without them, you would not be able to browse the Web without personally knowing or storing all the IP addresses of the many millions of registered websites. To illustrate how integral DNS servers are in today’s Internet, in 2016, a single DNS server was hacked which essentially caused half of the Internet’s most popular applications to go down for an entire day.
The Pitfalls of Centralized Servers
DNS servers do their job well in terms of making the Web more user friendly, but as a trade-off they introduce centralization, which comes with security risks. DNS servers are single points of failure which are vulnerable to attacks. If just a few are hacked or brought down, the consequences can be catastrophic. Not only that, but DNS servers make it much easier for governments, corporations, ISPs and hackers to see which websites users are visiting, and block certain websites.
DNS servers are just one of the many types of servers used today. Almost every application is facilitated through use of centralized servers to route, process and store data. Facebook, Gmail, YouTube, Instagram, and Messaging apps like WeChat and WhatsApp all use servers to route traffic and store your data. Unfortunately, large centralized servers all suffer from a few major issues:
- A single bug or hack can bring down the entire network/application.
- They act as honey pots for hackers, as all data is stored conveniently in one place.
- The owner of the server typically owns all application and user data by default, which leads to the data harvesting dilemma we are facing today.
- Blocking, censoring, or shutting down applications is possible by governments or other institutions.
What if we wanted to make an application without any servers? This is precisely the purpose of p2p networks, which are by definition serverless. With p2p networks, users can make connections with each other without needing to go through a central server.
Let’s take the example of messaging apps. With centralized messenger apps, when you want to send a message to a friend, you first need to contact the app’s servers. These chat servers essentially store a big phonebook of all the IP addresses associated with each username, so messages can be sent there and forwarded to your friends. Servers for messaging apps will also typically store chat histories so that past conversations, or messages sent while offline, can be re-displayed even after logging out and back in. Of course, it is all a bit more complicated and depends upon the application, but this is the general idea.
In a p2p network, there is no centralized server containing this big “phonebook” mapping accounts to IP addresses. So how is it possible for a p2p messenger app without any servers to work? This problem can be solved using something called a Distributed Hash Table (DHT), which is the basic building block of TOP’s network design. DHTs have many use cases, but in each case, there needs to be a routing algorithm which allows peers to find other peers in the network without using any servers.
In the case of a messenger app, instead of a few servers storing all the routing information of each user, the “phonebook” is divided amongst all users of the p2p network. Each node only stores a small subset of the phonebook so that the load is not too big.
However, if each node only stores a part of the phonebook, what happens when a node needs to contact — or for instance send a friend request — to a peer not stored in its own contact list? This is where DHT routing algorithms become important, with the specifics being what separates different implementations of DHTs.
Here’s how DHT routing algorithms generally work:
Each node is assigned a NodeID, and will store the contact information of a subset of peers in what’s called a routing table. This is like a contact list. When a node wants to contact a peer that it does not directly store in its contact list, it will ask a few of its peers in its routing table if they have information about the node in question. If they don’t, they will return the contact information of peers in their list who may have a better chance of knowing the node being searched for. The requesting node will then ask these peers if they have the contact info of the target node.
The process is iterative, and made in such a way to quickly converge on the target node, and without needing each node to store the contact information of a huge number of peers in their own contact lists.
What constitutes a “better chance of knowing” the target node depends upon the implementation, although it usually involves comparing the “closeness” of NodeIDs. TOP uses the Kademlia DHT implementation as it is mathematically proven to be highly efficient, which has led to its popular use in p2p applications such as BitTorrent.
Let’s take a look at a simplified version of how a lookup process might work:
Assume Alice is attempting to chat with Bob for the first time, but she does not have his contact info in her own routing table (contact list).
For the sake of simplicity, NodeIDs are represented as human readable names. In a real DHT, NodeIDs would look more like public keys.
Step 1: Alice asks Charlie — who is in her contact list — if he knows Bob. Charlie does not have Bob in his routing table, but responds with the contact info of someone he thinks may know Bob.
Step 2: Alice adds Randy to her routing table, and then contacts him as per the “recommendation” given by Charlie. Ureka! Randy has Bob in his routing table, and passes Bob’s information back to Alice.
Step 3: Alice adds Bob to her routing table, and can now send a friend request or chat with Bob if he accepts.
In a real DHT there could be tens of thousands of nodes, so there would likely be a few more iterations.
Notice how in this lookup process there was no need to contact a server. In general, DHTs can be used to store any (key, value) pair in a serverless manner. In the example of routing, the “key” is the NodeID, and the “value” is the IP address and port number. However, we can use DHTs for many other things as well, such as distributed file storage, p2p content distribution, instant messaging, and more.
In part 2, we will see how TOP Network is made up of many DHTs which each help to facilitate a different function. With TOP’s hierarchical design, p2p networks can be infinitely layered into one big network of p2p networks. Stay tuned to find out more!