Distributed Lookup Services
Locating objects in a distributed setting
March 2009
Introduction
Distributed Lookup Services deal with the problem of locating data that is distributed among a collection of machines. In the general case, the machines are all peers (no dedicated servers) and there is no central state that needs to be maintained. The challenge is to locate a host that satisfies some property (contains some data, based on a search key) in a scalable, distributed manner.
We limit the problem to locating a node that contains information that corresponds to some search key, such as a file name rather than something more general, such as locating nodes whose content contains some property (e.g., locate all files that contain the word polypeptide). The key will serve as a search index.
There are three basic approaches we can take to locating such data:
- centralized server
- flooding
- distributed hash tables
Centralized server
A central server used to store the database of key → node mappings. The original Napster service is an example of this model. The central server holds the entire database of attributes and points to content on other servers. In Napster's case, a server held an index of all music with the machines that were hosting the content.
This is a simple model but does not scale well because the central server can become a bottleneck. The central server can also be a problem if it is not reliable since the entire service is dead if the server cannot be contacted.
Flooding
With the flooding approach, a node that needs to find the content corresponding to a certain key will contact peers when looking for content. If a peer node has the needed content, it can respond to the requestor. If not, it will forward the request to its peers. If one of those peers has the needed content, it will respond to whoever sent the request (the chain of requests is followed back). Otherwise, it too can forward the request. A time-to-live count that is decremented with each hop can limit the flood of queries through the network.
This flooding approach was the model gnutella used for distributed file sharing. The problem with the system was that peers were not always reliably up or fast (especially those connected to slow modems).
Distributed Hash Table
In self-contained systems, a hash function maps a key (the thing you are searching for) to a number in the range 0 ... n-1. The content is then accessed by indexing into a hash table, looking up look up value at table[hash(key)].
In a distributed implementation, known as a distributed hash table, or DHT, the hash table is distributed among a set of nodes. Nodes all use the same hash function. Looking up a key gives you a node ID that holds the data. The entire goal of a DHT is to allow anyone to find the node that corresponds to a given key. That node will be responsible for holding the information associated with that key. A key difference between the DHT approach and the centralized or flooding approaches is that a specific node is responsible for holding information relating to they key (even if it just sends a link to the content).
We'll now take a briefly, cursory look at two approaches to DHTs:
Chord
Think of a sequence of numbers arranged in a logical ring, 0, 1, 2 ... n, and looping back to 0. Each node in the system occupies a position in this ring that is the number you'd get by hashing its IP address and taking the value modulo the size of the ring [hash(IP)mod n].
A (key, value) would be stored at a node that matches hash(key). If there is no node at that position, the next node ahead of that number is responsible for storing the data. This is the next node you hit if you traverse the ring clockwise starting from that hash(key) position. This node is called the successor node.
When a node joins a network at some position j, where j=hash(node's IP), some (key,value) data has to migrate from the successor's node to this new node.
For routing queries, a node only needs to know of its successor node. Queries can be forwarded through successors until a node that holds the value is found. To increase performance and avoid the worst case of traversing the entire circle, nodes may maintain a table containing additional routing information about other nearby nodes. When a node needs to locate one that's farther away, it contacts the furthest node that it knows about whose ID is less than the hash(key) and asks it to locate the node for hash(key). The process can be repeated recursively to locate the node.
CAN (Content-Addressable Network) design
Think of a grid and two separate hash functions hx(key) and hy(key). The key is hashed with both of them. a=hx(key) gives you the x coordinate and b=hy(key) produces the y coordinate. The node responsible for the location (a, b) stores (key, V): the key and its value.
Each node is also mapped onto this logical grid and is responsible for managing values within a rectangular sub-grid: that is, some range (x0..x11, y0..y1).
A node only knows about its immediate nodes. For looking up and routing messages to the node that holds the data it needs, it will use neighbors that minimize the distance to the destination.
A new node is inserted by the following proces:
- pick a random pair of values in the grid: (p,q).
- contact some node in the system and ask it to look up the node that is responsible for (p,q).
- Now negotiate with that node to split its zone in half. The new node will own half of the area.
The advantage of this approach is that it has been shown to be highly scalable. With a uniform distribution of nodes, the number neighbors per node is 2d, where d is the number of dimensions.