Assignment 7

Due April 3, 2016 11:59 via sakai

Reference material

FAQ and addenda

Be sure to periodically check the FAQ for updates and additional details.

Objective

We covered DNS when we discussed the application layer and saw that it is a critical component in mapping user-friendly domain names to IP addresses. This assignment is to write a very simple DNS server. All it will do is accept queries for domain names and return the corresponding addresses. The server will be single-threaded, will not support DNS recursion (forwarding requests to other name servers), and will return only IP addresses (no name server information, mail servers, etc.) that correspond to the name.

Languages

You may do this assignment in Java, C, C++, or Python. Be aware that you will need to look at bit ranges and move bits around.

Groups

You may work on this assignment either individually or in a group of up to four members. Your work will be judged to a higher standard if you have more group members. With a group of four, for example, I expect flawless program execution, cleanly formatted code with good comments, and a report with detailed test cases explained that is so beautifully written that I will weep tears of joy upon seeing it.

Deadline

Be sure to allot sufficient time to do this assignment. If you are comfortable with programming, you should be able to finish this assignment within an hour. Do not assume that, however. You may find yourself spending a lot more time than you budgeted in parsing requests and responses, formatting a date, or wondering why your client is not connecting to your server.

Develop incrementally. Take a look at the recitation notes as an example of how you can add tiny amounts of code and then test it. Do not write a few hundred lines of code and then try to debug it. That is a recipe for disaster.

Be aware that there is an exam coming up soon as well as possible other projects and activities in your life. Start early and budget your time. Make sure that you allot enough time for testing and writing your report.

Background

First be sure that you are familiar with UDP sockets programming. If working in C, check out the TCP and UDP sockets tutorials and run the UDP sample code to make sure that you have no problems compiling or communicating. You can read the Oracle documentation for UDP programming with Java: All About Datagrams.

Operations

Here’s what your program should do:

Read a file of domain names and addresses. The format is plain text with an IP address at the start of a line followed by white space (spaces and/or tabs) followed by a domain name. You can download this sample hosts file.
```
# this is a comment 
12.14.101.22    www.test.org  
51.234.1.1  another.domain.blah  
```
Wait for a UDP packet to arrive on a port you specify.
Parse the DNS header.
If it’s not a valid query, return a “format error” (code 1) or “not implemented error” (code 4).
Look up the given domain name in the list of names that you read in initially.
Create a reply message. If you find a match, add an answer record identifying the name to address mapping. If you don’t find a match then don’t add an answer record.
Send the response.
Go back to step 2 and wait for the next packet.

Step-by-step guidance

Step 0. Command names and command line processing

Your server will be named dns-server. You do not need to write a client since you can use existing DNS query clients such as dig.

To make it easy to select a port number, the server should accept a port number as an option on the command line with the -p parameter. To make it easy for me to test different host files, the server should accept a host file with the -f parameter. The command syntax is:

dns-server [-p port#] [-f hostfile]

No extra arguments should be allowed. Both the -p and -f arguments are optional and you can have hard-coded defaults. The -p and -f parameters can appear in any order; use getopt to parse them. If you’re not familiar with this, here’s a quick tutorial and demo.

Step 1. Read about the DNS protocol

A really good explanation of the protocol and structures is presented in The TCP/IP Guide. Also, some nice tables & explanations are presented here.

Step 2. Write a simple server that receives a UDP packet on a port

Make sure this part works! If you’re programming in C, you can use code from the UDP sockets tutorial. Basically, you do the same as with TCP but don’t listen or accept:

s = socket(): create a socket
bind(s, address): bind the socket to any valid address (INADDR_ANY) and a port number that you define. You will need to use this port number in the client to send messages.
size = recvfrom(s, buf, maxsize, 0, &remoteaddr, &addrlen): receive a message

Test it out via a DNS query program such as dig. For example, start your server in one window and in another run:

dig @localhost -p 1153 test.mydomain.abc

If you are running dig from another machine, replace @localhost with the name of that machine. Replace 1153 with a port number on which your server is receiving messages. A port number is a 16-bit value in the range 1 through 65535. Purt numbers under 256 are reserved for well-known services and port numbers under 1024 require administrative privileges, so pick a number in the range 1024 through 65535.

Don’t expect dig to show anything since you’re not returning any data but your server should print debugging messages so you can see what you got (# bytes and you can dump the bytes).

Step 3. Parse the DNS message

The DNS messaging protocol was originally defined in RFC 833 but there have been a few RFCs that made updates to the core protocol. The latest version is RFC 1035.

Headers are nicely illustrated in various pages in the The TCP/IP Guide as well as in: Network Sorcery.

I am providing you with the C data structure for the three DNS headers that you need to handle. If you’re programming in Java, you’ll have to extract the bit ranges explicitly with shifts and masks since the language does not support bit fields.

Step 3a. Parse the DNS header

The DNS format is the same for requests and responses:

[ DNS header (fixed size) ]
[ zero or more question records ]
[ zero or more answer records ]
[ zero or more authority records (identifies the name servers for this zone) ]
[ zero or more additional records ]

The header is always present. Among other things, it identifies that the message is a DNS query and identifies which and how many other sections are present. The sections after the header are:

Question: The query that is issued to the name server (typically the domain name). You expect to get this.
Answer: The answer to a query. If you were able to find an address for a name, you will send back an answer record that contains the IP address. Your response will also contain the original query in the query section. If you cannot match the name, then return a response code (RCODE) of 3, which is the code for no name, defined in RFC 1035.
Authority: This will contain a set of records that identify the zone your name server is responsible for and other name servers. You can ignore this.
Additional: This is a set of records that hold any additional information. You can ignore this as well.

Your server will receive DNS queries. You can check that a message is indeed a query by looking at the QR field of the header. A value of 0 means it is a query. The header also contains counts for each of the other fields (questions, answers, name servers, additional records).

For debugging, print a few of the fields to make sure they make sense. For a dig query, the query (qr) field should be 0, recursion desired (rd) should be 1, question count (qd) should be 1, and the three other counts should be 0.

Watch out for network vs. host byte order

One thing to realize is that the structures are defined in a “network byte order” called big endian, which is different from the little endian format used by the intel architecture (and most machines today). The Java Virtual Machine (JVM) stores its data in big endian format but uses the native machine’s byte order when using them. Python also uses the native machine’s byte order. You will, most likely, need to flip bytes around to ensure that the bits are in the correct order.

With big endian, the most significant byte is in low memory. A 32-bit value

0x12345678

will be stored (from low to high memory) as

0x12, 0x34, 0x56, 0x78

On an Intel processor, the value will be stored with the most significant byte in high memory:

0x78, 0x56, 0x34, 0x12

Hence, if you’re reading a 16 bit value from a network format structure, you will need to convert. Four macros do this:

unsigned int local_format = htonl(network_format);: convert a long value (32 bits) from the host format to the network format
unsigned short local_format = htons(network_format);: convert a short value (16 bits) from the host format to the network format
unsigned int network_format = ntohl(local_format);: convert a long value (32 bits) from the network format to the host format
unsigned short network_format = ntohs(local_format);: convert a short value (16 bits) from the network format to the host format

Step 3b. Parse out the query string

Parse out the query in the query record. This is in a format of

<query name> <query type> <query class>

The query name is the name whose address is being requested. Instead of a domain name format such as

test.mydomain.abc

each component contains a byte count (occupying one byte) followed by the characters. So this is stored in the message as:

4, 't', 'e', 's', 't', 8, 'm', 'y', 'd', 'o', 'm', 'a', 'i', 'n', 3, 'a', 'b', 'c', 0

A zero signifies there are no more components.

The query type should be 1, which represents an address (A) query. The class should also be 1, which represents an Internet query.

Step 4. Create a response

The response will contain the same message ID and will contain the same query string. The easiest way to create this is to just suffix zero or more responses to the message that was received. If you didn’t write code to process the hosts file, you can just create a hard-coded IP address for debugging purposes.

You’ll need to change the following in the DNS header:

QR field: set to 1 to indicate a response
AA field: set to 1 to indicate an authoritative answer
TC field: set to 0 to indicate the reply is not truncated
RA field: set to 0 to indicate recursion is not available.
RCODE field: 0 for no error; 3 for a name error (domain name does not exist).

If there’s a single response, set the answer count to 1 (set the value to htons(1) so that the bytes will go in the correct order.

Suffix the answer to the message that you received from the client. An answer is a resource record and is in the format:

<name> <type> <class> <ttl> <rdlength> <rdata>

Where:

<name>: the query string just like in the query with byte counts and components instead of dots.
<type>: 1, for an address query answer
<class>: 1, for the Internet
<rdlength>: 4, the length of a 32-bit IP address
<rdata>: this is the 4-byte address. Convert the format 1.2.3.4 to an unsigned int and then store it in network form via htonl(addr).

Write the response with sendto().

Step 5. Process the hosts file

Now that you know the networking part works, you can turn to the easy part: reading the hosts file. You are provided with a sample hosts.txt file. It is a simplified format of a normal hosts file that most DNS servers use. It looks like this:

# a basic hosts file for DNS 
# ignore blank lines, lines containing whitespace (spaces and/or tabs) 
# format: <domain_name> <whitespace> <address> 
# ignore anything after <address> 

# some sample entries below 

127.0.0.1   localhost 
1.0.0.1   test.name     # ignore any comments after the address 

128.6.4.2   cs.rutgers.edu 
129.42.38.1   ibm.com 

123.45.67.8   a.domain.name.with.many.levels.of.hierarchy.net

You should process this file only when you start the server, not for each query. An easy storage format is to just have a simple linked list of <name, address> pairs. You’re not managing a lot of addresses so there is no need for more sophisticated storage structures.

Submission

Report

Your report can be brief and can be in plain text or pdf. Be sure that it contains the following:

Names of all group members
Precise and detailed instructions for compiling and running the program. I suggest that you test these out on your friends and make sure that they work exactly. I will not have the patience to fiddle with getting your program compiled and running and you will lose a huge percentage of the grade if I cannot compile and run it.
Any bugs or peculiarities.
A discussion of the tests that you ran.

Source code

Submit only the source code for your assignment. Do not include executables, C object files, or Java class files. Do not include any temporary files.

If submitting multiple files, make sure that every file has the names of the group members in it.

Make sure that the code looks clean. Indentations should be consistent; do not mix leading tabs and spaces. Use clear variable names. You do not need a lot of comments but be sure to have a comment that identifies each function or method in your program.

Make sure that your code does not have confusing debugging statements or blocks of commented-out code. Basic server-side messages, such as indications that you received a connection and what commands you are parsing, are fine.

You may submit a Makefile so that we can run make to compile your program. You need not do this if compiling your application is trivial.

Submit

You must to submit everything we need to compile your program. This will, at minimum, be one file that implements the server. There is no need to submit a hosts file since we can create our own.

Do not submit object files, executables, or temporary editor files.

Before uploading the, make sure that all the components are there. If we can’t compile any part, you will get no credit.

Hand the assignment in via sakai. Only one group member should submit the assignment. You will be penalized for duplicate submissions.