Project Assignments: 22C:178 & 055:134

Computer Communications Spring 1998

Projects give you hands-on practice with network administration, protocols, and programming. The grading method for each project is described along with the detailed instructions for the project.

Project 5

[due in class, Monday 4 May]

The purpose of this project is to demonstrate concurrent message processing and switching using sockets and datagrams. Your task is to write a program that forwards input datagrams to output ports. The program will have two input ports for data and a third input port for control. A diagram for the program follows.

              |             (*)            | 
------------> | (a)                    (b) | -------------->
              |                            | 
              |                            | 
------------> | (c)                    (d) | -------------->
              |                            | 

The three input ports in the diagram are (a), (b), and (*). Ports (a) and (b) are data ports -- data to be forwarded arrives on these two ports. Port (*) is a control port -- datagrams arriving at (*) control program behavior.

Initially, datagrams arriving at (a) are forwarded by sending the data on port (b), whereas datagrams arriving on port (c) are forwarded through port (d). Whenever the program receives a datagram on port (*), the forwarding is ``switched'' in the following sense. The first datagram received on (*) causes the program to switch its forwarding behavior: data received on (a) will now be forwarded through (d) and data received on (c) will now be forwarded through (b). This new forwarding behavior persists until another datagram arrives to (*). The second datagram received on (*) causes the data forwarding to revert back to the original (a)->(b) and (c)->(d) mapping. In general, each datagram received on (*) toggles the mapping of input data ports to output ports. The program thus functions as a kind of crossbar switch.

There is no fixed order of data arrival on the three input ports -- datagrams can arrive at any time on these ports. This is a problem because if a program attempts to read() or recvfrom() one of these three ports while there is no data available, then the program waits for data -- and the program can be stuck. For instance, suppose the program uses recvfrom() on the port (a) and therefore waits for data to arrive on (a). While the program is waiting, a datagram arrives on port (b), but the program will not read this data because it remains waiting on port (b)! How can we resolve this? One technique would be to use fork() and devote concurrent processes to each input port. But we use a different -- and simpler -- strategy for this project. Unix provides a system call to determine whether or not your program would wait on a port before your program attempts to read() or recvfrom(). This system call is the select() call, documented by the man select man pages.

The unix documentation for select() is not very thorough or helpful to get started programming, so it is best to see an example. The udpserv3.c program shows a version of the UDP server used in earlier homeworks and projects, but expanded to have five input ports. This program will wait simultaneously on these five input ports and only read from one of them when data is available. Therefore the program does not get ``stuck'' as in the example above. You should use the same select() technique (with FD_SET, FD_ZERO, and FD_ISSET) in your code for the project.

Some Details

Call your program relay.c. It has three fixed port numbers for the input ports: 5012 is the toggle control input port and ports 5013, 5014 are the data input ports. The relay program should have four command line arguments (argv[1] through argv[4]) that specify the output hosts and their port numbers. An example of the command syntax is:

relay localhost 6822 5689
This example specifies that the two outputs from relay are directed to (1) the same machine running relay, but with port 6822, and (2) the ox machine on port 5689.

The maximum buffer size for all UDP datagrams should be 1025 bytes, but you are welcome to tune this constant to other values during debugging and experimentation with your program. To test your relay program, two other programs are available. The first program is called pump and it copies data from stdin to a specified UDP port destination. Two examples of the syntax for pump are

pump 0 localhost 5013 < mydata
pump 1 localhost 5014 < mydata
The first example specifies that UDP datagrams be sent to localhost on port 5013, and the second example specifies port 5014. In each example, the input data is a file called mydata. The first argument, 0 in the first example and 1 in the second, specifies a delay time between sending datagrams. Please consult the source of the program pump.c for further details.

The companion to pump is the sink command. The sink command copies UDP datagrams it receives to stdout. Two examples of syntax are

sink 6048
sink 5114 > nudata
The first example specifies that the input port for sink is 6048 and the output will be displayed in the window running the sink command. The second example specifies an input port 5114 and directs the output to a file named nudata.

You can exercise pump and sink without the relay program, as follows. Suppose you have a data file called edata. Then try the commands

sink 5003 > fdata &
pump 0 localhost 5003 < edata
This should copy the edata file to the fdata file. For debugging, it is often better to open two windows for such an example, for instance, one xterm can do the pump and another xterm does the sink. It is, of course, important to start the sink before the pump -- otherwise the pump will fail because it won't find the sink waiting and ready for the datagrams. The source code for sink.c is also available.

Using two concurrent pump programs and two concurrent sink commands, you can test the relay. However, to show the switching capabality, you would need a third program -- one that sends a datagram to the toggle control port of relay. This should be easy to do using something like the udpcli.c program we have seen before, but customized to use the appropriate port number. This part is left as part of the development and debugging phase of your project.

You can start work on developing the relay program immediately. However, you will need to also do some experiments for this project using relay in various settings.

Testing Tools and Techniques

To make significant tests of relay you will need to have sources of input and ways of evaluating outputs. The first useful tool for generating a source of input is the supply.c program. It generates 1K blocks of data. The syntax for the supply program is

supply m c
where m is the amount of data you want to generate and c is the fill character for the data generated. For example, the command supply 8 M will produce 8KB of output consisting of the letter M repeated in 64-character lines (each line is actually 63 M characters followed by a newline byte). You can use supply in combination with pump as follows.
supply 38 a | pump 0 localhost 5792
This example generates 38KB of 'a' characters, and pump will take this 38KB as input, sending it via UDP datagrams to port 5792. Read about the unix pipe facility (the | operator) in man csh if you have never seen this technique before.

It is also useful to test the output from sink rather than seeing it all on the terminal window or storing it in a file. We can use existing unix commands to do this. This example counts the number of bytes produced by sink from reading port 5792:

sink 5792 | wc -c
(See man wc for an explanation of the wc command.) Suppose we want to count only the amount of data produced by sink with the 'j' character in each line. We can use grep to do this.
sink 5792 | grep j | wc -c
In this example, all the sink output was filtered by the grep command, which only let lines containing 'j' to pass on to the wc command, which in turn counted the number of bytes in all such lines.

It is useful to see all of this together in one example, such as

sink 5792 | wc -c &
supply 64 z | pump 0 localhost 5792
The first line starts the sink-wc combination, running it in the background (you can read about background mode in the man csh). The second line then starts the supply-pump combination. If you try this example, you should see that pump and wc report the same number of bytes. If this example is unclear, then try running the two lines in separate xterm windows so that the output from each command appears in its own window.

Experiment I

This is basic experimentation with relay. The basic experiments involve no networking and can be done on one workstation, using pump, sink, relay, and udpcli communicating via UDP. The basic experimentation just confirms that your relay works as expected. Here are a few things to try:

Experiment II

The relay program probably does not terminate automatically if you write it following the instructions. For the second experiment, add a time-out facility to your relay program. If the relay does not receive a datagram for 30 seconds, then it should terminate. So, for instance, after you use relay using pump and sink, the relay will automatically quit if you do not use it again with 30 seconds.

Experiment III

This experiment will test more intricate combinations of relay programs. Here are some things to try and questions answered by experiments.

Experiment IV

Experiment IV will do performance testing of the relay using the lab of PCs in 311. You will add some timing measurements to pump and sink programs, and test relay across the Ethernet with some large data sizes (perhaps 40MB or more).

The first experiment will be to use one pump, one sink, and the relay all running on different machines, and send a reasonably large amount of data, say 10-50MB. The idea is to measure the running time and establish a baseline for further experiments.

The next step is to reduce the flow control and repeat the same experiment. To reduce the flow control you will need to modify pump and sink so that, instead of sending an ACK for each datagram (and expecting an ACK for each datagram sent), these programs send and expect an ACK only once per K datagrams, where K is a parameter experimentally determined. Clearly, if K is extremely large the programs use very little flow control. However, if K is made large then your relay will need to have K buffers and memory is limited. The main programming challenge will be to have the relay manage buffers so that the pump can get ahead of the sink, but not to discard datagrams when there is a buffer shortage. Flow control is the answer, but just how much flow control you need is the question. Less flow control will mean fewer ACK messages and the total running time will be less. Document the results of your experiment.

Optionally, you may also want to test how the relay performs when there are two pump and two sink programs, each sending a large amount of data. If you perform this experiment, compare the results to your baseline experiment.

What to Turn in

Each experiment should be documented, on paper, describing the experiment and its results. Also please turn in the program listings for different versions of relay that you write. Please remember to include your student number.

Grading Criteria

The general guidelines are: 250 points total (this project counts 2.5 times any of the other projects). About 100 points of the total will be for the basic programming of relay, about 20 points for Experiment I, about 40 points for Experiment II, about 30 points for Experiment III, and 60 points for Experiment IV.

In the grading for each experiment, points are awarded for correctness, thorough documentation of how the experiment was performed and what the results were, and an explanation of the results. Some points may also be given for creative questions and variations on the experiments.

Project 4

[due by email, Tuesday 7 April]

This, the second programming project, is a simple illustration of the HTTP protocol. You will write a program that communicates with a web server, request a web page, read the web page, and count the number of 'j' characters in that page.

The protocol for this assignment is TCP, and the network programming is basically an adaptation of the tcpcli.c program you have used previously. What you may not already know is how the HTTP protocol functions.

To see how the HTTP protocol works, try the following:

From a Unix shell, type the command
telnet 80
This command establishes a TCP connection between your shell and the CS Web Server at its well-known port, 80. The standard server port is 80 for the HTTP protocol. In response, you will see something like:
Connected to
Escape character is ']'.
Now the web server expects an input stream. So enter
In response, you should see the CS Department's main web page. The connection will also break after returning the web page.
You can also see the course web page by repeating Step 1, but then entering the command
GET / herman/22C178/index.html
(Warning! HTTP is fussy about spelling mistakes.)
As you can see, the HTTP protocol is quite simple. To get a web page, simply open a connection, request the page, and it is returned as a stream. Routines like putStream and getStream can be used to send a request and retreive the response. Note that getStream should continue to read until the connection is broken -- that is how the end of page is detected.

The program you will write takes command-line arguments. For an example of how a program works with command-line arguments, compile and test the parms.c program using the commands

gcc -o parms parms.c
parms this is a test
A running example of the program you will write is here, which you can copy and run on the departmental HP machines (sorry, it won't run on the SGI machines). Here are some example tests:
% webcli
Number of j's = 2
% webcli
Number of j's = 1
% webcli "~jones/index.html"
Number of j's = 26
The webcli program is reading a web page, counting the number of 'j' characters in that page (it does not count any 'J' characters), and printing the total.

What to submit. On or before the due date, you should have written a program that behaves like the webcli program. When your program is ready, email it to me ( The following guidelines are important:

Grading. The grading will be simple: we should be able to compile and test your program and see that it works.

Project 3

[due in class, 25 March]

This is the first programming project. Your task is to write the client program and communicate with a server process that has already been written and will be running until the due date of the project. The server program uses the following simple logic:

The Project. Using the UDP and TCP sample programs given in homework assignments, write a client program that communicates with the server using your student ID.

What to submit. On or before the due date, turn in a listing of your program. Do not email me your program! . Make sure your name and the last four digits of your student ID number are given in the listing.

Grading. The grading will be figured from two items, the listing you submit and the server log. We can see if your program ran (or at least that some program correctly ran) from the server log. The server log may also be valuable for us to monitor your progress in completing the assignment.

Note: It is possible that the server can crash, either because it has a bug or because of some hardware problem. If you are in doubt about whether the server is properly running you can test it using the sample client program pr3cli which tests the server (sorry, this program only runs on HP machines).

Project 2

[due in class, 27 February]

We take another look at our friend ping in this project. Our goal is to estimate network bandwidth by trying different ping commands. For this project you will need to experiment and save results of those experiments, analyze the results of the experiments, and explain your results. To justify your conclusions, some background reading will be useful. Here are some online sources for material about the ping command.

Another useful research task for this project will be to see how ping is working on an IP datagram level. You can see this by following these instructions.
Login to one of the HP workstations (or use telnet to remotely log on).
Copy the tcpshow and tcpdump programs to your directory (remember, clicking on the correct mouse button will ask you to save the link as a file, which is how you should copy these two programs). If you are curious, you can see the man pages for tcpshow and tcpdump as well.
Change the tcpshow and tcpdump permissions to executable by the commands
chmod +x tcpshow
chmod +x tcpdump
Copy this tcpdump log file to your directory, calling it simp.log.
Test the programs by entering the command
tcpshow < simp.log | more
This should show you a playback of the Ethernet frames actually observed during one ping execution.

The Project. The experimental part of this project is to use ping with different packet sizes. On a Linux system, the packet size is specified using the -s option, whereas on HP systems, the size is just a positional parameter (given just after the target address in the command). By testing different target addresses with different sizes, you should be able to deduce something about the bandwidth of your network.

What to Submit. The result of this project is a paper, no more than three pages, documenting your experiments and explaining the measurements you collect. It should answer the following questions.

The ping output prints a round-trip time in milliseconds. What does this round-trip time actually measure? How accurate and precise is this measurement?
Frequently, the output from one instance of a ping command differs from another. Why is this? What are the factors influencing the ping measurements?
Choose one IP address on your local area network as the target for ping commands. For this address, experiment with different ping packet sizes. Plot the time versus size on a graph to show the relation between size and timings. Note: feel free to use machines in our lab to perform experiments on a network free from other user packets.
Based on your experiments, your readings, your examination of the tcpdump logs, and other tests with ping, estimate the bandwidth of your local area network. Is your estimate consistent with the claim 10Mb rating of the Ethernet? If not, why not?

Grading Criteria.

Your project grade will be based on thoroughness, accuracy, clarity of your explanations, and creativity. Roughly, the 100 points for the project will be:

40 points
Thoroughness: did you answer all the questions in the list above?
20 points
Accuracy: did you think about the factors that control ping timings and take some care to control your experiments to account for this timing? Did you report when and where you did the experiments (which network, were there other users of you LAN when you tried the experiments, etc)? Are the points on your plot one measurement or many measurements?
20 points
Explanation: can you account for the results? For instance, if the plot of time versus size has some irregularities, can you explain these? If your conclusion seems at variance with what you understand to be the bandwidth of the hardware, why is this so?
20 points
Creativity: did you embellish the basic task to answer the fundamental question of estimating bandwidth using ping?

Project 1

[To be completed by 12 February]

This project is essentially an application of the information provided in Chapter 5 of the Linux Network Administrator's Guide. To complete this project you will use commands to make a network of three machines in the 311 MLH lab. Here is what you are expected to do:

Make an appointment with the TA, who will supervise and verify your completion of the project. This appointment will be a 15 minute period during which you are expected to complete all project tasks. Should you not complete all tasks, you will have a lower grade for the project or may reschedule another appointment (and this will also lower your grade on the project). Note: since there are about 64 students, we intend to schedule two students concurrently during a sequence of laboratory hours. In one hour, up to 8 students can complete the first project; ideally, eight hours would be sufficient for all students, however we will schedule more appointment times to allow for scheduling conflicts. To see the list of available appointment times, follow this link to see a table of available appointment times; you'll then need to email the TA to make an appointment for a requested time. Don't delay too long in making an appointment!

When you make an appointment, the TA will email you three IP addresses. These three IP addresses will the addresses for your network. You will use hostname, ifconfig, and route commands to create this network on three PCs chosen by the TA during your appointment. The names of the three machines will be aaa, bbb and ccc. If you have time during the appointment, you should also create routing loop for the IP address so that any datagram sent to will loop from aaa to bbb to ccc to aaa, etc.


For this project you can get a maximum of 100 points. These 100 points will be awarded by the following criteria, judged by the TA at the end of your appointment. If you need a second appointment to complete the assignment, then your score will be lowered by 20% of whatever your total is in the second appointment.


You are welcome to practice in the 311 MLH laboratory. Instructions on how to use the laboratory will be given in the course FAQ (after someone asks me questions!). In addition, you may want to get some idea of what the /etc directory looks like in advance; an approximate image of this directory is available via this link. In addition, it may be useful to look at the Linux man pages for the following commands:

Ted Herman