General TCP Proxy
Posted 27 Sep 2000 by herman

For developing a Web Proxy, Web Server, mail proxy, or similar projects, this is a useful example.

A Working Proxy

There are hundreds of proxies, written in Java, C, C++, and other languages, with source code available, on the Internet. I found a reasonable TCP proxy on freshmeat (put "proxy" in the search field and you'll find it). I downloaded this TCP proxy, compiled it, and placed it in the proxy/proxy2 directory of user on alacran.cs.uiowa.edu. After I tried it and learned how to use it, I made a small modification in the code: the proxy program now prints everything it receives, from either direction, so you can see what the actual data looks like.

The Main Idea Here's what the TCP proxy does. To illustrate, let's suppose the proxy is between a web server (www.uiowa.edu) and a browser (netscape running on alacran). The picture looks like this.

  +---------------+
  | www.uiowa.edu |
  +---------------+
         | (port 80 for http requests)
         |
         |
  +---------------+
  |  TCP proxy    |
  +---------------+
         | (port 7777 or some other specified port)
         |
    +------------+
    |  netscape  |
    +------------+

Here's what is happening. We ask netscape to open a "page" on localhost at port 7777, by asking for

http://localhost:7777
and this causes the browser to connect to alacran (where the browser is running) at port 7777. When the TCP proxy gets this connection request, it acts as a server. But instead of replying itself to the browser, the TCP proxy sends a new connection request to www.uiowa.edu at port 80. When the TCP proxy gets a reply from www.uiowa.edu, the TCP proxy saves this reply and sends it back to the browser, in response to the original connection request that started the whole interaction. Got it? The TCP proxy is just a "relay" of messages going between the browser and the web server.

How to Run the Proxy

To run the TCP proxy, first login as user on alacran and then

cd proxy/proxy2
There you'll find the code (written in C) and the executable, called proxy. To run the proxy requires several parameters, as you can see by reading the code of proxy.c or by executing "proxy --help". One of the parameters needed by the program is the IP address of the destination, which we'll suppose to be the IP address of www.uiowa.edu. I executed the command ping www.uiowa.edu to get the IP address for this web server: it is 128.255.56.81. Then to run the proxy, I used the command
proxy -s 7777 -d 80 -D 128.255.56.81 --nodaemon
and then asked, in a different window, for netscape to open alacran.cs.uiowa.edu:7777. Ooops! Suddenly the window where the proxy was running filled up with all sorts of strange characters. It seems that much of the output has binaries that don't format nicely. So then I killed the proxy with Cntrl-C and instead ran it with
proxy -s 7777 -d 80 -D 128.255.56.81 --nodaemon > logfile
Now, when I tried netscape again to open the "page" at alacran's port 7777, I got the real webserver page at www.uiowa.edu. Quickly, I again killed the proxy and looked at the file logfile --- but this file was large and complex, not a simple example. What we really need to learn about HTTP and HTML is a very simple web page. I happen to have a very simple web page, namely http://weblog.cs.uiowa.edu. Can I get the proxy to work with this simple web page? Again, I started the proxy, this time with the command
proxy -s 7777 -d 80 -D 128.255.28.120 --nodaemon > logfile
(the IP address for weblog.cs.uiowa.edu is 128.255.28.120).

Note: I found I had to quit netscape each time I tried this, because netscape keeps a cache of the pages it fetches, and won't actually re-read a page under normal circumstances. Finally, though, after these experiments, my logfile contained a good example of the interaction (both request and reply) between netscape and a web server. It shows all the HTTP and HTML going on that most users never know about.

[ Home | Help | Articles | Account | People | Projects ]