22C:118 Fall 2004 Week 15 Summary 22C:118 Fall 2004 Week 15 Summary

  1. Monday: Deadlocks in databases, prevention and detection, was the lecture topic. Broadly speaking, deadlocks can be communication deadlocks (we saw this with TCP applications and with parallel simulation) or resource deadlocks, as occur with locking for operating systems (even in Python threads) and database transactions.
    Deadlocks can be prevented in databases by aborting ÿoung" transactions whenever they block older ones, so that the older transactions never wait on locking. This idea requires some kind of timestamping mechanism for given each transaction a unique timestamp.
    Another way to prevent deadlock is to order all possible resources by some known ordering R and make sure that every transaction requests all its locks by some (sub-) order following R. Both these prevention ideas could also work in the case of distributed databases.
    Deadlocks are detected by having the transaction manager construct a wait-for graph (WFG). There exists a deadlock if and only if the WFG contains a directed cycle. The WFG is a dynamic object, changing over time as transactions lock and unlock items.
    In a distributed database, there can be deadlocks even though no site's transaction manager has a WFG with a cycle. That is because a particular transaction can have subtransactions running at the different sites. To solve the detection problem, one can use a new type of server called a "deadlock detector". As edges are added and removed from the local WFG at each site, messages are sent to the deadlock detector. The idea is for the deadlock detector to construct a "global" WFG. However, this is tricky to implement correctly: the deadlock detector is not synchronized exactly with each site's transaction manager, and therefore we have the danger of "phantom deadlock" detection.
  2. Wednesday began the new topic of Peer-to-Peer networks, which use the concept of Overlay Networks to join network users. An overlay networks is essentially a virtual network on top of the Internet in which nodes are programs and links are TCP connections (they could be UDP as well). In a peer to peer network, all peers are both clients and servers.
    The lecture's motivation is file-sharing. We looked at classical file sharing systems such as Napster and Gnutella. Napster requires a server that acts as a directory of all peers and their content. If you're looking for a file, ask the server and find out who has it. The actual copying of the file goes directly between peers. Napster suffers because the server is a single point of failure. By contrast, Gnutella has no server. Each request for a file simply gets broadcast to all peers currently connected to the Gnutella (overlay) network. But the Gnutella idea becomes inefficient when the network gets large.
  3. Friday covered the Chord style of peer-to-peer networking. It is a special-purpose network based on a circle of identifiers and "fingers" that are pointers within the circle to make find operations more efficient.