CS 553

Internet Services

Suggested projects

Project proposals should be 2 pages. They should describe what will be built or measured. In addition, they should show that you have performed enough background research to demonstrate that this is not duplicating pervious work (2 or 3 references are fine).

Reliability of the Internet for users

Many of the works we have model the reliability of the Internet by measuring the reliability from random pairs of nodes. However. users don't care about reaching random pairs of nodes. but rather only the path to the destination. Given that a few sites dominate the "destination" address of all traffic., the "all pairs" methodology may inflate the observed error rate. Indeed, recent work has shown the structure of the Internet to consist of a highly interconnected "core" surrounded by a fringe. The core may exhibit reliability properties much different from the fringe. In this project, you would sample the error rate and distribution of HTTP traffic to a few large sites and compare it with the error rates observed in previous work.

Black-box recording system for workstation clusters

A current project has measured the distribution of workstation reboots over time. However, much about the workstation state during the reboot is still unknown. For example, the hardware configuration, software configuration, and operations on the box before and after the reboot. Also, downtime of the workstations are not measured. In this study, you would create a software module that would monitor the state of the machine and save it to non-volatile memory, operating like a "black-box" recorder on an aircraft. The system should be sufficiently robust to allow reconstruction of muchof the machine state after a crash. The ultimate goal is to have this software installed on many machines to classify the machines' availability and un-availability at the hardware, operating system and application levels.

Bug classification of Java programs

While the types of bugs encountered in C programs has been studied, the types of common programming mistakes Java programmer make is poorly understood. Indeed, many of the errors made in C are impossible to make in Java, because most of the memory management is automated. Of course, a whole new set of error types may crop up. In this project, you would examine a number of Java programs and attempt to find patterns of common programming mistakes. I have a number of Java programs available for this project, as well as a large number of programming projects from undergraduate classes.

Resource usage warning indicator for Java programs

One hypothesis of why programs often fail or crash is that they do not take appropriate action when some external resource is exhausted. In this project, you would examine the standard Java class files to see where resources, such as disk space, memory, and network, are used. You would then attempted to construct new exceptions which are thrown as "warnings" that resources are low. For example, currently the Java runtime will throw an OutOfMemoryError when memory is exhausted. However, by this time it is often too late to recover. Instead, a set of controllable warnings may be more useful to the programmer. In addition to a prototype, you would have to catalog which warnings the runtime should support and show how they would useful.

Fault detection of Java programs using compile time analysis (suggested by Barbara Ryder)

Compile-time analysis can be used to judge the stability of a program before it is run (much like the tool lint) In this project, your tool would examine java programs and report dangerous constructs. In particular, for a given package or set of class files, your tools should report uncaught exceptions, or exceptions who's catch clause contains no code or meaningless code (e.g. a sole System.out.print) This project could be extended to report other dangerous code practices beyond poor exception handing as well.

Stability of a Standard Web services

This project would measure the stability of a web service via fault injection. You could use the Mendosus framework developed in the PANIC lab to observe how a clustered web service responds to a variety of network faults. You should run 2 or 3 different web cluster designs and characterize their availability under different fault scenarios You could also develop some of you own fault-injection tools

Build a fault-injections system for disks, memory and the filesystem. (suggested by Thu Nguyen)

In this project, you would construct a fault injection software for disks, the memory and file-system. Ideally, these would be plug in modules into the Operating System. Your modules should be able to handle either trace-fed faults or support standard distributions. In addition, your module should replicate the behavior of the faulty component as realistically as possible.

Map the resource-dependence graph of an Internet Service (suggested by Thu Nguyen)

A complex Internet Service consists of hundreds, or thousands of components. One key to understanding the reliability of a service to map the resource dependence graph of the service. Ideally, this graph should be automatically generated as well as hierarchical. It should be automatically generated because the chance for a human error to miss something is high. It should also be hierarchical because service designers must abstract away the level of complexity to balance accuracy with complexity. Your prototype could start with something simple, such as the resources used by a single Java servlet or program.

Remote attachments (suggested by Badri Nath)

For many small, hand-held devices, including attachments is problematic. For example, suppose you are on your cell phone, and get a short text message asking for a your next hot conference paper. Current technologies allow you to send 100 bytes of data or so for short text messages. The challenge is to figure out an easily useable scheme for sending attachments which fits into the small spaces allowed by text messages.