My Published Articles: May 2004

Connected to: Hal Burch,Researcher Internet Mapping Project
Living by the slogan “Computer Science - the only profession where you can make 50 billion dollars producing things “that ‘mostly’ work”, this PhD student’s love for solving computational problems at Carnegie-Mellon University has landed him the coveted position of a professional coach at the USA Computing Olympiad. As a researcher in computer science, his focus has mainly been on network security, Internet mapping and network visualization.
by Salman Siddiqui

Q. Why did you and your colleagues initiate the Internet Mapping Project? A. It started as a necessary prerequisite for another project: determining the source of a denial of service (DoS) attack. We had developed a way to test whether or not a particular connection was being used in a DoS attack. If you knew how the network was connected, you could test connections and trace the attack back to its source. As we collected this data both for the Internet and Lucent’s network (we were at Bell Labs at the time), we found interesting stuff on both networks and began collecting data on a daily basis. Our original software was slow, so we only did about a ninth of the Internet per day. We have made many improvements in the last six years.Q. Does your project truly map the entire Internet or is it also limited to a certain class of address spaces? A. There are too many active IP addresses to traceroute to all of them daily. Fortunately, IP addresses on the same network have almost exactly the same path, so we do not need to traceroute to all of them. Thus, we pick an IP address from each network on the Internet and traceroute to those IP addresses. We may not get all the connections between computers in the same company, but we do get most of the connections in the Internet Service Providers (ISPs). Since the connections in an ISP are used by many people, these are connections that are of more interest to us. We are also limited by what we can reach. Some firewalls block traceroutes. Again, however, this is mostly an issue of computers inside companies, not computers run by ISPs.Q. How are you able to study the infrastructure changes in the event of a disaster in a specific geographical location such as in Yugoslavia? A. We have done that on small scales only. Country-level information is relatively simple to get using domain names. For examples, Yugoslavia owns the .yu domain, so anything in that domain is likely to be in Yugoslavia. The non-country-specific domains, such as .com, .edu, and .net, are more difficult. When my colleague Steve Branigan wanted to determine the networks in Yugoslavia, he started with the .yu domain and manually expanded it from there. Yugoslavia had a small enough network presence so this was not difficult.Q. Wouldn’t a distributed approach to the mapping of the Internet give us a better picture of the Internet? A. A distributed approach might give us more of the connections in the ISPs. Paul Barford and others at University of Wisconsin estimated in 2001 that using a single source may miss as much as 40% of the connections. Currently, we are using multiple locations for our less frequent scans.Q. What hardware and bandwidth is currently being used for your project?A. It runs a 750MHz Pentium III computer running FreeBSD. We limit the bandwidth to 500 packets per second, which corresponds to 256 kilobits per second (equivalent to about five modems).The machine does not need as much power as it has - the CPU is 60-70% idle when the scan is running. We limit the speed to decrease the intrusiveness of the scan.Q. Would you agree that the Opte project, which has been inspired by the Internet Mapping project, is faster and more comprehensive than your own project? A. From their website, it looks slower. We map the Internet daily in about two hours. Opte claims to want to map the Internet in one day, but it takes them much longer than that. I am not certain exactly how long; their website currently says that it took them 143 days to do their scan, but it appears to have really taken about seven days.I have no idea where their statement that it takes us six months to generate a single map came from. We do not produce the layouts necessary to generate a picture on a daily basis, as it takes about a day to do those calculations. We do not do it often enough for that to be a large problem. In terms of detail, Opte does more to measure the connections within corporations. Being focused on the core, we do not gather that level of detail. Q. How much time does it exactly take to map the entire Internet according to your project? Does it really take six months to complete a single map as reported?A. It takes under two hours to perform the traceroutes. We spend about another two hours doing related activities, such as looking up domain names of the routers found. It takes about 18 hours to do a layout, but that is not normally done as part of the daily runs. Once the layout is done, generating the actual pictures takes seconds.Q. What benefits does Internet mapping offer for a common web user?A. I don’t know how a normal user of the Internet could use such a map, although the pictures are pretty to look at. The data is really meant for researchers to better understand the connectivity of the Internet so they can improve it. It is of interest to operators as well, who need to know the pieces of the Internet so they can know what to watch. Operators have a good notion of what their own network is, but little idea about other people’s networks.

Site:
Hal's homepage
www-2.cs.cmu.edu/~hburch/

My Published Articles

About Me

Thursday, December 15, 2005

May 2004

0 Comments: