The Internet2 Observatory Data Collections
This document describes the Internet2 Observatory Data Collections. The data resides on a variety of different databases that are available online through a variety of programmatic interfaces. Taken as a whole, the databases comprise a large correlated database for use by the research community.
- Internet2 Usage Data
- Internet2 Netflow Data
- Internet2 Routing Data
- Internet2 Latency Data
- Internet2 Throughput Data
- Internet2 Router Data
- Internet2 NetFlow Data
- Internet2 Syslog Data
We are also interested in hearing from the research community if there data attributes that should be collected. Please send mail to firstname.lastname@example.org if you have suggestions.
Internet2 Usage Data
The Internet2 NOC keeps usage statistics on the Internet2 network. Usage statistics are collected at 10 second intervals using theIndiana University-developed SNAPP tool . A list of the data being collected can be found at:
The files are rrd data files and the graphs may be individually
viewed by clicking on the appropriate links.
There is an interface to obtain the raw RRD data used to generate the graphs. It can be accessed here.
Internet2 Netflow Data
The Internet2 NetFlow data is available using rsync from netflow.internet2.edu. Flows are stored in flow-tools format. Access to netflow data is by special arrangement. To obtain an rsync account to download the netflow data, please send mail to email@example.com. More information on how to obtain the data is available at Proposal Process. Note that the IP addresses in the data have their low-order eleven bits set to zero, meaning the finest granularity one can see in the data is /21.
Data for each router is obtained at local collectors in the Internet2 nodes on a five minute interval. They are regularly pulled back to the central storage device, and on a daily basis, combined into a daily aggregated file that is router dependent. Logs for each day are available along with nighly summary reports.
For more detailed instructions on obtaining the Netflow data using rsync, click here.
Internet2 Routing Data
The Internet2 routing data is organized by type: External Gateway Protocols (BGP in the case of Internet2) and Internal Gateway Protocols (IS-IS). Internet2 has deployed a special version of Zebra that participates in the IS-IS and BGP routing protocols.
EGP (BGP) Data
Internet2 has deployed Zebra bgpd at each router location. A local machine in the Internet2 Network Observatory maintains an internal BGP session with the local core router and records all of the BGP routing information it learns from the routers. Data is transferred to a central point and is available here:
The "rib" file is a snapshot of the entire BGP RIB taken every 2 hours. Logs of changes (adds, deletes, modifications) are made to the "updates" file. Since there is a Zebra bgpd at each router location the data has separate rib and updates files. Under each router location the data is split into individual months. Each file is in the format rib.YYYYMMDD.HHMM.gz or updates.YYYYMMDD.HHMM.gz. All time stamps are UTC.
The data collected is in MRT (Multi-threaded Routing Toolkit) format and are binary files that can be converted to ASCII with Marco d'Itri's perl dump parser scripts (see http://www.linux.it/~md/software/zebra-dump-parser.tgz).
IGP (IS-IS) Data
The IS-IS data collection system is implemented using ISISd (v 0.93b - see http://isisd.sourceforge.net/ ), along with Zhang Shu's IS-IS logging facility. ISISd is ISO's IS-IS routing protocol implementation integrated into the Zebra platform. The IS-IS logging facility is crafted into ISISd and is configured to collect IS-IS protocol information from its neighbor via broadcast interface.
At each of Internet2's PoPs, a Unix server running ISISd, supports local IGP data collection over an Ethernet interface to an Internet2 core T640 router. With the IS-IS data recorded in libpcap format, each record includes a timestamp that's synchronized to a local (within PoP) CDMA provided clock, which is then distributed via NTP to the FreeBSD data collector. The time stamp accuracy is measured to within ten microseconds to UTC. See http://www.endruntechnologies.com/cdma.htm for details. The primary objective is to provide high resolution measurements for comparing geographically distributed link-state protocol data.
The files are rolled on a daily basis beginning at GMT midnight and are arranged by month. See
To download a file, just click on the desired month and year of interest followed by the location/PoP of interest. The data files can be retrieved by clicking on the day of interest. tcpdump, as well as other network analyzing tools (e.g. Ethereal), can be used to playback the recorded IS-IS data files/logs. Below is a snippet using tcpdump to replay the log from August 8, 2004.
Internet2 Latency DataLatency data is available through the One Way Active Measurement Project (OWAMP) at Internet2. OWAMP provides potential for measurements other than just Latency. Further information about OWAMP can be found at OWAMP Project.
A mesh of latency is at OWAMP Grid
Internet2 Throughput Data
Throughput data is available through the Bandwidth Control Project (BWCTL) at Internet2. Further information about BWCTL can be found at BWCTL Project.
Internet2 Router Data
This data is collected as part of the Internet2 Network - Visible Backbone project of the Internet2 NOC. Every hour a wide variety of data is gathered from all of the Internet2 backbone Juniper routers via XML. Shortly after the data is gathered it is processed by a large number of perl programs. These programs generate the data available through the URL:The Visible Network Toolset. Additional processed data will be made available on a regular basis.
There are two ways to access the raw XML data collected from the routers. There's a simple HTML browsing interface and there's a programatic interface using SOAP and CGI. You'll need to know how the data is stored for either access method. There's a directory structure based on the time and and date the data was collected. The top level is the year. Next comes the month, day, hour, and minutes the data was collected. For example, the files in /2003/4/21/13/10 were collected April 21 2003, at 1:10 pm. Each file is compressed with GZIP.
For further information about automating the process of obtaining the data using the SOAP interface, click here.
Internet2 Syslog Data
Internet2 Syslog data is
Internet2 Topology Data
Topology files describing the Internet2 Network are generated from the GlobalNOC Database. This data is populated from the network device configuation files each day. The files are in Network Description Language (NDL) and the Open Grid Forum Network Management Working Group Schema (OGF NMWG).
A map of the Topology is automatically generated using GraphVis and is located here. This map is generated from the Topology schema.
There are also maps showing the Internet2 Network topology and IGP metrics between each site.