DataGrid Wide Area Network
Monitoring Infrastructure
(DWMI)
|
|
|
Connie Logg |
|
February 13-17, 2005 |
|
|
History
|
|
|
Originally done for SC2001 demo and
called IEPM-BW |
|
After SC2001, development continued |
|
FNAL picked up IEPM-BW and adapted it
to their site |
|
In Spring 2004 redesigned for
TeraPaths monitoring project |
|
Currently still called IEPM-BW, and
deployed at 4 sites |
|
|
Architecture - I
|
|
|
|
Use MySQL database |
|
All configuration is in the database so
the code can self configure |
|
Allows flexibility for adding new types
of data |
|
Written in perl |
|
Low impact probes (currently abwed, traced,
and pingd) have daemons that run independently |
|
High impact probes have a daemon (bw-synchd)
which insures that high impact probes do not run simultaneously and that
there is a break between each test. |
Architecture - II
|
|
|
|
Results from all probes written to a
data directory and are loaded by load-datad daemon which assures that the
data base is not bombarded by hundreds of writes simultaneously. |
|
Analysis scripts run every hour or two
depending upon how long they take |
|
Plot data, traceroute reports, master
web page generation |
MySQL Database Tables - I
|
|
|
NODES Each node has an entry and its
specs (latitude, longitude, contact, paths, et al.) |
|
MONHOST Active monitoring host(s)
information (web/cgi paths, data analysis specs, et al.) |
|
TOOLSPECS Probe specifications
(probe, probe options, frequency, testtype, et al.) |
MySQL Database Tables - II
|
|
|
|
Many types of tests possible |
|
background low impact tests which can run
concurrently (traceroute, ping, abwe) |
|
background-syn Tests which must be run one
at a time (iperf) |
|
On demand to be implemented |
|
|
MySQL Database Tables - III
|
|
|
|
SCHEDULE |
|
scheduler inserts probe requests into the
SCHEDULE |
|
Daemons read SCHEDULE table for the
probes they are responsible for within the current timeframe, and run the
probes. |
|
All results are written to a data
directory and loaded by the data loading daemon |
|
|
APIs and other utilities
|
|
|
Fetch-ping-data |
|
Fetch-abwe-data |
|
Fetch-trace-data |
|
Fetch-bw-data (e.g iperf) |
|
Fetch-trace-data |
|
Etc.. |
|
All take a nodename and timespan and
return a filename where the data is stored |
Data Analysis
|
|
|
Time series plots group and individual |
|
Diurnal analysis & fitting |
|
Traceroute analysis |
|
Bandwidth Change Analysis will be
augmented by other methods currently be researched and developed |
CGI Utilities in
development
|
|
|
Add and update NODES |
|
Add and update TOOLSPECS |
|
Add and update MONHOST |
|
Interactive data analysis |
|
|
Informational Web Pages
|
|
|
Table of defined NODES |
|
Table of defined MONHOST |
|
Table of TOOLSPECS probe
specifications |
|
Description of data base tables |
|
Report on data logging for past few
weeks |
|
PLM needs updating |
|
Others to come every time I have to
look at something for validation, I create a web page |
Futures
|
|
|
Make data available via web services |
|
Interactive data analysis CGIs |
|
Add additional probe types |
|
Develop complete distribution kit
complicated by differing locations and versions of perl, gnuplot, mysql,
graphics libs, ploticus, iperf, etc. |
|
Add additional anomaly detection
techniques |
Summary
|
|
|
The objective is to provide for regular
and reliable network probe testing and data collection from several locations
around the world |
|
Make the data available to the
community |
|
Provide a framework for the
incorporation of a variety of analysis tools |
Acknowledgements
|
|
|
Many people have contributed content to
this system over the years |
|
|
Questions &
Considerations
|
|
|
|
BWCTL |
|
not installed everywhere and it is one
more thing I would need to install as part of the distribution kit and
maintain |
|
Does not do multiple iperf streams |
|
May want other heavyweight tests that
bwctl does not provide for |
|
OWAMP special NTP configuration |
|
|