1 / 32

MIRnet Administrative Data Analysis System (MADAS)

MIRnet Administrative Data Analysis System (MADAS). Greg Cole, Natasha Bulashova Friends & Partners NCSA. Description. System converts netflow data into structured data stored in a series of relational database tables

Download Presentation

MIRnet Administrative Data Analysis System (MADAS)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.


Presentation Transcript

  1. MIRnet Administrative Data Analysis System (MADAS) Greg Cole, Natasha Bulashova Friends & Partners NCSA

  2. Description • System converts netflow data into structured data stored in a series of relational database tables • System provides means of browsing summary statistics in graphic and table format • A work in progress since 1998; first version in summer of 1999, second in fall of 2000 (for HPIIS review), third in February 2001 http://www.friends-partners.org/madasd/ FOR MORE INFO...

  3. Description||3130|3130|UDP-Other|55|6349|2|979067306|979067523||3130|3130|UDP-Other|55|6569|2|979067306|979067523||53|3271|UDP-DNS|1|482|1|979067480|979067480||63499|80|TCP-WWW|2|96|1|979067547|979067550||53|35432|UDP-DNS|2|634|1|979067717|979067721||63500|80|TCP-WWW|2|96|1|979067547|979067550||61492|80|TCP-WWW|2|96|1|979067677|979067680||51270|21|TCP-FTP|6|360|3|979067720|979067781||51271|21|TCP-FTP|6|360|3|979067720|979067781||0|2048|ICMP|1|1500|1|979067753|979067753||63501|80|TCP-WWW|2|96|1|979067547|979067550||3143|3128|TCP-Other|5|1486|1|979067620|979067620||3128|3143|TCP-Other|5|1043|1|979067620|979067620||61493|80|TCP-WWW|2|96|1|979067677|979067680||63502|80|TCP-WWW|2|96|1|979067547|979067550||1024|53|UDP-DNS|1|71|1|979067714|979067714

  4. Process • Aggregate netflow data from Router • Load into primary database tables • Update summary tables • Update “heap” tables • Wait 10 minutes (and do it again)

  5. All network flows must meet minimum traffic threshold to be included in live database (for MIRnet, this is set to 10K) Lose 3% of total traffic volume but reduce 95% of records All data kept in archives Currently maintains 17,000,000+ network flow records (June 1, 2001) Primary IPheaders table *************************** 1. row *************************** ip_source: ip_destination: port_source: 40C-45C port_destination: 25 protocol: TCP-SMTP packets: 199 octets: 285413 flows: 1 timestart: 2000-08-28 22:50:21 timeend: 1999-09-08 06:18:09 channel: BE periodbegin: 1999-09-08 06:11:49 periodduration: 600 keyid: 2 domain_source: 42 domain_dest: 28 *************************** 2. row *************************** ip_source: ip_destination: port_source: 80 port_destination: 1K-2K protocol: TCP-WWW packets: 11 octets: 11128 flows: 1 timestart: 2000-08-29 18:39:41 timeend: 1999-09-08 06:20:52 channel: BE periodbegin: 1999-09-08 06:11:49 periodduration: 600 keyid: 3 domain_source: 9 domain_dest: 125

  6. Primary DNSdata table +----------------+---------------------------+----------------+----------------+-----------+ | ip_address | ip_name | createtime | modifytime | ip_domain | +----------------+---------------------------+----------------+----------------+-----------+ | | icpmac12.epfl.ch | 20010110104036 | 00000000000000 | 6203 | | | budm31.ar.wroc.pl | 20010110104036 | 00000000000000 | 3232 | | | ip134-tpas-1.ti.net.ge | 20010110104032 | 00000000000000 | 6131 | | | dyn081-146.stanmore.ac.uk | 20010110104029 | 00000000000000 | 9760 | | | gosh-atm.ex.ac.uk | 20010110104026 | 00000000000000 | 9488 | | | | 20010110104025 | 00000000000000 | 2 | | | | 20010110104025 | 00000000000000 | 2 | | | | 20010110104024 | 00000000000000 | 2 | | | paul.cvcp.ac.uk | 20010110104024 | 00000000000000 | 9456 | | | | 20010110104023 | 00000000000000 | 2 | | | endo1.endoc.med.unipi.it | 20010110104023 | 00000000000000 | 6214 | | | | 20010110104022 | 00000000000000 | 2 | | | | 20010110104022 | 00000000000000 | 2 | | | | 20010110104022 | 00000000000000 | 2 | | | imb.hope.ac.uk | 20010110104021 | 00000000000000 | 9526 | +----------------+---------------------------+----------------+----------------+-----------+ Currently maintains 806,431 DNSdata IP records (January 10, 2001)

  7. Primary Domains table *************************** 1. row *************************** domainid: 715 domainname: anl.gov latitude: 41.858 longitude: -88.017 domainlabel: Argonne Natl Lab createtime: 20010103224037 modifytime: 20001227191828 origin: US shortlabel: Argonne Natl Lab location: pdomainid: 715 rdomainid: 715 loccity: Chicago locstate: IL loccountry: United States orgclass: US Government,US Govt DOE worldclass: North America regionclass: USA Great Lakes • Heart and soul of MADAS system • Adding new “intelligence” to this database enables entirely new classes of analysis • Currently maintains 11,771 domain records (January 10, 2001) *************************** 2. row *************************** domainid: 948 domainname: doe.gov latitude: 38.892 longitude: -77.017 domainlabel: US Department of Energy createtime: 20001227170946 modifytime: 20001227170946 origin: US shortlabel: US-DOE location: Washington, DC pdomainid: 948 rdomainid: 948 loccity: Washington locstate: DC loccountry: United States orgclass: US Government,US Govt DOE worldclass: North America regionclass: USA Atlantic Central

  8. Other Primary Tables +------+--------------------------+---------------+ | code | country | worldclass | +------+--------------------------+---------------+ | ?? | Unknown | Unclassified | | AC | Ascension Island | Other | | AD | Andorra | Europe | | AE | United Arab Emirates | Middle East | | AF | Afghanistan(Islamic St.) | Middle East | | AG | Antigua and Barbuda | North America | | AI | Anguilla | Other | | AL | Albania | Europe | | AM | Armenia | Middle East | | AN | Netherland Antilles | Other | +------+--------------------------+---------------+ • IP Today (last 24 hours of ipheaders records) • Country Codes • Parent domains • Color mappings +----------+-------------+ | parentid | parentname | +----------+-------------+ | 1308 | ac.jp | | 3 | ac.ru | | 959 | ac.uk | | 986 | edu.tw | | 6 | free.net | | 735 | nasa.gov | | 41 | nlanr.net | | 4762 | ircache.net | | 100 | ras.ru | +----------+-------------+ +-------+---------+ | code | value | +-------+---------+ | ?? | pink | | CA | lblue | | CH | purple | | DE | lbrown | | DK | green | | EE | dgray | | FI | white | | FR | cyan | | IL | gold | | IT | lred | | JP | dpink | | NL | lpurple | | NO | gray | | Other | lyellow | | PL | orange | | RU | blue | | SE | lgray | | TW | yellow | | UK | marine | | US | lgreen | +-------+---------+

  9. Capabilities • With these tables (updated every 10 minutes), we can provide all sorts of live (and historical) traffic analysis between world regions, countries, country regions, cities, institutions, organizations, network protocols by year, month, day, hour, minute, . . But . .

  10. Database “mirsum” 8 tables updated live every 10 minutes 2 “Heap” (RAM-based) tables used for most live queries Pre-query “optimizer” selects best tables for current query Domain_date_proto Domain_date_proto_mm Domain_date Domain_date_mm Country_date_proto Country_date_proto_mm Country_date Country_date_mm Heap_domain_date_proto Heap_domain_date_proto_mm Need to use Indexed Summary Tables

  11. A word about technologies • No proprietary software • Mysql for database • PHP for query interface • Web/CGI for stats interface • Perl for code/CGI base • DBI for interaction with Mysql • GD::Graph graphics libraries

  12. Analysis that in original MADAS system took 400-500 lines of perl code, now looks like: Perl Code (object-oriented) #### 2 ########## # chart showing total volume with breakdown by top countries my $self = MADAS::Country->new( database => "mirsum", table => "domain_date", variable => "origin_dest", imagemapcgi => "/cgi-bin/madas/printtable.pl", imagemap => 0, percent => 1, graphtype => "bars", title1 => "Total MIRnet Traffic Flow by Destination Country", rh_input => \%in); $self->set_title2("Period: <b>" . $self->get_timebegin . "</b> - <b>" . $self->get_timeend . "</b>"); $self->doit();

  13. Demonstration

  14. World Regions (by country)

  15. Countries (by domain)

  16. US Regions Russian Regions

  17. DOE NASA US Government DOD

  18. Advantages • Higher-level analysis of network usage (“not just for engineers”) • System encourages “exploration” • Better understanding of ‘users’ and their applications • Immediate feedback on traffic problems/issues

  19. Future Plans • Evaluate shared use of Domains and DNSdata tables (perhaps via LDAP) • Standard monthly and quarterly reports of traffic utilization • “Monster” query • “Project” level accounting/analysis more . . .

  20. Future Plans (continued) • Create always-running “server” to maintain data, provide “instant stats”, manage web site/interface • Provide statistical analysis routines • Create database to maintain all “global” settings • Port-level analysis (looking for “napster”, etc.) more . . .

  21. Future Plans (continued) • Explore integration/sharing with HPIIS projects (others?) • Develop data maintenance applications for Domains database • Develop ‘world-map’ graphics applications more . . .

  22. Future Plans (continued) • Develop “partnerships” analyses (looking at domain-domain and machine-machine partnerships) • Add additional “organizational” classes (i.e., “US Govt DOE”, “University”) • Add state-level analyses • Clean-up/refine Domains database more . . .

  23. Future Plans (continued) • Add “science” classifiers and “project” identifiers to regular traffic flows • Integrate this with database describing high performance network science applications • Integrate back-end reporting with front-end reservation system

  24. Future plans (continued) • Authentication system for machine-level inquiry/analysis • Device independent display of usage (for text-only, email, WAP devices) • Handle IP address cache expiration problem • Etc. . . .

More Related