200 likes | 315 Views
CS519 BGP Project Report. Kai-Wen Chung (kc279) San-Yiu Cheng (sc345). How to Proceed BGP Analysis. Collect Raw Data. Import into Database. Query Database and Analyze data. Collect Raw Data. MAE-EAST (1998.1 ~ 1998.11) http://archive.routeviews.org/ (2003.1 ~ 2003.3). Database Schema.
E N D
CS519 BGP Project Report Kai-Wen Chung (kc279) San-Yiu Cheng (sc345)
How to Proceed BGP Analysis Collect Raw Data Import into Database Query Database and Analyze data
Collect Raw Data • MAE-EAST (1998.1 ~ 1998.11) • http://archive.routeviews.org/ (2003.1 ~ 2003.3)
Database Schema • Original Schema
Database Schema (cont.) • Record Size • Message: 94 bytes/record • MsgPath: 18 bytes/record • # Record • Message: 104,841,405 (98.1 ~ 98.11) • MsgPath: 251,442,478 (98.1 ~ 98.11)
Database Schema (cont.) • Database space allocation: 20GB • About 12 hours to import raw data for 1 month (about 10,000,000 messages, and 20,000,000 paths) • Data volume reaches limitation soon
Our Solution • Allocate larger space • Move Database from SQLServer -> Sparrow • Total 70GB • Modify data schema to reduced record size
Data Schema Modification • Record Size • Message: 52 bytes/record • MsgPath: 14 bytes/record • Size Reduces • Message: 46.9% • MsgPath: 22.2% • Faster Data Importing
Current Status • Database • P3-500 with 128MB ram, and Windows 2000 Server and SQL Server 2000 installed • Imported Data • 1998.1 ~ 1998.11. About 21GB in DB • 2003.3. About 34GB in DB
Current Database Issue • SQL Server Performance • It could take several hours to run a query • Space problem • 70GB is only enough for data of 1 ~ 2 month (of 2003) • We need a “Tera-byte” Database to accommodate all data of 2002, and 2003
Summary of Data • Total space used: • ~55G (1998 and 03/2003) • Number of Messages: • ~220.5 Million (1998 and 03/2003) • Number of DataSet: • ~30,000 (1998 and 03/2003)
Summary of Data (cont.) • A small number of IP addresses dominate the routing table • 15 Source IP addresses occupy about 68% of the PeerIp field of the Messages • 15 Destination IP Addresses occupy about 47% of the NextHop field of the Messages
Summary of Data (cont.) • Advertisement Vs. Withdrawal Messages • There are about 220 Million Messages • ~31.5% of all Messages are Withdrawal Messages • ~68.5% of all Messages are Advertisement Messages
Some Advices • Optimize your query • Some queries are going to take several hours to execute • Test on bgpbaby first • This is a smaller version of bgpdata (~1G) • Don’t try to execute all your queries on last day • The SQL Server database is going to be overwhelmed