180 likes | 340 Views
Data Extraction, Visualization and Processing. Minor Project Presentation. Santosh Ghimire – 066 BCT 533 Subit Raj Pokharel – 066 BCT 538 Sudip Kafle – 066 BCT 539. System Block Diagram. Data Extraction. Election Data GIS Data for Coordinate of districts.
E N D
Data Extraction, Visualization and Processing Minor Project Presentation SantoshGhimire – 066 BCT 533 Subit Raj Pokharel – 066 BCT 538 SudipKafle – 066 BCT 539
Election Data GIS Data for Coordinate of districts District Population Based on Ethnicity Data for District Level Indicators Different Set of Data available in Different file Formats
Extraction Database Parser extracts data from file and saves to database
Parsing • XML file has Tree-node structure • Required data present in between opening and closing tags • PDF has no standard format for storing data • The file first converted to simple text • HTML has DOM structure. • Data may not be structurally represented unlike XML. • Data extracted using Regular expressions : PDF and HTML
Data Management • Admin needs to login for Data Management • Can Add, Update and Delete Data by searching based on various criteria • Only admin can register new admin
Visualization on Map • Used Google Map API • JavaScript used at Client side • jQuery and JSON used to implement AJAX Web Server Server Acknowledge request and sends map data in JSON format User sets new criteria for Map Map shown on Web Page New Map
Visualization with Tag Cloud • Shows overview of scattering of data. • One dimension represented by Text displayed (e.g. Name of district) • Other dimension by weight(Font size and Color) of Text • Implemented using CSS. • Weight of Tags statistically calculated based on population.
State No. 1 State No. 7 State No. 2 State No. 8 State No. 3 State No. 6 State No. 5 State No. 4
Analyzing Feasibility of Federal States • Districts can be selected to form new state. • Aggregate data for each state obtained from database. • Data can be • Top Caste, Top Parties in election, development index • Coefficient of Variation used to see if it is feasible
Facts Finder • Informative facts extracted from raw data in Database • User allowed to choose from multiple criteria • Nested SQL queries used
Methodology • Programming Languages • C# with ASP .Net, JavaScript, jQuery • MS-SQL Server 2008 as Database Engine • Web Technologies • JSON, AJAX • Google Map API
Project Management • Each phase divided into small chunks. • Assigned to team members. • Online Repository created on BitBucket.org • Using Mercurial based TortoiseHg • Works synchronized among each member • Weekly discussion with Senior Developer at YIPL Nepal.