260 likes | 379 Views
Matt Bernier Joey Murphy David Coleman. Linked Data Visualization. Needs Analysis. Allow users to view data sets graphically using intuitive and efficient controls Specifically to view links among data points Contemporary methods include: diagrams, graphs, and lists
E N D
Matt Bernier Joey Murphy David Coleman Linked Data Visualization
Needs Analysis Allow users to view data sets graphically using intuitive and efficient controls Specifically to view links among data points Contemporary methods include: diagrams, graphs, and lists Enable users to perform analysis on their data
Market Analysis Linked data is present in several environments: Search engines (page ranking) Social networks (recreational, academic, professional) Other database-driven sites Computer networks
Market Analysis Market Users • Website owners (50-100 million active domains, multiple sites per domain) • Enterprise internal site managers • Social networks operators and users (more than 200 sites online) • Network administrators
Background Web sites supporting large amounts of users are very popular Finding common usage statistics can be very beneficial Purchasing similar products Participating in common discussions Common browsing habits
Background Showing Web links How websites link together Visualizing the web Visualizing any linked data sets
An Existing Application Create Random Nodes
Goals and Objectives Overall goal is to create an intuitive web based tool that allows users to see links within their data Allow users to analyze and infer information from the links Making it easy for web programmers to implement the graph on their site using a PHP class structure
Tools HTML,CSS (data presentation) PHP (data objects, processing) JavaScript (graph creation, interaction) JSViz (framework for dynamic views, Force-directed algorithm creates a graph that is aesthetically pleasing)
Literature ReviewGeneral Ideas Building graphs from data sets Displaying data Data analysis Examining and inferring relationships Prediction Application to real world
Literature Review Presenting data to users Tree structures, Data -> Information “Inducing the chosen mental model in the mind of the observer” Easy to understand Allows for more information to be absorbed by observers Aaron Kershenbaum and Keitha Murray. In Journal of Circuit Systems and Computers
Literature Review Many theories and techniques for graph analysis, but not construction Choice of nodes and links What is represented by a node? What is represented by a link? Greatly influence meaning in a linked data display e.g. hyperlinks, Enron email dataset A. Badia and M. Kantardzic. In Proceedings of the 3rd international workshop on Link discovery LinkKDD '05 J. Shetty and J. Adibi. In KDD ’05
Literature Review Link Mining – analyzing links Makes use of descriptive and predictive modeling (data mining) e.g. determining webpage relevance based on anchor text and surrounding text of incoming hyperlinks e.g. segregating website users into groups based on common behaviours L. Getoor. In ACM SIGKDD Explorations Newsletter, Vol. 5, Issue 1, 2003
Literature Review Link prediction Uses node proximity “Information about future interactions can be extracted from network topology alone” Predicting links that represent online social interaction can help to determine the feasibility of adding new interaction features to a site D. Liben-Nowell and J. Kleinberg. In CIKM '03
Patent Analysis • Computer-implemented system and method for handling linked data views, Patent number 7,068,267, held by SAS Institute Inc. • A first view and a second view are used to display at least a portion of the data observations contained in the data model. Conditional data that is associated with the second view specifies how the second view's display is modified based upon a selection of a data observation within the first view.
Advantages Design allows for customization Custom data objects Almost all visual aspects of the graph are easily changed or left as default settings
Disadvantages Requires a network connection and a browser Or an Apache and PHP installation on a local machine As dataset grows larger, application performance may degrade Possible Browser compatibility issues These are typical web issues with HTML, JavaScript, and CSS rendering
Requirements Analysis • Functionality (performance) • Flexibility • Allow users and developers to customize and deploy application as they see fit • Reliability • Provide an accurate data representation • Quality • Provide a meaningful, visual representation of data
Requirements Analysis Operating Environment Scripts: Run on a webserver with PHP (4.0+) installation Can interface with databases Users: Cross-system Cross-Browser
Requirements Analysis • Interfaces • A PHP class is provided, and the data to be visualized is added by the user. • Performance Requirements • Time required to produce display varies with size of dataset • 1-10 seconds • Restrict size of datasets to prevent browser/computer from suffering
Requirements Analysis • Resources • Design was conceived prior to undertaking project, 10 man-hours to refine design • Coding – 20 man-hours • Testing – 15 man-hours
Demo • Example
Future • More complex display • Hyperlinks and/or pictures as nodes • Re-centering graph by clicking a node • Mouse-over events for more detail