210 likes | 389 Views
Visualization of AAG Paper Abstracts. André Skupin Dept. of Geography University of New Orleans AAG Pittsburgh, April 5, 2000. AAG Conference Abstracts. Web Search Engine Interface. Research Motivation I Methodology. Geography’s role in information visualization geographic concepts
E N D
Visualization of AAG Paper Abstracts André Skupin Dept. of Geography University of New Orleans AAG Pittsburgh, April 5, 2000
Research Motivation IMethodology • Geography’s role in information visualization • geographic concepts • regions • scale • cartographic techniques • generalization • labeling • GIS technology • data integration
Research Motivation IIApplication • Developments in Academic Geography • based on geography’s written output • generalizable for any corpus of documents
Data Capture & Pre-Processing • Source Data: • abstracts submitted to AAG 1999 Hawaii • complete abstracts as text file • 2220 abstracts • Pre-Processing: • Separation into three parts: • author information • abstract text • keywords chosen by authors
Keyword Component Indexing • (1) extract keywords chosen by authors • (2) break keywords into components • (3) match components against content of all abstracts • result: • all abstracts indexed • overall richer then only author-chosen keywords • vector-space model with 2220 docs & 741 terms
Spatialization • projection of elements of a high-dimensional information space into a low-dimensional representation (Skupin & Buttenfield 1997) • > project document/keyword matrix into 2D • Technique: Self-Organizing Map (SOM) • input: raw document/keyword matrix • output: two-dimensional grid of neurons with weight for each keyword
Base Map Creation • Implementation: SOM_PAK & C++ • 1. Choose SOM Dimensions • e.g. 85 x 115 neurons • 2. Train Grid of Neurons • each neuron gets weight for each keyword • preservation of high-dim. document topology • 3. Apply SOM to Data Set • documents assigned to single neurons • 4. Assign unique locations to documents
Base Map of AAG Abstracts • Complexity • > Generalization ? • > Scale ? • Labeling • > Weighted Index ? • Visualization • > GIS Software ?
High-Dimensional Clusters Projected onto Map Hierarchical Coarse SOM K-Means
Map Design for 2D SpatializationVisual Hierarchies Geographic Space Information Space
Research Directions IApplications • visualize trends in geography • author trajectories through time • subject emergence • geography of geography
Research Directions IITechniques • Cluster Solutions • U-matrix (-> contiguous clusters in 2D) • AutoClass (-> with optimized cluster numbers) • quantify performance of cluster solutions • Visualization • multi-band thematic visualization
Color Composite“GIS” “urban” “visualization”: Full Extent