180 likes | 289 Views
Bringing Data Science, Xinformatics and Semantic eScience into the Graduate Curriculum (solicited). EGU2012-11224 (EOS 6/ ESSI2.3) April 25, 2012, Vienna. Peter Fox (RPI) pfox@cs.rpi.edu Tetherless World Constellation. tw.rpi.edu. Future Web Web Science Policy Social. Hendler. Themes.
E N D
Bringing Data Science, Xinformatics and Semantic eScience into the Graduate Curriculum (solicited) EGU2012-11224 (EOS 6/ ESSI2.3) April 25, 2012, Vienna Peter Fox (RPI) pfox@cs.rpi.edu Tetherless World Constellation
tw.rpi.edu • Future Web • Web Science • Policy • Social Hendler Themes • Xinformatics • Data Science • Semantic eScience • Data Frameworks Fox McGuinness • Semantic Foundations • Knowledge Provenance • Ontology Engineering Environments • Inference, Trust Multiple depts/schools/programs ~ 35 (Post-doc, Staff, Grad, Ugrad)
Govt. Data • Open • Linked • Apps Hendler/ Erickson Application Themes • Env. Informatics • Ecosystems • Sea Ice • Ocean imagery • Carbon Fox McGuinness/Luciano • Platforms: • Bio-nano tech center • Exp. Media and Perf. Arts Ctr. • Comp. Ctr. Nano. Innov. • Data Intensive • Health Care/ Life Sciences • Population Science • Translational Med • Health Records
http://tw.rpi.edu/web/Courses Context Experience Data Information Knowledge Creation Gathering Presentation Organization Integration Conversation Data Science Xinformatics Semantic eScience 4 Web Science
Also at RPI • Data Science Research Center and Data Science Education Center • http://www.rpi.edu/about/inside/issue/v4n17/datacenter.html • Over 35 research faculty, 5 post-docs, ? grad students • Data is one of Rensselaer Plans’ five thrusts • Other key faculty • Fran Berman (VPR) • Jim Myers (Director CCNI)
Curriculum • Web Science and IT – undergrad, and MSc. and PhD. (with science concentrations) • Environmental Science with Geoinformatics concentration • Bio, geo, chem, astro, materials - informatics • GIS for Science • Master of Science – Data Science (pending) • Multi-disciplinary science program (2012) PhD in Data and Web Science
E.g. IT with Env. Sci. • ERTH-1200 Geology II (4 credits) - spring • CHEM-2250 Organic Chemistry I (4 credits) - spring • ERTH-2210 Field Methods (2 credits) - fall • IENV-1920 Environmental Seminar (2 credits) - spring • BIOL-2120 Intro. to Cell and Molecular Biology (4 credits) - spring • IENV-4500 Global Environmental Change (4 credits) - fall • ERTH-4180 Environmental Geology (4 credits) – spring • ERTH-4963 Xinformatics (4 credits) – spring • IENV-4700 One Mile of the Hudson River (4 credits) - fall
Geoinformatics concentration • CSCI1000 - Computer Science I • CSCI1200 - Data Structures • CSCI2300 - Introduction to Algorithms or ERTH 4750 - Geographic Information Systems in the Sciences • CSCI4380 – Databases • CSCI4961 - Data Science • CSCI4960 – Xinformatics • ERTH 4980 – Senior Thesis
Web Science Learning Objectives • Students will demonstrate knowledge and be able to explain the three different "named" generations of the web (a/k/a Web 1.0, Web 2.0, and Web 3.0) from mathematical, engineering, and social perspectives • Students will demonstrate the ability to use the dynamic programming language Python to develop programs relating to Web applications and the analysis of Web data. • Students will be able to understand and analyze key Web applications including search engines and social networking sites. • Students will be able to understand and explain the key aspects of Web architecture and why these are important to the continued functioning of the World Wide Web. • Students will be able to analyze and explain how technical changes affect the social aspects of Web-based computing. • Students will be able to develop "linked data" applications using Semantic Web technologies.
Data Science Objectives • To instruct future scientist how to sustainably generate/ collect and use data for their research as well as for others: data science. • To instruct future technologists how to understand and support essential data and information needs of a wide variety of producers and consumers • For both to know tools, and requirements to properly handle data and information • Will learn and be evaluated on the full life-cycle of data and relevant methods, technologies and best practices.
Learning Objectives • Develop and demonstrate skill in data collection and management • Know how to develop and apply data models and metadata models • Demonstrate knowledge of data standards • Develop and demonstrate the application of skill in data science tool use and evaluation • Demonstrate the application of data life-cycle principles and data stewardship • Demonstrate proficiency in data and information product generation
Xinformatics Objectives • To instruct future information architects how to sustainably generate information models, designs and architectures • To instruct future technologists how to understand and support essential data and information needs of a wide variety of producers and consumers • For both to know tools, and requirements to properly handle data and information • Will learn and be evaluated on the underpinnings of informatics, including theoretical methods, technologies and best practices.
Learning Objectives • Through class lectures, practical sessions, written and oral presentation assignments and projects, students should: • Develop and demonstrate skill in development and management of multi-skilled teams in the application of informatics • Demonstrate ability to develop conceptual and logical information models and explain them to non-experts • Demonstrate knowledge and application of informatics standards • Demonstrate skill in informatics tool use and evaluation
Modern informatics enables a new scale-free framework approach
Semantic eScience Objectives • Ontology Development, Merging and Validation • Semantic Language and Tool Use and Evaluation • Use Case Development and Elaboration • Semantic eScience Implementation and Evaluation via Use Cases • Semantic Application Development and Demonstration • Group Project and Team Development, Use Case Implementation and Evaluation
Discussion… • Science and interdisciplinary from the start! • Not a question of: do we train scientists to be technical/data people, or do we train technical people to learn the science • It’s a skill/ course level approach that is needed • Education and research semi-coupled • We must teach methodology and principles over technology * • Data science must be a skill, and natural like using instruments, writing/using codes • Team/ collaboration aspects are key ** • Foundations and theory must be taught ***
Progression after progression Informatics Requirements • Example: • CI = OPeNDAP server running over HTTP/HTTPS • Cyberinformatics = Data (product) and service ontologies, triple store • Core informatics = Reasoning engine (Pellet), OWL • Science (X) informatics = Use cases, science domain terms, concepts in an ontology