340 likes | 545 Views
Columbia’s Institute for Data Sciences and Engineering. An Applied Sciences Innovation Hub. A Broad Institute. Nine Schools SEAS (School of Engineering and Applied Science) Arts and Science Journalism Business Architecture, Planning and Preservation International and Public Affairs
E N D
Columbia’s Institute for Data Sciences and Engineering An Applied Sciences Innovation Hub
A Broad Institute • Nine Schools • SEAS (School of Engineering and Applied Science) • Arts and Science • Journalism • Business • Architecture, Planning and Preservation • International and Public Affairs • Medical School • Public Health • Law • Led by the School of Engineering and Applied Science Current profile, approximately: • 180 faculty • 690 PhD students • 1,650 MS students • 1,500 Undergraduate students
Institute Plans • 48 founding Institute faculty • ~130 affiliated faculty members, University-wide • Initial plans to hire 30 new faculty and recruit 150 doctoral students; 45 additional faculty will be hired, at 5 a year, over the next 15 years • 44,000 sq. feet of new academic space will be ready by 2016
Degree Programs • Certificate in Data Science • Fall 2013 • Four courses • MS in Data Science • Fall 2014 • Core in fundamentals of data science • Tracks in application areas corresponding to centers
The Centers of the Columbia Institute for Data Sciences and Engineering NEW MEDIA SMART CITIES HEALTH ANALYTICS CYBERSECURITY FINANCIAL ANALYTICS FOUNDATIONS OF DATA SCIENCE
Center for Smart CitiesChair: Peter Kinget, EESEAS –APAM, CCLS, CS, ChemE, EE, EEE, MechE; Partners -Graduate School of Arts and Sciences; Business School; Journalism School7 committee members, 23 affiliates
Research in Smart Cities • Integrating the digital city with the physical city • Monitoring building energy consumption in New York • Improve the power supply through smart grid technology • Deploy sensing devices to facilitate everyday activities in a crowded urban environment
Infrastructure MonitoringMonitoringlarge suspension bridge vibrations Fixed Reference
Urban visualizationVisualizing and interacting in 3D with georeferenced urban data
Center for New MediaChair: Mark Hansen, JournalismCo-chair: Owen Rambow, CCLSSEAS–CCLS, CS, EE, IEOR; Partners -Tow Center for Digital Journalism; Business School; Graduate School of Architecture, Planning, and Preservation10 committee members, 19 affiliates
Research in New Media • Creating new forms of digital media • Analyzing and creating social media • Creating visualizations • Acquiring Information • From language – speech analysis, machine translations, identifying emotions • From images and video – extracting information from images, e.g., Leafsnap • Harnessing information • Applications for journalism, business • and social media • Presenting data so it is usable
Center for Health AnalyticsChair: NoemieElhadad, Biomedical InformaticsSEAS –APAM, BME, ChemE, CS, IEOR, MechE;Partners –Medical School, Grossman Center, Public Health, Biology, and Statistics10 committee members, 15 affiliates
Research in Health Analytics • Analyzing big data for: • Patient data • Genomic databases • Public health records • Using electronic health records • To discover patterns of diseases,effective drugs, treatments, and therapies • Sequencing genomics • Showing associations with single mutations and genetically-associated diseases • Functional genomics • Finding how cancer genes disrupt healthy function • Systems biology DNA Sequencing on a chip
Health Analytics Center Individual, Population Clinical, Healthcare Molecular, Cellular
EHR and time series analysis – Glucose predictability (Albers et al., 2009)
Center for Financial AnalyticsChair: David Yao, Industrial Engineering and Operations ResearchSEAS –ChemE, CS, IEOR; Partners -Business School, Economics, International and Public Affairs, Statistics6 committee members, 24 affiliates
Research in Financial Analytics • Big data for better financial services and solutions • Use predictive analytics to optimize financial decisions • Understand and regulate high-frequency trading • Predict and manage systemic risks • Real-time analysis of unstructured data/information, e.g.,corporate and government actions, commentary, social media • Help banks identify and contain fraud,
Systemic risk • Complex • Very high dimensional • “Edges” • Lending • Assets • Derivatives • “Social” network of financial institutions Minoui and Reyes (2011 IMF Report)
The Contagion Effect • A vicious cycle during crisis time, leading to contagion • Approach: stochastic network using publicly available data
Foundations of Data ScienceChair: Tony Jebara, CSCo-chair: TianZheng, StatisticsSEAS –APAM,CCLS,CS, IEOR, EE, EEE, MechE; Partners –Biomedical informatics, Biology, Business, CIESIN, Lenfest Center, Statistics6 committee members, 42 affiliates
Foundations of Data Science • Machine learning • Computational learning theory • Statistical prediction • Algorithms and optimization • Software and hardware infrastructure for computation with big data
Graph & Network Algorithms • Matching nodes into a network • New students show up to school • Have a matrix of their profile vectors • At graduation, observe formed network • Predict network for next year’s freshmen?
Center for CybersecuritySEAS –APAM,CCLS,ChemE,CS, EE, EEE, IEOR, MechE;Partners –Biology,Business School, International and Public Affairs, Law
Research in Cybersecurity • Essential to critical infrastructure • Government, financial transactions, electronic commerce, and personal computing • Security and survivability of large-scale, heterogeneous cybersystems • Systems security engineering • Threat mitigation • Threat detection and analysis • Cyberattack reaction and recovery • Cyberattack tolerance • Large-scale distributed (re)action
Breaking commodity devices to learn how to fix them using Symbiotes • CISCO Phone IP vulnerability • HP printer firmware update vulnerability