70 likes | 283 Views
Evacuating the Comfort Zone: (Via Curriculum Reform…). Comfy Topics. Logical data models and languages Query optimization/execution Consistency models and mechanisms Storage architectures Enterprise IT applications. Moving Outside the Zone. Rethinking system architectures
E N D
Comfy Topics • Logical data models and languages • Query optimization/execution • Consistency models and mechanisms • Storage architectures • Enterprise IT applications
Moving Outside the Zone • Rethinking system architectures • Deep memory hierarchies, componentization, adaptive algorithms, extreme scales (nano to global) • Embracing probabilistic reasoning • In data analysis, adaptive algorithms (again!), online user interactions, data modeling and integration, lossy compression
Course 1: Data Systems • Not so radical: infect the OS course with the RSS • Traditional OS material (scheduling, protection, resource management) • File & Record storage • Transactions, Concurrency, Recovery • Storage Hierarchies • Dataflow architectures: query plans, NW support • Big pedagogical benefit to merging this material • Two design targets (OS/FS vs. DBMS) • Leads to instructive architectural tradeoffs • Illustrates 2 design philosophies (bottom-up vs. top-down)
Course 2: Modeling & AnalysisRelational + IR + Statistics + Information Theory • Review of basic math • 1st-order logic • Central Limit Theorem, Chernoff/Hoeffding bounds • Simple information theory: entropy, error metrics • Data Modeling • Logical: Relational normalization, ontologies, IR bag-of-words • Probabilistic: simple graphical models (Bayes nets), IR vector space • Data Analysis • Relational-style query optimization/execution, OLAP • Sampling and summarization • Boolean IR, ranked retrieval, link analysis, info extraction • Predictive analyses: classification, clustering • Data Visualization • Pragmatics, Exercises: • Decision-support systems & tasks: queries, mining tools, etc.
Assertions • Course 1 is a Good Idea • 4 years at the grad level at Berkeley, it works great • Course 2 is The Future • It’s in demand • Think of what Business, BioEng, etc. really want! • Do our systems students even know how to manage, exploit experimental data? • A curriculum, or a research agenda? • KDD is a piece of this • But fragmentary • Opportunity here! • The DB textbook market is saturated :-)