1 / 27

Storage Systems CSE 598D, Spring 2007

Storage Systems CSE 598D, Spring 2007. Lecture 1: Introduction and Overview January 25, 2007. How this course will work. Class meetings twice a week Tue, Thu: 5.30 - 6.45 pm, 223B Lectures by me in most classes Some student presentations To be determined as the course progresses

pgoodwin
Download Presentation

Storage Systems CSE 598D, Spring 2007

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Storage SystemsCSE 598D, Spring 2007 Lecture 1: Introduction and Overview January 25, 2007

  2. How this course will work • Class meetings twice a week • Tue, Thu: 5.30 - 6.45 pm, 223B • Lectures by me in most classes • Some student presentations • To be determined as the course progresses • Everyone should participate in discussions • Part of your grade for participation! • Scribe notes to record lectures and discussions • 2-3 assignments • May involve some simple system building: details to be decided • Some written homeworks • Online resources • Class URL via my Web page • Slides/scribe notes/assignments on Angel • Please make sure your Angel email works

  3. How this course will work • Background expected • Operating systems (411-level) • Basic knowledge of file systems, I/O subsystem, DMA, device drivers, … • Distributed systems • Consistency semantics, replication, caching, synchronization, … • Algorithms and data structures (undergraduate-level) • Analysis of algorithms, basic data structures • Will cover background material whenever needed • Your feedback important in deciding what to cover

  4. How this course will work • No text-book • I will use some chapters from a set of books • If needed, photocopies of these will be made available to you • Syllabus consists of material presented in class • Most of it based on research papers made available on the course page • Not up yet but will be soon • Additional reading material: for background or to delve deeper • What you need to do • Read assigned papers BEFORE each class • During the class • Ask questions, express your opinions, argue! • Goal: Learn about storage systems • Also learn • How to read a research paper? • How to write a good systems paper? • What separates good (systems) research from bad?

  5. How this course will work • Grading • Scribe notes: 10% • Detailed notes that one can go back to and find everything that was presented and discussed in the class • And that you can use for revision before the exam! • Participation in class: 10% • Mid-term exam: 20% • Presentation: 10% • Assignments (2-3): 20% • Survey or Project: 30%

  6. How this course will work • Survey • A 10-15 page comprehensive exploration/synthesis of an area related to storage systems at the end of the semester • Project • Groups of up to 2 students • Identify a problem and motivate the need to solve it • Convince where existing research lacks • Develop and evaluate your solution • Present it in a paper-style write-up at the end of the semester

  7. Today • Some background/history on storage systems • Overview of course content • A superset of topics we will study

  8. Introduction

  9. Why Applications Need Storage • Memory is • Volatile: Durability is needed • Not enough: High Capacity is needed • Not easy to share/move: Portability is needed • Expensive • Non-volatile, cheap, long-lasting, reliable, abundant storage is needed for numerous applications • Personal/individual applications • Scientific applications • Enterprise applications • Internet scale applications • Emerging sensor networks, highly distributed systems such as some P2P systems

  10. Personal Applications • Email, Contacts, Schedules, … • Financial data, personal files, … • Media files • Gaming

  11. Sanger Institute Sequencing facility to add 100 TB each yr. CERN Particle Collider Scientific Applications • Manipulate large data sets: Either explicitly (files) or implicitly (VM). NASA EOSDIS

  12. Enterprise Applications • File and Email servers • OLTP • OLAP • Other Database applications • SAP • Financial workloads • …

  13. Data Grids Internet Scale Applications

  14. Sensor Networks

  15. IBM 305 RAMAC - 1956Random Access Method of Accounting and Control • 5 MB capacity, 50 disks each 24” diameter, 2000 bits/sq-inch density • First computer with magnetic hard disk • Replaced the “magnetic drum” • Could store roughly 2000 pages of text!

  16. Seagate Savvio 10K.1 - 2004 • 10K RPM, 73.4 GBytes • Can read and write complete works of Shakespeare 15 times each second!

  17. Seagate Savvio 15K - 2007 • 15K RPM, 73.4 Gbytes • World’s fastest disk?

  18. Storage Devices/Hardware Storage Area Networks RAID Arrays Tape Archives

  19. Overview of Course Content

  20. Overview of Course • What goes on inside a disk? • Hardware • Modeling the disk • Performance optimizations • Disk scheduling • Rearranging data blocks • How do you improve bandwidth to/from disks? • RAID arrays • Reduce data transferred from disks (Active Disks) • Storage Area Networks to allow concurrent transfers to/from several hosts • Shared Storage Model

  21. How can software take advantage of these enhancements • Review of the OS I/O subsystem • How sys-admins manage storage • File Systems for NAS/SAN • Caching and Pre-fetching • Theory of storage • Which problems are hard? • Important data structures • With shared storage, and a very complicated storage system, how do we manage this hierarchy? • Storage Provisioning • QoS Control/Virtualization • Security • Case-studies of enterprise storage systems (e.g., EMC, Veritas)

  22. Requirements are becoming more stringent - we need do guarantee availability, and store data for a long time (archival storage). How do we achieve this? • Dependability/Availability issues • Disaster management • Data lifetime • Power and thermal management of storage systems • Storage in highly distributed systems • Storage in P2P systems • Sensor storage • Grid-like infrastructure based storage: E.g., Oceanstore • Storage in search, information retrieval • Google File System • Are disks going to be the norm in the future? • Future of magnetic storage • MEMS • Flash storage • Windows Vista for laptops

  23. Part of the material will be from these books • “Storage Networks Explained” (Wiley), Troppens, Erkens, and Muller • “The Holy Grail of Storage Management” by Toigo • “Storage Area Network Essentials” (Wiley) by Barker and Massiglia

  24. Next time • Hard disk • Certain aspects of I/O subsystem • Spanning hardware and OS

  25. L2 iL1 Memory Bus (e.g. PC133) Main Memory dL1 I/O System View CPU Software Stack Appln. File System Buffer Manager Device Driver e.g. SCSI I/O Bus (e.g. PCI) Disk Ctrller Controller(ASIC) Device Firmware Cache DMA engine Platters Actuator Motors Electronics

More Related