70 likes | 207 Views
Priority Research Direction (I/O Models, Abstractions and Software). Key challenges. Summary of research direction. What will you do to address the challenges? Develop newer I/O models and higher level abstractions (datasets based techniques that exploit specialized applications)
E N D
Priority Research Direction (I/O Models, Abstractions and Software) Key challenges Summary of research direction • What will you do to address the challenges? • Develop newer I/O models and higher level abstractions (datasets based techniques that exploit specialized applications) • Purpose-driven and customizable I/O (e.g., checkpointing, analytics, external communication (workflow) • Incorporate I/O into programming models and languages • Utilize I/O delegation for offloading I/O within user space, caching, data reorganization • Integrate online analytics and data management • Programming and Abstraction : how is I/O viewed from 1M+ processes? The file I/O abstraction is not good enough nor scalable. Make I/O independent of number of processes with predictable performance Potential impact on software component Potential impact on usability, capability, and breadth of community • What capabilities will result? • - Higher-level abstraction (e.g., datasets, specialized data management) • Purpose-driven I/O (e.g., checkpointing, analytics, external communication in a workflow) • Customizable I/O • I/O Delegation and Active Storage with I/O and processing as a service • How will this impact the range of applications that may benefit from exascale systems? • More control and significantly reduced complexity in I/O (3-5 years) • Portability of application WRT I/O (3-5 years) • Predictable performance (5+ years) • Maximize use of data while available (3-5) • Real-time Knowledge Discovery and Insights (10+ years)
4.x I/O Models, Abstractions and Software • Technology drivers • File systems with traditional semantics are not scalable • I/O architectures as an independent and separate component does not scale • Alternative R&D strategies • Extend current file systems • Develop newer layers on top of current file systems • Develop newer I/O models and higher level abstractions (datasets based techniques that exploit applications domains) • Purpose-driven and customizable I/O (e.g., checkpointing, analytics, external communication (workflow) • Develop techniques to concurrently exploit the data and perform analytics when it is created; that is, embed online analytics • Incorporate I/O into programming models and languages • Use databases • I/O Delegation and Active Storage with I/O and processing as a service • Recommended research agenda • Develop newer I/O models and higher level abstractions (datasets based techniques that exploit specialized applications) • Purpose-driven and customizable I/O (e.g., checkpointing, analytics, external communication (workflow) • Incorporate I/O into programming models and languages • Active Storage with I/O and processing as a service • Utilize I/O delegation for offloading I/O within user space, caching, data reorganization etc. • Develop techniques to concurrently exploit the data and perform analytics when it is created; that is, Integration of data analytics, online analysis and data management • Crosscutting considerations • Programming models and languages • Architectures
Priority Research Direction (Newer Storage Devices (SCM/SSD) and I/O Hierarchies) Key challenges Summary of research direction • Brief overview of the barriers and gaps • Performance, energy footprint and scalability of current storage devices is limiting • Incorporation of newer storage devices such as SCM, SSD • Optimizations for managing newer hierarchies • What will you do to address the challenges? • Develop balanced architectures with newer devices embedded within the system • Develop new I/O models, software, runtime systems and libraries to exploit these hierarchies • Develop new file systems or special-purpose data management layers • Intelligent and proactive caching mechanisms Potential impact on software component Potential impact on usability, capability, and breadth of community • What capabilities will result? • Orders of magnitude faster I/O and performance • - Significant potential for power optimizations in the I/O subsystem • What new methods and components will be developed? • - Software layers for managing newer devices and memory hierarchy • How will this impact the range of applications that may benefit from exascale systems?* • Much faster I/O and highly optimized sustained performance (3 years) • Significant reduction in the cost of checkpointing (3 years) • Real-time knowledge discovery and insights (6+ years) • Much simpler data management (5 years) • * This timeline is relative to the time thee devices are incorporated into the architectures
4.x Newer Devices and Hierarchies • Technology drivers • Disks based storage systems not scalable • Newer Storage devices such as SCM and SSD provide a potential to significantly improve performance and reduce power consumption by orders of magnitude • Alternative R&D strategies • Build balanced architectures with newer devices embedded within the system • Develop new I/O models, software, runtime systems and libraries to exploit these hierarchies • Develop new file systems or special-purpose data management layers • O/S manages the new memory hierarchy (for I/O purposes) • Intelligent and proactive caching mechanisms • Recommended research agenda • Develop balanced architectures with newer devices embedded within the system • Develop new I/O models, software, runtime systems and libraries to exploit these hierarchies • Develop new file systems or special-purpose data management layers • Intelligent and proactive caching mechanisms • Crosscutting considerations • Power optimizations • Potential to significantly enhance resiliency • Architectures • Operating System
4.x <External Communication> • Technology drivers • Data movement from/to systems is sequential (single node based) even with multiple streams • Protocol conversion • Alternative R&D strategies • Develop parallel data movement software and tools • Special purpose network protocols for parallelism • Scalable Scheduling • Integration of external networks with local file systems • Recommended research agenda • Develop parallel data movement software and tools • Special purpose network protocols for parallelism • Scalable Scheduling • Integration of external networks with local file systems • Crosscutting considerations • Scheduler
I/O, Storage and Data Management < I/O Models and Abstractions > Integrated with newer Programming Models and Languages SDM for Peta/Exa-bytes Real-time Knowledge Discovery and Insights Accelerated Scientific insights from Petabytes of Data Purpose driven I/O and Active Storage, Integration of Analytics and I/O Power optimized, Customizable I/O I/O Runtime systems for SCM/SSD devices, Newer I/O abstractions I/O delegation 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019