1 / 29

Introduction to HDF5 Session Two Data Model Comparison HDF5 File Format

Learn about HDF5 and its capabilities, how it can address data management challenges, and comparison with filesystems, XML, relational databases.

brownjoel
Download Presentation

Introduction to HDF5 Session Two Data Model Comparison HDF5 File Format

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to HDF5Session Two Data Model ComparisonHDF5 File Format

  2. Our Purpose Today • Familiarize you with HDF5 and its capabilities. 2) Help you understand how HDF5 might be applied to your data management challenges.

  3. HDF5 Data Model File Link Dataset Group Datatype Dataspace Attribute HDF5 Objects

  4. Developing a Project Data Model HDF5 Data Model Relational A Relational Database HDF5 File

  5. Logical Data Models X X

  6. HDF5 / Directories and Files • Both support hierarchies for organizing information (and to some degree, directed graphs)

  7. HDF5 / XML • Both support rich metadata and allow new types to be defined • HDF5 objects designed for numeric data; XML objects designed for text

  8. HDF5 / Relational Databases • HDF5 supports multi-dimensional arrays with common datatypes in the cells; locate by offset • RDB support rows with different data types in fields; locate by primary key

  9. HDF5 Technology Platform • HDF5 data model • The “building blocks” for data organization and specification • HDF5 software • Library, language interfaces, tools • HDF5 file format • Bit-level organization of HDF5 file Let’s look at… Recall…

  10. HDF5 File Format • Defined by the HDF5 File Format Specification • Specifies the bit-level organization of an HDF5 file on storage media • Maps the data model objects to a linear address space • Other representations of the data model objects are also possible, but those are not the HDF5 format • Self-describing • All the information necessary to read and reconstruct the data model objects is specified by the format • Designed to work well with other technologies • Designed for speed and storage efficiency • Binary format

  11. HDF5 File Format Specification Introduction You can have the power of the format without worrying about the details of the specification.

  12. Developing a Project Data Model HDF5 Data Model Relational A Relational Database HDF5 File

  13. Physical Instantiations Format

  14. HDF5 / Filesystem • Both allow traversal of objects in the hierarchy • Both include internal metadata for fast access to subsets of the data • Both can handle variety of data • HDF5 file can be easily migrated or shared

  15. HDF5 / “Binary Flat File” • “Binary Flat File” = A sequence of bytes representing (primarily) numeric data. Often written by scientific and engineering applications to save results from simulations or experiments. • A binary flat files usually represents the fastest way to write numeric data. Read performance varies depending on access patterns. • Unlike HDF5, binary flat files are not self-describing or portable across architectures.

  16. HDF5/XML • Both HDF5 and XML are self-describing and portable • XML is text-based and requires contents to be accessed sequentially • HDF5 is binary and supports random access and subsetting

  17. HDF5/PDF • Both HDF5 and PDF formats are published and open • Both can include heterogeneous types of information • PDF focused on documents • HDF5 focused on collections of different types, with strong support for multi-dimensional arrays of numeric data • Both are portable across architectures

  18. HDF5 / Relational Databases • RDB provides access control features; HDF5 does not • RDB transaction based; HDF5 is not • Transactions / Logging introduce overhead that may not be needed • HDF5 not designed for many writers to ‘random’ locations • RDB provides built-in indices to values • HDF5 provides navigation to datasets / subsets within datasets • HDF5 files portable across platforms

  19. Discussion • How could daily temperature measurements made at various locations throughout a building be modeled in different formats? Filesytem, Binary Flat File, XML, PDF, Relational Database • What are some pros/cons of each?

  20. Review • HDF5 consists of • file format • self-describing • many internal structures to support high-performance • software • data model • file, dataset, datatype, dataspace, attribute, group, link • HDF5 designed to support • management of high-volume, complex data • data sharing and preservation

  21. HDF5 Data ModelExample ENSIGHT Automotive Crash Simulation

  22. Automotive Crash Simulation

  23. Automotive Crash Simulation

  24. Automotive Crash Simulation

  25. Solid Modeling

  26. Solid Modeling

  27. Modeled in HDF5

  28. Mesh Example in HDFView

  29. Stretch Break

More Related