1 / 24

Data Management

Data Management. David Nathan & Peter Austin & Robert Munro. This section. Data management Properties of data Relational data model XML Example. something happened. . representations, lists, summaries, analyses. something inscribed. cleaned up, selected, analysed.

chassidy
Download Presentation

Data Management

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Management David Nathan & Peter Austin & Robert Munro

  2. This section • Data management • Properties of data • Relational data model • XML • Example

  3. something happened  representations, lists, summaries, analyses something inscribed cleaned up, selected, analysed you applied knowledge, made decisions archived, presented, published NOT OF INTEREST! recapitulates  representations, eg transcription, annotation recording you applied knowledge, techniques made decisions, applied linguistic knowledge FOCUS OF INTEREST! archived & ... ?? something happened Workflows - description vs documentation Description Documentation

  4. Choosing values/priorities • Standards & compliance • Adeptness with tools • Modelling of phenomena, architecture of data • Dissemination/publishing • Preserving • Ethics, responsibility, protocol • Range, comprehensiveness • Intellectual rigour • Which are priorities? • Which are dispensible?

  5. Data should be: • explicit • consistent • robust • meaningful • conventional • adaptable, convertible, machine readable etc • useful!

  6. “Portability” • Bird and Simons 2003: language documentation data needs to have integrity, flexibility, longevity

  7. “Portability” • complete • explicit • documented • preservable • transferable • accessible • adaptable • not technology-specific • (also appropriate, accurate, useful etc!!)

  8. Data management • the way that data is structured is also information, that may be complex • properly structured data allows: • usage including manipulation, conversion, derivation • preservation • machine readability

  9. Data management systems • a data management system is a system you design for storing data and metadata: • information about content and structures • relationship between units of information • it is not necessarily tied to any particular software, or even a computer

  10. Naive managment using filenames • a (too) simple management system: • information about a recording is captured in the filenames: 1st_int_john_5Aug.wav market_conv_mj.wav …. • what does ‘int’ mean? • what information about the recording is missing?

  11. Data modeling • World/universe • Domain • Relevant • entities • properties • relationships • We also need formal ways to represent these

  12. Data modeling • data modelling is the process of designing your data management system: • what information do you need to record? • what are the units of information? • what are their properties (attributes)? • what are the relationships between the units of information? • how is the information etc likely to change in the future? • how can all this be represented?

  13. Data management • two well-known formats for structured data: • relational database • eXtensible Markup Language (XML) • these are methods, not softwares or hardwares • any system for well-structured data could be OK, but generally: • smaller community of users so less tools and support • ... so errors more likely

  14. Databases • Note that database has 3 senses: • a body of related information • type of software (eg Oracle, Access, Filemaker) • a model for the domain of information (ie. formulation of entities and relationships)

  15. Relational format • Uses tables • Table rows represent entities in a domain • Table columns represent properties/attributes of entities • Each cell represents one atomic unit of data • The order of rows and columns has no significance

  16. TABLE NAME field name Representing a relational design • simplest example

  17. Representing a relational design • less trivial entity TABLE NAME field 1 field 2

  18. CONTINENT name COUNTRY name Representing a relational design • less trivial domain = one to many

  19. AUTHOR ..... SUBJECT name ..... name Non-trivial domains • non-trivial domains have many-to-many relationships

  20. From model to implementation • implementing table relationships CONTINENT COUNTRY name name id id continent_id

  21. Designing a database • Determine the domain, entities and relationships • Experiment with scenarios • Any non-trivial model will evolve as it is thought out and tested • Normalisation is the process of refining models

  22. Practical example • Create a database model for some audio metadata

  23. What does all this achieve? • conceptual/intellectual validity • scalable, searchable, modular • machine readable • in fact, portable: • complete • explicit • documented • preservable • transferable • accessible • adaptable • not technology-specific

  24. Stop here!

More Related