560 likes | 573 Views
Discover Data Service Models focusing on access, dissemination, and microdata. Hands-on experience with various data retrieval systems and tools. Review and discuss practical approaches for managing data in library settings.
E N D
DLI Training Workshop Hosted by the University of Regina Library December 1999 Chuck Humphrey
Day 1 : A.M. • Review data service models within the framework of: • access and dissemination • aggregate data and microdata • statistics versus data
Day 1 : A.M. • Hands-on work with aggregate data • CANSIM • E-STAT • Census ‘96 • Health Indicators Database
Day 1 : P.M. • Microdata retrieval systems • LANDRU (UC) • introduction with hands-on experience • ISLAND (UBC) • introduction with hands-on experience
Day 2 : A.M. • Update to DLI since 1997 • experiences with new web services • Spatial Data Retrieval: GEODE
Day 2 : P.M. • Other data extractors • Discussion about possible COPPUL data access projects • Roundtable discussion about introducing data to our reference colleagues
Access and Dissemination Tools for Aggregate and Microdata
Data Service Models • Begin by discussing data service models within the framework of three topics: • access and dissemination • aggregate data and microdata • statistics versus data
Treat as a Collection and Provide Reference Install Data and Provide Access “Order & Pass-through” Service Data Service Models • Models were presented as a continuum during the 1997 DLI workshop
Data Service Models • Choose a model that matches your staff and computing resources
Acquisition Fill a Request Locate data Order data & documentation Collection Development Select & Locate data Order data & documentation Catalogue data & documentation Install & Store (data & documentation) Reference Search for data Interpret documentation Retrieve or download data Process data change formats subset cases or variables aggregate cases merge files analyze data
Acquisition Fill a Request Locate data Order data & documentation Collection Development Select & Locate data Order data & documentation Catalogue data & documentation Install & Store (data & documentation) Reference Search for data Interpret documentation Retrieve or download data Process data change formats subset cases or variables aggregate cases merge files analyze data
Acquisition Fill a Request Locate data Order data & documentation Collection Development Select & Locate data Order data & documentation Catalogue data & documentation Install & Store (data & documentation) Reference Search for data Interpret documentation Retrieve or download data Process data change formats subset cases or variables aggregate cases merge files analyze data
Acquisition Fill a Request Locate data Order data & documentation Collection Development Select & Locate data Order data & documentation Catalogue data & documentation Install & Store (data & documentation) Reference Search for data Interpret documentation Retrieve or download data Process data change formats subset cases or variables aggregate cases merge files analyze data
Acquisition Fill a Request Locate data Order data & documentation Collection Development Select & Locate data Order data & documentation Catalogue data & documentation Install & Store (data & documentation) Reference Search for data Interpret documentation Retrieve or download data Process data change formats subset cases or variables aggregate cases merge files analyze data
Acquisition Fill a Request Locate data Order data & documentation Collection Development Select & Locate data Order data & documentation Catalogue data & documentation Install & Store (data & documentation) Reference Search for data Interpret documentation Retrieve or download data Process data change formats subset cases or variables aggregate cases merge files analyze data
Find a referral partner on campus Acquisition Fill a Request Locate data Order data & documentation Collection Development Select & Locate data Order data & documentation Catalogue data & documentation Install & Store (data & documentation) Reference Search for data Interpret documentation Retrieve or download data Process data change formats subset cases or variables aggregate cases merge files analyze data
1. Access and Dissemination
The Inventory Model • In the traditional inventory model, roughly half of the support goes to putting items on the shelf, while the other half goes to finding and getting the items off the shelf. Source: Darlene Fichter
The Access Model • With the access model, support is split between getting information into a deliverable state and finding appropriate ways of retrieving and disseminating the information.
Access/Dissemination Issues • managing vendor licenses • are the license conditions realistic? • what type of identification or authentication is required?
Access/Dissemination Issues • matching products with technology • is the product dependent on a specific operating system? • is the product software dependent?
Access/Dissemination Issues • determining access methods • stand-alone, lan or wan? • what are the finding tools?
Access/Dissemination Issues • determining dissemination options • what are the output formats? • does the output require special storage considerations?
The Access Model • These issues and others about access and dissemination will underlie our discussions over the next two days.
2. Aggregate and Microdata
Data Types • In the 1997 DLI workshop time was spent discussing differences between aggregate data and microdata. • Each type has an impact on data access models.
Aggregate Data • Aggregate data consist of statistical summaries derived from original data collections and organized in tables according to the following properties: • socio-economic phenomena • spatial representation • time
Aggregate Data • Statistical summaries • these summaries takethe form of counts, totals, sums, averages or percentages
Spatial representation and Time are fixed Cells contain counts Age and Sex are displayed
Spatial representation and Age are fixed Year and Sex are displayed
Age and Time are fixed Geography and Sex are displayed
Aggregate Data • Aggregate data products • usually stored as a series of related tables in some type of database structure requiring special retrieval software (examples from STC include C86, C91, CBP, CANSIM, etc.)
Microdata • Microdata are • usually anonymised records of actual respondents from a survey • unsummarized, i.e, observations in the form in which the data were collected • in a raw format requiring some form of processing, typically a flat ASCII file
Case 3 Case 4 Microdata: Cases 3 & 4 from the GSS 2 Main File 000031214110011982001212222221002098200121222222401121111241112121112205020197111971021212222225211026121204300140955720411313022111999901978787879702221411271412400315000616611232222222221111172626162212222666666636212000000020320222224222000022204141101101102111111122111000000210000000002100000000010000000000200000423300200200100000100200 000041100110011101102122222221002009200212222222021111111231212111211208120193811938044122222221111052201203901007504721031191012233520406058787870304221303420708300400001420007111222122211721575656565555555666666656565000555500210222111111110000001111100001101112212122111011010110000110101100000000000000000000000000000000000000000000000000
Microdata: First 14 Cases from the GSS 2 Episode File 000041144504000800024010000000012518733 000041144308000900006011222220012518733 000041141709000930003031222220012518733 000041141709301100009031222220012518733 000041141211001330015011222220012518733 000041149113301630018011222220012518733 000041141216301800009011222220012518733 000041143018002000012031222220012518733 000041147920002015001541222220012518733 000041143720152130007531222220012518733 000041147921302145001542221220012518733 000041144321452200001512221220012518733 000041147522002300006012221220012518733 000041144523002800030010000000012518733
Impact on Data Access • Aggregate data have been processed and organized in a database structure • must locate the table with desired data • must deal with each database structure • must deal with accompanying retrieval software
Impact on Data Access • Microdata data must be processed or subset for subsequent processing • must identify desired variables and cases (data documentation) • must deal with the raw data file structure • must address the issue of desired formats
Impact on Data Access • These and others differences between aggregate data and microdata will be part of our discussions about data access, also.
3. Statistics versus Data
Statistics versus Data • The term statistics is commonly used to describe the numeric summaries, such as counts, totals, sums and averages, that people use to make a point in a study or report.
Statistics versus Data • The term data refers to numeric files containing a collection of raw information with many observations that can be analyzed from a variety of perspectives.
Statistics versus Data • Typically, generalizations are drawn from analyses of a data file. • For example, the information provided by all of the individuals in a survey is considered to be data, while the percent of respondents in a survey with a university degree is a statistic.
Blurring Statistics and Data • In the print world, statistical information is usually found in statistical abstracts, census monographs and serial publications by government agencies.
Blurring Statistics and Data • In the digital world this numeric information is now appearing with electronic table access on CD-ROM, the Internet, or in electronic journals. • Many aggregate data products now fall in this category.
Blurring Statistics and Data • In other instances, the responses in the microdata file of a survey may provide the answer to a statistical question. • For example, the percentage of the population in Canada with high blood pressure may be determined from the National Population Health Survey.
Impact on Data Access • The use of aggregate data products and microdata files to answer statistical questions will also contribute to our discussions about data access.
Context for Aggregate Data • Simplifying access to aggregate data is partially driven by a desire to use these products to answer general statistics questions. • The demand for facts and figures at the reference desk remains constant or steadily increases.
Aggregate Data Challenges • The challenges of creating access to aggregate data were summarized earlier. • finding a table with the desired statistics • dealing with each database structure • coping with a variety of retrieval software