170 likes | 336 Views
BLS Metadata Repository – Issues and Progress. Daniel Gillman US Bureau of Labor Statistics. Outline. BLS Programs Time Series Data Dissemination Metadata Model BLS Repository. Wolfram Data Summit. BLS Programs. 8 Major Program Areas Inflation & Prices Employment Unemployment
E N D
BLS Metadata Repository – Issues and Progress Daniel Gillman US Bureau of Labor Statistics
Outline • BLS Programs • Time Series • Data Dissemination • Metadata • Model • BLS Repository Wolfram Data Summit Wolfram Data Summit
BLS Programs • 8 Major Program Areas • Inflation & Prices • Employment • Unemployment • Pay & Benefits • Spending & Time Use • Productivity • Workplace Injuries • International Wolfram Data Summit Wolfram Data Summit
Time Series • Measure or index over time • Index: number relative to fixed point • 30 series types • Subset by • Industry • Occupation • Geography (state, county, MSA, etc) • Tables • Generated from time series data Wolfram Data Summit
Data Dissemination • Web site: http://www.bls.gov • 8 major numbers • Unemployment rate (m) • Consumer price index (m) • Producer price index (m) • Employment cost index (q) • Average hourly earnings (m) • Payroll employment (m) • Productivity (q) • Import price index (m) • All time series • Tables Wolfram Data Summit
Data dissemination Wolfram Data Summit
Data Dissemination • Organized by programs • Time series in ASCII files by FTP • Some tables • Crude database search • Little metadata • Web site itself • Hidden in FTP directories • Handbook of Methods • Seasonal adjustment Wolfram Data Summit
Data Dissemination • Requires knowing • Organization of BLS • Specific surveys or programs • Specific series • Terms & technical meaning • E.g., earnings • Relies on “Series ID” • Brittle scheme for identifying series • Known by power users Wolfram Data Summit
Metadata • Supports • Dissemination • Support Data.Gov • Time series and tables • Does not support • Internal processing • Describing survey life-cycle • Microdata (respondent level) Wolfram Data Summit
Metadata • Hard to collect • Need “simple” model • Maybe not so easy • Basic metadata already on FTP sites • Support finding data by • Traditional means • Series ID, BLS structure • New means • Subject matter Wolfram Data Summit
Metadata • Previous BLS focus group study • Users find data by • Time • Place • Subject (title or keywords) • Structure of agencies not known • Technical terms not known • Metadata must support this Wolfram Data Summit
Model • Model – • Time Series • Data Element • Classification • Concept • Naming Convention Wolfram Data Summit
Model Wolfram Data Summit
BLS Repository • Under development • Requirement – fast response • Testing – • Flat single table design • Using Apache Lucene Solr • Open source enterprise search • Various interface approaches • Visual Basic • Java Wolfram Data Summit
BLS Repository • Need term map • Common terms to technical terms • Definitions for technical terms • Concept based management • Link terms to relevant data • Manage multi-faceted search • Development schedule • Still research project Wolfram Data Summit