180 likes | 468 Views
Automated Data Analysis. Nishan Ahmed. Data Management Training Cairo, Egypt April 21 - 25, 2013. National Center for Immunization & Respiratory Diseases. Influenza Division. Objectives. Why Automated Data Analysis (ADA) What does the ADA process involve Preparation Steps
E N D
Automated Data Analysis Nishan Ahmed Data Management Training Cairo, Egypt April 21 - 25, 2013 National Center for Immunization & Respiratory Diseases Influenza Division
Objectives • Why Automated Data Analysis (ADA) • What does the ADA process involve • Preparation Steps • Which Applications can be used • Software Examples to set up ADA • Software Considerations • Basic requirements to develop ADA • Benefits of a well developed ADA system
Why Automate Data Analysis • To simplify and ease routine data analysis • To standardize analysis output • Uniform report • Replicable • To make reporting efficient • To make sharing of information easier • To make data management processes easier and effective • Enable frequent runs for data checks • Catch errors quickly • Up-to-date editing • Quick intervention – error pattern
What does the ADA process involve • Plan analysis needs in advance of data collection • Identify data needed for each process • Plan and create export or import table templates – i.e Merged tables • Verify desired data outputs • Create standardized routine reports • Create queries in advance • Write codes in advance
Preparation Steps • Understand the study objectives • Identify routine data outputs • Identify required graphs and summary tables • Understand the purpose of each report • Identify core information & variables required • Outline the data management objectives • Evaluation of data quality • Enhancement of data quality • Tracking data input activities • Identification of emerging odd patterns
Software Examples to setup ADA • Routine and Basic ADA • Database Software: • Excel • Access • EPI Info • Basic to Advanced ADA • Statistical Analysis Software: • STATA • SAS • R statistics
Which Applications can be used • Database software capabilities: • Capture information • Queries • Advanced Code • Reporting • Summary Tables • Graphs • Statistical software focuses on • Data acquisition and sharing • Basic and advanced statistical analysis • Output and reporting
Software Considerations - Excel • Spread sheet format has limited application • Simple tables can be created easily • Has built-in formulas that make it easy to perform simple calculations • Time consuming to set up table links • Links have limitations • Macros are available • Limited number of columns/rows allowed • Better for smaller databases
Software Considerations - ACCESS • Database application • Includes a easy to use guided set up for queries and reports • Advanced query capability available • Built-in formulas make it easy to perform simple calculations • VBA coding – created behind the scenes for tables, queries, reports • User can write custom macros to perform specific tasks • Limited scope for data manipulation • Limited number of columns within table format
Software Considerations - Statistical • Can perform basic functions • Good table merging applications • Import/Export functions compatible with most applications • Code available to create various report outputs • Able to analyse large data sets • Automated functions may be set for reporting as well as sending an automated email • Simple and advanced macros can be written • Training is essential but support websites are available • Unlimited scope for advancements
Basic Requirements to Develop ADA • Problem solver and constant learner • Search websites for formulas and guidelines • Willingness to ask for help when it’s needed • Practice • Logical insight and attention to detail • Creating queries • Creating & understanding desired output • Additional helpful skills • Statistical background • Descriptive Stats • Summary Statistics • Coding skills (i.e. SQL) • Coding translation ability and adaptation
Benefits • Less time spent • Recreating queries to find relevant data • Recreating reports • Reduced effort in reporting data to collaborators and stakeholders • Detect data inconsitencies more quickly • Perform more consistent analysis • Better understanding of your data • Increase data re-use • Opportunity to standardize codes for advanced analysis