380 likes | 556 Views
Today.... Course overviewCourse objectivesCourse details: grading, homework, etcSchedule, lecture overviewWhere does Stata fit in?Basic data analysis with StataStata demosLab. Course Objectives. Introduce you to using STATA and Excel forData managementBasic statistical and epidemiologic an
E N D
1. Introduction to Statistical Computing in Clinical Research Biostatistics 212
Lecture 1 Good afternoon, and welcome! This is the first day of class for Biostatistics 212, and, I believe, your first time together as Advanced Clinical Research Scholars (including both ATCR and Masters Program scholars). Good afternoon, and welcome! This is the first day of class for Biostatistics 212, and, I believe, your first time together as Advanced Clinical Research Scholars (including both ATCR and Masters Program scholars).
2. Today... Course overview
Course objectives
Course details: grading, homework, etc
Schedule, lecture overview
Where does Stata fit in?
Basic data analysis with Stata
Stata demos
Lab
3. Course Objectives Introduce you to using STATA and Excel for
Data management
Basic statistical and epidemiologic analysis
Turning raw data into presentable tables, figures and other research products
Prepare you for Fall courses
Start analyzing your own data Objective 1 – you may have your own data; we supply STATA; here’s how to start using it
Objective 2 – a response to complaints about the Glantz course
2 options – spend more time teaching you how to do his homework
- or teach you what we think you’ll need to know
Objective 3 – The idea here is to teach you how to do clinical research. What does that mean? Use data to make products! Knowing how to type the right STATA command is a necessary skill, but you also need to know how to manage data, clean it, document your analyses, export data, and create products that you want other people to understand – i.e. publish. In the future, you may not be the one to do the data analysis, but even so, it’s important for you to understand how it works, to make you a self-sufficient clinical researcher.Objective 1 – you may have your own data; we supply STATA; here’s how to start using it
Objective 2 – a response to complaints about the Glantz course
2 options – spend more time teaching you how to do his homework
- or teach you what we think you’ll need to know
Objective 3 – The idea here is to teach you how to do clinical research. What does that mean? Use data to make products! Knowing how to type the right STATA command is a necessary skill, but you also need to know how to manage data, clean it, document your analyses, export data, and create products that you want other people to understand – i.e. publish. In the future, you may not be the one to do the data analysis, but even so, it’s important for you to understand how it works, to make you a self-sufficient clinical researcher.
4. Course details Introduction to Statistical Computing - 1 unit
Schedule – 7 lectures, 7 lab sessions, on 7 Tuesdays in a row
Dates: August 4 – September 15
Lectures 1:15-2:45
Labs 3:00-4:00
All in China Basin, CBL 6702 (6704 for lab)
Final Project Due 9/22/09
5. Course details Introduction to Statistical Computing
Grading: Satisfactory/Unsatisfactory
Requirements:
-Hand in all six Labs (even if late)
-Satisfactory Final Project
-80% of total points
Reading: Optional
6. Course details, cont Course Director
Mark Pletcher
Teaching Assistants
Justin Parekh – Section 1
Elena Flowers – Section 2 (Mac)
Tamara Castillo
Maurice Garcia
Lecturers
Andy Choi
Jennifer Cocohoba
Lab Instructor
Mandana Khalili
7. Overview of lecture topics 1- Introduction to STATA
2- Do files, log files, and workflow in STATA
3- Generating variables and manipulating data with STATA
4- Using Excel
5- Basic epidemiologic analysis with STATA
6- Making a figure with STATA
7- Organizing a project, making a table
8. Overview of labs Lab 1 – Load a dataset and analyze it
Lab 2 – Learn how to use do and log files
Lab 3* – Import data from excel, generate new variables and manipulate data, document everything with do and log files.
Lab 4 – Using and creating Excel spreadsheets
Lab 5* – Epidemiologic analysis using Stata
Lab 6 – Making a figure with Stata
Last lab session will be dedicated to working on the Final Project
* - Labs 3 and 5 are significantly longer and harder than the others
9. Overview of labs, cont Official Lab time is 3:00-4:00, but we will start right after lecture, and you can leave when you are done.
10. Overview of labs, cont Labs are due the following week prior to lecture. Labs turned in late (less than 1 week) will receive only half credit; after that, no points will be awarded. However, ALL labs must be turned in to pass the class (even if no points are awarded).
Lab 1 is paper
Labs 2-6 are electronic files, and should be emailed to your section leader’s course email address: biostat212_section1@yahoo.com (Justin) or biostat212_section2@yahoo.com (Elena)
11. Final Project Create a Table and a Figure using your own data, document analysis using Stata.
Due 1 week after last lab session, 20 points docked for each 1 day late.
12. Course Materials Course Overview
Final Project
Lectures and Labs (just in time)
Other handouts
Books
13. Getting started with STATA Session 1
14. Types of software packages used in clinical research
Statistical analysis packages
Spreadsheets
Database programs
Custom applications
Cost-effectiveness analysis (TreeAge, etc)
Survey analysis (SUDAAN, etc)
15. Software packages for analyzing data STATA
SAS
S-plus, and R
SPS-S
SUDAAN
Epi-Info
JMP
MatLab
StatExact
16. Why use STATA? Quick start, user friendly
Immediate results, response
You can look at the data
Menu-driven option
Good graphics
Log and do files
Good manuals, help menu
17. Why NOT use STATA? SAS is used more often?
SAS does some things STATA does not
Programming easier with S-plus and R?
R is free
Complicated data structure and manipulation easier with SAS?
Epi-info (free) is even easier than STATA?
18. STATA – Basic functionality Holds data for you
Stata holds 1 “flat” file dataset only (.dta file)
Listens to what you want
Type a command, press enter
Does stuff
Statistics, data manipulation, etc
Shows you the results
Results window Do a quick demo – open STATA, load a dataset, look at the data, run summarize on age.Do a quick demo – open STATA, load a dataset, look at the data, run summarize on age.
19. Demo #1 Open the program
Load some data
Look at it
Run a command
20. STATA - Windows Two basic windows
Command
Results
Optional windows
Variable list
History of commands Other functions
Data browser/editor
Do file editor
Viewer (for log, help files, etc)
21. STATA - Buttons The usual – open, save, print
Log-file open/suspend/close
Do-file editor
Browse and Edit
Break
22. STATA - Menus Almost every command can be accessed via menu
23. Demo #2 Enter in some data
Look at it
Run a couple of commands
Three nonsense variables – 2 numeric, 1 string – Save as….
Describe, Tablulate, Summarize, List
Use menus for Tabulate
Review commands
Look at variable list
Three nonsense variables – 2 numeric, 1 string – Save as….
Describe, Tablulate, Summarize, List
Use menus for Tabulate
Review commands
Look at variable list
24. Menu vs. Command line Menu advantages
Look for commands you don’t know about
See the options for each command
Complex commands easier – learn syntax
Command line advantages
Faster (if you know the command!)
“Closer” to the program
Only way to write “do” files
Document and repeat analyses
25. STATA commandsDescribing your data
describe [varlist]
Displays variable names, types, labels
list [varlist]
Displays the values of all observations
codebook [varlist]
Displays labels and codes for all variables
26. STATA commandsDescriptive statistics – continuous data
summarize [varlist] [, detail]
# obs, mean, SD, range
“, detail” gets you more detail (median, etc)
ci [varlist]
Mean, standard error of mean, and confidence intervals
Actually works for dichotomous variables, too.
27. STATA commandsGraphical exploration – continuous data
histogram varname
Simple histogram of your variable
graph box varlist
Box plot of your variable
qnorm varname
Quantile plot of your variable to check normality
28. STATA commandsDescriptive statistics – categorical data
tabulate [varname]
Counts and percentages
(see also, table - this is very different!)
29. STATA commandsAnalytic statistics – 2 categorical variables
30. STATA commandsAnalytic statistics – 2 categorical variables tabulate [var1] [var2]
“Cross-tab”
Descriptive options
, row (row percentages)
, col (column percentages)
Statistics options
, chi2 (chi2 test)
, exact (fisher’s exact test)
31. Getting help Try to find the command on the pull-down menus
Help menu
If you don’t know the command - Search...
If you know the command - Stata command...
Try the manuals
more detail, theoretical underpinnings, etc
32. STATA commandsAnalytic statistics – 1 categorical, 1 continuous
33. STATA commandsAnalytic statistics – 1 categorical, 1 continuous bysort catvar: summarize [contvar]
mean, SD, range of one in subgroup
ttest [contvar], by(catvar)
t-test
oneway [contvar] [catvar]
ANOVA
table [catvar] [, contents(mean [contvar]…)
Table of statistics
34. STATA commandsAnalytic statistics – 2 continuous
35. STATA commandsAnalytic statistics – 2 continuous scatter [var1] [var2]
Scatterplot of the two variables
pwcorr [varlist] [, sig]
Pairwise correlations between variables
“sig” option gives p-values
spearman [varlist] [, stats(rho p)]
36. Demo #3 Load a STATA dataset
Explore the data
Describe the data
Answer some simple research questions
Gender and HTN, age and HTN
37. In Lab Today… Familiarize yourself with Stata
Load a dataset
Use Stata commands to analyze data and fill in the blanks
38. Next week Do files, log files, and workflow in Stata
Find a dataset!
39. Website addresses Course website
http://www.epibiostat.ucsf.edu/courses/schedule/biostat212.html
Computing information
http://www.epibiostat.ucsf.edu/courses/ChinaBasinLocation.html#computing
Download RDP for Macs (for Stata 10 Server)
http://www.microsoft.com/mac/otherproducts/otherproducts.aspx?pid=remotedesktopclient
Citrix Web Server
http://apps.epi-ucsf.org/