390 likes | 513 Views
Statistics on American TV Habits. Kevin Englebert, Jordan Goodreau, Don Li. Description of topic. We wanted to determine whether the number of viewers of a show and the critic rating of the show are related
E N D
Statistics onAmerican TV Habits Kevin Englebert, Jordan Goodreau, Don Li
Description of topic • We wanted to determine whether the number of viewers of a show and the critic rating of the show are related • Also we wanted to figure out if viewers of a show relate to the number of Google hits that show gets • Finally, we wanted to find out if the network a show is on affects its number of viewers
Background • Google • Founded September 4 1998 • Larry Page • Worth $23 billion • Internet search, e-mail, online mapping, office productivity, video sharing, social networking • Nielsen Ratings • Audience measurement systems • Gather Data through the extensive use of surveys • TV.com • Critic Ratings • Staff Writers • Scale of 1-10
Procedure of Data Collection • We performed a stratified random sample by randomizing lists from Nielsen site then randomly selecting two shows per list for the two linear regression tests (Tests 1 and 2) • We then found critics rating (as determined by a scale from 1-10 on tv.com) and Google hits (as results generated by searching “name of show” TV show) for each of the shows • We performed a test on each of the sets of data and then constructed a confidence interval for the slope of the pop reg line
Procedure of Data Collection • For the chi squared test we randomly selected 10 shows for each of the three major networks with more than 30 shows (ABC, CBS, NBC) • We collected the number of viewers in millions for each of the shows from the Nielson rating site. • Finally, we performed the chi squared test on the data set.
Scatter plot Test Data In Format of Show – Viewers in Millions – Rating – Google Hits As The World Turns - 2.744 - 8.2 - 291,000 BIG BANG THEORY, THE – 10.200- 9.0- 432,000 THE BIGGEST LOSER – 11.923- 8.1 - 397,000 COLD CASE – 11.509- 8.9- 534,000 CSI - 15.778 -9.1- 677,000 DESPERATE HOUSEWIVES – 6.907-8.8- 2,290,000 Eleventh Hour - 5.514- 7.8- 361,000 FAMILY GUY – 5.970-9.1- 3,380,000 FRIENDS - 3.574-9.2- 19,800,000 FRINGE - 4.011-8.5- 816,000 GHOST WHISPERER – 10.644- 8.7- 697,000 Grey’s Anatomy - 12.9- 9.0- 1,750,000 HEROES - 6.614-9.2- 7,800,000 HOUSE - 6.032- 9.2- 34,300,000 HOW I MET YOUR MOTHER 6.595- 9.1- 1,010,000 HOWIE DO IT – 7.777- 4.7- 520,000 ICARLY – 3.901-8.6- 263,000 JUDGE JUDY - 6.071- 7.8- 112,000 KNIGHT RIDER – 5.736- 8.0- 934,000 Late Night with Conan O’Brien- 1.033- 9.1- 194,000 Late Show with David Letterman – 1.355- 8.3- 1,750,000 LAW AND ORDER – 10.259- 8.9 - 658,000 LIFE – 4.745- 9.0- 52,600,000 MOMMA’S BOYS - 4.746- 6.4- 98,600 ONE TREE HILL - 1.813- 9.0- 1,660,000 PRISON BREAK - 4.574- 9.1- 5,980,000 PRIVATE PRACTICE - 6.077 - 8.7- 623,000 PUSHING DAISIES - 4.323- 8.9- 711,000 SCRUBS – 4.169-9.2- 534,000 SPONGEBOB – 4.070- 8.6- 1,190,000 STYLISTA – 1.323 - 6.9- 172,000 The Office - 8.346- 9.1- 7,170,000 TWO AND A HALF MEN – 4.686- 9.0- 542,000 WITHOUT A TRACE - 14.062- 9.0- 478,000
R2= .015 (only 1.5% of the critic rating plays a role on viewers) R=.123 (weak correlation)
Assumption • two Independent SRS • True relationship is linear • Assumed for the purpose of the analysis • Assumed for the purpose of the analysis
Test • Ho: β=0 • Ha: β>0 T=b/Seb= .701 P(t>.701 I Df=32)= .2443 We fail to reject ho in favor of ha because the p-value is greater than .05. We have sufficient evidence that the slope of the population regression line is zero. As the rating increases the number of viewers remains the same. Therefore critics rating and number of viewers for a show are not related.
B – Confidence interval b t*SEb=(-.9321, 1.9097) We are 95% confident that the slope of the population regression line lies between -.9321 and 1.9097. This interval contains zero thus as critics rating increases, the number of viewers of the show stays the same so there is no linear relationship between viewers and Google hits.
Graph of Viewers vs Hits r sq=.012 Only about a percent of the variation in Google hits can be explained by the change in viewers of the show. r= -.109 Very weak association
Viewers vs. Google Hits • Ho: β=0 • Ha: β>0 • n= 34 • y = a+bx → y=6611920.29+ -347302.62x Example of what we call a Google hit
Assumption • two Independent SRS • True relationship is linear • Assumed for the purpose of the analysis • Assumed for the purpose of the analysis
Viewers vs. Google Hits • We fail to reject Ho in favor of Ha • Sufficient Evidence that slope of population regression line β=0 • As number of viewers Increases, the number of Google hits stays the same. • Therefore there is no linear relationship between viewers and Google hits.
Confidence Interval We are 95% confident that the slope of the population regression line lies between -1,500,000 and 791,428. Since this interval contains zero there is no relation between critic rating and viewers because a slope of 0 on the population regression line means that Google hits does not change as viewers change. = (-1500000 , 791428)
IQR Test • 1.5 x IQR Test • Q3-Q1=IQR • 1.5 x 1318000= 1977000 • 1977000+1750000=3,727,000 • Outliers : 6 outliers
Viewers vs Hits (New Data) r sq.= 2.58 x 10^-5 Less than a tenth of a percent of the variation in Google hits can be explained by the change in viewers of the show. r= -.0051 Scattered
Viewers vs. Google Hits (New Data) • Ho: β=0 • Ha: β>0 • N=28 • y=a +bx→ y= 830682.87+ -1028.44x
Assumption • two Independent SRS • True relationship is linear • Assumed for the purpose of the analysis • Assumed for the purpose of the analysis
Viewers vs. Google Hits (New Data) • We fail to reject Ho in favor of Ha • Sufficient Evidence that slope of population regression line β=0 • As number of viewers Increases, the number of Google hits stays the same. • Therefore there is no linear relationship between viewers and Google hits.
Confidence Interval (New Data) We are 95% confident that the slope of the population regression line lies between -82,612 and 80,555. Since this interval contains zero there is no relation between critic rating and viewers because a slope of 0 on the population regression line means that Google hits does not change as viewers change.
Assumption • SRS • Sample size large enough so that all expected counts are ≥ 5 • Assumed for the purpose of the analysis • Assumed (sure, why not….. Catch 22 for this one)
Chi Squared Test H0 : There is no association between the network that a show is on and the number of viewers of that show. HA : There is an association between the network that a show is on and the number of viewers of that show. X2 = Σ = = 9.7 P(X2 > 9.7 | df = 4) = .0457 (observed – expected)2 expected (2 – 2.333)2 2.333 + + (5 – 2.333)2 2.333 …... We reject H0 in favor of HA because our P value of .0457 is less than α = .05. We have sufficient evidence that there is an association between the network that a show is on and the number of viewers of that show.
Personal opinion These tests were very interesting. We learned many things about television and how critic ratings and viewers affect each other and also Google hits. I liked our tests and feel like I wouldn’t know some of these interesting stats if we wouldn’t have done these tests. This has also shown us that many people watch dumb shows for no reason just that they like them.
Bias/ Error • Population of TV Shows too small • Only more common shows data available • Unable to find data on all TV shows • Ratings not available for all TV shows • Data Could be skewed • Name of show could result in more Google hits • “Life” “House” “Friends” etc showed outlier number of hits perhaps not all returns are of show • Also shows that have been running longer may have gotten a boost simply for duration
Application to Population • There is no relationship between viewers and rating • Critic’s best shows are not necessarily what the public is watching • Sophistication vs Amusement
Application to Population • There is no relationship between the number of Google hits and the number of views of the show • There is a possible confounding variable of hits that have common words not related to the TV show • Should not be extrapolated to pop. • Perhaps there maybe more there then a linear relationship but we are unsure
Application to Population • There is a relationship between viewers of a show and the network it runs on • This may suggest that viewers perhaps just go to a network for a show and stay for more than one show • Due to some of the assumptions failing however this should not be definitively extrapolated to the population
Conclusion • There is no relationship between viewers and rating or Google hits but there is a relationship btw viewers and network • More loyal to networks than shows? • Cannot be extrapolated