1 / 23

Chapter 1 – Stats start here

Learn about statistics, the science of data, and how it helps make sense of the world by finding patterns and relationships. Discover the value of data and how it is used in various scenarios, such as personalized advertising and analyzing driver behavior. Explore the importance of context in interpreting data and the different types of variables: identifiers, categorical, and quantitative.

miltonj
Download Presentation

Chapter 1 – Stats start here

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 1 – Stats start here

  2. Statistics: The science of data • Data: collection of numbers, characters, images or other items along with their context that provide information about something What is Statistics and data?

  3. Facebook: If you have a Facebook account you have probably noticed that the ads you see online tend to match your interests and activities. Much of your personal information has been sold to marketing or tracking companies. Your data are valuable! A company can find out your age, sex, education level, job, hobbies and activities. Examples

  4. Target stores make customer profiles by collecting data about people using credit cards. Patterns the company discovers across similar customer profiles enable it to send you advertising and coupons that promote items you may be interested in purchasing. Examples

  5. How dangerous is texting while driving? • Researchers compare reaction time of sober drivers, drunk drivers, and texting drivers. The results were striking. The texting drivers actually responded more slowly and were more dangerous than those who were above the legal limit for alcohol Examples

  6. Data vary because we don’t see everything and because even what we do see and measure, we measure imperfectly. • Example: Ask different people the same question and you will get lots of different answers • Statistics helps us make sense of the world by seeing past the underlying variation to find patterns and relationships. Statistics is about variation

  7. Let’s start with an example: Amazon.com • Background: Amazon started as book store in 1995. By 1997 Amazon had 2.5 million books sold to more than 1.5 million customers in 150 countries. In 2010, sales reached 34.2 billion and they now sell basically everything, including a $400,000 necklace, Yak cheese from Tibet and the largest book in the world. What are Data?

  8. So how did they do it? How do they track their customers? • The answer is data! What are Data?

  9. Your name and address? Yes, but they are not numbers. Numbers only? The amount of your last purchase. What are Data? Zip Code? This is a number, but is it used for analysis such as average?

  10. Think of some data points that Amazon may collect: • Try to guess what each column represents. What are Data?

  11. Why is this hard? • Because there is no context. If we don’t know what values are measured and what is measured about them, the values are meaningless. • We can make the meaning clear if we organize them in a data table: What are Data?

  12. Data must have context to be meaningful. • Without context data cannot be interpreted. • What information provides good context? • Who • What • Where • Why • When • How Context

  13. Are the numbers listed above data? • Data must have contextto be meaningful. The numbers listed above could be test scores, ages of a group of golfers, or the uniform numbers of the starting backfield on the football team. • Without context data cannot be interpreted. 17, 21, 44, 76

  14. How the data are collected can make the difference between insight and nonsense. For example, data that come from a voluntary survey on the Internet are almost always worthless. The When • Time frame – Data recorded in 1803 means something much different than data recorded now The Where • Place – data measured in India may be different than data measured in Mexico. • More specific – indoors/outdoors, house/office The How

  15. In general the rows of a data table correspond to the individual cases about the whom/which the data was collected, but cases go by different names depending on the situation: • Individuals who answer a survey are called respondents • People on whom we experiment are called subjects or participants • In a database, the rows are called records • Otherwise we call them what they are: customers, economic quarters, or companies, etc. The Who

  16. Characteristics recorded about each individual are called (variables) usually the columns. • Can be broken into three categories: • Identifiers • Categorical • Quantitative The What

  17. Identifiers are useful but not typically used for analysis. • Everyone has a unique one and they are useful for not confusing cases, but not needed to be analyzed. • Examples: Student ID numbers, driver license numbers, social security numbers The What

  18. Categorical Variables: Tell the group/category each individual belongs to. • Usually text values, not numbers. Any descriptive responses are usually categorical. • Examples: Male/Female, pierced/not, eye color, state, country • Numerical examples: zip code, area code The What

  19. Quantitative Variables: When a variable contains measured numerical values for which it makes sense to find an average, usually with units. • The units provide a meaning and also a scale in particular situations so we know how far apart two variables are. • Examples: Cost, life span, distance, degrees The What

  20. Either/or: Some variables with numeric values can be either categorical or quantitative depending on what we want to know • Example: Age • Quantitative – Amazon wants to know the average age of those customers that visit their site after 3 am. • Categorical – When deciding which album to feature when you visit the site, they’ll have categories child, teen, adult, senior. The What

  21. Example – Identify each variable as categorical or quantitative. • A consumer reports article about 25 tablet computers lists each tablet’s manufacturer, cost, battery life (hours), operating system (iOS/Android), and overall performance score. • Manufacturer – Categorical • Cost – Quantitative • Battery life – Quantitative • Operating system – categorical • Performance score – Either The What

  22. Suppose a Consumer Reports article (published in June 2005) on energy bars gave the brand name, flavor, price, number of calories and grams of protein and fat. Identify the following • Who: • What: • When: • Where: • How: • Why: • Categorical variables: • Quantitative Variables (with units): Example

  23. Popular magazines and websites rank colleges and universities on their “academic quality” in serving undergraduate students. Describe two categorical variables and two quantitative variables that you might record for each institution. Exit Slip

More Related