1 / 13

Most Prominent Methods of How to Find Outliers in Statistics

The efficient way to get all outliers is by utilizing the interquartile range (IQR). It includes the average bulk of the data, so outliers in statistics.

Download Presentation

Most Prominent Methods of How to Find Outliers in Statistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MOST PROMINENT METHODS OF HOW TO FIND OUTLIERS IN STATISTICS WWW.STATANALYTICA.COM

  2. Today's Discussion What are outliers in statistics? Examples of outliers in statistics How to find outliers in statistics using the Interquartile Range (IQR)? How to find the outliers in statistics using the Tukey method? Specifications Conclusion

  3. What are outliers in statistics? A definition of outliers in statistics can be considered as a section of data, which is used to represent an extraordinary range from a piot to another point. Or we can say that it is the data that remains outside of the other given values with a set of data. If one had Pinocchio within a class of teenagers, his nose’s length would be considered as an outlier as compared to the other children.

  4. Examples of outliers in statistics In the given set of random values, 5 and 199 are outliers: 5, 94, 95, 96, 99, 104, 105, 199 “5” is studied as an extremely low value whereas “199” is recognized as an extremely high value. But, outliers are not always taken as these simple values. Let’s assume one accepted the given paychecks in the last month: $220, $245, $20, $230. Your average paycheck is considered as $130. But the smaller paycheck ($20) can be because that person went on holiday; that is why an average weekly paycheck is $130, which is not an actual representation of their earned. Their average is more like $232 if one accepts the outlier ($20) from the given set of data. That is why seeking outliers might not be that simple as it seems. The given data set might resemble as:

  5. Examples of outliers in statistics 60, 9, 31, 18, 21, 28, 35, 13, 48, 2. One might guess that 2 is an outlier and possibly 60. But one predicts it as 60 is the outlier in the set of data. Whiskers and box chart often represent outliers:

  6. However, one might not has a passage to the whiskers and box chart. And if one does, the few boxplots might not explain outliers. For instance, the chart has whiskers which stand out to incorporate outliers as: That is why do not believe in obtaining outliers in statistics from the whiskers and a box chart. It said that whiskers and box charts could be a valuable device to present after one will be determined what their outliers are—the efficient method to obtain all outliers with the help of the interquartile range (IQR). These IQR includes the average amount of the data; therefore, outliers could quickly be determined once one understands the IQR. Examples of outliers in statistics

  7. An outlier is described as a data point that ranges above 1.5 IQRs, which is under the first quartile (Q1) or over the third quartile (Q3) within a set of data. Low = (Q1) – 1.5 IQR High = (Q3) + 1.5 IQR Sample Problem: Find all of the outliers in statistics of the given data set: 10, 20, 30, 40, 50, 60, 70, 80, 90, 100. Step 1: Get the Interquartile Range, Q1(25th percentile) and Q3(75th percentile). IQR = 50 Q1 (25th percentile) = 30 Q2 (50th percentile) = 55 Q3 (75th percentile)= 80 How to calculate IQR of the above data set value Put all the data values in order and mark a line between the values to find Q1(25th percentile) and Q3(75th percentile). [Q1:(10,20,30,40,50) | Q2: (60,70,80,90,100)]Find the median of Q1 and Q2, which is 30 and 80.Subtract Q1 from Q2. [80-30 = 50]IQR = 50. How to find outliers in statistics using the Interquartile Range (IQR)?

  8. Step 2: Multiply the calculated IQR with 1.5 that has been obtained in Step 1: IQR * 1.5 = 50* 1.5 = 75. Step 3: Add the number of Step 2 to Q3 [calculated in Step 1]: 75+ 80= 155. It is considered as an upper limit. Keep this number away for a specific moment. Step 4: Subtract the number which one has found in Step 2 from Q1 from Step 1: 30 – 50= -20. It is the lower limit. Put the number aside for a moment. Step 5: Keep the values from the data set in order: 10, 20, 30, 40, 50, 60, 70, 80, 90, 100. Step 6: Include these low and high values to the given data set in order: -20, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 155. Step 7: Highlight a value above or below the values that one has put in Step 6: -20, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 155. Here is the method for how to find outliers in statistics, and for this example, it will be 100. How to find outliers in statistics using the Interquartile Range (IQR)?

  9. How to find the outliers in statistics using the Tukey method? The Tukey method to discover the outliers in statistics applies the Interquartile Range to separate very small or very large numbers. It is the equivalent of the above method, but one might examine the formulas which are composed slightly different, and the specification is slightly different. For instance, the Tukey method utilizes the idea of “fences.” The specifications are: High outliers = Q3 + 1.5(Q3 – Q1) = Q3 + 1.5(IQR) Low outliers = Q1 – 1.5(Q3 – Q1) = Q1 – 1.5(IQR) Where: Q1 = first quartile Q2 = middle quartile Q3 = third quartile IQR = Interquartile range The above equations provide two values. One can study a fence that can highlight the outliers from the values included in the amount of the data. Now, let’s check how to find outliers in statistics.

  10. How to find the outliers in statistics using the Tukey method? Sample Problem: Use Tukey’s method to get the value of outliers of the following data:  3,4,6,8,9,11,14,17,20,21,42. Step 1: Calculate the Interquartile range [follow the same procedure shown in the table as mentioned above], which give the value as Q1 = 6 Q3 = 20 IQR = 14 Step 2: Measure the value of 1.5 * IQR: 1.5 * IQR = 1.5 * 14= 21 Step 3: Subtract the value of Q1 to obtain the lower fence: 6 – 21 = -15 Step 4: Sum the value to Q3 to obtain the upper fence: 20+ 21 = 41. Step 5: Add these fences to the given data to get the value of outliers: -15, 3, 4, 6, 8, 9, 11, 14, 17, 20, 21, 41, 42. Anything which is outside the fences is considered to be the outliers. For the given data set, 42 is considered as an only outlier.

  11. Conclusion Several students face difficulty regarding how to find outliers in statistics; that is why we have mentioned two different methods to calculate it. Besides this, there are other advanced methods too to get the value of outliers, such as Dixon’s Q Test, Generalized ESD, and much more. Use the above-mentioned IQR and Tukey method to solve the problems of outliers values.

  12. FACEBOOK TWITTER PINTEREST FOLLOW US ON SOCIAL MEDIA @statanalytica @statanalytica @statanalytica

  13. Contact Us WEBSITE EMAIL www.statanalytica.com info@statanalytica.com

More Related