1 / 0

Final Exam Review

Final Exam Review. The following is a list of items that you should review in preparation for the exam. Note that not every item in the following slides may be on the exam, and there may be items on the exam not on this slide. Overview of three techniques. Decision Tree Clustering

nixie
Download Presentation

Final Exam Review

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Final Exam Review

  2. The following is a list of items that you should review in preparation for the exam. Note that not every item in the following slides may be on the exam, and there may be items on the exam not on this slide.
  3. Overview of three techniques Decision Tree Clustering Association Rule
  4. What is classification? Determining to what group a data element belongs Or “attributes” of that “entity” Examples Determining whether a customer should be given a loan Flagging a credit card transaction as a fraudulent charge Categorizing a news story as finance, entertainment, or sports
  5. What is Cluster Analysis? Grouping data so that elements in a group will be Similar (or related) to one another Different (or unrelated) from elements in other groups Distance within clusters is minimized Distance between clusters is maximized http://www.baseball.bornbybits.com/blog/uploaded_images/Takashi_Saito-703616.gif
  6. Association Mining
  7. Match Scenario with Data Mining Technique Which data mining technique (Decision Trees, Clustering, or Association Rules) would be most appropriate to answer each question below? What products are bought at the same time as coke? What is the probability that a 57-year-old female in a low income family will die because of cancer? How many types of customers visit fresh grocery?
  8. Interpret your model You should be able to interpret your model from two aspects: First, whether it is a good model Second, how you can use your model to help you answer question/make decision.
  9. Basic Statistic Information Be able to understand the basic about your data by looking at explore window with descriptive statistics Distribution, Average, Range and etc. And what those numbers can tell you.
  10. What can you tell from this histogram? Do most people spend a lot or not?
  11. Decision Tree Whether it is a good model Use Subtree Assessment Plot to find out Average Square Error and/or Misclassification Rate. Lower average square error and misclassification rate suggest better model. Think why these numbers can provide you the optimal number of leaf. How to use your model Follow the tree path that matches the descriptions in your question.
  12. Why the optimal number of leaves is 13?
  13. What is the likelihood of 52 years old man with affluence of 5 buying an organic product?
  14. Cluster and Segment Whether it is a good model You want to have higher cohesion within your cluster and higher separation between your cluster. Higher Root Mean Square Standard Deviation suggests lower cohesion. Higher distance to nearest cluster suggests higher separation How to use your model Be able to tell the difference each cluster has against your overall result.
  15. Which model is better in terms of cluster cohesion? For each model, which cluster has the highest cohesion? How will the maximum number of clusters in you model may affect the cohesion and separation?
  16. Is the sale of stretch jeans of cluster 2 better than the average sales of stretch jeans of entire population?
  17. Association Rule Whether it is a good model Confidence: the chance of Y is bought when X has been bought Support: the chance of X and Y bought together Lift: the ration of confidence to the chance of X and Y are bought together coincidentally. How to use your model Able to give suggestions based on your analysis
  18. Does coke often be bought with Beer or Pepsi? Why? Can you give one suggestion that two products should been put close to each other? Can you give one suggestion that two products should not been put together? Why?
More Related