1 / 29

Information technology in business and society

Information technology in business and society. Session 17 – Advanced SQL + Data Mining Sean J. taylor. Administrativia. Assignment 3: New drop for any updates related to A3 Assignment 4: D ue Sunday 4 /1 (this is an extension) Class participation grading. Midterm Review process.

adora
Download Presentation

Information technology in business and society

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Information technology in business and society Session 17 – Advanced SQL + Data Mining Sean J. taylor

  2. Administrativia • Assignment 3: New drop for any updates related to A3 • Assignment 4: Due Sunday 4/1 (this is an extension) • Class participation grading.

  3. Midterm Review process • Consult the solutions (posted to BB). • Photocopy the page(s) of your exam that you wish to dispute. • Write why you think you deserve points. • Submit to my mailbox on the 8th floor by Thursday 3/29 (or after class).

  4. Learning objectives • Be able to write more advanced queries. • Learn about the data-driven organization and the data revolution in management. • Know the basic problems data mining attempts to solve.

  5. Review: SQL SELECT ISBN, BookName, Price, Publisher FROM Book WHERE BookName like '*Information Systems*' AND PubDate > #1/1/2002# AND Price < 100 ORDER BY Price

  6. Review: Group By … Having • Use “Having” clause to filter aggregation result SELECT Publisher, COUNT(*) FROM Book GROUP BY Publisher Having Count(*) > 2 • Use “where” clause to filter records to be aggregated SELECT Publisher, COUNT(*) as total FROM Book Where Price < 100 GROUP BY Publisher Having Count(*) > 10 Order by Count(*)

  7. Multiple Group By Fields SELECT Publisher, Author, AVG(Price) as AvgPrice FROM Book GROUP BY Publisher, Author;

  8. Grouping With a Join SELECT Publisher, Count(*) as NumOrders FROM Book, Orders WHERE Book.ISBN = Orders.ISBN GROUP BY Publisher;

  9. Grouping with a Join 2 SELECT Publisher, Orders.CustomerID, Sum(price) as TotalPaid FROM Book, Orders, Customer WHERE Book.ISBN = Orders.ISBN AND Orders.CustomerID = Customer.CustomerID GROUP BY Publisher, Orders.CustomerID;

  10. Multiple Joinswith Where and Group BY • SELECT FavoriteMovie, count(*) • FROM Profiles, FavoriteBooks, FavoriteMovies • WHERE • FavoriteMovies.ProfileId = Profiles.ProfileId • and FavoriteBooks.ProfileID = Profiles.ProfileID • and FavoriteBook = "The Great Gatsby" • GROUP BY FavoriteMovie ORDER BY count(*) desc;

  11. ProportionsUsing Sub-Selects • SELECT FavoriteMovie, count(*) / (select count(*) from Profiles) • FROM Profiles, FavoriteMovies • WHERE • FavoriteMovies.ProfileId = Profiles.ProfileId • GROUP BY FavoriteMovie • ORDER BY count(*) desc;

  12. Proportions Using Sub-Selects II • SELECT FavoriteMovie, Profiles.Sex, count(*) / avg(Q.total) • from Profiles, FavoriteMovies, (select Sex, count(*) as total from Profiles group by Sex) as Q • where • FavoriteMovies.ProfileId = Profiles.ProfileId • and Q.Sex = Profiles.Sex • group by Profiles.Sex, FavoriteMovie • order by FavoriteMovie, Profiles.Sex;

  13. The Data-Driven Firm

  14. Gary Loveman • • Zero executive experience • • Zero background in Casinos • • But, an MIT PhD who knows how to make numbers talk • Results • • Transformed Harrah’s from second tier to number one gaming company in the world • • Completed a $30.7 Billion LBO • • Introduced a culture of pervasive field experimentation“There are two ways to get fired from Harrah’s…”

  15. The Data-Driven Firm • Why do we see these changes now? • Collect: easier to collect, store information about consumers, technologies, markets • Respond: Fast internal communication means that firms are agile enough to respond to external information • Process: Firms can process large volumes of data to make intelligent decisions

  16. Data-Driven Firms are Winning • Data-driven decision makers: • 4% higher productivity • 6% greater profitability • 50% higher market value from IT • (Brynjolfsson and Kim, 2011)

  17. What Wal-mArtKNows http://www.nytimes.com/2004/11/14/business/yourmoney/14wal.html

  18. Data-DrivenChallenges • MeasurementWhat should be measured and how? • IncentivesHow can we design incentives around these measures without creating adverse consequences? • InfrastructureDo we have the right infrastructure (servers, software, etc) in place to measure and analyze the data we have? • SkillsDo we have the skills we need to accomplish these tasks?

  19. A new kind of R&D

  20. What is Data Mining? • Automated search for patterns in data • Automated (or computer assisted) statistical modeling • A process for using IT to extract useful, actionable knowledge from large bodies of data

  21. “Big Data” http://online.wsj.com/video/2012-the-year-of-big-data/D4237159-C9A9-4A09-9701-F03EF7FB8040.html

  22. Big Names with Big Data

  23. CEOs • “We have come out on top in the casino wars by mining our customer data deeply, running marketing experiments and using the results to develop and implement finely tuned marketing and services strategies that keep our customers coming back.” • Gary Loveman, Harrahs CEO • ”For every leader in the company, not just for me, there are decisions that can be made by analysis. These are the best kinds of decisions. They’re fact-based decisions.” • Jeff Bezos, Amazon CEO • “It’s all about collecting information on 200 million people you’d never meet, and on the basis of that information, making a series of very critical long-term decisions about lending them money and hoping they would pay you back.” • Rich Fairbank, founder and CEO of Capital One

  24. Why Now? • Firms are collecting massive amounts of data on operations, customers, and the competitive landscape. • But there is far too much data for manual analysis. • Amazon: > 50M active customers • Phone companies: 100M+ accounts, thousands of txns each • Google: 11B “objects” • RFID tags

  25. Types of Data Mining

  26. Our Roadmap • Visualization • Basic Data Mining Process • Classification Example • Clustering Example

  27. Next Class:Data Mining II • Work on A4

More Related