90 likes | 196 Views
5/4: Final Agenda…. 3:15—3:20 Raspberry bars In lieu of Google IPO shares.. Homework 3 returned; Questions on Final? 3:15--3:40 Demos of student projects Steve Commisso; Wes Dyer; Jianchun Fan 3:40—4:20 Interactive Semester Review
E N D
5/4: Final Agenda… • 3:15—3:20 Raspberry bars • In lieu of Google IPO shares.. Homework 3 returned; Questions on Final? • 3:15--3:40 Demos of student projects • Steve Commisso; Wes Dyer; Jianchun Fan • 3:40—4:20 Interactive Semester Review • 4:20—4:30 Parting words
Course Outcomes • After this course, you should be able to answer: • How search engines work and why are some better than others • Can web be seen as a collection of (semi)structured databases? • If so, can we adapt database technology to Web? • Can useful patterns be mined from the pages/data of the web? What did you think these were going to be?? REVIEW
Main Topics • Approximately three halves plus a bit: • Information retrieval • Information integration/Aggregation • Information mining • other topics as permitted by time REVIEW
Adapting old disciplines for Web-age Information (text) retrieval Scale of the web Hyper text/ Link structure Authority/hub computations Databases Multiple databases Heterogeneous, access limited, partially overlapping Network (un)reliability Datamining [Machine Learning/Statistics/Databases] Learning patterns from large scale data REVIEW
From my perspective, we did it all! 1/20;1/22: intro; vector space 1/27;1/29: vectorspace 2/3;2/5: correlation; LSI 2/10;2/12: Search engine tech: A/H 2/16;2/20: Search engine tech: Page rank; practicial considerations 2/24;2/26 --Google; Crawling; Clustering 3/2;3/4 --Clustering 2; collaborative filtering 3/9:Midterm; 3/11 --Mid-term discussion; Content based filtering; start of classification learning. SPRING BREAK 3/23--Naive Bayes Classification& Text Classification; and Spam Filtering (3/25) 3/30: DB refresher; 4/1: XML 4/6 Xquery usecases; semantic web;4/8Information Integration start 4/13;4/15 Over view of Info. vs. Data integration; Issues in DI (vs. DB and DDB); issues in DB/IR.;;; Project 2 discussion; DI Models (GAV vs. LAV) 4/20;4/22 GAV/LAV models; Query optimization in Data Integration 4/27;4/29 Bibfinder; DB/IR 5/4: Interactive Semester Review
494 students Okay, folks Google can be improved With LSI. We need data integration, Clustering, which Google doesn’t do much, we need db/IR integration.. Blah blah Google blah blah blah Blah. Blah blah blah blah blah blah, Blah blah blah Google blah blah Blah blah blah blah blah blah blah.. Interactive Semester Review A Farside treasury…
Schindler: I could've got more...I could've got more, if I'd just...I could've got more...Stern: Oskar, there are eleven hundred people who are alive because of you. Look at them.Schindler: If I'd made more money...I threw away so much money, you have no idea. If I'd just...Stern: There will be generations because of what you did.Schindler: I didn't do enough.Stern: You did so much.Schindler: This car. Goeth would've bought this car. Why did I keep the car? Ten people, right there. Ten people, ten more people...(He rips the swastika pin from his lapel) This pin, two people. This is gold. Two more people. He would've given me two for it. At least one. He would've given me one. One more. One more person. A person, Stern. For this. I could've gotten one more person and I didn't. • Top few things I would have done if I had more time • Information extraction; Automated annotation • Record/Ontology/Schema matching issues • Customized portal generation • P2P mediation • Services—and service standards • Security issues.. • . • Be less demanding more often (or even once…) Adieu with an Oscar Schindler Routine. Rao: I could've taught more...I could've taught more, if I'd just...I could've taught more...T&U: Rao, there are twenty people who are mad at you because you taught too much. Look at them.Rao: If I'd made more time...I wasted so much time, you have no idea. If I'd just...T&U: There will be generations (of bitter people) because of what you did.Rao: I didn't do enough.T&U: You did so much.Rao: This slide. We could’ve removed this slide. Why did I keep the slide? Two minutes, right there. Two minutes, two more minutes.. This music, a bit on p2p. This review. Two points on custom portals. I could easily have made two for it. At least one. I could’ve gotten one more point across. One more. One more point. A point, Sree. For this. I could've gotten one more point across and I didn't.