180 likes | 328 Views
Introduction. LING 575 Fei Xia 01/04/2011. Today. General course information Enron Email Dataset Social Network Analysis Email foldering Hw1. General course information. Course objectives.
E N D
Introduction LING 575 Fei Xia 01/04/2011
Today • General course information • Enron Email Dataset • Social Network Analysis • Email foldering • Hw1
Course objectives • Research question: how can we use CL to process human communication data (e.g., emails, chat room, FaceBook) ? • Approach: Use a publicly available data set, Enron Email Dataset, as a case study • Reading: Become familiar with some of the existing work • Working on a project: • Choose a topic • Review existing work: identify strengths and weaknesses • Outline your proposal • Implement your system • Write a report • Presenting: • Improve presentation skills • Get feedback • Providing feedback and suggestion to your classmates
Reading • Two sets of papers: • Set 1: selected and presented by Fei • Set 2: selected and presented by students • You need to read both sets of papers • No need to understand all the technical detail • Read papers before class
Working on a project • We will go over a few topics in class • You can choose one of the topics or a new topic • You can work on the project alone or with a teammate
Topics covered by Fei • Social network analysis • Email foldering • Personal vs. business emails • Email zoning • Deception detection • Information extraction (??)
Timeline for your project • Week 2: select a topic and 3+ papers (hw1) • Week 3: present the papers • Week 6 or 7: your approach and initial results • Week 10: final results • Week 11: final report • Weekly update in hw or in class (as short presentations)
Presentations • Fei (2 weeks): Week 1 and 2 • Students (3 weeks): Week 3, 6 or 7, and 10 • Fei or Both (5 weeks): Week 4, 5, 6 or 7, 8, and 9 • Remember to load your slides to Adobe Meeting Room in advance.
Feedback and suggestion • For your project, you need to • provide a list of papers • make a reading assignment (about three questions per papers) and provide (some) answers afterward • present the papers and your proposal • run experiments • write final report • For other people’s projects, you need to • read their papers • finish reading assignments • provide suggestions (e.g., ideas, questions, references) in class or on GoPost
Grades for LING575 • No midterm or final exams. • Reading assignments: 10-20% • Project and presentation: 70-80% • Class participation: 10-20%
Prerequisites • LING 570 • LING 572 is a big plus, but it is not required • Programming in C/C++, Java, Perl, Python, or Ruby • Basic unix/linux commands • Be able to • attend class live (in person or remotely) • spend 10-20 hours per week on the assignments • If you don’t meet all the prerequisites, you need to email me by 6pm tomorrow.
Office hours • Email: • Email address: fxia@uw.edu • Subject line should include “ling575” • The 36-hour rule: it works both ways • Office hour: • Time: Thurs 10:45-11:45am • Location: Padelford A-210G
Meeting room and recording • The url for meeting room is http://uweoconnect.extn.washington.edu/ling575/ • In-person students: • Need to attend in person • Laptop is allowed ONLY IF you use it for course-related activities • Online students: • Need to attend live • Bring your mic, and log in a few minutes before class starts to test the mic • For your presentation, remember to upload the slides to meeting room before class starts • Try to speak louder as the mic might not be very sensitive. • The links to the recordings will be on GoPost.
Url, GoPost, Email • Course url: http://courses.washington.edu/ling575x • Syllabus (incl. slides, assignments, and papers): • GoPost: • CollectIt: • GoPost: Most course-related questions should go to GoPost, including the urls of recordings. • Email: you should use it ONLY for confidential subjects. • Please check your emails and GoPost at least once per day.
GoPost • It will have discussion areas similar to the ones in ling570. • Each team will own a discussion area, which include • a list of papers you have found • a list of questions you want others to answer • any questions you have for the group • any ideas that you want to share with the group • …
Patas • If you need to have a patas account, you need to email linghelp@uw.edu right away to get an account. • The directory for LING575 on patas: • ~/dropbox/10-11/575x/ • Data directory: /corpora/enron_email_dataset/ • For jobs that run more than 5 minutes, use the cluster submission commands: see “Condor submission” slides.
Extension and incomplete • Extension and incomplete are given only under extremely unusual circumstances (e.g., health issues, family emergency). • The following are NOT acceptable reasons for extension: • My code does not quite work. • I have a deadline at work. • I am going to be out of town for a few days. • …