170 likes | 244 Views
Feedback – Lab 2. 9 Sept 2014. Your learning experience in this course. Active Listening Video Lectures Underlying question is ” how am I going to use this concept later on”? Consolidating new knowledge via quizzes and surveys
E N D
Feedback – Lab 2 9 Sept 2014
Yourlearningexperience in thiscourse • Active Listening Video Lectures • Underlyingquestion is ”howam I going tousethisconcept later on”? • Consolidatingnew knowledgeviaquizzes and surveys • Directquestionswillhelpyoumemorize the concepts • Activatingyourknowledge by reflecting and reasoning (yourcognitiveeffort!) • Howshall i solvethis problem with the knowlegethat I haveacquired so far? – Lab Classes
Lab Sessions: Text Comprehension & Task Interpretation • (Always: pointoutinaccuracies) • Use Case 1 • I do not understand the text: • go back to the video lecture, probablyyouhave not built the backgroundcontextrequired for completing the task.
…. Continued Usecase 2: • Oh my god, whatam I supposedto do here? • read the text severaltimes and identify the keypoints in the text strucure: - Description - The purpose - Tasks - pre-processing: feature transformation - identify the best features by applyingyourknowledgeaboutempiricalerror - Interpret the resultsbased on yourknowledgeaboutempiricalerror and your common sense knowledge or historical research.
My expectations on yourlearningexperience • Students should be ableto interpret the text and the tasks (diversified interpretations areallowed and welcome) • Students should be ableto show critical mind by workingout a plausible interpretation(s) and motivatetheir choice (s).
Aboutinstructions and time… • I am not sure thatinstructionswereunclear. • The core task is the representation bin0 and bin1 in order toapply the formulae. • Thiswas the cognitiveeffortofthis lab. • Youcouldwork in groups and groupscouldexchange info betweenthem… and for severalhours…. And youmade it!
Pre-processing: feature transformation • Categorical features Binary features • Each feature shoudassume a value 0 or a value 1 following the instructions under the heading ”Preprocessing” (search & replace; ifformulae; whatever…)
The task wasaboutempiricalerror(Lect 6, min 7:44) • Empiricalerror: howwell the chosen hypothesisclassifies the training data. • How do youassess a hypothesis? • Systematiccountingofcorrectguesses and wrongguessesmade by the hypothesiswrt the correctlabels • Thismeansthatyou must compare the predictionsof the hypothesiswith the actuallabels
Lab Task • Ourhypotheseswere the different features. • Wehavetoassesseach feature wtrtoclassiffication (survived vs died)
1) For each feature, calculate the empiricalerror • LEARN TO PREDICT THE FIRST COLUMN • (a) For each of the features calculate (and write down) the training error if you used only that feature to classify the data. To do this you will need to do the following for each feature: • Split the data based on that feature. Call bin0 all examples that have 0 for that features and bin1 all examples that have 1 for that feature. • Calculate the majority count for the label in each bin, i.e. for bin0, majority(bin0) = max(count(bin0 = survive); count(bin0 = notsurvive))
Accuracy/Error • A possible representation…. WATCH OUT! AGE FEATURE IS TRICKY HERE!
Which feature would be best touse? • EMBARKED… ifwe trust thissample and ourcalculations… (error rate on this feature is the lowest) • Basicallythismeansthatmanyofthosewhostartedtheirtrip from Southampton did not survived. • However, the differencebetw the features wasvery small!
Manyinteresting interpretations!NonebelievedthatEmbarkedwas a good feature for real • ”thiscoulddepend on the small dataset” • ”embarked feature gave the lowesterror […] Intutivetly the firstclass feature shouldhave the strongestrelationshipwith the chanceofsurviving” • ”If wecalculateaccuracywithmore features […], we get moreinterestingresults” • ”The Embarkedwould be the best tousebecause it has the lowesterror rate. In reality it is veryunlikelythat the city has anycorrelationwiththeirchanceofsurvival, unlesstheyrecievedsome special trainingbeforeboarding or shared a roughupbringing in the city” • Etc.
Missingvalues • Goodthatyounoticedthatthereweremissingvalues, ie cells withoutanyvalue! • Someofyouhaveremovedthem • Someofyouhavecovertedto >25 • In practice, missingvaluesrequire ”moreinvestigation” • Missingvaluesare not consideredto be ”noise” in the sense thatwasexplainedduring the video lecture.
Technicaltroubles • If youexperience problems with a computer: configuration problems, weirdbehaviour, etc. just change computer and report the touble (Per?)
Next… • Thosewhohavemiscalculated the empiricalerrorshouldrecalculated in the correctway as presented. • Thosewhowant, canhavesomeadditionaltrainingwith an optional task that is on the website. It contains the solution. You do not needtosubmitanything. It is just for you! • All thosewhohavesubmitted the reporthavecompletedthislab task. Welldone!