0 likes | 7 Views
Discover how Nikhil Malhotra is leveraging the full potential of Natural Language Processing (NLP) to transform businesses.
E N D
NikhilMalhotra:HARNESSINGTHE POWEROFNLP • IfthelastdecadewasthedecadeofvisionbasedA.I., this decadewould bethedecadeofconversationalAI.Conversationandtheabilitytohaveadiverselanguagesettocommunicateisoneofthosetraitsthat distinguishhumansfromtherestoftheanimalworld.Weareseeingan avatar,previouslyunknownandun-fathomed.WhenChatGPTbecame alive,peoplesawthepowerofgenerativeAIandmodellingforthefirst time. • Naturallanguageisaverbose,sometimesunformed,nongrammatical, andoftennon-specificlanguageusedforgeneralcommunications amongsthumanbeings. • Weusethisformattocommunicatewithothers,butwealsousethe sametoformulatethoughtsinourhead.FormalLanguagesincontrast arestrictunchanginglanguages,likesthosewehaveallreadin schools, viz.chemistry,mathematics,etc.Asimplecomparisonbetweenthetwo isdrawnhere: • FormalLanguages • Strict,Unchangingrules • ApplicaonspecificlikeMathsandChemistry • Noambiguit • Canbeparsedbyregularexpressions • Inflexible:nonewterms • NaturalLanguages • Flexible,evolvinglanguage • Unspecificandused inmanydomainsandapplications • Redundantandverbose • Expressiveandambiguous
Difficulttoparse • Veryflexible • Ithasbeenatrystamongstthecomputingcommunitytounderstand language.Oneofthefirstrealattemptsthatwasmadewastoconsidera languageasaformulationofsymbolsandthesesymbolsconveycertain things,whichenablethoughts.AnexampleiswhenIusedtheterm “CRAB”.Almostinstantaneously,weknowwhatthismeans.The symbolcrabisacombinationofasoundandthevisualimagerythatyou getwhenyouhearthisword. IfIsaytheterm“Kavouras”,mostnon-Greekspeakerswouldhaveno ideawhatIamreferringtohere.Itwouldappearastheneuronfiringhas suddenlystoppedasthelanguageisunknowntothespeaker. “Kavouras”isacrabinGreek,andsincethehumanbrainhasnotseenor heardthatsymbol,thisappearstobeawordoutofcontext.Thiswasthe primaryreasonwhyearlypractitionersconsideredlanguageameansof recognizingsymbols,athoughtprocesswhichhadtobediscardedfor naturallanguageprocessingtoevolve. Machinesdonotunderstandtextashumansdo,sothestartoftheNLP wasreallyencodingtextinaformatthatamachinecanunderstand. Machineshoweverdounderstandnumbers,sothestartofNLPwasto converttextinsomeformatofnumbers. Wecanunderstandthisbetterusingasentence.Letussaythesentence thatthemachineistryingtounderstandis:“Theanimalcrossedthestreet.”
Thesentenceaboveisbrokendownusingwordsastokens.Let’stake thefirstwordwhichis“The”.Theprocessistofind“The”inour list/databaseofvocabularyandformulateavectorforthesame.Letus assumetheword“The”comesatposition4137andthetotalnumberof uniquewordsinourvocabularyare30,000sothefinalvectorarray wouldlooklike: [Position:4137] [0,0,……………………………………….1, 0,…………………………..] Bynow,mostofyouwouldhaverealizedthatposition4137has1and therestofthe30,000elementsinvectorare0whichgivesthisthename asone-hotencodingvector.Thistechniqueallowsustoprovidea significantmeaningtothewordwith1beinginthepositionbasedon vocabulary Itwassoonfoundoutthatalthoughone-hotencodingvectorworks, therearechallengeswhensomeonewishestofinddifferencesor similaritiesbetweensimilarkindoftexts.Iwouldusecanexampleofa pairofsentenceshere. “Iwishtogotothehotel”and“Iwishtogotothemotel” Theyareverysimilarin theirsemanticmeaning,astheintentdelivered bythissentenceisapersontryingtogotoalocationtostay.The challengeisthatwithone-hotencodingvector,it’sdifficulttofindthe semanticmeaning. Whenthesetwosentenceswouldthusbecompared,theywouldappear markedlydifferent.Thisgaveriseto amechanismto encodewordsas vectors,whichprovidemeaningandcontexttothewordandsentence, leadingtothecreationof aword2vecorwordtovectortechnique. Word2vectakeslargecorpusoftextasitsinputandproducesavector
space,whichusuallyhasseveralhundreddimensions.Eachuniqueword inthecorpusbeingassignedacorrespondingvectorinthespace. Oncethesewordvectorsandsentencevectorsaremade,itisveryeasy forthesystemtodoanalogicalreasoningofaquestionlike“Kingisto queenasfatheristo?”andtheansweris“mother”becausethevectors closelymatchthespaceofamother. Thevectorsonceformattedenableustodoalot of mathematical analysisonthem.Weusethesevectorstodeterminethesemantic closenessofaquestionaskedbyauserandprovideanswerstotheuser. EarlyWord2VecusagewereFAQchatbots.Someexamplesareas follows: FAQbasedchatbots DocumentparsingandSearch IVRbasedautomatedresponse Music2Vec:Anewconcepthasbeento convertmusic-basedtones intoavectorizedformatakintoword2vec.Thisenablesuserswithin applicationslikeSpotifytofindthenearestbasedsongbasedonanote ofmusic Theword2vecapproachtoldusthatwordsarenolongeranisland,and theycarrycontextwheretheyoccur.Infact,thebasicprincipleof Word2Vectaughtusthosewordsofafeatherflocktogether.For e.g., medicaljournalswouldhavewordsthatarequiterelatedtoeachother andquiteunrelatedtoothertopics.
Thenextprocesswithinnaturallanguageunderstandingwas understandingthewordsequences.Sequencesarethecoreofsentence formationand,hence,thecoreofcommunication.Sequencesallowed youtodothefollowing: • Machinetranslation:Jesuiscontent->Iamhappy • Findingnamedentityrecognitionfromasystem:HarryPotter andHermioneGrangeraregoodfriends. • Generatingmusicfromasinglenote • Generatingwords:SomethinglikeChatGPTdoes • Aspecialclassofalgorithmswasneededforthis.Thiswasdoneusing therecurrentneuralnetwork,asshownbelow.TheadvantagesofRNN overVanillaneuralnetworkswerelargebecauseofthefactthatthey couldmodeltimeinthesequenceofwordsandcouldunderstandthe sequenceofwordsbetter.TheissuethatRNNfacedwasthecontext whenthesentencesbecamelonger.Considerthefollowing paragraph. • Ananimalcametoacity.Thecitywaslargeandcacophonic.The animalwasveryscaredasitsawalargenumberofcars,buildings,and otherfeaturesinthecity.Thecitywassolargethatitfoundthatthe forestappearedsmaller. • TheAgeofTransformers • Thencametheageoftransformers,whichisthecurrentage.ChatGPT asoftodayhasbecomethelatestsurgeandtheposterchildofwhat naturallanguagegenerationcando. • PositivesofChatGPT: • NLG:Forthefirsttime,naturallanguagegeneration,whichmeans thegenerationofsequencesofwordsthataregrammatically correct,hasbeenpossibleusingsomemodelofAI. • FoundationalAI:It’safoundationalmodelthatnotonlydoes conversationalAIbutalsogeneratescodepieces.Bydefinition,
foundationalAImodelshavethecapacitytodomorethana specifictask. • LimitedTransparency:Iwouldnotgooutonalimbandsaythatit • istrulyethicalAI,butittriestodoitspart,asthecreatorsseemto havelearnedfrommistakesmadebyothermodelsinthepast. • LimitationsofChatGPT: • Itdoesn’thavethecontextofanenterprise-basedsystem.Inorder forittoattainso,datafortheenterprisehastobesent.Thisitself bringstwoissueswhichiswhetherenterpriseswouldbewillingto sendtheirconversationsandcontextsonacloud-basedsystem. • ChatGPTisalsoconfidentwhenitiswrong.Thiscanbeamajor barrierforenterprise-basedapplications. • YoucanprofessathoughtthatChatGPTwouldeventuallyagree. Asanexample,let’strywritingastatementabout ChatGPT. • ME:“Isawapinkelephantflying” • CHATGPT:“It'spossiblethatyouexperiencedahallucination. • Seeingapinkelephantisoftenused asasymbolforbeingdrunkor havingconsumedothersubstancesthatcancausehallucinations. • However,ifyouareconcernedaboutyourhealth,it'salwaysa goodideatoconsultwithadoctor.” • ME:“Whatareyousaying.IhaveseenpinkelephantsflyingIam sure” • ChatGPT:“Ifyou'recertainthatyousawapinkelephantflying, • it'spossiblethatyouexperiencedavividimaginationoradream. Seeingthingsthatarenottherecanalsobeasymptomofcertain mentalorneurologicalconditions…” • Asyounotice,asahumanbeing,ifIprofessawrongideabetter, ChatGPTwouldrevertthepushback.NowItriedthisagain: • ME:“Isawapinkelephantflying” • ChatGPT:“It'snotuncommonfor peopletohaveunusualor • imaginativeexperiences,andseeingapinkelephantflyingcould beanexampleofthis…”
Theresponseisdifferent,whichgivestheillusionyouaretalkingtoa realperson,whichyouarenot.Ifallofyoutried thesamesentenceasI have,youwouldseeCHATGPTalternatebetweenthetwo answers, whichindicatesthatthemodelalthoughverygood,hassomedefined datafromwhichitgeneratesitsvalues. Inconclusion,Iwouldsaythatwearejustattheprecipiceofwhat artificialintelligencecando.ChatGPTisjust aninitialmanifestation. Theworldaheadappearstobeaseamlesshandoffbetweenhumansand AI.Wehaveto,however,ensurewehumanshavethekill-switchinour hands.