240 likes | 256 Views
Implementing an unsupervised method to detect sequential and quantitative anomalies in unstructured logs, essential for efficient service management.
E N D
LogAnomaly: Unsupervised Detection of Sequential and Quantitative Anomalies in Unstructured Logs Weibin Meng, Ying Liu, YichenZhu,ShenglinZhang,DanPei YuqingLiu,YihaoChen,RuizhiZhang,ShiminTao,PeiSunandRongZhou 2019/8/15 1 WeibinMeng
InternetServices 396 • The numberofservicesis growing rapidly Stabilityofservicesarebecoming moreimportant 319 • Internetprovide varioustypes of services 254 400 201 156 300 122 Traffic will increase more than three times 200 100 Source:CiscoVNIGlobalIP 0 2017 2018 2019 2020 2021 2022 2019/8/15 2 WeibinMeng
AnomalyDetection • Anomalies will impact revenue and user experience. • Anomaly detection plays an important role in servicemanagement. 2019/8/15 3 WeibinMeng
LogsforAnomalyDetection • Logs are one of the most valuable data for anomaly detection • Diverse • Logsrecord a vast range of runtime information • Every serviceand device generates logs • General Unstructured logs 2019/8/15 4 WeibinMeng
LogsforAnomalyDetection scenario pattern detection Asingle log can reflect ananomaly. e.g., “ powerdown” • Singlelog • anomaly • Keywords& • Regularexpressions The number of multiple logs changes can reflectanomalies. e.g.,num(down)!=num(up) • Logs • Quantitative Anomalies • Basedon • logsequence The sequence of multiple logs changes can reflect anomalies. e.g.,OSPFfailedtostart Sequential Anomalies Ourworkfocusonlogsequenceanomalydetection 2019/8/15 5 WeibinMeng
ManualDetection • Automatically detect anomalies based on unstructured logs WorkflowofOSPF(anetworkprotocol)startup : Down → Attempt → Init → Two-way→ Exstart → Exchange → Loading → Full QuantitativerelationshipofInterfaceflapping: num(interfacedown)=num(interfaceup) Runtimelogs: Line protocol on Interface ae3, changed state to down Interface ae3, changed state to down Interface ae3, changed state to up Runtimelogs: OSPFADJCHG, Nbr 1.1.1.1 on FastEthernet0/0 from Attempt to Init OSPFADJCHG, Nbr 1.1.1.1 on FastEthernet0/0 from Init to Two-way OSPFADJCHG, Nbr 1.1.1.1 on FastEthernet0/0 from Two-way to Exstart OSPFADJCHG, Nbr 1.1.1.1 on FastEthernet0/0 from Two-way to Exstart Aninterfacedowneventoccurs Everylogisnormal, butOSPF failed to start 2019/8/15 6 WeibinMeng
Templates(logkeys): T1.Interface *, changed state to down T2.Vlan-interface *, changed state to down T3.Interface *, changed state to up T4.Vlan-interface *, changed state to up Logs->Templateindexes: L1->T1,L2->T2,L3->T3 L4->T1,L5->T4,L6->T3 Logtemplateindexsequence: T1,T2,T3,T1,T4,T3 Previousstudies • Existing log anomaly detection: • Quantitativepatternbased methods • Sequentialpatternbased methods • Quantitative anomaliesdetectionmethods Logs: L1.Interface ae3, changed state to down L2.Vlan-interface vlan22, changed state to down L3.Interface ae3, changed state to up. L4.Interface ae1, changed state to down L5.Vlan-interface vlan22, changed state to up L6.Interface ae1, changed state to up Sliding/sessionwindows CountMatrix T1,T2,T3,T1,T4 v1v2v3v4 Cj1110 Cj+11110 Cj+21011 Cj+31011 PCA(SOSP’09) PreFix(SIGMETRIS’18) sequencenext • Onlycomparingtemplate indexeslosestheinformationhidden intemplate semantics T1,T2,T3,T1,T4 v1 [v1v2v3] [v2v3v1] [v3v1v4] v4 LogCluster(ICSE’16) IM(ATC’10) T1,T2,T3,T1,T4 v3 Sliding/sessionwindows T1,T2,T3,T1,T4 T1,T2,T3,T1,T4 DeepLog(CCS’17) T1,T2,T3,T1,T4 Sequentialanomaliesdetectionmethods 2019/8/15 7 WeibinMeng
Challenges • Valuable information could be lost if only log template index is used. • Services can generate new log templates between • two re-trainings • Some templates are similar in semantics but different in indexes • Existing methods cannot detect sequential and quantitative anomalies simultaneously. • Existing approaches cannot address this problem 2019/8/15 8 WeibinMeng
OverviewofLogAnomaly Historical logs … Match Templatesequence Template Vector Sequence LSTM Vector sequence Attention Synonyms& Antonyms Extract Count Vector LSTM An anomalydetectionsystem based on unstructuredlogs template2Vec … Word Vectors Template Vectors Templates Model template2Vec template2Vec Classification Offlinelearning Onlinedetection Existing Vectors Temporary Templates Temporary Vectors Update Real-time logs Comparison Vector sequence Output Templatesequence Match 2019/8/15 9 WeibinMeng
TemplateRepresentation Historical logs … Match Templatesequence Template Vector Sequence LSTM Vector sequence Attention Synonyms& Antonyms Extract Count Vector LSTM Addressthefirstchallengeandsavetemplatesemantics. template2Vec … Word Vectors Template Vectors Templates Model template2Vec template2Vec Classification Offlinelearning Onlinedetection Existing Vectors Temporary Templates Temporary Vectors Update Real-time logs Comparison Vector sequence Output Templatesequence Match 2019/8/15 10 WeibinMeng
TemplateRepresentations • Some existing templates have similar semantics • Some logs containing antonyms look similar but have opposite semantics • Convertlogtemplatesto“soft” representations • Takes antonyms and synonymsinto consideration Logs: 1.Interface ae3, changed state to down 2.Vlan-interface vlan22, changed state to down 3.Interface ae3, changed state to up 4.Vlan-interface vlan22, changed state to up 5.Interface ae1, changed state to down 6.Vlan-interface vlan20, changed state to down 7.Interface ae1, changed state to up 8.Vlan-interface vlan20, changed state to up Templates: 1.Interface *, changed state to down 2.Vlan-interface *, changed state to down 3.Interface *, changed state to up 4.Vlan-interface *, changed state to up Logs>Templates: L1->T1L2->T2L3->T3L4->T4 L5->T1L6->T2L7->T3L8->T4 2019/8/15 11 WeibinMeng
Template2Vec • template2Vec: (templaterepresentationmethod) • Construct the set of synonyms and antonyms • CombinedomainknowledgeandWordNet • Generate word vectorsbyusingdLCE[1]algorithm • dLCE is a distributional lexical-contrast embedding model • Calculate template vectors. Syns&Ants Wordvectors (2) (3) (1) Templatevectors Templates (3) [1]Kim Anh Nguyen, Sabine Schulte, and Ngoc Thang Vu. Integrating distributional lexical contrast into word embeddings for antonym-synonym distinction. arXiv preprint arXiv:1605.07766, 2016. 2019/8/15 12 WeibinMeng
Template Approximation Historical logs … Match Templatesequence Template Vector Sequence LSTM Vector sequence Attention Synonyms& Antonyms Extract Count Vector LSTM A mechanism to address new templates at runtime template2Vec … Word Vectors Template Vectors Templates Model template2Vec template2Vec Classification Offlinelearning Onlinedetection Existing Vectors Temporary Templates Temporary Vectors Update Real-time logs Similarity comparison Vector sequence Output Templatesequence Match 2019/8/15 13 WeibinMeng
Betweentwore-trainings Template Approximation • Extract atemporarytemplate for the log ofanew type • Map the temporary template vectorinto one of the existing vector Word Vectors Template Vectors Templates offline online Existing Vectors Temporary Templates Temporary Vectors Real-time logs Template Approximation Between Two Consecutive Trainings 2019/8/15 14 WeibinMeng
AnomalyDetection Historical logs … Match Templatesequence Template Vector Sequence Addressthethirdchallengeanddetecttwoanomaliessimultaneously. LSTM Vector sequence Attention Synonyms& Antonyms Extract Count Vector LSTM template2Vec … Word Vectors Template Vectors Templates Model template2Vec template2Vec Classification Offlinelearning Onlinedetection Existing Vectors Temporary Templates Temporary Vectors Update Real-time logs Similarity comparison Vector sequence Output Templatesequence Match 2019/8/15 15 WeibinMeng
Anomalydetection Sequentialpattern(e.g,OSPFstarting) Quantitative pattern(e.g.,up=down) Logs: L1Interface ae3, changed state to down L2Vlan-interface v2, changed state to down L3Interface ae3, changed state to up. L4Interface ae1, changed state to down L5Vlan-interface v2, changed state to up L6Interface ae1, changed state to up Templates(logkeys): T1Interface *, changed state to down T2Vlan-interface *, changed state to down T3Interface *, changed state to up T4Vlan-interface *, changed state to up Templatesindexsequence: T1T2T3T1T4T3 Templatesvectorsequence: v1v2v3v1v4v3 Slidingwindows 2019/8/15 16 WeibinMeng
AnomalyDetection … • Sortprobabilities: • For a log sequence, we sort the possible next template vector based on their probabilities(ofappearinthenextlog). • Topkcandidates : • If the observed next template vector is included in the top k candidates (or similar enough with them), we regard it as normal. Template Vector Sequence LSTM Attention Combine sequential and quantitative relationship Count Vector LSTM … Vector sequence Similarity comparison Alarm 2019/8/15 17 WeibinMeng
Evaluation Datasets&Baselines Baselines: Datasets: • LogCluster (ICSE’16) • Invariants Mining (ATC’10) • PCA (SOSP’09) • Deeplog (CCS’17) • BGL: • Generated by the Blue Gene/L supercomputer. • HDFS: • Collected from more than 200 Amazon nodes. 2019/8/15 18 WeibinMeng
Evaluation of LogAnomaly BGLdataset HDFSdataset LogAnomalyachievesthebestperformance 2019/8/15 19 WeibinMeng
Case Study Dataset Anomalydescription Results • The traffic forwarded by this switch dropped from 15:00, Oct 13 • The services provided by this switch were impacted from 22:15, Oct 13 • The switch recovered at 1:16, Oct 14. • Logs form an aggregation switch deployed in a top cloud service provider. • All of LogAnomaly’s alarms were during 15:59 ~ 1:16 • LogAnomalysuccessfully detected anomaliesand generated no false alarm. Oct1315:59 Oct1401:16 Oct1322:15 Oct1008:25 Oct1900:00 Oct1315:00 Sep2500:00 Service recovered Servicewere impacted IM alarmed DeepLog alarmed Traffic dropped LogAnomaly alarmed Beginning End 2019/8/15 20 WeibinMeng
Conclusion 04 • LogAnomaly • Ananomalydetectionsystembasedonunstructuredlogs. 03 • template2Vec 01 • Represent template without losing semantic information. 02 • Template Approximation • Merge templates of new types automatically • Evaluation • Bestresultsonpublic datasets andreal-world switch logs 2019/8/15 21 WeibinMeng
Thanks mwb16@mails.tsinghua.edu.cn 2019/8/15 22 WeibinMeng