1 / 19

Big data requirement engineering

Explore the aspects of information requirements analysis in big data for improving production system efficiency. Discover new methodologies and tools for data mining to uncover data-information opportunities.

lporter
Download Presentation

Big data requirement engineering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bigdatarequirementengineering Jānis Zemnickis, janis.zemnickis@gmail.comSupervisor: Laila Niedrīte, Dr.sc.comp.

  2. Use cases for big data analysis • Log Analytics • RecommendationEngines • MarketResearch • PrecisionMedicine • CustomerService • Fraud Detection

  3. Problems • Currently, most of the manufacturing companies do not make good use of all the generated and collected data to improve production system efficiency, in turn, to increase their competiveness (Dean 2014). • Big data uncover data-information opportunities • Implementing a business intelligence(BI) system is a costly, resource-intensive andcomplex undertaking (Yeoh, W., & Popovič, A., 2016)

  4. Data mining to discover useful information

  5. Reviewofexistingmethods/tools • N. Kozmina, L. Niedrite, J. Zemnickis, Information Requirements for Big Data Projects: A Review of State-of-the-Art Approaches. In: Lupeikiene A., Vasilecas O., Dzemyda G. (eds) DB&IS 2018. Springer, Cham, CCIS, vol. 838, pp. 73-89, 2018. • N. Kozmina, L. Niedrite, J. Zemnickis, Perspectives of Information Requirements Analysis in Big Data Projects.In:Volume 315: Databases and Information Systems X, 10.3233/978-1-61499-941-6-109, 2018

  6. Research conclusion • Accordingto guidelines (Guidelines for Performing Systematic Literature Reviews in Software Engineering, 2007) given by Kitchenham and Charters • The goal was to explore the aspects of informationrequirements analysis in the context of Big data • Found 242 papers, for 26 papers done full analysis • Big data RE usulallyis be done by setting goals, creating scenarios or some othersolution oriented approach • In average there is medium ability to generate the information requirements in a Big data projectby processing the existing data in a (semi-) automatic way

  7. Results: Is it feasible to generate the information requirements in a Big data projectby processing the existing data in a (semi-) automatic way? Notspecified High

  8. Newmethodology S3 S1 S2 Database Database • Calls • Custumeronsideconsultations • Freetextfeedback • E-mails • Comments Pictures • Socialmediapicutures • Adspictures Audio Text Bigadataanalysis - NLP A Entityconsolidation Attributegroupingalgorithm Big data requirement recognition algorithm A Namedentitiesandrelations Attrlist Newrequirements A A

  9. Bigadataanalysis - NLP • Maintehnalogyforunstructureddataanalysis - NLP • Analyzebesttool: • CoreNLPfromStanfordgroup • NLTK, themostwidely-mentioned NLP libraryforPython • TextBlob, a user-friendlyandintuitive NLTK interface • Gensim, a libraryfordocumentsimilarityanalysis • SpaCy, anindustrial-strength NLP librarybuiltfor performance *https://towardsdatascience.com/5-heroic-tools-for-natural-language-processing-7f3c1f8fc9f0

  10. Namedentityrepository

  11. Namedentitysourcetypeexamples • Externaldatabases • Public WEB pages (Twitter, Wikipedia, Ads) • Publicdatabases (Opendata – governance, statistics) • Internalsources • Operationaldatabases • E-mails • Calls • Freetextcustomerfeedback

  12. Namedentityfiletypeexamples • Structured • XML • JSON • DB files • CSV • Unstructured • Text • Sound to text • Picuture • Video

  13. Namedentityrelationtype, examples • Parent • Entity – entityatribute (subtag,) • Sameentity (consolidation) • Twosourcesdescribesoneentity • Related (businessspecificrelation) • Collateralcontract – asset • Leasingagreement - car

  14. Namedentitytype, examples • Number • Date • Organization • Person • Identificator

  15. Big data requirement recognition algorithm • Primary entities should be taken from organizationcoredatabases • Find additional attributes about existing entity • Additionalentiyrelationships • Incaseofnewentity it shouldbevalidated

  16. Big data requirement recognition algorithm • Asresult: • Additionalattributesforexistingentities • Additionalrelationshipsbetweenentities • Newentitydiscovery • Newrquirementexample: • R1: Entity car, LE-7398, • Attr 1:Color, attr2:manufacture year 2000, attr 3:fuel, source: ss.lv • Comment: «Car looks bad», source: facebook.com • Voicerocord: relatedentity «KarlisBerziņs» intrestedaboutinsuranceforentity car, LE-7398, source: internalorganizationdatabase

  17. Futurework • Existing NLPalgorithm comparison for current use case • Practicalimplementation • Bigdataecosystem • Bigdataanysis – NLP • Entityconsolidation • Attributegroupingalgorithm • Big data requirement recognition algorithm

  18. Thankyou!

More Related