1 / 21

Identifying Data Set for Dissertation

The process of collecting and validating data on variables of interest is data set collection. It enables us to answer research questions, analyze hypotheses, and assess the outcome. Data collection is one of the significant processes in conducting research. It is good to have the best research design in the world but if it is not possible to collect the required data then completing the dissertation becomes a difficult task. Data collection is a very challenging task that requires systematic planning, patience, hard work, perseverance, and more to complete the project successfully. 2<br>Website: www.tutorsindia.com<br>Email: info@tutorsindia.com<br> Whatsapp: 91-8754446690

tutorsindia
Download Presentation

Identifying Data Set for Dissertation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. IDENTIFYINGDATASET FORDISSERTATION AnAcademicpresentationby Dr.NancyAgnes,Head,TechnicalOperations,TutorsIndia Group www.tutorsindia.com Email:info@tutorsindia.com

  2. Today'sDiscussion OUTLINE Abstract: Introduction CategoriesofDataSet Decentralized Data Set Search Some recent topics in dataset Conclusion

  3. Abstract: The essential step in authorizing a proposed system is to evaluate overan appropriate dataset. The value that has to be generated from data needs the ability to findout, access,and givesense todatasets. Many efforts are carried out to encourage data sharing and reuse of data,frompublishersaskingwriterstosubmitdataalongwith dissertationtoopendataportals,datamarketplaces,anddata communities. Contd...

  4. Google recently released a service for identifying datasets, which lets users find out datathathasbeenstoredindifferentonlinerepositoriesthroughkeywordqueries. These developments predict a research field that has been emerging in identifying dataset or data retrieval that broadly contains frameworks, tools, and methods that helpsin comparinga userdata needover acollection ofdatasets. 1 Contd...

  5. Introduction The process of collecting and validating data on variables of interestis data setcollection. Itenablesustoanswerresearchquestions,analyzehypotheses, andassess the outcome. Datacollectionisoneofthesignificantprocessesinconducting research. Contd...

  6. Itisgoodtohavethebestresearchdesignintheworldbutifitisnotpossibletocollect therequireddatathencompletingthedissertationbecomesadifficult task. Data collection is a very challenging task that requires systematic planning, patience, hardwork,perseverance, andmoreto completethe projectsuccessfully.2

  7. Categories ofDataSet Data are classified into two major categories: qualitative and quantitative. I.QUALITATIVEDATA Qualitative data are usually descriptive and non-numerical in nature. The qualitative data collection methodplays a significant role in impact evaluation by providing data that can be helpful to understandtheprocessesbehindobservedoutcomes. Contd...

  8. Moreover, these methods can also improve the quality of survey-based quantitative evaluationswhichhelpstogenerateevaluationhypothesesandstrengthensthe designof survey questionnaires. Italsohelpsinexpandingorclarifyingfindingsofquantitativeevaluation. II.QUANTITATIVEDATA Quantitativedataisnumericalandcanbecomputedmathematically. This method uses various scales, such as ordinal scale, nominal scale, ratio scale, andinterval scale. Contd...

  9. These approaches are very cheap to implement, and can be easily compared as theyare standardized. However, these approaches are limited in their ability for the research and clarificationof similarities anddifferences. The outcome obtained from these methods can be summarized, compared, and generalizedvery easily. 3 Contd...

  10. Types of Data Set Searches Contd...

  11. Source:https://link.springer.com/article/10.1007/s00778-019-00564-xSource:https://link.springer.com/article/10.1007/s00778-019-00564-x 1.HIDDENSEARCH: The hidden or deep search refers to the content that can be found behind web forms usuallywritten in HTML. Therearetwomainapproachesforfindingdataonthedeepweb. The first is the traditional method to develop vertical search engines, where semantic mappings are created between each website and a centralized third party customized toa specific domain. Contd...

  12. Structuredqueriesarecreatedonthethirdpartyandredirectedthroughtheweb formsusing mappings. A second approach tries to produce the resulting web pages in HTML that emerge fromweb form searches. Googlehasproposedanapproachforfindingdataindeepwebcontentby approximating input automatically to quite a few million HTML forms that are written in multiple languages and span over hundreds of domains, and the resulting HTML pagesare added toits search engine index. The user will be directed to the result of the newly submitted form when they click on asearch result. Contd...

  13. 2.ENTITY-CENTRICSEARCH Inthistype ofsearch, informationis orderedand accessedthrough entitiesof interest, andtheir relationships and attributes. 3.TABULARSEARCH Intabularsearch,usersaccessthedatastoredinoneormoretables.Themainaim is to identify specific data, such as attribute names or extending tables with fresh attributes.4 Contd...

  14. Decentralize d Data Set Search 4.GOOGLEDATASETSEARCH Googleproposedaverticalsearchenginethathasbeen createdto identifydatasets onthe web. Thissystemutilizesschema.organdDCAT. The web for all datasets are crawled based on Google web crawl with the use of the schema.org, as well as the datasets that are described using DCAT, and gathers the associated metadata. Contd...

  15. They additionally link the metadata to some other sources, find out a replica and generatean indexof enhanced metadatafor eachdataset. The metadata is submissive to the knowledge graph of Google and its search capabilitiesthat are builton the topof this metadata. The datasets that are indexed can be identified through keywords and CQL expressions. 5.DOMAIN-SPECIFICSEARCH Inthistypeofsearch,servicesfocusondatasetsfromspecificdomains. Theyproposemodifiedmetadataschemastoexplainthedatasetsandcrawlersare implementedto determine themautomatically. 5 Contd...

  16. SOMERECENTTOPICSINDATASET: 1.Dimensional Reduction approaches for large scale data. 2.Training/Inferenceinnoisyenvironmentsandincompletedata. 3.Handlinguncertaintyin bigdata processing. 4.AnomalyDetectioninVeryLargeScaleSystems. 5.Scalableprivacypreservationonbigdata. LightweightBigDataanalyticsasaService. Approachestomakethemodelslearnwithlessnumberofdatasamples. Contd...

  17. PROBLEMOFIDENTITY Thecharacterofadatabaserowisarticulatedastheprimarykey. Thuseachtupleindatabasetablecharacterizesauniquerecord. Attheobject levelit isnot constantlythe case. It is possible that two different objects having the same features values can be acknowledgedas beingequal with theequals ()function. Likewisetheobjectscanberelatedusingthe(==)operator. Contd...

  18. Conclusion Thereareseveralwaystowidenthesystem’sfunctionality. Academic search engines extract metadata, in the form of a listofauthors,publication year,venue,andso on. The first step will be to provide this data for each paper in the system. But apart from this, it will be useful to apply this information to datasets, to find out the authors and venues that have used a dataset, and also to determine how its usage has changed overtime. Contd...

  19. Users of the system also help in enhancing the quality of search, by giving feedback on the extracted links, signifying errors, and findingout datasetswithin papers that cannotbe identified by theclassifier. Withsuchparticipation,thesystem’saccuracy,recall,andcoveragecanbe enhanced further. Several future improvements must be carried out to get better results. Oneofthewaysistostudytheglobaldatacollectionpatternsafterthissystemis fullyorganizedand employedby manyusers inreal-life situations. Thiscanbeachievedmainlybyapplyingmachinelearningalgorithmssuchas clusteringalgorithmslikek-MeansClusteringtosplittheusersintoseveralclusters.6 Contd...

  20. CONTACTUS UNITEDKINGDOM +44-1143520021 INDIA +91-4448137070 EMAIL info@tutorsindia.com

More Related