320 likes | 410 Views
A Snapshot of public Web Services. Prof: Dr.Jainguo Lu 03-60-569 Presenting Group: Aktar-uz-zaman Mohit Sud. Objective. Find out the number of public web service Complexity Composability Meaningful documentation Future research trends. Introduction.
E N D
A Snapshot of public Web Services Prof: Dr.Jainguo Lu 03-60-569 Presenting Group: Aktar-uz-zaman Mohit Sud
Objective • Find out the number of public web service • Complexity • Composability • Meaningful documentation • Future research trends
Introduction • Conflicting the direction of research area based on - Current Status of web service - Future Evaluation In order to find relative relevance of the current research, they did some snapshot of public web service and describe the result of study and discus their implications. For Example, most primarily application will be - public web - intra-corporate
How • Describe how crawled web services from large number of registries, removed duplicates and validated the services. • Describe variety of automated and manual analysis from resulting web services. • Describe the implications and lessons of these analysis for the research
Overview of Current Research Direction in Web Services Web services are software services distributed on the internet. Standard to formalize web service in levels • SOAP (Simple Object Access Protocol) for message Communication • WSDL (Web Service Definition Language) for description • BPEL4WS (Business Process Execution Language for Web Service) for composition • OWL-S (Ontology Web Language for Services) for describing web service in an unambiguous, computer-interpretable form. • UDDI (Universal Description, Discovery and Integration) for publishing and discovery the web services.
Discovery and Composition Two approaches • Promoted the syntax of WSDL and use BEPL4WS for composition Underline problem: • Search is mostly keyword is English text descriptions which is not machine interpretable. Research possibility: To extract higher level of language from WSDL 2. Using language like OWL-S, more semantics in the web services. So that the meaning and functionality is unambiguous and machine-interpretable.
Relevant Approach It depend what type of application will support in web service in near future?? Two Ideas: • Intra-corporate scenario: Annotated by service provider using consistence ontology • Public web: Consistence ontology is a dream and less feasible.
Snapshot of Current Web Services What public web services are available?? • UDDI registries is not good - large portion is “hello-world” style - do not have valid WSDL file URL Therefore: • They first crawled the registries • Processed the data collected to remove invalid entry and duplicate • Analysis the text description according to their properties and functionalities.
Crawling the Registry • A crawler is a program that visits Web sites and reads their pages and other information in order to create entries for a search engine index. which is also known as a "spider" or a "bot." • Crawlers are typically programmed to visit sites that have been submitted by their owners as new or updated. Entire sites or specific pages can be selectively visited and indexed. Crawlers apparently gained the name because they crawl through a site a page at a time, following the links to other pages on the site until all pages have been read. • The crawler for the AltaVista search engine and its Web site is called Scooter.
Crawling the Registry Cont The following registries crawled for collecting the information's: www.bindingpoint.com www.salcentral.com www.xmethod.com www.webservicex.com www.webservicelist.com
Crawling the Registry cont Crawler found the registries: • 2432 total registries After filtering invalid they found • 1544 registries
Crawling the Registry cont The following information saved in local Database: • Service name • Providers • Text description • Content of the WSDL file
Crawling the Registry cont Invalid entries: • WSDL is not well-formed or does not conform to WSDL standard • Duplicates registration Removing Invalid entries: • Parsed every fetched WSDL file to see valid xml document • Simple check to the WSDL standards by checking existence of several necessary tags Removing duplicates: • Used combination of service name and provider name as a key Next: Automated clustered to classify of these collected web service in terms of their functionalities.
Clustering the Services cont Why clustering? • Would help the retrieval of services • Hypothesis was to automatically generated cluster will be able to suggest similar services. How? Text based clustering techniques, from Three parts of service description: • Text description when they are registered • The document field of services in their WSDL files • The documentation field of individual operation of services in their WSDL files.
Automated Clustering cont Two algorithm techniques used • Hierarchical Agglomerative Cluster (HAC) • Jaccard Similarity as distance measure
Noise in clustering • When a service does not enough information to differentiate itself from other from other during the clustering • Many of them does not have any documentation in DSDL files
Complexity of the Web Service How many individual operations are involved in individual web service? 640
Complexity of the Web Service Manual Analysis • 77% < 5 operation • 36% only one operation • Most of the operation have relation each other • No more then two operation is compatible among the services
Result and Motivation • At the current stage there are no large number of public web services available which are both very complicated and have the potential to be composed with other services. • Research motivation of the composition of complicated web services from intra-corporate scenarios
Complicacy of Service Compositions Quality of WSDL service description • Are services ready to use or compose? • Whether the services provider are seriously using the WSDL files as the way to convey the correct interpretation to developers who will use them?
Analysis on Length of Text Description on 640 Services • >80% has less then 50 words • >52% has <20 words
Analysis on Operation of Text Description • 80% has <10 words • 50% has zero documentation
Population, Distribution and Structure • 67% of registered web services not valid, 6 months data collection from another survey
Population, Distribution and Structure • 63% of WS hosted in US
SOAP Message Size • SOAP Message Size= HTTP header + essential tag + payload tag • SOAP message is larger then current web objects • 92% of SOAP messages are < 2kb, only 45% of existing web objects are < 2kb
Analysis • Since WSDL and registration information are the only source for the user to understand the functionality of the service, it is questionable that currently available public web services are ready for composition??? • TPYE: Most publicly web services are simply data sources that uses SOAP
Analysis • Retrieval: For the quality and performance of retrieval/discovery challenges, and evolving of web services it need advanced system of registries which will structure the entries and make retrievals and discovery easier. • Composition: Very few ways of composing web service because of the lack of services and relation. If proper XML description in WSDL file, composing is not a pressing problem.
Conclusion • Hoping this analysis will provides useful information about future fruitful research direction of the web service technology including, Modeling, Specification, Discovery Composition and Verification • There is more opportunity to research and do similar study on intra-corporate web services. Hoping machine interpretable annotation may well be feasible for more complex composition and conversion frameworks.
Reference 1. Jianchun Fan & Subbarao Kambhampati Department of Computer Science and Engineering Arizona state University 2. Su Myeon Kin KAIST.EECS Dept KOREA And Marcel-Catalin Rosu IBM T.J Watson Research Center USA