200 likes | 326 Views
N-CSSC. Net-Centric Software & Systems Consortium Planning Meeting. February 20 - 22, 2008. SILC: Service Identification in Legacy Codes. Gopal Gupta Department of Computer Science The University of Texas at Dallas. Problem Description. What?:
E N D
N-CSSC Net-Centric Software & Systems Consortium Planning Meeting February 20 - 22, 2008 SILC: Service Identification in Legacy Codes Gopal Gupta Department of Computer Science The University of Texas at Dallas
Problem Description • What?: -- Build complete infra-structure for identification and utilization of (web-) services • Why?: -- Service-oriented computing gaining acceptance -- Industry moving towards Service Oriented Architecture -- Software being made available as services on the Web -- Need infrastructure for (web-)services
Example: CustomerName creates FlightNumber Reservation Flight Reservation Service StartDate ReservationCode StartCity DestinationCity (Web) service • Executable programs accessible (over the web) that effects some action or change in the world (i.e., causes some side-effect)
Infrastructure needed: • Mark-up to describe web-services (currently WSDL) --- should be semantics based (our work: USDL) • Tools to automatically carve out services from existing legacy codes --- Focus of this project • Engines for automated discovery & composition --- lot of work by us and others
Discovery Matching service found S1 Composed into requested service Composition S2 S3 Web Service Discovery & Composition Directory of registered services Web service Request We need semantic description of Web services
Our work: USDL • Currently services described using WSDL -- purely syntactic in nature • USDL: Universal Services-Semantics Descr. Lang. -- semantic descriptions of Web services -- provides a framework to model complex real-world concepts (uses Prop. Logic) -- Uses OWL Wordnet Ontology to provide a common representation of real-world concepts Language used to build ontologies Representation of semantics of terms and their relationships
Part 2: Services Identification in Legacy Codes • Aim: (Semi-)automatically generate USDL description from legacy codes • Analyze a large software system and identify its independent, meaningful service components • Automatically generate USDL description of the identified components • Current approaches: largely manual
Approach • Use combination of: -- data-flow analysis of the code & -- natural language analysis of the documentation to infer service semantics • Focus on SAP Code Bases • Initially focus on simple database type procedures -- Examine data, control flow and documentation to infer what the procedure accomplishes.
Approach • Next, focus on full ABAP to infer service semantics: -- Use knowledge of ABAP features and builtins -- Use knowledge of the object hierarchy -- Analysis of documentation • Intuition: strongly connected components in call graphs and data-flow graphs likely constitute an independent service. • Basis of identification is in -- code’s data-flow analysis, and -- documentation’s natural language analysis
Discovery Matching service found S1 Composed into requested service Composition S2 S3 Part 3: Service Discovery & Composition • Carved out services put in a repository, and discovery/composition used for building applications repository of identified services service request
Discovery and Composition • Discovery Problem: Given a repository of (Web) services, and a query with requirements of the requested service, find a service from the repository that matches these requirements. • Composition Problem: Given a repository of (Web) services, and a query with requirements of the requested service, find a set of services that can be put together in correct order of execution to obtain the desired service. One Problem
Representation of Composite Service • Composite service as a Directed Acyclic graph • Composite service description = Workflow -- Sequential composition -- non-sequential composition -- conditional service composition
Non-Sequential Conditional Composition Example ConfirmHotel ConfirmFlight ReserveFlight YES ReserveCar Visa Approved ? Query Inputs ProcessVisa Query-Outputs ReserveHotel NO CancelFlight CancelHotel SERVICE TO MAKE INTERNATIONAL TRAVEL ARRANGEMENTS
Putting it all together • Given a legacy system; carve services out of it that are placed in a repository • Services are described using USDL • Application/Workflows can be developed automatically via the composition engine • Provides a problem solving framework • Initial workflow can be manually refined
Application: Automatic Workflow Generation in Bioinformatics • Use MyGrid Repository of Bioinformatics & eScience Services Input: Gene Sequence Output: Evolution Tree BLAST CLUSTAL PAUP BlastNexus NexusClustal ClustalNexus NexusPaup PHLYOGENETIC INFERENCE TASK (Sequential Composition)
Application: Automatic Workflow Generation in Bioinformatics Create-MobyData MIPSBlastBetterE13 Extract-BestHit MOBYSHoundGetGenBank-GeneSequence Query Inputs (GeneInput) Extract-Accession QueryOutputs (GeneIdentity,AGI, AccessionNum) PHLYOGENETIC INFERENCE TASK (Non-Sequential Composition)
Application: Automatic Workflow Generation in Bioinformatics MEGA BLAST Format(Aligned SequenceSet) ? PAUP PAUP Query Inputs CLUSTAL BLASTNexus Query-Outputs PHYLIP Nexus CLUSTAL PHYLIP PHLYOGENETIC INFERENCE TASK (Non-Sequential Conditional Composition)
Industry Member Benefits • The company can convert its legacy enterprise code into a collection of services. • Company can unlock the value in its monolithic enterprise code by moving towards SOA thru automated means. • Company can put services on the web; potential source of additional revenue.
Proposed Work and Plan • Deliverables: -- Refined USDL design -- Refined Discovery & composition engines -- SILC tool set • Budget: -- Project scientist/post doc: $75K -- Faculty time: $16K -- Travel: $2K (conference) -- Total cost per year: $93K
PI’s prior commercialization experience • Developed several industrial-strength software (publicly distributed + in active use at UTD) • Two software product development companies spun out of past research • 1st funded by multiple SBIR grants (currently in product marketing stage) • 2nd spun out in Dec ’07 from UTD (Interoperate LLC); - software for interoperability (stealing market share) - Borland a partner for GUI testing area; - software in product marketing stage] • Both product solve problems for which no industry solution exists • multiple earlier attempts have failed