120 likes | 259 Views
AHRT: The Automated Human Resources T ool. BY Roi Ceren Muthukumaran Chandrasekaran. Outline. Problem Domain Background Web Services Ontologies Approach Web Services Architecture Web Services Parsing Ontologies AHRT Demo!. Problem Domain.
E N D
AHRT: The Automated Human Resources Tool BY RoiCeren Muthukumaran Chandrasekaran
Outline • Problem Domain • Background • Web Services • Ontologies • Approach • Web Services Architecture • Web Services • Parsing • Ontologies • AHRT Demo!
Problem Domain • Many companies use existing web based systems like Taleo as their job application interface • Some systems allow the applicant to upload their resume and parse it to automatically populate the fields in the application • However, these systems do a poor job in populating the fields accurately and sometimes require extensive user interaction • They often store the information in flat files or databases only • Our Objective • Effective parsing rules for automating data collection • Ontologies used for knowledge representation • Exposed via RESTful Web Services for platform independence
Background: Web Services • Web services • Applications can be broadcast as a service on a web server, such as Apache • A wrapper, such as Axis or Tomcat, can be used to execute these applications • Standardized protocols (beyond HTTP) can be used to further promote uniform communication, such as JSON • We use encoded form data • In this way, we can allow disparate platforms to interoperate • To exemplify this, we use REST servers on different ports • PHP server handles parsing • JAX-RS server handles the ontology and instances
Background: Ontologies • Formal representation of knowledge as • Set of concepts in a domain, and • Relationships between them • Advantages of using Ontologies: • More expressive and searchable • Can be visually examined • Relationships can be expressed between attributes
Approach (Web Service Architecture) • Expose the functionality of our program to the world • Text categorization algorithm • Getters and setters for resume instances • Publish access to these functions via web services • REST with Java (JAX-RS) using Jersey: For persistence layer • PHP RESTful web server: For parsing • Build a web interface that utilizes these web services
Approach (Web Service Architecture) AHRT System Communication Protocol Browser PHP resume to port 80 processed resume edited resume to port 8080 JAX-RS • Users will interact with the web interface to categorize their resume • User uploads their resume to PHP REST server • Server returns parsed resume in a form • User may alter or add to categorized resume • Server stores the resume as an instance of its resume ontology using JAX-RS server
Approach (Web Services) Communication Protocol Browser AHRT System OWL file request OWL file JAX-RS • Admins can then access the ontology file of the entire database • Note: For ease of demo, uploads automatically navigate to the download page
Approach (Web Services) • A PHP server was built to handle parsing • Can exist independently of the database server • Uploaded resumes will be processed by the server and formatted in an upload form • User can correct the resume categorization before uploading to the server’s ontology database • Apache Tomcat was configured on the AHRT server for instance handling • Serves as a wrapper for the JAX-RS REST server • Final submissions then are added to the database • The database serves as a instance pool for the ontology, which has the logic for the domain built in
Approach (Parsing) • PHP used in this version • Other web services exist that parse (they cost upwards of $600!) and can be swapped out in the source code • preg_match and preg_match_allfunction used with various regular expressions to handle the identification and classification of resume categories • preg_match identifies a string based on the REGEX • preg_match_all splits data into arrays based on a REGEX
Approach (Ontologies) • JENA Ontology API • Loads a predefined OWL schema • API used to create instance variables • Can be used for two different types of properties • Datatype properties • Object properties • We utilize the database to pull instance data • JENA then populates the ontology with this data • We will open the created .owl file within Protégé to view the ontology’s instance variables • Jambalaya Plugin can be used to visualize the ontology
Demo! http://denali.cs.uga.edu