360 likes | 543 Views
Crawling, P arsing and Semantic Matching of Vacancies and CV’s Semantic Recruitment Technology Jakub Zavrel , Textkernel InGRID Workshop 11-2-2014. Textkernel : Spinoff from R&D in machine learning and language technology
E N D
Crawling, ParsingandSemantic Matching ofVacanciesandCV’sSemanticRecruitmentTechnologyJakubZavrel, TextkernelInGRID Workshop 11-2-2014
Textkernel: • Spinoff from R&D in machine learning and language technology • Founded 2001, offices in Amsterdam (HQ), Frankfurt, Paris, 45 employees; strong R&D focus • Deloitte Fast 50 2007, 2010, 30% YoY growth • Core technology: Understanding unstructured text data. Multi-lingual Market: • Job boards, Recruitment Software, Staffing and recruitment, Mobility, LargeEmployers • Products: • Multi-lingual tools (15 languages) to extract CVs and jobs • Jobfeed: largest real time DB for job marketanalysis • Search! & Match! to connect people and jobs • Customers: UWV,PoleEmploi, Adecco, Randstad, USG, Monster, Stepstone, XING, SAP, Unisys, Bosch, Axa, Philips, etc. (350 direct, 2000+ indirect), • Large partner network (HR & recruitment software)
Language gap I likeprogramming, butI’minterested do takeon more project management responsibility Is there a job in ourorganisationthatbetter fits mydegree? We are looking to hire: Anexperiencedtech team team lead I’dlike to workonour mobile strategy. I’vehelped a frienddevelop a mobile app. • The idealcandidate has: • min. 5yr of experience • Certfiedscrummaster • Exp. w/iOS, Android I’dlike to do more withmyorganisational talent. Completedacademic studies Computer Scienceorrelated 30% travel forcustomerpresentations
The Job ad searches directly in a database and identifies relevant candidates (or vice versa) …
Extract! CV/Job Parsing Automatically convert each document into a complete record
Extract! • Time savingscodingCVsand Jobs • Ifyouacceptnoise, 100% time savings • Structured dataallowsbettersearch: SemanticSearching and Matching • Codingenablesreporting and statistics
Occupation coding! • Coding followsExtraction • Customerspecificorstandardtaxonomies • Stringsimilaritybasednormalization • Lot of synonyms per language • Distance = confidences • Problem cases: ambiguity, context, long tail • More complex modelscan help(classifiers, multi-variate models) • Semantic matching better (occupation coding errors are counterbalancedbyother variables)
Search! • Semanticsearch: „Letsyou find whatyoumeannotwhatyoutype“ Impression...
Match! • Match!
Semantic Matching Technology: • Natural Language Processing • Machine Learning • Semantic Analysis • Probabilistic Language Model • Search Engine • Multi-lingual taxonomies • Recruitment knowledge-bases
Jobfeed Search andanalyse real-time online jobadsaswellashistoricaldata
Jobfeed! Knowledge of all demandforlabour in European job market • Salesleadsforrecruitment and staffingcompanies • Real time labourmarketanalyticstools • Largest database of jobs for matching unemployed • Perfect data sourcefortextmining
Jobfeed! • Real time collection of online job adsfromany (unstructured) source • Available in NL, DE, FR, IT • Gradually rolling out in rest of Europe • Richlysemanticallystructured data
Jobfeed: Multilingual Occupation Taxonomy • Occupations >4000 codes • 4 languages • 3 layer hierarchy • >50K synonyms • Link to other concepts: • - Skills • - Education level • - Sector • - O*NET • UWV (Dutch Employment Agency) • ROME Example: NL: administratiefmedewerker, EN: administrative assistant, FR: employéadministratif, DE:Verwaltungsassistent(m/w). Group: administrative personnel Class: Administration and Customer Service Synonyms: administrative employee, assistant clerk, office support Skills: ms office, excel, english language, etc O*NET: 43-9199.00: Office and Administrative Support Workers, All Other UWV: 1000402563: Administratiefmedewerkersecretariaat Basedonmillions of jobs, years of customer feedback and experience!
Je op is voor te ervaring aan als and software Frequent words for "Java developer" en van de een je met in het Java of om team zijn kennis bij Ervaring die the naar a jaar jij bent Developer HBO hebt to werken werk
voor te is of zijn aan bent naar bij om Frequent words for all professions en van de een in het je met op Je als ervaring die Het hebt deze werken zoek De wij functie onze ben tot over werk opleiding uit and werkzaamheden dat binnen u Als Voor zelfstandig kennis ook s verantwoordelijk
Solution: contrast frequencies Observed frequency of w: O(w) = A Expected frequency of w: E(w) = C * B / D Pick words with highest score: score(w) = (O - E)2 / E
Top words for "Java developer" java developer software spring scrum agile hibernate ontwikkelaar u j2ee Building rich skills profiles for thousands of occupations from millions of real time jobs…… new trends and occupations… wij xml jee o javascript you kennis ontwikkelen oracle ontwikkeling development maven applicaties ervaring web de frameworks jboss mbo senior architectuur webservices informatica werkzaamheden technologie developers eclipse bezit het team wo rijbewijs technieken tomcat the vca zelfstandig architect werklocatie html
Supply & Demand • Have: lots of data, technology, ideas • Want: labormarket expertise, students, research
SemanticRecruitmentTechnology Thanks!