220 likes | 328 Views
www.itbs.ee. Anton Vedeshin, 05-11-2009. Informaton Extraction (Retrieval) in Application to e-Business. www.itbs.ee. Imagine!. Fitting room for Web design companies. Web Design company web site. www.itbs.ee. Contents. Introduction (General Information Extraction)
E N D
www.itbs.ee Anton Vedeshin, 05-11-2009 Informaton Extraction (Retrieval) in Application to e-Business
www.itbs.ee Imagine! • Fitting room for Web design companies WebDesign company web site
www.itbs.ee Contents • Introduction (General Information Extraction) • Applications to e-Business • Our Solutions
www.itbs.ee Information Extraction
www.itbs.ee Types ofsource • Webpages, XML, RSS, Blogs, Excel, whatever…
www.itbs.ee Uses • Web Portals • Information Systems • Databases • Decision systems • whatever…
www.itbs.ee Applications to e-Business (1) • Fitting room for Web design companies WebDesign company web site
www.itbs.ee Applications to e-Business (2) • Module for your CMS Old client’s web page New client’s web page CMS provider CMS, PHP… HTML, frames…
www.itbs.ee Applications to e-Business (3) • Information/Comparison portals Example: currency rates comparison
www.itbs.ee Applications to e-Business (4) • Context searching portals Example: discounts finder (Riga area) Your Searching Engine or Portal Internet Context Extraction
www.itbs.ee Applications to e-Business (5) • Product lists comparisons Example: Pricelist compilation for hardware shops Toshiba Portege Tablet R500-11C, WXGA 12.1, C2D U7600 ULV 1.2G, 2GB, 160GB, WLAN, Bluetooth, Vista Bus. Toshiba Portege Tablet / 12.1" WXGA/ CD U7600 1.2GHz/ RAM 2GB/ HDD80/ WiFi/ BT/ FP/ Vista Business
Applications to e-Business (6) • Indexing / Classification Example: Estonian Legislative Acts Indexing
Applications to e-Business (7) • … Indexing / Classification A. Vedeshin, I. Liiv, E. Täks: Visualisation and structure Analysis of Legislative Acts: A Case Study on the Law of Obligations (ICAIL’07, ACM)
www.itbs.ee Applications to e-Business (8) • Business catalogs extraction Example: making database from www.zl.lv
www.itbs.ee Our Solution (1) we developed an • PHP + MySQL • Regex Rules • near to real time processing Ideal Solution for e-Business Information Retrieval Framework
www.itbs.ee Our Solution (2) the IR Framework is using: • HTML / XML (all types of tags) • JavaScript • CSS (inline and linked) • Images • Human-like crawling
www.itbs.ee Our Solution (3) • WS Internal Architecture
www.itbs.ee Our Solution (4) THREE types of extraction:
Our Solution (5) • The Part of The Algorithm (Automatic mode) Advanced Information Retrieval from Web Pages in proceedings of FDIA 2007
www.itbs.ee Future Use & Applications (1) • Analyzing not only text and images (for text), but maps, schemes etc. • Weather maps (waves, wind, ice…) • Position vectors (speeds, paths…) • Sea vehicle information (Gross Tonnage, people aboard…) Risks calculation & Accident prediction on Baltic Sea
www.itbs.ee Future Use & Applications (2) More?
www.itbs.ee Thank You! Questions? Comments? Suggestions? Anton Vedeshin anton@itbs.ee Tallinn, Estonia