1 / 18

Factiva Intelligent Indexing

chanel
Download Presentation

Factiva Intelligent Indexing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. Factiva Intelligent Indexing™ SLA 2004

    2. Agenda Factiva Intelligent Indexing™ Application of Factiva Intelligent Indexing™ Pros and Cons Quality Control

    3. Factiva Intelligent Indexing™

    4. FII Structure One universal taxonomy Building blocks Inclusive hierarchy Polyarchy Synonyms and alias names Full descriptions Variable depth and breadth

    5. Polyarchy Internet/Online services E-commerce Internet browsers Internet portals Internet search engines Internet service providers etc. Computers Computer hardware Computer services Computer stores Networking Semiconductors Software Applications software GroupWare Intelligent agents Internet browsers etc.

    7. FII Application Code mapping Entity extraction Rule-based system Linguistic analysis software Manual review

    8. Code Mapping Most information providers provide some form of metadata. This is matched to relevant Factiva indexing terms. Advantages: Easy and quick Efficient use of existing data Disadvantages: Mismatches between coding schemes Different interpretations of same concepts Variable quality – which sources do you trust?

    9. Entity extraction This tool finds company names which are then compared to our controlled vocabulary. Advantages: Consistent Precise Disadvantages: Ambiguous names High maintenance costs

    10. Symbology Snapshot

    11. Rule-based system Sets of IF-THEN statements established by editors, information architects, or subject-matter experts. Advantages: Good at highly formulaic content Precise Disadvantages: Need thousands of rules for a complete system Maintenance of the rules themselves becomes VERY expensive! Only captures explicit concepts

    12. Example

    13. Linguistics-based categorization This tool is currently employed across all English, French, German and Spanish language publications. A combination of linguistic analysis and statistical algorithms allows new content to be compared to example data and coded appropriately. Advantages: Scales to millions of documents, thousands of categories, multiple languages Copes well with change Fits editorial workflow Good fine-tuning tools – editorial control Codes implicit as well as explicit concepts Disadvantages: Training time and cost

    14. Editorial Control Set relevance levels Maintain training set Stop words - correlation and multiple meanings "Chechnya" to the industries model, as it was triggering the freelance journalist code (because so many of them were dying there)

    15. Manual coding About 200 editors spread across main time zones Advantages: Humans easily grasp the gist of the story Cope well with exceptions Visible/Controllable Disadvantages: Very resource-intensive = Expensive Slow Inconsistent (subjective and temporal) Not scalable

    16. Review process Lists reviewed every three months, redefinition, new codes, expansion changes Market research/customer feedback and behavior Changes to parent schemes/standards Editorial/Quality control feedback Internal coding forum 45-day notice period

    17. Quality control Sampling by editors Scoring for precision and recall Analysis by source, language, code, editor etc. Feedback to editors and systems Corrective action

    18. Results Three million articles coded a month All receive a level of autocoding Seventy-nine percent automation or more than two million are auto-coded with no further manual review

    19. Recap Factiva’s taxonomy is Factiva Intelligent Indexing™ Factiva uses a hybrid methodology for application Factiva has a coding team for governance and maintenance End result: Factiva Intelligent Indexing™ leverages our editorial strengths, combining human experience and expertise with the latest automation software to implement a completely flexible and granular indexing system across all of our content.

More Related