190 likes | 330 Views
HYP Progress Update. By Zhao Jin. Outline. Background Progress Update. Background. Query (Text-based) The set of keywords to be entered into the system to retrieve the desired information or resources Main category Traditional IR Web (ex. Google) OPAC (ex. LINC) Video (ex. TRECVID).
E N D
HYP Progress Update By Zhao Jin
Outline • Background • Progress Update
Background • Query (Text-based) • The set of keywords to be entered into the system to retrieve the desired information or resources • Main category • Traditional IR • Web (ex. Google) • OPAC (ex. LINC) • Video (ex. TRECVID)
Background • Query Analysis • To analyze the pattern and hidden information in the queries • To efficiently classify and support such queries.
Progress update • Mid-May to Early June • Background reading • Around 30 to 40 papers on various topic • Summarizing of key points in the paper
Progress update • Mid-June to late-June • Log analysis • BBC Video Query • NUS OPAC Query • Background reading on OPAC and TRECVID
Progress update • July to now • Follow up on two main topics • Query classification and division on content-based and feature-based keywords (OPAC) • Identifying ASR-oriented keywords in a video query (TRECVID) • Background reading on MARC, wordnet and LOC subject heading
Progress update • Plan for the near future • Refine and experiment with the current ideas • Log analysis • Background reading (Textbook & Related paper) • Preparation for implementation
End of progress update • Thank you for your attention!
Two types of keywords • Content-Based Keyword (CBK) • The keywords that concern what the item is about • Ex. title, subject heading, etc • Feature-Based Keyword (FBK) • The keywords that concern the features of the item. • Ex. author, publisher, genre, medium
Benefits • Benefits: • Faster retrieval • More precise retrieval • Help in relevance ranking
Possible implementation • Possible implementation: • term co-occurrence for concept division • list of special words and machine learning for FBK and CBK division • wordnet for classification among CBKs
Possible implementation • Possible implementation: • CL and IL search algorithms for actual searching with CBKs. • list of special words and machine learning for classification among FBKs. • Marc record search algorithms for actual searching with FBKs. Back
Means to retrieve shots • Example: • To find shots of “Bill Clinton” • Face recognition • Closed-caption • Automatic Speech Recognition (ASR)
Metrics • Common VS Special (In reality) • How common in reality is the concept represented by the keyword. • Generic VS Specific • How generic is the concept represented by the keyword.
Metrics • Concrete VS Abstract • Whether the keyword represented is concrete or abstract • Topic frequency (Low VS High) • How often the keyword becomes (closely related to) a topic.
Metrics • Formal VS Informal • Whether the keyword is in formal or informal language • Written VS spoken • Whether the keyword is in spoken or written language
Metrics • Feature-level VS Content-level • Whether the keyword is about the feature of the video (ex. camera motion) or the content of the video Back