1 / 60

CADIP Research at UW-Milwaukee

Learn about methods for efficient image retrieval online using HTML metadata with a focus on multimedia information retrieval (MMIR) and practical search strategies. Discover the importance of image search and explore different agent implementation languages.

lusher
Download Presentation

CADIP Research at UW-Milwaukee

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CADIP Research atUW-Milwaukee Ethan Munson and Yelena Tsymbalenko

  2. Research Foci • Languages for implementing agents • MS Thesis by Preeti Seshadri • Multimedia information retrieval • Exploiting metadata to improve MM IR • Yelena Tsymbalenko’s MS research • Models of media • Usability of information visualization • future work

  3. Multimedia IR

  4. Using HTML Metadata to Retrieve Relevant Images from the Web Yelena Tsymbalenko University of Wisconsin-Milwaukee

  5. Why is image search important? • The Web is primary source of obtaining information. • Images are one of the most valuable sources of information available on the Web. • Few WWW image search engines currently exist. • Using textual search engines to find images manually is laborious.

  6. A Requirement for Web Image Search • We need an efficient method of discovering and indexing image content. • Two main sources of information about image content: • image processing • associated text • text content • markup

  7. Related work • WebSeek(J. Smith & S. Chang, Columbia University) • performs a semi-automated classification of the images • uses image file name for categorization • searches by browsing or searching through the categories • uses image features such as color content to find images of similar color

  8. Related work • WebSeer(M. Swain et al., The University of Chicago) • uses associated text and markup to supplement information derived from analyzing image content • uses multiple kinds of metadata • decides which images are photographs

  9. Why look for new methods for image retrieval? • The number of WWW documents is growing rapidly and constantly changing. • Image processing is complex and computationally expensive. • We need fast and efficient methods for finding images. • Extensive image processing is not necessary.

  10. Our research • Obtain information about image content from HTML Source Code: • Explicit: file and HREF names • Implicit: markup structure • Determine which features of Web documents are best clues to image content

  11. Search Strategy Examples • Image file name • Title of HTML document • Alternate text (ALT tag) • Text of hyperlink • Text of the same paragraph • Header text

  12. Analysis Plans • Will collect data about search results for a number of queries (several dozen) • Suggestions for queries are welcome ! • Will test which clues are most effective • Are some redundant ? • Does a combination of clues produce better recall ? • Are some clues more precise ? • Is search performance dependent on query type ? • Proper names (Chaplin, Garvey) • Phenomena (riot, explosion)

  13. Using HTML Metadata to Retrieve Relevant Images from the Web Yelena Tsymbalenko Department of Computer Science University of Wisconsin - Milwaukee yelena@cs.uwm.edu

  14. Agent Implementation Languages

  15. Agent Implementation Languages • Preeti Seshadri’s thesis has two parts • Survey of languages • Pure scripting languages • Tcl, Perl • Scripting/general-purpose languages • Java, Python, Telescript • Resource management service for Java • Interface design • Partial implementation

  16. Language Requirements • Good language infrastructure • OO or other good modularity features • Automated memory management • Decent performance • Byte-compilation is probably enough • Portability • Security • Mobile agents must either be trusted or controlled • Control is always better

  17. Language Survey • Systems programming languages (C, C++) • high-performance, but non-portable and insecure • Pure scripting languages (Tcl, Perl) • low-to-medium performance, portable • limited security and communication services • Scripting/general-purpose languages (Java, Python, Telescript) • medium-performance, portable • more security and communication support

  18. Systems Programming Languages • Native-code compilation yields very high performance • Native-code is not portable • compilation is too complex to perform at client site • Language definitions are limited • no security or coordination infrastructure • little is guaranteed about higher-level services • even exception handling is limited

  19. Pure Scripting Languages • Tcl is a bad choice • poorly suited for larger applications • low performance • poor language infrastructure • non-OO, no threads, no exceptions • Perl is a bit better • performance is better, but not great • limited security • language complexity is high

  20. Telescript • “Environment” for constructing agent societies • Proprietary (General Magic, Inc.) • Language, engine, communication protocols • Claimed to be fast, easy-to-use, secure • Core concepts • “places” are execution contexts and can be nested • No agent-to-agent communication • agents move to places and do things • Capability-based security (“permits”)

  21. Python • An OO scripting language • Unusual dynamic type system • Many high-level data types • Socket-level networking support • Typical byte-compiled characteristics • portability, dynamic linking • Limited security support • “Restricted execution,” similar to sandboxing • appears poorly integrated with mobility

  22. Java • General-purpose language widely used for scripting-style applications • Excellent language design • Medium performance • Strong security features • customizable “sandboxes” • Heavily and effectively hyped • portability is overrated • performance will probably never match C++

  23. Security Issues in Java • Java is very secure, but problems remain • e.g. security managers are inflexible • Agent portability is a problem • A newly arrived agent must be trusted • sandboxing addresses the obvious trust issues • Denial of service attacks are still possible • deliberate and accidental • Java lacks standard resource management services

  24. Resource Management Interface • Supports both monitoring and control • CPU time • memory • threads • Granularity is per-thread and per-threadgroup • Designed to work on bytecode, not source • can monitor “outside” agents

  25. RunTimeException UsageException ThreadRegister ThreadRegister ChiefMonitor ExceedUse Class Structure has uses interface uses exception uses is

  26. Interface Details • Initialization • Resource usage queries • consumption • limits • Resource usage control • set usage bounds and policy • reset usage bounds • Resource exceptions • interrupt-style control

  27. Implementation Plans • Prototype requires that agent be built with internal monitoring support • agent’s implementor must cooperate • We want to impose monitoring on arbitrary mobile agents • Solution: bytecode rewriting • All interesting operations have well-defined representation in Java bytecode • Will wrap relevant bytecodes in monitoring code • Similar to Purify/Quantify

  28. A Theory of Media for Multimedia Authoring and Browsing Systems

  29. The Original Problem • Develop a multimedia document system that allows easy addition of new media modules • Kernel/shell architecture • Shells support individual media • text, graphics, video • Kernel provides medium-independent services • document structure, scripting language, style sheet system

  30. Proteus Style Sheet System • Portable style sheet system • PSL style language adapts to application (or media module) • medium supported by application is specified with MSPEC language • Architecture designed for multiple, simultaneous presentations • Used in Ensemble document environment and in MPMosaic WWW browser

More Related