1 / 24

Discourse Level Software

Discourse Level Software. Current Status and Future Directions. Nov. 16, 2004 Lars Huttar (lars_huttar@sil.org) Knowledge Management Services. Abstract (I). Discourse analysis (DA, a.k.a. textlinguistics) is a task frequently cited as needing computer-assisted tools.

tate
Download Presentation

Discourse Level Software

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Discourse Level Software Current Statusand Future Directions Nov. 16, 2004 Lars Huttar (lars_huttar@sil.org) Knowledge Management Services

  2. Abstract (I) • Discourse analysis (DA, a.k.a. textlinguistics) is a task frequently cited as needing computer-assisted tools. • Some tools are currently available for certain tasks, but as yet, no user-ready applications specifically for the discourse charting commonly used on the field.

  3. Abstract (II) • This presentation will review a few of the existing tools most pertinent to DA on the field, and software that is planned or under development. • I will also mention the conceptual model for constituent charting described in my thesis, which uses XML encoding of text and analysis, from which a chart is rendered via XSL.

  4. Overview • The need for discourse analysis software • What’s already out there? • What’s coming down the pike?

  5. Need for Discourse Software The task: • Help the user produce charts, diagrams, and summaries of texts in such a way as to facilitate discovery of discourse patterns and to expedite testing of hypotheses.

  6. Import (interlinear) text Segment and move pieces into chart columns Mark genre(s) Configurable auto-highlighting, e.g. color by POS. Toggle highlighting of certain features Manual annotation of features incl. coherence and prominence Search text, IT, and annotations Chart/summary of results, hyperlinked to data Accessible to MTTs/OTTs Geoffrey Hunt Kent Spielmann Major features desired

  7. Example constituent chart

  8. Current Practice • Pencil & paper • MS Word • MS Excel • A few bravesouls useother tools

  9. The Right Tools? Specialized tools could make it quicker and easier!

  10. How to Address the Need? • Use existing software • SIL FieldWorks DA tool(s) • Extend existing tools?

  11. What’s already here? • MDA • BART • RSTTool • MATE • CiCaDA

  12. Multilinear Discourse Analysis • Generate statistics and diagrams relating to span analysis, topic continuity statistics, and other issues • Input is an SFM marked up text (e.g. from Shoebox) • In Beta 2 • More info: phil.quick@sil.org

  13. Biblical Analysis Research Tool • BART – has features supporting discourse analysis of biblical texts • Comes with extensive built-in morphosyntax markup; supports customizable tagging and complex queries. • Only for biblical texts; can’t enter vernacular texts. • Part of TW, or available from WordSearch Corp. • www.sil.org/translation/bart.htm

  14. RSTTool • Lets user diagram relations between text “chunks.” • Free download from http://www.wagsoft.com/RSTTOOL • User can define own set of relations, schemas, etc. such as SSA or Longacre’s propositional relations. • Can generate statistics based on the tree structures built by the user. • File format is XML-based. • Text can be edited even after struc-turing has begun.

  15. MATE Workbench • Tool “to aid in the display, editing and querying of annotated speech corpora” • Encodes data in XML and displays via XSL-like stylesheets; could be programmed to produce various displays. • In “early demo” version (2001). Looks like it has potential, but I can’t get it to runon my machine. • http://mate.nis.sdu.dk/

  16. CiCaDA • Produce fairly feature-complete constituent charts from XML data using XSLT stylesheets. • Encode text, column assignments, and chart configuration in XML; chart is produced automatically. • Open standards promote modification/ reuse of data. • There is no “application;” no user-friendly way to enter the XML data.

  17. Helps available • LinguaLinks Library has several items, including: • Analyzing Discourse: a Manual of Basic Concepts – Dooley & Levinsohn (avail. on the web as well as in LLL). Very practical.

  18. Do you know of others? • Please let me know if you are aware of other useful discourse-level software tools!

  19. What’s coming? • TCC • AGTK • FieldWorks DA tools

  20. TCC • “A tool for drawing syntax trees” – could also be used for discourse “chunking” and highlighting • Looks very easy to use. Collapsible tree makes it easy to browse large text structures. • Supports Latin-1 charset. • Author taking feedback to make TCC more useful for SIL’s work. • Still in beta. No release sched. • Info: http://ulrikp.org/

  21. Annotation Graph ToolKit • AGTK is a toolkit for annotating texts • TreeTrans – edit syntactic trees; charting & chunking possible • InterTrans – interlinearize text (very beta) • Saves in an abstract XML format; potential good basis for “Lego” solution • Not ready for end users.

  22. SIL FieldWorks DA Tool(s) • FW DA software is still on the drawing board but is a high priority. • Would leverage the huge benefits of all the work that has gone into FieldWorks! • FW tools already support interlinear text, text annotations/tagging and highlighting. • Preliminary work has begun on design of constituent charting features. • Wish list for DA features exists but requirements not yet prioritized.Guidance team has not yet beenformed.

  23. Conclusion • There are some good tools already out there for certain tasks related to DA. Unfortunately they don’t interoperate much, and there are no domain-aware applications for constituent charting. • SIL FieldWorks tools, as they become available, should cover certain DA tasks well, such as constituent charting.

  24. Questions? Comments?

More Related