180 likes | 323 Views
Part Two: Using Xaira to explore corpora. Richard Xiao z.xiao@lancaster.ac.uk. Outline of the talk. Concordance Wordlist Keywords (No) Output formats Manipulating results Collocation/colligation Distribution analysis Live demonstration Tips for keeping away from bugs
E N D
Part Two:Using Xaira to explore corpora Richard Xiao z.xiao@lancaster.ac.uk
Outline of the talk • Concordance • Wordlist • Keywords (No) • Output formats • Manipulating results • Collocation/colligation • Distribution analysis • Live demonstration • Tips for keeping away from bugs • Multilingual dimension • Xaira FAQs
Concordance • Word query ( ) • Search for a word • Phrase/Quick query ( or ) • Searching for a word or phrase • Addkey query ( ) • POS or lemma search • Pattern query ( ) • Regular Expression search • XML query ( ) • Search for XML markup • CQL/XQL query ( ) • Searching using XML-based Corpus Query Language • Query builder ( ) • A powerful combination of all query types
Wordlist • In Client >> Word query (up to 100,000 lexicon entries): sorting alphabetically, by frequency, or the number of forms • In Xaira Indexer Tools >> Tools >> Indexer >> Options >> Create frequency table
Keyword? • Sadly, no – • Use WordSmith instead • WordSmith version 4.0 fully supports Unicode
Output formats • Page mode vs. Line mode (KWIC) • Plain text vs. XML text • Scope of context • Alignment (left, right, top, bottom) • Reference (on the status bar)
Manipulating results • Edit query (to save time for related queries) • Bibliographical data • Sort KWIC concordances • Select/block select/copy concordances • Right click on a concordance • Thin/edit concordances • Random sampling • Save queries and export them in XML • Print results
Collocation/colligation ( ) • Statistical measure (MI or Z) • Window span • Minimum frequency • Minimum MI/Z score • Top N collocates • Computing collocation statistics for individual words • Applying selected lemmata • Colligation (Addkey tags)
Distribution analysis ( ) • Defining partition (subcorpora) • (Texts >> Column control to select XML tags) • Texts >> Define partition (3 ways) • Based on selected class, values in a column, or solutions to a query • Texts >> Open partition • Tabulation (text class, words, hits, %, etc) • Normalised frequencies for subcorpora • Sorting tabulated data • Graphic presentation (pie/bar chart) • Save distribution data in various forms • Copy pie/bar chart
Additional features of Xaira • Annotating concordances (making notes) • Copying query text or notes • User-defined stylesheet • Colour book (e.g. different colours for different POS categories) • Remote access over a network • Platform-independent
Xaira live demonstration • Here we go… • …slides to follow
Tips for keeping away from bugs • In the Line mode, a maximum of 1,524 concordances are displayed • See the rest in the Page mode • In Query builder, joining query nodes in the horizontal direction (‘OR’) and then in the vertical direction (‘AND’) may produce unreliable counts when the Link type is specified as ‘One-way’ or ‘Two-way’ • Only define Link type as ‘Next’ or ‘Not next’ • If thousands of hits are downloaded and dozens of them are deleted by reverse selection in thinning, the system may crash • If concordances have been sorted/edited, a saved query may not be opened again • Save the edited concordances as an XML list using ‘Query – Listing’ in the menu or pressing on the toolbar
Xaira FAQs • Is Xaira free and where can I get it? • Yes, it is absolutely free. You can get a copy (binary for Windows, and source codes for compilation on the Unix/Linux/Mac system) at the SourceForce website. The latest release is 116. http://sourceforge.net/project/showfiles.php?group_id=130289 • Where can I get more documentation? • In addition to the built-in help file, more documentation is available at the Xaira site: http://www.oucs.ox.ac.uk/rts/xaira/ • Where can I get technical help? • You can sign up for the Xaira Preview List to get help: http://www.tei-c.org.uk/tei-bin/betatest • For a critical review, see http://www.lancs.ac.uk/postgrad/xiaoz/papers/xaira_review.pdf