1 / 11

Anna Bogomolova, Tatyana N. Yudina , Oleg Karasev, Ruslan Sennov

Research Computing Center of Moscow State University NCO Center for Information Research. Anna Bogomolova, Tatyana N. Yudina , Oleg Karasev, Ruslan Sennov

march
Download Presentation

Anna Bogomolova, Tatyana N. Yudina , Oleg Karasev, Ruslan Sennov

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Research Computing Center of Moscow State University NCO Center for Information Research Anna Bogomolova, Tatyana N. Yudina, Oleg Karasev, Ruslan Sennov University Information System RUSSIA: RF Social and Budget Statistics Modules with Research-assisting Services. System of Subject Headings to Cross-Search Data and Documents on Public Finances

  2. University Information System RUSSIA Collections 1 500,000/ 17.5Gb (www.cir.ru)

  3. Sociopolitical Thesaurus 70,000  concepts,      110,000  conceptual relations • constructed specially as a tool for automatic text processing; • contains terms from economic, financial, political, military, social, legislative and cultural domains; • a set of relations is adapted to information-retrieval applications; • regularly tested during automatic text processing

  4. THESAURUS for Information Retrievalin Sociopolitical Domain • Thesaurus provides for query refinement - reformulation/expansion • Terminology of Thesaurus covers 95-98% of words and terms of Russian government publications, academic papers and mass media texts from 1991 • Thesaurus is a main element of ALTP/automatic linguistic text processing technology.

  5. Query Refinement

  6. Thematic modules University Information System RUSSIA includes: • Module of Socioeconomic State Statistics of Russia • BudgetStatistics Module • Module of documents of the European Court of Human Rights

  7. System of Subject Headings for Budget Data 87 hierarchic categories First level categories are: • Macroeconomic Indicators • Budget Revenues and Expenditures • Tax Concessions • Budget Deficit/Surplus • State and Municipal Debt • Budget Process • Budget Federalism • Extra-Budgetary Funds • State Authorities • Fiscal Misconduct

  8. Category Description“Tariffs of Natural Monopolies” • Tariffs & natural monopoly • Tariffs & (gas or electricity or housing and public utilities or railway service) • Tariffs & (Unified Energy System of Russia or Gasprom)

  9. Further developments • Including microdata • Developingand testing of budget thesaurus • Developing databases of socioeconomic and budgetary statistics

More Related