220 likes | 341 Views
Automatic Classification of Accounting Literature Nineteenth Annual Strategic and Emerging Technologies Workshop. Vasundhara Chakraborty, Victoria Chiu, and Miklos Vasarhelyi San Francisco July 31, 2010. Automatic Classification of Accounting Literature. Introduction and Background
E N D
Automatic Classification of Accounting Literature Nineteenth Annual Strategic and Emerging Technologies Workshop Vasundhara Chakraborty, Victoria Chiu, and Miklos Vasarhelyi San Francisco July 31, 2010
Automatic Classification of Accounting Literature • Introduction and Background • Motivation and Research Questions • Literature Review • Classification of Accounting Research- The Manual Method • Development of Automatic Classification Method • Methodology- A Two-Phase Experiment • Phase I: Keywords • Phase II: Full Abstract • Results and Analysis • Treatment • Mode of Reasoning • Accounting Area • Conclusion and Implication Outline
Introduction (1/5) Purpose This study explores the possibility of developing a methodology to classify accounting academic publication automatically. Research Questions Can we automate the classification process (of accounting literature) by using the keywords from academic journal articles? Can we automate the classification process (of accounting literature) by using the full abstracts from academic journal articles? Do results vary depending on which elements we use to automate the literature classification process and to what extent do they differ? Automatic Classification of Accounting Literature 3
Introduction (2/5) Contribution Extending the usefulness of automatic text classification method to publications, and Seeking the possibility to improve the methodology applied in research that investigates the attributes and development of knowledge in accounting discipline. Automatic Classification of Accounting Literature 4
Introduction (3/5) Motivation Literature taxonomization is a critical element for revealing the development and evolution of knowledge in disciplines (Brown et al. 1987, 1989, Vasarhelyi et al. 1988, Brinberg and Shields 1989, Meyer and Rigsby 2001, Heck and Jensen 2007). Traditionally, the taxonomization process has been manually performed in this research area (Vasarhelyi et al. 1984, Brown et al. 1985, 1989, and Badua 2005). Automatic Classification of Accounting Literature 5
Introduction (4/5) Motivation (cont.) The rapid growth of collections in online academic databases indicates that there is increasing difficulty for professionals to access information in a timely and efficient way (Nobata 1999). Gangolly and Wu (2000)- the development of methods for automatic indexing and classification of concepts has been necessitated due to The increase of text databases size. High cost of domain expertise to develop classifications. Automatic Classification of Accounting Literature 6
Introduction (5/5) Automatic Classification of Accounting Literature • Taxonomy for Classifying Accounting Publications • Accounting Research Directory- The Database of Accounting Literature (ARD)- (Brown, Gardner and Vasarhelyi, 1993). • Treatment, Accounting Area, Mode of Reasoning, Research Method, Inference Style, Mode of Analysis, School of Thought, Information, Geography, Objective, Applicability, and Foundation Discipline. • Treatment: identifies the major factor (independent variable) or other accounting phenomenon associated with/causes the Information taxon (dependent variable). • Accounting Area: identifies the major accounting field under which the article belongs. • Mode of Reasoning: identifies the technique used to formally arrive at the conclusion in the article. 7
Literature Review (1/4) I. Accounting Literature Classification- The Manual Method Literature on examining the attributes and development of accounting research manually classifies publications that represent the core knowledge of accounting discipline. Kinard and Putney 1968, Gonedes and Dopuch 1974, Hofstedt 1975, 1976, Brown et al. 1987, Vasarhelyi et al. 1988, Rigsby 2001. Vasarhelyi et al. (1988) examine the trend of accounting research within 1963 and 1984. Taxonomy: Research Method, Foundation Discipline, School of Thought, and Mode of Reasoning. Automatic Classification of Accounting Literature 8
Literature Review (2/4) I. Accounting Literature Classification- The Manual Method (cont.) Brown et al. (1989) researched on accounting publications in academic journals (AOS, TAR, JAE, and JAR) from 1976 to 1984. Accounting Area, Research Method, School of Thought, and Geographic. Fleming et al. (2000) studied the evolution of research in The Accounting Review (TAR) within 1966 and 1985. Focused attributes: Research Methods, Financial Accounting subtopics, article length, citations, and author background. Automatic Classification of Accounting Literature 9
Literature Review (3/4) II. Development of Automatic Classification Method Crouch and Yang (1992) foundthat automatic classification method produces useful thesaurus classes when supplementing query terms. Chen et al. (1995) automatically generated a thesaurus to evaluate Worm Community System (WCS) by adopting the algorithmic approach developed by Chen and Lynch (1992) which was applied to generate a concept network. Nobata (1999) used statistical and decision tree methods to identify and classify biology terms automatically. Refining the applied algorithms for automating classification process is needed. Automatic Classification of Accounting Literature 10
Literature Review (4/4) II. Development of Automatic Classification Method (cont.) Classifying financial accounting concepts automatically (Gangolly 2000). Term frequency in financial accounting standards was analyzed and clusters of concepts are derived by agglomerative nesting algorithm. Automatic grouping related accounting concepts (Garnsey 2006) Semantic parsing techniques and statistical methods were used. Automatic Classification of Accounting Literature 11
Automatic Classification of Accounting Literature Methodology (1/3) • Sample Collection • Three hundred and fifty eight articles published in accounting journals were downloaded.
Automatic Classification of Accounting Literature Methodology (2/3) Phase I: Using Keywords Validation of results Termfrequency Database Apply classification algorithms Attribute Selection Parse out keywords Word count Journal articles Create document –term matrix Phase II: Using Full Abstract Database Create phrases Term frequency Validation of results Parse out full abstract Word count Apply classification algorithms Journal articles Attribute Selection Create document –term matrix
Automatic Classification of Accounting Literature Methodology (3/3) • Examples of Keywords used for Treatment Taxon Classification
Automatic Classification of Accounting Literature Phase I: Keywords Analysis with all Subclasses Results and Analysis (1/5) • Analysis on Treatment Taxon
Automatic Classification of Accounting Literature Phase I: Keywords Analysis with Class Modification Results and Analysis (2/5) • Analysis on Treatment Taxon
Automatic Classification of Accounting Literature Phase II: Full Abstracts Analysis with Class Modification Results and Analysis (3/5) • Analysis on Treatment Taxon
Automatic Classification of Accounting Literature Results and Analysis (4/5) • Analysis on Accounting Area Taxon Automatic Classification of Accounting Literature
Automatic Classification of Accounting Literature Results and Analysis (5/5) • Analysis on Mode of Reasoning Taxon Automatic Classification of Accounting Literature
Automatic Classification of Accounting Literature Comparison of Results Conclusion (1/2)
Automatic Classification of Accounting Literature Conclusion (2/2) • Findings Summary • This study shows that using semantic parsing and data mining techniques, we can classify academic publications. • Treatment and Accounting Area taxons can be classified relatively better. • Limitations • Limited number of articles were used for the experiments. • A more comprehensive database with sufficient representation of articles belonging to different subclasses is needed. • Future Research • Use of a larger data corpus. • Use of full text. • Use semantic parsing and data mining methods to discover new emerging paradigms in the accounting literature.
Automatic Classification of Accounting Literature Thank You!