270 likes | 374 Views
Automatic Analysis of Corporate Financial Disclosures. Darina M. Slattery University of Limerick Ph.D. Postgraduate Student Richard F.E. Sutcliffe University of Limerick Eamonn J. Walsh University College Dublin. Research Objective.
E N D
Automatic Analysis of Corporate Financial Disclosures Darina M. Slattery University of Limerick Ph.D. Postgraduate Student Richard F.E. Sutcliffe University of Limerick Eamonn J. Walsh University College Dublin
Research Objective • To investigate the correlation between the release of disclosure information and share price resonses, and to develop a system that will analyse such disclosure information and predict the likely share price response accordingly
Principal Stages of Research • Identify Interesting Content in Disclosures • Use Classification Techniques to Predict the likely Share Price Response, based on Interesting Content identified
Definition of Financial Terms • Corporate Financial Disclosures • Securities & Exchange Commission (SEC) • EDGAR
Definition of Financial Terms • SIC Code • CIK Code • Ticker Symbol • Registrant
Sources of Financial Information • Disclosures • Historical Share Price Data • Press Releases • Industry Trends • Speculation
SEC & EDGAR: Background (1/2) • The Securities Act of 1933 & the Securities Exchange Act of 1934: • Require all Public Domestic Companies to disclose and file Specific Periodic & Annual Reports with the SEC • As of May 1996, all SEC disclosures must be filed on EDGAR and thus made available online at http://www.sec.gov/edgarhp.htm
SEC & EDGAR: Background (2/2) • EDGAR can be accessed by: • Web Browser • Anonymous File Transfer Protocol (FTP) • EDGAR Search Facilities: • General-Purpose Searches • Special-Purpose Searches
EDGAR Disclosures • Format Dictated by Law • Plain Text or HTML only • Certain Limitations on HTML Tags • For Consistency
EDGAR Form Types (1/2) • Form 10-Q • Quarterly Report • Continuing View of the Company • Due within 45 Days after Quarter Close • Form 10-K • Annual Report • Comprehensive Overview of the Company • Due within 90 Days after End of Fiscal Year
EDGAR Form Types (2/2) • Form 8-K • Current Report • Reports the Occurrence of any Material Events and Corporate Changes • Of interest to Shareholders & Potential Investors • Due within 5 or 15 Days after the Event Occurrence, depending on Event
Stages Undertaken To-Date • Obtained Suitable Data Set • Analysed Structure of Form 8-K’s • Prepared Content for Classification • Attempted to Classify Documents by Likely Share Price Response
Data Set • Obtained 567 Form 8-K disclosures in SIC 7372 • Filing Date • URL • Share Price Response around Filing Date • Up Down Nochange • 219/ 567 Disclosures Chosen for Experiments • Categorised by Share Price Response • 80% of each Category used as Training Data • 20% of each Category used as Test Data
Structural Analysis • Header • Items 1-9 • Item 1: Changes in Control of Registrant • Item 2: Acquisition or Disposition of Assets • Item 3: Bankruptcy or Receivership • Item 4: Changes in Registrant’s Certifying Accountant • Item 5: Other Materially Important Events • Item 6: Resignations of Registrant’s Directors • Item 7: Financial Statements and Exhibits • Item 8: Change in Fiscal Year • Item 9: Regulation FD Disclosure • Appendices
Content Preparation: Wordsmith 219 Disclosures Large Phrase List On June 2, 1998, XXX Corporation announced in a press release the signing of an Acquisition Agreement and Plan of Merger … • agreement and plan of merger • announced in a press release • the outstanding capital stock of • … W O R D S M I T H On May 2, 2000, XXX Corporation announced in a press release the sale of the outstanding capital stock of … … On April 17, 1997, XXX Corporation signed an Agreement and Plan of Merger … Note: Sorted by frequency in descending order
Content Preparation: Tokenisation Large Phrase List Tokenised Phrase List agreement and plan of merger announced in a press release the outstanding capital stock of … • ‘agreement’,’and’,’plan’,’of’,’merger’ • ‘announced’,’in’,’a’,’press’,’release’ • ‘the’,’outstanding’,’capital’,’stock’,’of’ • … Note: Sorted by frequency in descending order Note: 219 Disclosures are also Tokenised
Content Preparation: Compound Search Trees Tokenised Phrase List Note: At this stage, the phrase list is cut off at 100, 1000 & 10000 phrases for the three experiments • ‘agreement’,’and’,’plan’,’of’,’merger’ • ‘announced’,’in’,’a’,’press’,’release’ • ‘the’,’outstanding’,’capital’,’stock’,’of’ • … Search Tree [agreement, [and, [plan, [of, [merger, []]]]]] [announced, [in, [a, [press, [release, []]]]]] [the, [outstanding, [capital, [stock, [of, []]]]]] … COMPOUND SEARCH TREE GENERATOR
Content Preparation: Compound Noun Matching (1/2) Search Tree 219 Disclosures ‘on’,’june’,’2’,’1998’,’xxx’,’corporation’,’announced’,’in’,’a’,’press’,’release’,… [agreement, [and, [plan, [of, [merger, []]]]]] [announced, [in, [a, [press, [release, []]]]]] [the, [outstanding, [capital, [stock, [of, []]]]]] … ‘on’,’may’,’2’,’2000’,’xxx’,’corporation’,’announced’,’in’,’a’,’press’,’release’,… … ‘on’,’april’,’17’,’1997’,’xxx’,’corporation’,’signed’,’an’,’agreement’,’and’,’plan’… COMPOUND NOUN MATCHING
Content Preparation: Compound Noun Matching (2/2) COMPOUND NOUN MATCHING announced_in_a_press_release agreement_and_plan_of_merger … The Number of Single Token Phrases is Determined by the Length of Phrase List Used (100, 1000 or 10000 Phrases) 219 Disclosures With Single Token Phrases announced_in_a_press_release the_outstanding_capital_stock_of … … agreement_and_plan_of_merger …
Content Preparation: Overview 219 Disclosures With Single Token Phrases 219 Disclosures On June 2, 1998, XXX Corporation announced in a press release the signing of an Acquisition Agreement and Plan of Merger … announced_in_a_press_release agreement_and_plan_of_merger … On May 2, 2000, XXX Corporation announced in a press release the sale of the outstanding capital stock of … announced_in_a_press_release the_outstanding_capital_stock_of … … … On April 17, 1997, XXX Corporation signed an Agreement and Plan of Merger … agreement_and_plan_of_merger …
Classification: DTS System (1/2) • The C4.5 Decision Tree System (DTS) • A Machine Learning Algorithm • Derives Decision Trees • Set of Records (i.e. Set of Disclosures) • Non-Categorial Attributes • Categorial Attribute • Files Required • File.names File.data File.test
Classification: DTS System (2/2) • File.names • Defines Non-Categorial & Categorial Attributes • File.data & File.test • Training & Test Files Take Same Format
Classification: IIS System • The Inverted Index System (IIS) • A Conventional Information Retrieval (IR) System • But used as a Classification System Here • Query = Disclosure • Only 3 Documents in Document Collection • Ups • Downs • Nochanges
Classification Experiments • Conducted Three Experiments • DTS & IIS used in each experiment • First Experiment • 100 Most Frequently Occurring Phrases • Second Experiment • 1,000 Most Frequently Occurring Phrases • Third Experiment • 10,000 Most Frequently Occurring Phrases
Conclusions & Next Steps • How can we Improve the Classification Results? • Identify Significant Disclosures • Identify Significant Content • Increase Data Set • Two-Way Classification? • Automate Partitioning of Relevant Disclosures • Automate Scanning for Significant Content
Contact Details Ms. Darina Slattery, B.B.S. M.Sc. Dept. of Computer Science & Information Systems, University of Limerick, Limerick, Ireland Tel: +353-61-213551 Email: darina.slattery@ul.ie