1 / 25

Workshop on Internet and BigData Finance (WIBF 14)

Workshop on Internet and BigData Finance (WIBF 14). A Big Data Analytical Framework for Portfolio Optimization - Dhanya Jothimani , Ravi Shankar, Surendra S Yadav Department of Management Studies Indian Institute of Technology Delhi India. Flow of Presentation. Introduction

Download Presentation

Workshop on Internet and BigData Finance (WIBF 14)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Workshop on Internet and BigData Finance (WIBF 14) A Big Data Analytical Framework for Portfolio Optimization -DhanyaJothimani, Ravi Shankar, Surendra S Yadav Department of Management Studies Indian Institute of Technology Delhi India

  2. Flow of Presentation • Introduction • Scope of Work • Gaps Identified • Objective • Proposed Framework • Expected Outcomes • Conclusion • References WIBF 14

  3. Introduction • Big Data: Few hundred Gigabytes, Terabytes or Zettabytes! (Qin et al., 2012) • Dimensions: Volume, Velocity, Variety , Variability, Complexity, Low value density (Singh, 2012) • Use of big data technologies in capital firms (Verma & Mani, 2012) • Next Big Challenge for capital markets: To handle the velocity of data being produced (Verma & Mani, 2012) WIBF 14

  4. Introduction • Types of Capital Market Data • Reference Data • Fundamental Data (corporate financials, etc) • News (Earning report, Economic News, etc) • Social Media (Market Sentiment, etc) (Rauchman & Nazaruk, 2013) WIBF 14

  5. Introduction • Portfolio: Collection of assets • Portfolio Optimization: Process of making investment decisions on holding a set of financial assets to meet various criteria • Criteria: • Maximizing Return • Minimizing Risk (Markowitz, 1952) • Major Steps: • Asset selection • Asset Weighting • Asset Management WIBF 14

  6. Scope of the work • Assets range from stocks, bonds to real estate • Scope of the framework is limited to Stock Analysis WIBF 14

  7. Previous Works WIBF 14

  8. Gaps Identified • To the knowledge of the authors: • No framework handles both structured and unstructured data for portfolio optimization • Generally, quantitative data is considered for fundamental analysis WIBF 14

  9. Objective • To propose a framework that considers both stock price data and current affairs data (i.e news articles, market sentiments) to help investors to make an informed investment decision WIBF 14

  10. Proposed Framework • 5- stage Process • Stage 1: Selection of Stocks • Stage 2: Validation of Stocks • Stage 3: Stock Clustering • Stage 4: Stock Ranking • Stage 5: Optimization WIBF 14

  11. Fig 1: Proposed Framework WIBF 14

  12. Stage 1: Selection of Stocks • Input: Listed firms in a particular stock exchange as DMU • Output: Efficient firms • Methodology: DEA, a non-parametric linear programming (Cooper et al., 2004) • Input parameters: Total assets, Total equity, Cost of sales and Operating expenses • Output parameters: Net sales and Net income (Chen, 2008; Chen & Chen, 2011) WIBF 14

  13. Stage 2: Validation of Stocks • Input: Tweets and News articles of efficient firms • Output: Validated firms • Methodology: Sentiment analysis using Distributed text mining • Software framework:HadoopMapReduce • Properties: Ease-of-use, scalability and failover properties (Dittrich & Ruiz, 2012) WIBF 14

  14. Effect of Management on Stock Price Source: http://www.firstbiz.com/data/graphic-how-infosys-stock-moved-since-nr-narayana-murthys-return-86318.html WIBF 14

  15. Stage 2: Validation of Stocks Figure 2: Hadoop Framework for Distributed Text Mining WIBF 14

  16. Stage 3: Stock Clustering • Input: Validated firms • Output: Clusters of firms • Methodology: • Correlation co-efficient of returns of stocks • Clustering algorithm: Louvian or k-means clustering (Blondel et al., 2008) • Number and Quality of clusters: • Maximize similarity within a cluster • Minimize similarity between clusters WIBF 14

  17. Stage 4: Stock Ranking • Selection of appropriate stocks from each cluster • Input: Stocks in each cluster • Output: Stocks being ranked in each cluster • Methodology: ANN (Ware, 2005) • Input parameters: GDP growth rate, Interest rate • Output parameters: Future ROI (Koochakzadeh, 2013) WIBF 14

  18. Stage 5: Optimization • How much to invest in each stock?! • Input: Ranked stocks • Methodology: Output:Top 3 portfolio suggestions • Markowitz Mean-Variance model (Markowitz, 1952) • Optimization heuristics WIBF 14

  19. Expected Outcomes • Stage 1: Asset selection • Stage 2: Consideration of Qualitative aspect of firms • Stage 3: Diversification of risk • Stage 4: Aids in the selection of appropriate stocks • Stage 5: Top 3 portfolio suggestions to the investors WIBF 14

  20. Conclusion • Informed investment decisions • Considers both structured and unstructured data • Flexibility to investors for selecting appropriate stocks • Applicability to any stock data • Limitation: • Considers only stock analysis • Asset management is not considered • Future Work: Implementation of the framework WIBF 14

  21. References • E. Patari, T. Leivo, and S. Honkapuro, "Enhancement of equity portfolio performance using data envelopment analysis", European Journal of Operational Research, vol. 220, no. 3, pp. 786–797, 2012 • H.-H. Chen, “Stock selection using data envelopment analysis.,” Industrial Management and Data Systems, vol. 108, no. 9, pp. 1255–1268, 2008. • H. Markowitz, “Portfolio selection,” The Journal of Finance, vol. 7, no. 1, pp. 77–91, 1952. • J. Bollen, H. Mao, and X Zeng, "Twitter mood predicts the stock market.", Journal of Computational Science, vol. 2, no. 1, pp. 1-8, 2011 • J. Dittrich and J.-A. Quian ́ -Ruiz, “Efficient big data processing in hadoop mapreduce,” eProc. VLDB Endow., vol. 5, pp. 2014–2015, Aug. 2012. • J. Gemela, "Financial analysis using bayesian networks", Applied Stochastic Models in Business and Industry, vol. 17, no. 1, pp. 57-67, 2001 • K. Karpio, P. Lukasiewicz, A. Orlowski, and T. Zabkowski, "Mining associations on the Warsaw stock exchange." Proceedings of the 6th Polish Symposium of Physics in Economy and Social Sciences (FENS2012), vol. 123, no. 3, pp. 553–559, 2013 • L. Bakker, W. Hare, H. Khosravi, and B. Ramadanovic, "A social network model of investment behaviour in the stock market.", Physica A: Statistical Mechanics and its Applications, vol. 389, no. 6, pp. 1223 – 1229, 2010 WIBF 14

  22. References • M. Dia, "A portfolio selection methodology based on data envelopment analysis.", INFOR, vol. 47, no. 1, pp. 51–57, 2009 • M. Ismail, N. Salamudin, N. Rahman, and B. Kamaruddin, "DEA portfolio selection in Malaysian stock market.", 2012 International Conference on Innovation Management and Technology Research (ICIMTR), pp. 739–743, 2012 • M. Rauchman and A. Nazaruk, “Big Data in Capital Markets” Available online at: http://www.sigmod.org/2013/keynote1_slides.pdf, Accessed on: May 30, 2014 • N. C. P. Edirisinghe and X. Zhang, "Portfolio selection under DEA-based relative financial strength indicators: case of US industries", Journal of the Operational Research Society, vol. 59, no. 6, pp. 842–856, 2007 • N. Koochakzadeh, A heuristic stock portfolio optimization approach based on data mining techniques. PhD thesis, Department of Computer Science, University of Calgary, March 2013. • N. S. Sachchidanand Singh, “Big data analytics,” in International Conference on Communication, Information & Computing Technology (ICCICT), pp. 1-4, 2012. • P. Gupta, G. Mittal, and M. Mehlawat, "A multicriteria optimization model of portfolio rebalancing with transaction costs in fuzzy environment." Memetic Computing, pages 1–14, 2012 • R. P. Schumakerand H. Chen, "Textual analysis of stock market prediction using breaking financial news: The AZFin text system.", ACM Transactions Information Systems, vol. 27, no. 2, pp. 1–19, 2009 WIBF 14

  23. References • R. Verma and S. R. Mani, “Use of big data technologies in capital markets.” Available online at: http://www.infosys.com/industries/financialservices/whitepapers/Documents/big-data-analytics.pdf, 2012, Accessed on: April 2, 2014. • S. Shen, H. Jiang, and T. Zhang, "tock market forecasting using machine learning algorithms. Available online at: http://cs229.stanford.edu/proj2012/ShenJiangZhang-StockMarketForecastingusingMachineLearningAlgorithms.pdf., 2012 • S. T. Li and Y. c. Cheng, "A stochastic HMM-based forecasting model for fuzzy time series.", IEEE Transactions on Systems, Man, and Cybernetics, Part B, vol. 40, no. 5, pp. 1255–1266, 2010 • S. Y. Xu, "Stock price forecasting using information from yahoo finance and google trend."Available online at:https://www.econ.berkeley.edu/sites/default/files/Selene, 2012 • T. S. Quah, "DJIA stock selection assisted by neural network", Expert Systems with Applications, vol. 35, no. 12, pp. 50 – 58, 2008 • T. Ware, “Adaptive statistical evaluation tools for equity ranking models.” Submitted to: Canadian Industrial Problem Solving Workshops (Calgary, Canada, May 15-19, 2005), 2005. • V. Blondel, J.-L. Guillaume, R. Lambiotte, and E. Lefebvre, “Fast unfolding of communities in large networks,” Journal of Statistical Mechanics: Theory and Experiment, p. 10008, 2008. WIBF 14

  24. References • W. Cooper, L. Seiford, and J. Zhu, “Data envelopment analysis: History, models and interpretations,” in Handbook on Data Envelopment Analysis (W. W. Cooper, L. M. Seiford, and J. Zhu, eds.), vol. 71 of International Series in Operations Research & Management Science, pp. 1–39, Springer US, 2004. • X. Qin, H. Wang, F. Li, B. Zhou, Y. Cao, C. Li, H. Chen, X. Zhou, X. Du, and S. Wang, “Beyond simple integration of RDBMS and MapReduce – Paving the way toward a unified system for big data analytics: Vision and progress,” in Proceedings of the 2012 Second International Conference on Cloud and Green Computing, CGC ’12, (Washington, DC, USA), pp. 716–725, IEEE Computer Society, 2012. • Y. S. Chen and B. S. Chen, “Applying DEA, MPI and Grey model to explore the operation performance of the Taiwanese wafer fabrication industry,” Technological forecasting and social change, vol. 78, no. 3, pp. 536–546, 2011. • Y. Zuo and E. Kita, "Stock price forecast using bayesian network", Expert Systems with Applications, vol. 39, no. 8, pp. 6729–6737, 2012 WIBF 14

  25. Thank You WIBF 14

More Related