220 likes | 383 Views
BIG DATA The next frontier for emerging market. USC CSSE Annual Research Review March 14, 2013 Rachchabhorn Wongsaroj Bank of Thailand, Visiting Scholar @ USC. Outline. Current situation What is big data? Why big data is important? Big data cases Research challenges
E N D
BIG DATAThe next frontier for emerging market • USC CSSE Annual Research Review • March 14, 2013 • RachchabhornWongsaroj • Bank of Thailand, Visiting Scholar @ USC
Outline • Current situation • What is big data? • Why big data is important? • Big data cases • Research challenges • Big data in Thailand • Future research
Current Situation • Data Quantity Lots of data is being created & collected Global data • Data Quality Problems • Data Variety • Data Timeliness
What is big data? Big Data = Volume, Variety and Velocity Volume Variety People to People People to Machine Machine to Machine 20 Hours of video uploaded every minute 8 Billion messages/day 845M active users 340Million Tweets/day 140M active users Velocity Source: Gartner & IBM
Why big data is important? Emerging Technologies Hype Cycle 2011 (Gartner)
Why big data is important? Emerging Technologies Hype Cycle 2012 (Gartner)
Why big data is important? Source: McKinsey Global Institute Analysis
Why big data is important? Big data can generate significant financial value across sectors Global Personal Location Data $100 billion +revenue for service provider Up to $700 billion value to end users Europe Public Sector Administration £250 billion value/year ̴0.5 % annual productivity growth US Health Care $300 billion value/year ̴ 0.7 % annual productivity growth US Retail 60+% increase in net margin possible 0.5-1.0 % annual productivity growth Manufacturing Up to 50% decrease in product development Up to 7% reduction in working capital Source: McKinsey Global Institute Analysis
Why big data is important? Health Care sector has potential to invest $300B 32% $108B $47B Account 14% $47B R&D personalized medicine, clinical trial design $108B R&D Accounts advanced fraud detection: performance based drug pricing 2% $5B Business Model aggregation of patient records, online platform and communities $165B Clinical 49% $165B Clinical transparency in clinical data and clinical decision support 3% $9B Public health surveillance and response systems Source: US Department of Labor
Research Challenges Customer micro-segmentation Sentiment analysis Performance transparency Labor inputs optimization Price comparison services Source: McKinsey Global Institute Analysis
Big data in Thailand • Language • Cost of implementation • Magnitude of data • Demographic data generator • Data type Challenges
Big data in Thailand Language (natural language processing) • no space between words • Combination between Thai –Foreign languages • Lack of Thai text analytic components Example
Big data in Thailand Cost of implementation 13 Big data vendors in 2013 Hadoop : Requires: ~$1 million between 125 and 250 nodes Distribution: Annual costs: ~$4,000 per node -> A small fraction of an enterprise data warehouse $10-$100s of millions.
Big data in Thailand Magnitude of data As of September 2012 44% 60% use Local Bandwidth 31% 14% 9% Local Bandwidth (.th, or.th, etc) 1,006,140 Mbps Overseas Bandwidth 405,860 Mbps 25% use smart phone 8% use tablet
Big data in Thailand Demographic data generator Most data are from young generations Population 65M Internet users 25M 39% of population use Internet 85.9% of data is created by Internet users age 6-24
Big data in Thailand Types of data – limited Big data technique application Only 2.12% focus on Education Source: http://www.prd.go.th/ewt_news.php?nid=23168
Bank of Thailand (BOT) Website – As is Manual Checking Financial institution DB3 BOT data (Internet/ Extranet) Template Input Manual Submit DB 2 DB 1 • Problems • Too many steps • Once due - act first, fix later • Too many stakeholders • Bureaucracy management style BTWS Working Auto Submit BOT Website Source: Bank of Thailand
BOT data website – As is Revision Policy Volume Timeliness Manual Checking Variety Query Data (BO) Cross Validation Input Data Complex Validation Input Template Manual Check Manual Submit Website Velocity Approve Accuracy & Reliability Source: Bank of Thailand
Future research • Data quality management • Tools • Template • Checklist • Process
Reference • Big Data: The next frontier for innovation, competition, and productivity, McKinsey Global Institute Analysis • Understanding Big Data: Analytic for Enterprise Class Haddop and Streaming Data, IBM • Gartner Report • Thailand National Statistic Office • Thailand Digital Statistic Source • Bank of Thailand (www.bot.or.th)
BIG DATA • The next frontier for emerging market Thank you Q & A • RachchabhornWongsaroj • Bank of Thailand • Visiting Scholar @ USC