50 likes | 166 Views
B REAKOUT S ESSION - B IG B ENCH -. 3rd W orkshop on Big D ata Benchmarking July 16-17 Xi‘an , China. B IG B ENCH – F URTHER D EVELOPMENT (1). Late Binding needs to be addressed in BigBench Pre- or Post-queries Workload has to deal with missing values Possibly start with Weblogs
E N D
BREAKOUT SESSION- BIGBENCH - 3rd Workshop on Big Data Benchmarking July 16-17 Xi‘an, China
BIGBENCH – FURTHER DEVELOPMENT (1) • Late Binding needs to be addressed in BigBench • Pre- or Post-queries • Workload has to deal with missing values • Possibly start with Weblogs • Add columns to tables dynamically • Scaling factor needs to be proven for data generation rate and query result size • Data model specific: • Integration of media resources considered, but excluded • Localization (WGS84) aspect for Customer (potentially for reviews, considered as minor important since postal code available) Late Binding::= the schema information will be evaluated during runtime.
BIGBENCH – FURTHER DEVELOPMENT (2) Support for Graph structures: • Integration of hash-tag functionality • (Re-)Tweet like methods on recommendation of Customer • On-the-fly analysis will end in graph structures (e.g., “give me all Customers retweeting a positive review of product XY“)
BIGBENCH – OPEN ISSUES Open Issues: • Is localization an issue for a benchmark? • Do images/other media add value to a data benchmark?
BIGBENCH – FURTHER STEPS • Big Data Challenge • Have people implement BigBench • Hive version will be out soon • Discussion later • Big Data Pipeline • BigBench somewhere in the middle/end? • Discussion later