1 / 14

Twitter Frenzy FPGA Data Stream Processing

Twitter Frenzy FPGA Data Stream Processing. Cory Kleinheksel (Team Leader) Tim Meyer David Graziano Josh Clausman. Project Idea. Twitter Frenzy - A way to filter tweets as a set of frequencies using a FPGA to perform packet analysis.

cade-newman
Download Presentation

Twitter Frenzy FPGA Data Stream Processing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Twitter Frenzy FPGA Data Stream Processing Cory Kleinheksel (Team Leader) Tim Meyer David Graziano Josh Clausman

  2. Project Idea • Twitter Frenzy - A way to filter tweets as a set of frequencies using a FPGA to perform packet analysis. • Accelerate the stream processing of Twitter data queries. • Specifically accelerate computationally intensive and long life-time queries with data with short life-times. • The design/implementation of a frequency-based query will be the primary focus (interesting application of signal processing).

  3. Details • Input: Live (or simulated) Twitter stream data • Java program used to simulate twitter feed by reading from a dataset • Processing: • Extract tweets from input stream • Filter tweets based on query parameters • Text Matching • Determine tweet frequency components • Frequency Analysis • Apply signal filter (signal processing) • Output: Tweets matching filter

  4. Design Issues • Ability to acquire data from twitter at a useful speed • Determining packet usefulness (send/drop) in efficient manner • Managing concurrently arriving packets and multi-fragment packets • How to calculate frequency and filter corresponding packets

  5. Implementation Issues • How to properly buffer and send fragmented tweets • Time/clock cycles needed to perform frequency calculations • Time to perform Hashing • Created a lookup table based hashing block • Modules consuming data at different rates • Debugging HW

  6. System Architecture Diagram

  7. Breakdown: Network Data Flow

  8. Breakdown: Text Matching

  9. Breakdown: Frequency Analysis

  10. Algorithms • Hashing • String Matching • Frequency Analysis • Filtering (FIR)

  11. Project Results • Analyzed the problem • Implemented full simulator in software • Implemented in VHDL • Simulated in ModelSim • Tested on hardware, confirmed results against software implementation • Dataset: JSON_29493.txt • Processed 29493 tweets • 192 passed string filter • 133 passed frequency filter

  12. Software Simulator Example

  13. Demo

  14. References Berinde, Indyk, Cormode, Strauss. "Space-optimal Heavy Hitters with Strong Error Bounds" Cormode, Korn, Tirthapura. "Time-Decaying Aggregates in Out-of-order Streams" Charikar, Chen, Farach-Colton. "Finding Frequent Items in Data Streams“

More Related