170 likes | 400 Views
Crowd Size Estimation. Luis Huang 12-3-08 ECE 172A - UCSD. Background and Motivation. September 27, 2007, 9:19 pm Obama Rallies Huge Crowd in New York By Jeff Zeleny
E N D
Crowd Size Estimation Luis Huang 12-3-08 ECE 172A - UCSD
Background and Motivation September 27, 2007, 9:19 pm Obama Rallies Huge Crowd in New York By Jeff Zeleny Senator Barack Obama rallied New Yorkers in Washington Square Park in Manhattan Thursday night. (Photo: Richard Perry/The New York Times) When Senator Barack Obama ran through the arch and strode onto stage tonight in Washington Square Park, he paused and sized up the crowd standing before him, many of whom were waving… In February, Mr. Obama drew 20,000 people to the Town Lake in Austin, Texas. In March, 10,000 people crowded into a plaza outside City Hall in Oakland, Calif. In April, he attracted 20,000 at an outdoor rally at Yellow Jacket Park in Atlanta.
Crowd Estimation Significance • Common Convention • Literally counting out individuals in sequenced snapshots (extrapolated) • Aerial photographs often employed • Ticket sales/count with turnstiles • Controversy • Political rallies/protests crowd estimates carries political significance • Highly inaccurate and highly subjective • Personal bias is a big problem (candidates,’ political protests, etc.)
Ways to Approach this Problem • Two Schools of Thought (Gray, UCSC) • Detection Based Estimation • Run a detector, count, or cluster the output • Pros: Relatively good accuracy for small values • Cons: Requires really good algorithm • Mapping Based Estimation • Extract features and map them to a value • Pros: Easier to scale for large crowds • Cons: Hard to make scene invariant • Mapping Based Technique Extremely Difficult (see next) • Hybrid of both
Mapping Based Method Mapping Based Method • SIFT (Scale-Invariant Feature Transform) • Algorithm in Computer Vision used to detect and describe features in images (Lowe, 2004) • Four Steps: Scale-space extrema detection, keypoint localization, orientation assignment, and keypoint descriptor • Difference of Gaussian • Wha?
Initial Failure With Mapping Based • Used sample SIFT code from Dr. Vedaldi (UCLA) with scene of people walking in Venice (small number) • +1000 interest points detected • Next step? Found paper only for Crowd Density using complex algorithm (MFD). Useless for counting • No paper has found a way to differentiate crowd interest points from scene interest points as of yet • Conclusion: Waste of almost two weeks
Project Procedure • Used Skin-tone Thresholding for Binary Image • Morphological Image Processing (Opening and Closing) • Face Detection using Convolution Mask • Blob Count using BWLABEL Command
BWLABEL • Used as a blob counter • Takes only binary images • Produces a label matrix L • Groups and numbers connecting pixels • Then blobs are numbers and numbers are outputted onto image
Results MATLAB Examples
Discussion Of Results • My scientific highly accurate guesstimate: ≈120 clear faces • Program: 260 (186) • Estimate: ≈90 • Program: 120 • Estimate: ≈∞ • Program: really poor 270
What About McCain Crowds? For Fairness…
McCain Crowd Estimation Unexpected MATLAB expression.
Difficulties • Mapping Based Method • SIFT (Scale-Invariant Feature Transform) • Unworkable (as discussed earlier) • Thresholding Is Key • Faces need to be shown in photographs clearly, with correct lighting, enough detail, etc. • Blob Count Provides Rough Estimate • Accuracy very hard to attain • Any obstruction reduces accuracy • Signs, other people, other body parts, etc.
Limitations • Program needs to be manually adjusted for individual photographs, depending on thresholding, opening/closing operation, size of crowd, size of human features, etc. • Detail • Accuracy limited due to obstructions, facial details • Photographs not accurate enough to capture entire crowds • Does not solve bias problem. Program can be edited to either produce larger or smaller crowds.
Future Works and Improvements • Automated Adjustments • Thresholds that can adjust to lighting conditions (too bright, too dark, etc.) • Automated Morphological Operations • Sequential Snapshots (Panoramic views or aerial photographs) • Piecing together dynamic images • Video (Most likely will involve SIFT descriptions and frames) • Continuous counter taken at given intervals