360 likes | 435 Views
Functional Search™ Technical Overview April 2013. Outline. What is Functional Search™? How it works The future. Outline. What is Functional Search™? How it works The future. “nearby restaurant”. “when in a movie to pee”. “control my laptop”. Powered by Quixey. APP STORE
E N D
Functional Search™ Technical Overview April 2013
Outline • What is Functional Search™? • How it works • The future
Outline • What is Functional Search™? • How it works • The future
Powered by Quixey APP STORE SEARCH PRELOADED APP SEARCH WIDGET VOICE-ACTIVATED APP SEARCH
All-Platform Search Solution Mobile iPhone Android Windows Phone Desktop Mac PC Web-Based Platforms HTML5 Facebook LinkedIn Salesforce Twitter Browser Firefox Add-ons Chrome Extensions IE Add-ons
Apps as first-order objects Structured metadata fields Text metadata Developer User App edition edition Platform Platform device device
Outline • What is Functional Search™? • How it works • The future
Machine-Learned RegressionSearch (query, app) <feature1, …, feature100> score Define features(query, app) <feature1, …, feature100> Collect training points(query, app) score Train machine-learned regression model<feature1, …, feature100> score
Machine-Learned RegressionSearch (query, app) <feature1, …, feature100> score Define features(query, app) <feature1, …, feature100> Collect training points(query, app) score Train machine-learned regression model<feature1, …, feature100> score
Types of Features • Query Features • Word count • Popularity • Category classification • Result Features (a.k.a. App Features) • Downloads • Star ratings • Avg review positivity • Number of platforms • Query-Result features • tf-idf for app title • tf-idf for entire app metadata text corpus • Query-result category alignment • Matches for domain concepts like “free” and “iphone”
Metafeatures Reduce ML Problem • Instead of learning huge regression <feature1, …, feature100> score... • Can define metafeatures: <feature1, … feature10> MetaFeature1 (e.g. “App Quality”) … <feature91, …, feature100> MetaFeature10 (e.g. “Query-to-App Text Match”) • Then do smaller regression: <MetaFeature1, …, MetaFeature10> score
Metafeatures: Pros and Con • Con: Constrains what the ML can learn • Now it can’t learn facts like “a high Text Relevance score is a bad sign for apps that have lots of tweets” • The concept of “# of tweets” is screened off by the concept of “Quality” • But anticipated relationships can be addressed Text Relevance Overall Score Quality # of tweets
Metafeatures: Pros and Con • Pro:Use our domain knowledge to factor the problem • Metafeatures introduce feature independence assumptions • E.g. the “Quality Score” metafeaturetakes into account: • Store-specific star ratings (iTunes App Store, Google Play, Windows Phone store, Blackberry World, etc) • Store-specific review counts • Avg.popularity of all apps by this developer • Number of recent tweets about an app
Metafeatures: Pros and Con • Pro: Easier to get high-quality test data • Instead of asking our testers, “How overall good is this app for this query?” • We can ask: • “How high-quality is this app, given these star ratings / reviews / tweets?” • “How textually relevantis this app for this query, given this selection of text?”
Machine-Learned RegressionSearch (query, app) <feature1, …, feature100> score Define features(query, app) <feature1, …, feature100> Collect training points(query, app) score Train machine-learned regression model<feature1, …, feature100> score
Collecting evaluation points • Hire full-time paid testers • (query, app) Score from 1to 5Hundreds of points per tester per day
Machine-Learned RegressionSearch (query, app) <feature1, …, feature100> score Define features(query, app) <feature1, …, feature100> Collect training points(query, app) score Train machine-learned regression model<feature1, …, feature100> score
Q: What kind of regression model does Quixey use? A: Commercial Gradient Boosted Decision Trees (GBDT) for search ranking …and other ML stuff for query understanding, dynamic app classification, cross-platform app edition merging
Choosing the best model • TreeNet outputs models that minimized mean-squared-error on training points • a metric on (query, app, score) points • Our real metric is Normalized Discounted Cumulative Gain (nDCG) • a metric on (query, <app1, …, app5>) multi-app rankings • We might evaluate several TreeNet models before seeing a real nDCG improvement. We might retry regression with: • Different combinations of features • Different TreeNet settings like learnRate and maxTrees, different splits of data • More/better data (or fix errors in our training data)
Outline • What is Functional Search™? • How it works • The future
apps web • -vs-
Technological Web list nearby karaoke places Apps
Technological Web list nearby karaoke places URL Apps
Technological Web list nearby karaoke places URL Apps
Functional Web URL Function
278 Castro Street Mountain View, CA 94041 www.quixey.com Liron Shapira, CTO liron@quixey.com