Functional Search™ Technical Overview April 2013

Functional Search™ Technical Overview April 2013

Outline • What is Functional Search™? • How it works • The future

“nearby restaurant”

“when in a movie to pee”

“control my laptop”

Powered by Quixey APP STORE SEARCH PRELOADED APP SEARCH WIDGET VOICE-ACTIVATED APP SEARCH

All-Platform Search Solution Mobile iPhone Android Windows Phone Desktop Mac PC Web-Based Platforms HTML5 Facebook LinkedIn Salesforce Twitter Browser Firefox Add-ons Chrome Extensions IE Add-ons

Data Sources

Apps as first-order objects Structured metadata fields Text metadata Developer User App edition edition Platform Platform device device

Machine-Learned RegressionSearch (query, app)  <feature1, …, feature100>  score Define features(query, app)  <feature1, …, feature100> Collect training points(query, app)  score Train machine-learned regression model<feature1, …, feature100>  score

Types of Features • Query Features • Word count • Popularity • Category classification • Result Features (a.k.a. App Features) • Downloads • Star ratings • Avg review positivity • Number of platforms • Query-Result features • tf-idf for app title • tf-idf for entire app metadata text corpus • Query-result category alignment • Matches for domain concepts like “free” and “iphone”

No single feature is sufficient

“coffee shop” – more popular isn’t always best

Metafeatures Reduce ML Problem • Instead of learning huge regression <feature1, …, feature100>  score... • Can define metafeatures: <feature1, … feature10>  MetaFeature1 (e.g. “App Quality”) … <feature91, …, feature100>  MetaFeature10 (e.g. “Query-to-App Text Match”) • Then do smaller regression: <MetaFeature1, …, MetaFeature10>  score

Metafeatures: Pros and Con • Con: Constrains what the ML can learn • Now it can’t learn facts like “a high Text Relevance score is a bad sign for apps that have lots of tweets” • The concept of “# of tweets” is screened off by the concept of “Quality” • But anticipated relationships can be addressed Text Relevance Overall Score Quality # of tweets

Metafeatures: Pros and Con • Pro:Use our domain knowledge to factor the problem • Metafeatures introduce feature independence assumptions • E.g. the “Quality Score” metafeaturetakes into account: • Store-specific star ratings (iTunes App Store, Google Play, Windows Phone store, Blackberry World, etc) • Store-specific review counts • Avg.popularity of all apps by this developer • Number of recent tweets about an app

Metafeatures: Pros and Con • Pro: Easier to get high-quality test data • Instead of asking our testers, “How overall good is this app for this query?” • We can ask: • “How high-quality is this app, given these star ratings / reviews / tweets?” • “How textually relevantis this app for this query, given this selection of text?”

Collecting evaluation points • Hire full-time paid testers • (query, app)  Score from 1to 5Hundreds of points per tester per day

Q: What kind of regression model does Quixey use? A: Commercial Gradient Boosted Decision Trees (GBDT) for search ranking …and other ML stuff for query understanding, dynamic app classification, cross-platform app edition merging

Choosing the best model • TreeNet outputs models that minimized mean-squared-error on training points • a metric on (query, app, score) points • Our real metric is Normalized Discounted Cumulative Gain (nDCG) • a metric on (query, <app1, …, app5>) multi-app rankings • We might evaluate several TreeNet models before seeing a real nDCG improvement. We might retry regression with: • Different combinations of features • Different TreeNet settings like learnRate and maxTrees, different splits of data • More/better data (or fix errors in our training data)

apps web • -vs-

have a clean house

Technological Web list nearby karaoke places Apps

Technological Web list nearby karaoke places URL Apps

Functional Web URL Function

278 Castro Street Mountain View, CA 94041 www.quixey.com Liron Shapira, CTO liron@quixey.com

Functional Search™ Technical Overview April 2013