360 likes | 401 Views
R For A Data Driven SEO Workflow. Sam Collins – SEO Manager Sam Hall – Data Scientist. What does TravelSupermarket do?. “To reduce the stress of booking travel by providing the best UK comparison experience”. Today we are going to take you through…. How to solve a problem like SEO
E N D
R For A Data Driven SEO Workflow Sam Collins – SEO Manager Sam Hall – Data Scientist
What does TravelSupermarket do? “To reduce the stress of booking travel by providing the best UK comparison experience”
Today we are going to take you through… • How to solve a problem like SEO • Turning noise into insight • Proving value in what we’ve learnt
Why is Search Engine Optimisation Important To Us? ~60% ~50% Revenue Visitors
Firstly, it has a lot of moving parts User Signals Keywords Trustworthiness Semantic Relevance 200+ Site Speed UX Backlinks Brand Salience Reviews
Thirdly, travel is a big marketplace 60,000 keywords
…and then Google changes the rules… User Signals Keywords Trustworthiness Semantic Relevance Site Speed MEDIC Site Profiling Authorship UX Backlinks Brand Salience Reviews
….and doesn’t say how Cool, thanks Google
Which leaves you with one question Page B Google Update WHY?!! Performance Page A Time
R to the rescue! What do we know? Ranking Factors Winners & Losers Date of Change So, we can build a model
I R Scientist Ranking data Build models Watch out Google!
Reframing as a supervised learning problem we can understand what features are associated with pages that were affected by the update. Third Party SEO tools urlprofiler Google Search Console API Google Analytics Web scraping Chrome CLI CRAN: SearchConsoleR, googleAnalyticsR
We can explain the output of our model with the help of SHAP Negative impact on model prediction Positive impact on model prediction SHAP (SHapelyAdditive exPlanations) values help explain each features contribution to the prediction CRAN: SHAPforxgboost https://github.com/pablo14/shap-values
Higher session duration is good Landing pages with a session durations > 5 minutes had negative impact on model prediction – less likely to see decrease
How do we keep people on our site? Old destination template New destination template How do we test SEO changes?
We can’t use randomised control trials… Champion (current) 50% = split Challenger (new) 50%
But we can model the counterfactual to estimate treatment effect SEO Clicks (proxy for ranks) Treatment actual All holiday pages Impact! Treatment predicted (CONTROL) Split into representative page groups Use control to predict treatment Control actual Treatment Control Change made Time Make the change to the treatment pages CRAN: CausalImpact, bsts
What makes this method perfect for SEO at TravelSupermarket? • Accounts for seasonality • Valid when Google comes along and messes things up • Like other channels, success can be measured with a monetary value
Here’s one we made earlier… Hypothesis: Adding price to the meta title of the page will improve position & click through rate.
It worked. Treatment Control cumulative uplift
In Summary: We used R to… SEO is hard Cutting through the noise ££££
References: • S.M Lundberg, Su-In-Lee (2017). A Unified Approach to Interpreting Model Predictions. http://papers.nips.cc/paper/7062-a-unified-approach-to-interpreting-model-predictions.pdf • K.H. Brodersen, F. Gallusser, J. Koehler, N. Remy, S. L. Scott, (2015). Inferring Causal Impact Using Bayesian Structural Time-Series Models. https://research.google.com/pubs/pub41854.html. • https://www.distilled.net/resources/what-is-seo-split-testing/ • https://cran.r-project.org/web/packages/searchConsoleR/searchConsoleR.pdf