From Split Testing to Multi Variant Testing
ClickTrain October 2024
In the ever-changing landscape of Google Ads management, optimising ad performance is crucial for maximising return on investment. One of the all-time go-to strategies for refining your Google Ads campaign messaging is split testing, also known as A/B testing. However, how we do this has become less straightforward over the years, particularly with the introduction of Responsive Search Ads (RSAs) and Google’s extensive multivariate testing.
This guide inspires anyone interested in structuring manual Google Ad copy tests to help drive their success.
Why is doing split testing still relevant today?
RSAs use up to 15 headlines and 4 descriptions, more than 52,000 combinations for just one ad, excluding any variation driven by ad extensions! In essence, Google always runs a multivariate test. And, depending on how big your account is, you may start seeing system recommendations on these variants, with Google flagging poorer-performing headlines, highlighting better-performing ones and identifying winning element combinations.
Remember that Google needs between 2000 and 3000 impressions per variant; this process can take some time. If you need direction on improving your ads quickly and don’t have a high-traffic volume account, you will need to consider manual A/B testing, which doesn’t require a lot of computing power and time.
The difference between A/B and MTV
A/B Testing or Split Testing | Multivariate Testing |
What is it? Compares two versions of a single element to determine which one performs better. How does it work? Each variant is shown to a specific group. I.e. variant A is shown to group A and variant B is shown to group B. The winning variant is then chosen. When would you use it? Split testing is simpler and quicker to set up and analyse because it focuses on one variable at a time. It is best for testing specific changes, like different ad copy, or landing page designs. | What is it? Compares two versions of a single element to determine which one performs better. How does it work? Each variant is shown to a specific group. I.e. variant A is shown to group A and variant B is shown to group B. The winning variant is then chosen. When would you use it? Split testing is simpler and quicker to set up and analyse because it focuses on one variable at a time. It is best for testing specific changes, like different ad copy or landing page designs. |
How to set up A/B tests for RSAs
Experiments & variants
Setting up a Google Experiment using ad variants is likely the most “scientific” way of approaching ad copy testing, which is why it’s pretty limited in terms of what you can test at a time. However, this isn’t a bad thing. This lets you set up quick tests and move on to the next element.
You can choose the experiment run time, and Google provides a helpful reporting dashboard where you can monitor your test. This feature allows Google to determine whether there has been enough conclusive data during your test period to move forward and will indicate which variant won or lost. Below is an example of what this dashboard looks like for a variant test.
Ad variations can be set up for one specific ad or across many campaigns for the following:
- Headlines (whole headlines or just particular words)
- Descriptions (whole descriptions or just particular words)
- Landing pages
- Specific wording (with a find and replace function, i.e. testing the word)
- Pinned vs unpinned headlines and descriptions
RSA vs RSA
This is where we move into a more “unscientific” approach, but one that is used in practice by PPC managers in the field and is closer to the type of ad copy testing we used to do in the early days of running ad accounts with multiple ads running in rotation, over a 30 to 90 day period. It’s not strictly A/B testing in the statistical sense, as you won’t be able to split audiences into group A and group B. We are just running ads to a general search audience with a high probability of the two different ads being displayed to the same person.
This approach could help test something more abstract, such as voice or tone, particularly in lower-volume ad accounts and ad groups. Because RSAs result in so many variants, you’d have to lower the number of headlines and descriptions to reduce the sheer number of variants and ensure that each headline and description gets enough potential eyeballs on them.
A general guide would be 5 headlines and 3 descriptions. The aim is to identify a winner within a month and move on to the next test or decide to expand your current RSA and let Google continue testing combinations for you.
Determining the success of a Google Ads ad copy test
To effectively measure success, clearly defining what success looks like is essential. This involves understanding the entire marketing funnel. There’s no point in spending money to attract people to your ads if they aren’t interested in what happens after they click.
Sometimes, you may even want to deter users with a value proposition that’s so exclusive that they don’t want to click. In this instance, you might not have the best CTR, but the clicks that eventually end up on your website are far more likely to convert.
For most Google Ad accounts, conversion rates are a crucial metric for measuring success, and this should also be true when gauging your ad copy’s success. Even if an ad isn’t the exact point at which a user converts, it’s part of the journey and ties into a concept “ad scent” or message match.
Ad Scent Consistency
Ad scent refers to the consistency between your ad and the landing page it leads to. When users click on an ad, they expect the landing page to deliver on the promise made in the ad. This means the content, design, and overall feel should align with what initially attracted them. If the landing page matches the ad’s “scent,” users will likely stay and convert.
Maintaining a consistent ad scent, focusing on meaningful conversions, and clearly defining success are critical components of a successful ad account.
What types of tests can you run?
Depending on your capacity to test, it could be worth setting aside some time to define a testing schedule over a three to six-month period so that you can push forward innovation on your accounts, even when you are in the throes of the daily and monthly grind of the routine tasks of running an ad account. Pushing for this kind of testing on your “always on campaigns”, particularly when you have a client who tends to run constant monthly campaigns back to back, could help you gain long-term insights that will be valuable to you and your client regarding messaging.
Below are some prompts to help seed some of your testing ideas. You may also want to speak to your client to see if they have any unique elements they’d like to test or specific user sentiment tests that you can assist with by running A/B testing.
Pinning and what to pin?
Google tends not to recommend pinning, but it can be useful to provide control for you and your client regarding messaging. Options of how to test using pinning:
- Will the quality of the traffic I am driving to my landing page improve if I include the pricing of my service?
- Does having a countdown timer pinned in your top headlines drive more sales?
- In what position does dynamic keyword insertion work best, if at all?
- Does having a dynamic location pulling into the headline improve ad performance?
- Should you have your CTA before your brand name or vice versa?
- I am pinning winning combinations identified through Google’s multivariate tests vs non-pinning.
Language
Performance language can be a fascinating test. Does using it or not using it work for your audience? Some ideas of how this could look are as follows:
- Amount off vs. the percentage off? What’s most appealing?
- Do you get better results if you include terms like “free”, “guaranteed”, “today” or “now”? Or does it make no difference?
- Half price vs 50% off
- Pricing or no pricing included
- What? Vs Why? Sentiment in headlines, i.e. what is the offering vs what the offering can do to improve your customer’s life.
Voice
This test is more impressionistic but may be insightful if conclusive results can be drawn from it. Writing two very distinct ads with limited headlines and descriptions to convey a distinct feel with each could provide a window into what’s essential to your market. Here are some ideas for “voices” to test:
- Gender focused
- Exclusivity
- Aspirational
- Hard sell
Conclusion
Google’s continued testing and system recommendations are beneficial, providing valuable insights and automated optimisations that can enhance campaign performance. However, the ability to run ad copy tests remains an instrumental skill for PPC managers.
By manually testing different ad variations, you can push uniquely human insights that automated systems might overlook, ensuring that your campaigns are effective and aligned with your client’s brand voice and goals.