Category Archives: Testing

Why you need a hypothesis

why you need a hypothesisWhen thinking about site tests, there are two approaches you could take.

The first is to just chose something to test. This is appealing; it would fun and easy. Whatever caught your eye, you could change and run as an A/B test against the original version. Sometimes you would be lucky. More often, though, you would be wrong. The changes would be coming from you rather than from what the audience wants. A site change’s success is entirely dependent upon the audience’s reaction. If the audience isn’t happy, your site change will not work.

The second approach is to think about the details of the test and form a hypothesis. When you set up a hypothesis you consider your audience’s point of view.

A hypothesis can be thought of as a testable short story that describes user behavior when a change is introduced. For instance, a hypothesis might be, “Changing the hero image copy to focus on the design benefit rather than technical benefits will increase revenue by 10%”. It states what’s changing (the copy) and the predicted audience behavior (10% increase in revenue).

While it doesn’t appear in the hypothesis statement, the most important part of constructing a hypothesis is why you think it will work. If you can’t convincingly answer that, then your hypothesis is very unlikely to be successful.

You can answer why a hypothesis will work because you are knowledgeable about your audience. Testing with a hypothesis is a great way to increase that knowledge. If you test new copy and it works, it enriches your existing ideas about your audience. If it doesn’t work, it adds to your knowledge of your audience. You had an idea as to why it should work but that idea was wrong. So now you need to reconsider what you know about your audience. You can use this information to make your future tests stronger.

Without a hypothesis, a winning or failing test lacks context. It’s hard to interpret because the test wasn’t created with the audience and site in mind. Instead of enriching already known information, the test results are isolated data points.

Hypotheses are also useful because you can use them to prioritize site changes. If you have many changes you want to make to a site, look at what hypotheses are most worthwhile and run those tests first. You get the most impact for your testing efforts.

Finally creating hypotheses is a good way to get support from the people you work with. A series of hypotheses that ties into your site strategy will give people confidence that you know what you’re testing, why you’re testing it and what outcome you anticipate.

Choosing between hypotheses with confidence

A common site optimization problem is that you have too many hypotheses and not enough time to test them all. So the hypotheses have to be prioritized. People prioritize using scope, predicted impact, timeline, dependencies and risk. An often overlooked factor, though, is confidence.

Confidence is simply a measure of how certain you are that the test will succeed.

A hypothesis that is well supported by evidence is more likely to be successful, and so you’ll have a higher confidence level. The more information you have to go on, the better you can refine your hypothesis and anticipate errors. The variables that could upset your results are decreased and the hypothesis is more precise. Knowing more doesn’t guarantee that your hypothesis will be successful, but it does make it more likely.

A hypothesis with less data to support it is less likely to be successful, and so you’ll have lower confidence. Your hypothesis is less trustworthy because the information that you are unaware of can radically change the interpretation of facts. Knowing less doesn’t mean that your hypothesis will fail, but it does make it more likely.

As a general rule, more information means greater confidence in a hypothesis and less information means a lower level of confidence. Depending on risk tolerance – both personally and within the company – low or high confidence in a hypothesis can be a significant prioritization factor.

So what do you do if you have a fantastic idea, but not a lot of data to back it up, i.e. your confidence level is low? My favorite solution is to run thought experiments to find evidence to prove, refine or disprove the hypothesis.

Sometimes, though, even if you can’t find a lot of information to back up a test prior to launch, it’s still worth running. It comes down to what the rest of the testing queue looks like and what a successful or losing test would tell you.

Elements of a hypothesis

Hypotheses are a necessary tool for site testing. A hypothesis can be thought of as a short story that predicts how users will react when a site change is introduced.

Hypotheses are valuable because they keep you focused. If you can’t say what impact you expect a change will have or why you believe the change will work, it’s a good time to dig into the evidence and make sure you really understand what’s going on. Writing a hypothesis is an early test of your idea to ensure it holds together before pushing changes live on the site. Hypotheses also make it easier to prioritize tests and share ideas with other people in the company.

The elements of a hypothesis include: the area or functionality; what you want to change; why it matters to your audience; the metric you’re tracking; and the predicted outcome.

Let’s run through a quick example of how this all comes together.

Suppose you run a social media platform. You want your audience to interact with the platform more frequently both as creators and viewers of content. You believe the best way to accomplish this is to increase the number of photos people upload.

Area or functionality: Your audience’s interaction with the site’s interface, specifically uploading images.

What you want to change: The strength of the button for image uploads so people are more likely to see it.

Why it matters to the audience: Focus groups have indicated that customers love sharing photos over the social media network as it is quicker and easier than writing an update but it still feels like keeping in touch. However usability tests have shown that customers have trouble finding the button with which to upload photos. (Why you think it matters won’t be in the final hypothesis statement but it’s a very important step when building hypotheses. If you can’t convincingly explain why you think an audience will react well to the change you are proposing, your hypothesis is more likely to fail.)

Metric: The number of photo uploads divided by the number of total updates on the social media platform. Right now, photo uploads account for 3% of all updates.

Predicted outcome: An increase in photo uploads so they account for 5% of all updates.

Pulling all these pieces together, the hypothesis is: “Making the photo upload button stronger will increase the share of photo uploads to 5% of all updates.”

If you have a great idea but are having trouble creating the elements of a hypothesis, it may be because you need to narrow down what you’re testing. It is hard to create an effective hypothesis, and effective site tests, around an idea that is too broad. Another possibility is that your idea is sound but you need to refine it a bit more before you can act on it. By digging into available evidence, you can develop more insight into what will work and why.

Thought experiments: Groundwork for successful site changes

By running thought experiments before making a site change, you increase the chances that when you roll out the change it will be successful.

When you run a thought experiment, you attempt to prove, disprove or refine an idea with existing evidence. You can iterate on ideas very quickly, thought experiments are extremely cheap to run and, since they are not audience facing, they are also very low risk.

The evidence you dig into can be from a variety of sources, e.g. site analytics, focus groups, surveys, etc. The goal is to understand why a change would or would not work.

To walk through an example… Suppose the company you work for sells artwork that ranges from cheap prints to fairly expensive reproductions. Most of the orders are for cheap prints, so the company focuses on these low value, high volume sales. You have been instructed to increase revenue by promoting the cheap prints more effectively.

As a thought experiment, you decide to test whether the low value sales are what the site should primarily promote.

When you look at the order numbers, you find that most purchases are in fact low value. But when you look at the revenue, there are two peaks. One peak corresponds to the low value sales. The other peak corresponds to the high value sales. While there are far fewer high value sales, the margin is so much higher they bring in a lot of revenue.

This suggests that instead of one primary audience behavior there are two – one behavior pattern is to purchase low value products and the other is to purchase high value reproductions.

To further refine your thought experiment, you pull out information on who these high value purchasers are. You discover that they are concentrated in urban areas, specifically Los Angeles, San Francisco and New York. The types of artwork that they most frequently purchase are large, abstract paintings. The paintings are purchased by individuals, not businesses. It appears likely that people are purchasing these paintings to decorate their apartments or homes.

By following a thought experiment you’ve found out a lot about a profitable audience segment that hadn’t been identified. You also have some new ideas for how to increase revenue, e.g. make high value artwork more visible on the site, expand the types of artwork offered, start a weekly feature about how to decorate your urban apartment with artwork.

Thought experiments can help you find patterns and audience behaviors that have been overlooked. By running thought experiments before making site changes, your site changes are more likely to be successful because they will be informed by evidence rather than assumptions.

How to decide what to test

Imagine this: You have just finished leading the installation of a new testing system on the company’s marketing site. You’ve learned the new testing interface. You’ve done your research on A/B and multivariate testing and which to use when. All eyes are on you to improve revenue generated by the site.

Now you need to decide: What to test?

What are you testing for?

Before you start developing any tests, decide what you are testing for. Make sure you know the company’s goals. What is it that you are solving for by testing? What metric needs to improve for a test to be judged successful? Why does this metric matter? People get into trouble by not thinking this through before testing.

For instance, perhaps the VP directs you to increase the average time on page. You run some tests and successfully increase average time on page by 20%. Your site changes are judged a failure because even though you did what was asked – increase the time on page – the number of leads didn’t increase. The VP made an assumption that the longer a customer was on the page, the more likely he would be to submit his information. This assumption was wrong. What you should have been testing for was how to increase the number of leads.

To avoid this type of situation it is very important to understand what metric you are testing for and why.

Once you have your metric sorted out, the next step is to choose what to test. There are a lot of facets to deciding what to test, but to get started I’ll run through a couple of basic approaches: Best practice tests and audience motivation tests.

Best practice tests

Best practice tests involves fixing obvious usability issues. The goal is to get out of the way of your audience. It’s not the most exciting form of testing, but usually there are lots of areas to improve, especially if the site has never been tested before.

There are many ways you can find usability issues on the site. For instance, you can talk to the lead designer and see what she would like to improve. You can go through the site and ask yourself, ‘If I’d never seen the site before, what would confuse me?’ You can run usability testing and see where people hit snags.

This type of testing tends to be straightforward and generate good results.

Audience motivation tests

Audience motivation tests are more interesting, but typically harder to get right. While best practice tests are about getting out of your audience’s way, audience motivation tests are about lining the site up with what your audience wants to do.

To start thinking about audience motivation tests, consider these questions:

  • What area are you testing? What is the metric?
  • What does the company want the audience to do and why?
  • What does the audience want to do and why?
  • What do you want to the audience to do next?

The answers to these questions vary widely, but as an example:

  • What area are you testing? What is the metric? The product pages. Revenue.
  • What does the company want the audience to do and why? The company wants the audience to purchase their handmade artisan clothing to generate revenue and so forward the company’s mission statement of supporting local craftsmen.
  • What does the audience want to do and why? The audience wants to purchase the clothing because it is attractive and distinctive, which feeds into their sense of self, and because they are emotionally committed to projects that help small businesses. These purchases also support how they want the world to be.
  • What do you want to the audience to do next? To add a product to the cart.

If you have compelling answers for all of these questions your site changes are more likely to be successful. For instance, in this case perhaps you need to have better pictures of the product, because the audience is motivated by the clothing being distinctive and attractive. Or perhaps you need to include information about the person who made the clothing because this will reinforce the customer’s ideas about helping small businesses and about how they want the world to work.

Audience motivation testing is generally more difficult. It requires research and imagination. But, in the long term, you will build an understanding of your audience and their motivations, which will help you construct a more effective site. After all, without an audience, your site is just a collection of code.