Home | About us | LucidViewTM Strategy | Scientific testing | Case studies & articles | Services | FAQs | Contact us

 

 

Key Metrics

 

LucidView Strategy

Introduction

#1: Creative freedom

#2: Scientific discipline
> Define test elements
> Select key metrics

#3: Statistical power

#4: Marketing insights

#5: Profit

Benefits

Getting started

If you can’t measure it, you can’t improve it

If you want to increase advertising effectiveness or improve your marketing programs, you need to find a way to measure exactly what you’re trying to improve. This may sound easy enough, but the devil is in the details…

  • A leading magazine publisher found little correlation between retail scanner data and sales number from wholesalers. While wholesaler data was used to measure company performance, scanner data was used to analyze test results.
     
  • An Internet retailer found nearly opposite results measuring click-through rate versus conversion rate. In a test of online advertising, detailed, yet fairly bland ads drove conversion, while vague messages, exciting graphics, and a nebulous offer drove clicks. The team joked that a blank box was the best way to increase click-through rate while ensuring zero new customers.
     
  • An experienced marketing team had tested direct mail programs for years, using “keycode” numbers on each different mail package to track sales. Unfortunately, the default keycode was the same as the control. When reply cards were mailed in, company employees commonly used the default keycode to simplify their workload. Since test cell responses were underreported, tests seldom beat the control.


For accurate, reliable data, you need to:
    (a)  Choose the right metrics
    (b)  Analyze data stability and reliability
    (c)  Plan the process for data collection
    (d)  Refine the scope of the testing project, if necessary
 

Choose the right metrics

The best metrics are usually direct measures of response, sales, and profitability. Sometimes direct-response metrics are not possible, so substitute metrics are required. Catalog requests, phone calls for more information, website visits, and reply cards for “free information” may accurately measure intent, but usually have less than one-to-one correlation with actual purchase behavior.

One key benefit of in-market testing is the ability to measure purchase behavior in the real marketplace (where customers talk with their wallets and don’t know they’re being tested). In this, testing remains the only way to prove which changes directly impact sales. Any departure from a true measure of sales adds error and may lead to incorrect conclusions.

Direct marketers take note: If you want to analyze dollar sales (e.g. average order size) as a key metric, then you need to calculate the standard deviation of each test cell along with the average. Without the standard deviation, σ, of all individual orders in each test cell (“sigma” is a measure of data variability), you cannot calculate statistical significance (in addition, you should remove any outliers), because:

  • For response data, all you need is the number mailed and the number of orders. This is because for response (yes/no) data, the average and standard deviation are related.
     
  • For sales data (and other “continuous” numbers), there is no such relationship, so the standard deviation must be calculated for every group of data. You need to collect the individual order sizes somewhere in your database.
     

Analyze data stability and reliability

In direct mail—when every test cell is sent to a random selection of names in the same drop—data stability is of little concern. But tests running over a period of time or within different “test units” (e.g. stores, magazines, or regions) require the analysis of historical data to select stable and comparable test units.

Stability is a statistical term relating to how predictable performance is over time. If the data are stable, then you can be confident any change beyond a normal range (as defined statistically) is due to your test elements. You can assess stability by looking at how much historical data falls within statistical “control limits,” versus showing numerous outliers, trends, or other non-normal variation.

Measurement studies and stability analyses can determine how trustworthy your data are.
 

Plan the process for data collection

Running a test or measuring advertising effectiveness requires a sharper focus on the campaigns or test units being tracked. If you are testing a print ad in one magazine, you need to be able to separate responses to that one ad from all other market inquiries. Usually, you need a unique tracking number for each campaign or test cell you want to analyze. Examples include:

  • Catalog and direct mail: a unique source code (keycode) on catalogs and reply cards, different 800 number, and unique URL for tracking response (otherwise, you can try to match back all responses to the correct test recipe, though this approach has some error)
  • E-mail and Internet: a unique landing page and URL tag on each version, to follows the visitor throughout the purchase
  • Print ads: a unique contact name, phone number, or web address
  • Mass media: run tests in certain regions and compare regional sales to a historical baseline
  • Retail: Collect sales data individually for each store
     

Refine the scope of the testing project, if necessary

After assessing the availability and reliability of your data, you may need to change the scope of your test. For example, a retail test may have to focus on in-store product and promotional elements if local/regional advertising cannot be measured accurately for each test recipe. An advertising test may have to be run in a number of metropolitan areas across the U.S., so the chance of overlap in exposure is minimal. And if you want to increase long-term customer profitability, you may need a much larger sample size than you need if response rate is the key metric.

A large investment firm wanted to test changes to their newspaper advertisements. (a) Since the ads did not produce sales directly, they selected phone responses and website visits as a measure of advertising effectiveness. (b) The company already had computerized system to track different phone extensions and webpages. (c) Every ad would have a unique phone number and website address printed on the ad, so responses could be tracked back to the correct ad. The team took 16 phone extensions out of use to save for the test and created different landing pages for each test recipe. (d) With just six editions of the national newspaper, the test would run over a few weeks with a “control” version plus five test recipes running every week. With these restrictions, the team decided to limit the test to, at most, about a dozen test elements.


The next step is to leverage statistical power in the test design.

Back to top

 

© LucidView 2008. All rights reserved. Contact: 888-LucidView (888-582-4384), info@lucidview.com