How Does Did Recommend Differ from Likely to Recommend?

Have you recommended a product to a friend?

Will you recommend that same product to a friend again?

Which of these questions is a better indicator of whether you will actually recommend a product?

If people were consistent, rational, and honest, it’s simple. The second question asks about future intent so that would be the logical choice.

It may come as no surprise that people aren’t necessarily good predictors of their own future behavior.

How the question is phrased, along with the activity and distance in the future, each have an impact on how predictive the question is. It may, therefore be risky to heavily rely on people’s prediction of whether they WILL recommend.

We’ll look at how well-stated intentions predict future actions in a future article. In this article, we’ll examine how asking people whether they recommended something (recalling the past) differs from asking whether they will recommend (a future prediction of behavior). Are you really asking something different? If so, how do they differ and what problems arise from using only reports of past behavior to predict future behavior?

Predicting Is Hard, Especially About the Future

In theory, people should be accurate in recalling whether they did something rather than whether they will do something. With this in mind, Tomer Sharon recommends something he calls the aNPS (actual NPS)—whether people said they recommended over a recent period of time. Similarly, Jared Spool argues past recommendations are a better predictor of someone’s loyalty than asking about what people will do in the future. He uses an example from a Netflix survey to illustrate the point. Neither provides any data to support the idea that these are superior measures of future intent to recommend. (Finding data is hard.)

There is success in predicting what will happen based on what has happened, especially over short intervals and when people and things remain the same.

But even when the ideal conditions exist—even for behavior that is habitual—past behavior is far from a perfect predictor of what will happen. For example, when predicting whether people will exercise in the future, past exercise behavior was found as the best predictor of future exercise. But its correlation, while high, was not a perfect predictor (r = .69).

In a meta-analysis of 16 studies[pdf], Oulette and Wood (1998) also found that past behavior correlated with future behavior, but it was more modest (r = .39). Interestingly, though, they found behavior intentions (how likely you are to do something) were better predictors of future behavior (r = .54). They did find a more complex interaction between past behavior and future intent. In stable contexts, past behavior is a better predictor of future behavior. But in unstable contexts and when behavior isn’t performed on a frequent basis, intention is a better predictor of future acts.

Recommending products or brands to friends and colleagues is likely a less stable and infrequent action. A future analysis will examine this in more detail.

People and Products Change

Using past recommendations to predict future behavior also presents another challenge to consider: changing experiences.

Even if we have a perfect record of what people did and could exclude memory bias, change happens. People change their minds, and products and companies change (often quickly). So what people did in the past, even the recent past, may not be as good an indicator of what they will do in the future. A good example of this also comes from Netflix.

In 2011, Netflix split its mail order DVD business from its online streaming and raised prices. This angered customers (many of whom likely recently recommended the service) and Netflix lost 800,000 customers by one estimate.

We were tracking Netflix’s NPS in 2011 and this change was captured in its Net Promoter Score, which went from one of the highest in our database (73%) to a low -7% as shown in Figure 1.

Figure 1: The Netflix NPS in Feb 2011 and Oct 2011, before and after announcing an unpopular product change.

The Netflix example shows how past recommendations (even recent ones) can quickly become poor predictors of future attitudes and behavior. Netflix ultimately reversed its decision and more than recovered its subscriber base (and NPS score).

But that’s probably an extreme example of how a major change to a product can result in a major change to usage and recommendations. To more systematically understand how reports of past recommendations may differ from future likelihood recommendations, we examined data in two large datasets and compared the differences between past recommendations and stated likelihood to recommend in the future.

Study 1: Software Recommendations

In our 2014 consumer and business software reports, we asked current users of software products whether they recommended the product and how likely they are to recommend it in the future using the 11-point Likelihood to Recommend (LTR) item from the Net Promoter Score. We used this information to quantify the value of a promoter.

On average, 29% of software users in our study reported that they recommended the product to at least one person in the prior year. Of these past recommenders, we found 41% were most likely to recommend it again—giving a 10 on the LTR item (Figure 2). Extending this to the top two boxes we see that 64% of people who recommended are also promoters. Those who responded 8–10 on the LTR item accounted for a substantial (85%) portion of the recommendations. Including passives and promoters (7–10) accounts for almost all (93%) of the recommendations.

You can see the large drop off in recommendations between an 8 and 7 (Figure 2) where the percentage of software users reported recommending drops by more than half (from 21% to 8%) and then by half again (8% to 3%) when moving from 7 (passive) to 6 (detractor). It would be interesting to know why respondents in this study who did recommend in the past are less likely to in the future.

Figure 2: Percent of respondents who reported recommending a software product in the past and would recommend it in the future using the 0 to 10 LTR scale.

Study 2: Recent Recommendation

To corroborate the pattern and look for reasons why past recommendations don’t account for all future recommendations we conducted another study in November 2018 with a more diverse set of products and services. We asked 2,672 participants in an online survey to think about what product or service they most recently recommended to a friend or colleague, either online or offline. Some examples of recently recommended companies and products were:

  • Barnes & Noble
  • Amazon
  • eBay
  • Colourpop Lipstick
  • PlayStation Plus
  • Spotify music
  • Bojangles Chicken and Biscuits
  • Ryka—shoes for women

After the first few hundred responses we noticed many people had recently recommended Amazon, so we asked respondents to think of another company. After recalling their most recent recommendation, we used the 0 to 10 LTR scale (0 = not at all likely to recommend and 10 = extremely likely) to find out how likely they would be to recommend this same product or service again to a friend.

The distribution of likelihood to recommend responses is shown in Figure 3. About half (52%) of participants selected the highest response of 10 (extremely likely) and 17% selected 9. Or, of the people who recommended a product or service in the past, 69% are promoters (Figure 4). Figure 3 also shows the bulk of values are at 8 and above, accounting for 84% of responses and another substantial drop in recommendations happens from 7 to 6 (from 8% to 2%).

Figure 3: Distribution of likelihood to recommend from 2,672 US respondents who recently recommended a product or service.

Figure 4 shows that 92% of all recommendations came from passives and promoters (almost identical to the 93% in Study 1). Across both studies around 8% of past recommenders were not very likely to recommend again. Why won’t they recommend again?

Figure 4: Percent of respondents that are promoters, passives, and detractors for products or services they reported recommending in the past (n =2672).

To find out, we asked respondents who gave 6 or below to briefly tell us why they’re now less likely to recommend in the future.

The most common reasons given were that the product or service was disappointing to them or the person they recommended it to, or it changed since they used it. Examples include:

  • I am slightly less likely to recommend because my purchase contained dead pixels (TCL television).
  • We’ve had recent issues with their products (lightexture).
  • The website was not as user-friendly as it could be (Kohl’s).
  • Service not what I was told it would be (HughesNet).
  • Product didn’t perform as expected (Pony-O).
  • The person I recommended to it went on to have an issue using the site (Fandango).

Several respondents also didn’t feel the opportunity would come up again, supporting the idea that recommendations may be infrequent, indicating past behavior may be a poor predictor of future behavior.

  • I usually recommend Roxy to those who compliment me on my Roxy shoes. If that doesn’t happen then I don’t really bring them up.
  • I don’t expect anyone in my immediate circle to be soliciting recommendations for Dell. If they asked, I would give them a positive recommendation.
  • If it comes up in the conversation, I will recommend it. If it doesn’t I won’t necessarily go out of my way to do it.

Summary and Takeaways

In our analysis of the literature and across two studies examining past recommendations and future intent to recommend, we found:

Did recommend and will recommend offer similar results. If you want to use past recommendations as a “better” measure of future intent, this analysis suggests asking future intent to recommend is highly related with past recommendations. Across two studies, around 92% of respondents who did recommend are also most likely to recommend again (if you include the promoters and passives category).

Around 8% won’t recommend again. While past recommendations and future intent to recommend are highly related, around 8% of respondents who recently recommended a company, product, or experience won’t recommend it in the future (recommendation attrition). Even the most recent recommendation wasn’t a perfect predictor of stated intent to recommend. Expect the recommendation attrition to increase the longer the time passes between when you ask people whether they recommended.

Poor experiences cause recommendation loss. The main reason we found why people who had recommended in the past won’t in the future is that they (or the person they recommended) had a disappointing experience. This can happen from participants who had a more recent bad experience or because the product or service changed (as was the case when Netflix changed its pricing). Several participants also indicated they didn’t feel an opportunity would come up again to recommend.

Don’t dismiss future intentions. Past behavior may be a good indicator of future behavior but not universally. A meta-analysis in the psychology literature suggests stated intent has both an important moderating effect on past behavior, and in many cases is a better predictor of future behavior. We’ll investigate how this holds with product and company recommendations in a future article.

Ask both past recommendations and future intent. If you’re able, ask about both the past and future. It’s likely that people who recommended and are extremely likely to recommend again are the best predictors of who will recommend in the future. A literature review found that a combination of past behavior and future intentions may be better predictors of future behavior depending on the context and frequency of the behavior. Several participants in this study indicated that their recommendations were infrequent and despite prior recommendations may be less likely to recommend again (even though their attitude toward their experience didn’t decrease)


(function() {
if (!window.mc4wp) {
window.mc4wp = {
listeners: [],
forms : {
on: function (event, callback) {
event : event,
callback: callback

Sign-up to receive weekly updates.


Source: How Does Did Recommend Differ from Likely to Recommend?

Relationships, Transactions and Heuristics

There are two general types of customer satisfaction surveys: 1) Customer Transaction Surveys and 2) Customer Relationship Surveys. Customer Transactional Surveys allow you to track satisfaction for specific events. The transactional surveys are typically administered soon after the customer has a specific interaction with the company. The survey asks the customers to rate that specific interaction. Customer Relationship Surveys allow you to measure your customer’s attitudes across different customer touchpoints (e.g., marketing, sales, product, service/support) at a given point in time. The administration of the relationship surveys is not linked to any specific customer interaction with the company. Relationship surveys are typically administered at periodic times throughout the year (e.g., every other quarter, annually). Consequently, the relationship survey asks the customers to rate the company based on their past experience. While the surveys differ with respect to what is being rated (a transaction vs. a relationship), these surveys can share identical customer touchpoint questions (e.g., technical support, sales).

A high-tech company was conducting both a transactional survey and a relationship survey. The surveys shared identical items. Given that the ratings were coming from the same company and shared identical touchpoint questions, they expected the ratings to be the same for both the relationship survey and the transactional survey. The general finding, however, was that ratings on the transactional survey were typically higher than ratings for the same question on the relationship survey. What score is correct about the customer relationship? Why don’t ratings of identical items on relationship surveys and transactional surveys result in the same score? Humans are fallible.

Availability Heuristic

There is a line of research that examines the process by which people make judgments. This research shows that people use heuristics, or rules of thumb, when asked to make decisions or judgments about frequencies and probabilities of events. There is a heuristic called the “availability heuristic” that applies here quite well and might help us explain the difference between transactional ratings and relationship ratings of identical items.

People are said to employ the availability heuristic whenever their estimate of the frequency or probability of some event is based on the ease with which instances of that event can be brought to mind. Basically, the things you can recall more easily are estimated by you to be more frequent in the world than things you can’t recall easily. For example, when presented with a list containing an equal number of male and female names, people are more likely to think that the list contains more male names than female names when more males’ names are of famous men. Because these famous names were more easily recalled, the people think that there must be more male names than female names.

Customers, when rating companies as a whole (relationship surveys), are recalling their prior interactions with the company (e.g., their call into phone support, receipt of marketing material). Their “relationship rating” is a mental composite of these past interactions, negative, positive and mundane. Negative customer experiences, unlike positive or mundane ones, tend to be more vivid, visceral, and, consequently, are more easily recalled compared to pleasant experiences. When I think of my past interactions with companies, it is much easier for me to recall negative experiences than positive experiences. When thinking about a particular company, due to the availability heuristic, customers might overestimate the number of negative experiences, relative to positive experiences, that actually occurred with the company. Thus, their relationship ratings would be adversely affected by the use of the availability heuristic.

Ratings from transactional surveys, however, are less vulnerable to the effect of the availability heuristic. Because the customers are providing ratings for one recent, specific interaction, the customers’ ratings would not be impacted by the availability heuristic.

Summary and Implications

Customer satisfaction ratings in relationship surveys are based on customers’ judgment of past experiences with the company, and, consequently, are susceptible to the effects of the availability heuristic. Customers may more easily recall negative experiences, and, consequently, these negative experiences negatively impact their ratings of the company overall. While it would appear that a transactional survey could be a more accurate measure than a relationship survey, you shouldn’t throw out the use of relationship surveys just yet.

While average scores on items in relationship surveys might be decreased due to the availability heuristic, the correlation among items should not be impacted by the availability heuristic because correlations are independent of scale values; decreasing ratings by a constant across all customers does not have any effect on the correlation coefficients among the items being rated. Consequently, the same drivers of satisfaction/loyalty would be found irrespective of survey type.

I’d like to hear your thoughts.

Source: Relationships, Transactions and Heuristics

How the Right Loyalty and Operational Metrics Drive Service Excellence – Webinar


Last week, I spoke at the CustomerThink Customer Experience Thought Leader Forum, which includes customer experience researchers and practitioners sharing leading-edge practices. Bob Thompson, founder of CustomerThink, organized several sessions focusing on specific CX issues facing business today. In our session, titled Customer Service Excellence: How to Optimize Channel and Metrics to Drive Ominchannel Excellence, Stephen Fioretti, VP of Oracle and I addressed two areas about customer service. He talked about how customer service organizations can align their channel strategy to customer needs by guiding them to the right channel based on the complexity and time sensitivity of interactions. I talked about the different types of metrics to help us understand relationship-level and transaction-level attitudes around service quality.

Self-Service Channel Adoption Increases but Delivers a Poor Experience

Stephen reported some interesting industry statistics from Forrester and Technology Services Industry Association. While the adoption of self-service is on the rise, customers are substantially less satisfied (47% satisfied) with these channels compared to the traditional (and still most popular) telephone channel (74% satisfied). So, while automated service platforms save companies money, they do so at the peril of the customer experience. As more customers adopt these automated channels, companies need to ensure they deliver a great self-service experience.

Improving the Customer Experience of Automated Channels through Behavioral/Attitudinal Analytics

In the talk, I showed how companies, by using linkage analysis, can better understand the self-service channel by analyzing the data behind the transactions, both behavioral and attitudinal. After integrating different data silos, companies can apply predictive analytics on their customer-generated data (e.g., web analytics) to make predictions about customers’ satisfaction with the experience. Using web analytics of online behavior patterns, companies might be able to profile customers who are predicted to be dissatisfied and intervene during the transaction to either improve their service experience or ameliorate its negative impact.

Stephen and I cover a lot more information in the webinar. To learn more, you can access the complete CX Forum webinar recording and slides here (free registration required).


Source: How the Right Loyalty and Operational Metrics Drive Service Excellence – Webinar

Beachbody Gets Data Management in Shape with Talend Solutions

This post is co-authored by Hari Umapathy, Lead Data Engineer at Beachbody and Aarthi Sridharan, Sr.Director of Data (Enterprise Technology) at Beachbody.

Beachbody is a leading provider of fitness, nutrition, and weight-loss programs that deliver results for our more than 23 million customers. Our 350,000 independent “coach” distributors help people reach their health and financial goals.

The company was founded in 1998, and has more than 800 employees. Digital business and the management of data is a vital part of our success. We average more than 5 million monthly unique visits across our digital platforms, which generates an enormous amount of data that we can leverage to enhance our services, provide greater customer satisfaction, and create new business opportunities.

Building a Big Data Lake

One of our most important decisions with regard to data management was deploying Talend’s Real Time Big Data platform about two years ago. We wanted to build a new data environment, including a cloud-based data lake, that could help us manage the fast-growing volumes of data and the growing number of data sources. We also wanted to glean more and better business insights from all the data we are gathering, and respond more quickly to changes.

We are planning to gradually add at least 40 new data sources, including our own in-house databases as well as external sources such as Google Adwords, Doubleclick, Facebook, and a number of other social media sites.

We have a process in which we ingest data from the various sources, store the data that we ingested into the data lake, process the data and then build the reporting and the visualization layer on top of it. The process is enabled in part by Talend’s ETL (Extract, Transform, Load) solution, which can gather data from an unlimited number of sources, organize the data, and centralize it into a single repository such as a data lake.

We already had a traditional, on-premise data warehouse, which we still use, but we were looking for a new platform that could work well with both cloud and big data-related components, and could enable us to bring on the new data sources without as much need for additional development efforts.

The Talend solution enables us to execute new jobs again and again when we add new data sources to ingest in the data lake, without having to code each time. We now have a practice of reusing the existing job via a template, then just bringing in a different set of parameters. That saves us time and money, and allows us to shorten the turnaround time for any new data acquisitions that we had to do as an organization.

The Results of Digital Transformation

For example, whenever a business analytics team or other group comes to us with a request for a new job, we can usually complete it over a two-week sprint. The data will be there for them to write any kind of analytics queries on top of it. That’s a great benefit.

The new data sources we are acquiring allow us to bring all kinds of data into the data lake. For example, we’re adding information such as reports related to the advertisements that we place on Google sites, the user interaction that has taken place on those sites, and the revenue we were able to generate based on those advertisements.

We are also gathering clickstream data from our on-demand streaming platform, and all the activities and transactions related to that. And we are ingesting data from the marketing cloud, which has all the information related to the email marketing that we do. For instance, there’s data about whether people opened the email, whether they responded to the email and how.

Currently, we have about 60 terabytes of data in the data lake, and as we continue to add data sources we anticipate that the volume will at least double in size within the next year.

Getting Data Management in Shape for GDPR

One of the best use cases we’ve had that’s enabled by the Talend solution relates to our efforts to comply with the General Data Protection Regulation (GDPR). The regulation, a set of rules created by the European Parliament, European Council, and European Commission that took effect in May 2018, is designed to bolster data protection and privacy for individuals within the European Union (EU).

We leverage the data lake whenever we need to quickly access customer data that falls under the domain of GDPR. So when a customer asks us for data specific to that customer we have our team create the files from the data lake.

The entire process is simple, making it much easier to comply with such requests. Without a data lake that provides a single, centralized source of information, we would have to go to individual departments within the company to gather customer information. That’s far more complex and time-consuming.

When we built the data lake it was principally for the analytics team. But when different data projects such as this arise we can now leverage the data lake for those purposes, while still benefiting from the analytics use cases.

Looking to the Future

Our next effort, which will likely take place in 2019, will be to consolidate various data stores within the organization with our data lake. Right now different departments have their own data stores, which are siloed. Having this consolidation, which we will achieve using the Talend solutions and the automation these tools provide, will give us an even more convenient way to access data and run business analytics on the data.

We are also planning to leverage the Talend platform to increase data quality. Now that we’re increasing our data sources and getting much more into data analytics and data science, quality becomes an increasingly important consideration. Members of our organization will be able to use the data quality side of the solution in the upcoming months.

Beachbody has always been an innovative company when it comes to gleaning value from our data. But with the Talend technology we can now take data management to the next level. A variety of processes and functions within the company will see use cases and benefits from this, including sales and marketing, customer service, and others.

About the Authors: 

Hari Umapathy

Hari Umapathy is a Lead Data Engineer at Beachbody working on architecting, designing and developing their Data Lake using AWS, Talend, Hadoop and Redshift.  Hari is a Cloudera Certified Developer for Apache Hadoop.  Previously, he worked at Infosys Limited as a Technical Project Lead managing applications and databases for a huge automotive manufacturer in the United States.  Hari holds a bachelor’s degree in Information Technology from Vellore Institute of Technology, Vellore, India.


Aarthi Sridharan

Aarthi Sridharan is the Sr.Director of Data (Enterprise Technology) at Beachbody LLC,  a health and fitness company in Santa Monica. Aarthi’s leadership drives the organization’s abilities to make data driven decisions for accelerated growth and operational excellence. Aarthi and her team are responsible for ingesting and transforming large volumes of data into traditional enterprise data warehouse and into the data lake and building analytics on top it. 

The post Beachbody Gets Data Management in Shape with Talend Solutions appeared first on Talend Real-Time Open Source Data Integration Software.

Source: Beachbody Gets Data Management in Shape with Talend Solutions

Do Detractors Really Say Bad Things about a Company? you think of a bad experience you had with a company?

Did you tell a friend about the bad experience?

Negative word of mouth can be devastating for company and product reputation. If companies can track it and do something to fix the problem, the damage can be contained.

This is one of the selling points of the Net Promoter Score. That is, customers who rate companies low on a 0 to 10 scale (6 and below) are dubbed “Detractors” because they‘re more likely spreading negative word of mouth and discouraging others from buying from a company. Companies with too much negative word of mouth would be unable to grow as much as others that have more positive word of mouth.

But is there any evidence that low scorers are really more likely to say bad things?

Is the NPS Scoring Divorced from Reality?

There is some concern that these NPS designations are divorced from reality. That is, there’s no evidence (or reason) for detractors being classified as 0 to 6 and promoters being 9-10. If these designations are indeed arbitrary or make no sense, then it’s indeed concerning. (See the tweet comment from a vocal critic in Figure 1.)

Figure 1 Validity of NPS designations

Figure 1: Example of a concern being expressed about the validity of the NPS designations.

To look for evidence of the designations, I re-read the 2003 HBR article by Fred Reichheld that made the NPS famous. Reichheld does mention that the reason for the promoter classification is customer referral and repurchase rates but doesn’t provide a lot of detail (not too surprising given it’s an HBR article) or mention the reason for detractors here.

Figure 2 Quote HBR article

Figure 2: Quote from the HBR article “Only Number You Need to Grow,” showing the justification for the designation of detractors, passives, and promoters.

In his 2006 book, The Ultimate Question, Reichheld further explains the justification for the cutoff of detractors, passives, and promoters. In analyzing several thousand comments, he reported that 80% of the Negative Word of Mouth comments came from those who responded from 0 to 6 on the likelihood to recommend item (pg 30). He further reiterated the claim that 80% of the customer referrals came from promoters (9s and 10s).

Contrary to at least one prominent UX voice on social media, there is some evidence and justification for the designations. It’s based on referral and repurchase behaviors and the sharing of negative comments. This might not be enough evidence to convince people (and certainly not dogmatic critics) to use these designations though. It would be good to find corroborating data.

The Challenges with Purchases and Referrals

Corroborating the promoter designation means finding purchases and referrals. It’s not easy associating actual purchases and actual referrals with attitudinal data. You need a way to associate customer survey data with purchases and then track purchases from friends and colleagues. Privacy issues aside, even in the same company, purchase data is often kept in different (and guarded) databases making associations challenging. It was something I dealt with constantly while at Oracle.

What’s more, companies have little incentive to share repurchase rates and survey data with outside firms and third parties may not have access to actual purchase history. Instead, academics and researchers often rely on reported purchases and reported referrals, which may be less accurate than records of actual purchases and actual referrals (a topic for an upcoming article). It’s nonetheless common in the Market Research literature to rely on stated past behavior as a reasonable proxy for actual behavior. We’ll also address purchases and referrals in a future article.

Collecting Word-of-Mouth Comments

But what about the negative comments used to justify the cutoff between detractors and passives? We wanted to replicate Reichheld’s findings that detractors accounted for a substantial portion of negative comments using another dataset to see whether the pattern held.

We looked at open-ended comments we collected from about 500 U.S. customers regarding their most recent experiences with one of nine prominent brands and products. We collected the data ourselves from an online survey in November 2017. It included a mix of airlines, TV providers, and digital experiences. In total, we had 452 comments regarding the most recent experience with the following brands/products:

  • American Airlines
  • Delta Airlines
  • United Airlines
  • Comcast
  • DirecTV
  • Dish Network
  • Facebook
  • iTunes
  • Netflix

Participants in the survey also answered the 11-point Likelihood to Recommend question, as well as a 10-point and 5-point version of the same question.

Coding the Sentiments

The open-ended comments were coded into sentiments from two independent evaluators. Negative comments were coded -1, neutral 0, and positive 1. During the coding process, the evaluators didn’t have access to the raw LTR scores (0 to 10) or other quantitative information.

In general, there was good agreement between the evaluators. The correlation between sentiment scores was high (r = .83) and they agreed 82% of the time on scores. On the remaining 18% where there was disagreement, differences were reconciled, and a sentiment was selected.

Most comments were neutral (43%) or positive (39%), with only 21% of the comments being coded as negative.

Examples of positive comments

“I flew to Hawaii for vacation, the staff was friendly and helpful! I would recommend it to anyone!”—American Airlines Customer

“I love my service with Dish network. I use one of their affordable plans and get many options. I have never had an issue with them, and they are always willing to work with me if something has financially changed.”—Dish Network Customer

Examples of neutral comments

“I logged onto Facebook, checked my notifications, scrolled through my feed, liked a few things, commented on one thing, and looked at some memories.”—Facebook User

“I have a rental property and this is the current TV subscription there. I access the site to manage my account and pay my bill.”—DirecTV User

Examples of negative comments

“I took a flight back from Boston to San Francisco 2 weeks ago on United. It was so terrible. My seat was tiny and the flight attendants were rude. It also took forever to board and deboard.”—United Airlines Customer

“I do not like Comcast because their services consistently have errors and we always have issues with the internet. They also frequently try to raise prices on our bill through random fees that increase over time. And their customer service is unsatisfactory. The only reason we still have Comcast is because it is the best option in our area.”—Comcast Customer

Associating Sentiments to Likelihood to Recommend (Qual to Quant)

We then associated each coded sentiment with the 0 to 10 values on the Likelihood to Recommend item provided by the respondent. Figure 3 shows this relationship.

Figure 3: Likelihood to Recommend

Figure 3: Percent of positive or negative comments associated with each LTR score from 0 to 10.

For example, 24% of all negative comments were associated with people who gave a 0 on the Likelihood to Recommend scale (the lowest response option). In contrast, 35% of positive comments were associated with people who scored the maximum 10 (most likely to recommend). This is further evidence for the extreme responder effect we’ve discussed in an earlier article.

You can see a pattern: As the score increases from 0 to 10, the percent of negative comments go down (r = -.71) and the percent of positive comments go up (r = .87). There isn’t a perfect linear relationship between comment sentiment and scores (otherwise the correlation would be r =1). For example, the percent of positive comments is actually higher at responses of 8 than 9 and the percent of negative comments is actually higher at 5 than 4 (possibly an artifact of this sample size). Nonetheless, this relationship is very strong.

Detractor Threshold Supported

What’s quite interesting from this analysis is that at a score of 6, the ratio of positive to negative comments flips. Respondents with scores above a 6 (7s-10s) are more likely to make positive comments about their most recent experience. Respondents who scored their Likelihood to Recommend at 6 and below are more likely to make negative comments (spread negative word of mouth) about their most recent experience.

At a score of 6, a participant is about 70% more likely to make a negative comment than a positive comment (10% vs 6% respectively). As scores go lower, the ratio goes up dramatically. At a score of 5, participants are more than three times as likely to make a negative comment as a positive comment. At a score of 0, customers are 42 times more likely to make a negative rather than a positive comment (0.6% vs. 24% respectively).

When aggregating the raw scores into promoters, passives, and detractors, we can see that a substantial 90% of negative comments are associated with detractors (0 to 6s). This is shown in Figure 4.

The positive pattern is less pronounced, but still a majority (54%) of positive comments are associated with promoters (9s and 10s). It’s also interesting to see that the passives (7s and 8s) have a much more uniform chance of making a positive, neutral, or negative comment.

This corroborates the data from Reichheld, which showed 80% of negative comments were associated with those who scored 0 to 6. He didn’t report the percent of positive comments with promoters and didn’t associate the responses to each scale point as we did here (you’re welcome).

Figure 4: Percent of positive and negative comments

Figure 4: Percent of positive or negative comments associated with each NPS classification.

If your organization uses a five-point Likelihood to Recommend scale (5 = extremely likely and 1 = not at all likely), there are similar patterns, albeit on a more compressed scale (see Figure 5 ). At a response of 3, the ratio of positive to negative comments also flips—making responses 3 or below also good designations for detractors. At a score of 3, a customer is almost four times as likely to make a negative comment about their experience than a positive comment.

Figure 5: Percent positive or negative comments LTR

Figure 5: Percent of positive or negative comments associated with each LTR score from 1 to 5 (for companies that use a 5-point scale).

Summary & Takeaways

An examination of 452 open-ended comments about customers most recent experience with nine prominent brands and products revealed:

  • Detractors accounted for 90% of negative comments. This independent evaluation corroborates the earlier analysis by Reichheld that found detractors accounted for a majority of negative word-of-mouth comments. This smaller dataset actually found a higher percentage of negative comments associated with 0 to 6 responses than Reichheld reported.
  • Six is a good threshold for identifying negative comments. The probability a comment will be negative (negative word of mouth) starts to exceed positive comment probability at 6 (on the 11-point LTR scale) and 3 (on a 5-point scale). Researchers looking at LTR scores alone can use this threshold to provide some idea about the probability of the customer sentiment about their most recent experience.
  • Repurchase and referral rates need to be examined. This analysis didn’t examine the relationship between referrals or repurchases (reported and observed) and likelihood to recommend, a topic for future research to corroborate the promoter designation.
  • Results are for specific brands used. In this analysis, we selected a range of brands and products we expected to represent a good range of NPS scores (from low to high). Future analyses can examine whether the pattern of scores at 6 or below correspond to negative sentiment in different contexts (e.g. for the most recent purchase) or for other brands/products/websites.
  • Think probabilistically. This analysis doesn’t mean a customer who gave a score of 6 or below necessarily had a bad experience or will say bad things about a company. Nor does it mean that a customer who gives a 9 or 10 necessarily had a favorable experience. You should think probabilistically about UX measures in general and NPS too. That is, it’s more likely (higher probability) that as scores go down on the Likelihood to Recommend item, the chance someone will be saying negative things goes up (but doesn’t guarantee it).
  • Examine your relationships between scores and comments. Most companies we work with have a lot of NPS data associated with verbatim comments. Use the method of coding sentiments described here to see how well the detractor designation matches sentiment and, if possible, see how well the promoter designations correspond with repurchase and referral rates or other behavioral measures (and consider sharing your results!).
  • Take a measured approach to making decisions. Many aspects of measurement aren’t intuitive and it’s easy to dismiss what we don’t understand or are skeptical about. Conversely, it’s easy to accept what’s “always been done” or published in high profile journals. Take a measured approach to deciding what’s best (including on how to use the NPS). Don’t blindly accept programs that claim to be revolutionary without examining the evidence. And don’t be quick to toss out the whole system because it has shortcomings or is over-hyped (we’d have to toss out a lot of methods and authors if this were the case). In all cases, look for corroborating evidence…probably something more than what you find on Twitter.


Humana Using Analytics to Improve Health Outcomes


Earlier this year a CDW survey revealed that analytics is a top priority for two thirds of decision-makers in the health care industry. Nearly 70 percent of respondents said they were planning for or already implementing analytics.

This is no surprise, given the strong results seen by analytics from early adopters like Humana.

The health insurer has made analytics a foundational piece of its clinical operations and consumer engagement efforts. Humana uses predictive models to identify members who would benefit from regular contact with clinical professionals, helping them coordinate care and making needed changes in healthy lifestyle, diet and other areas. This proactive approach results in improved quality of life for members, at a lower cost, said Dr. Vipin Gopal, Enterprise VP, Clinical Analytics.

According to Humana, it identified 1.9 million members with high risk for some aspect of their health through predictive models in 2014. It also used analytics to detect and close 4.3 million instances where recommended care, such as an eye exam for a member with diabetes, had not been given. In those cases, Humana notified members and their physicians, through which such gaps in care were addressed.

“Every touch point with the health care system yields data, whether it’s a physician visit or a visit to a hospital or an outpatient facility,” Gopal said. “We use analytics to understand what can be done to improve health outcomes. Humana has over 15,000 care managers and other professionals who work with members to coordinate care and help them live safely at home, even when faced with medical and functional challenges. All of that work is powered by analytics.”

While health care has lagged other industries in adopting analytics, it accumulates a large volume of data which can be used to generate useful insights, Gopal said, adding, “Health care can hugely benefit from the analytics revolution.”

Until recently, Gopal said, many in health care “did not see analytics as a key component of doing business.” That is rapidly changing, however, largely based on the example of companies like Humana.

Real-time Analytics

Humana also used predictive analytics to help reduce the hospital readmission rate by roughly 40 percent through its Humana at Home programs. After noting that about one in six members enrolled in Humana’s Medicare plans were readmitted within 30 days of a hospital visit, the company built a predictive model to determine which members were most likely to get readmitted. It created a score quantifying the likelihood of readmission for each member; if the score rose above a certain point, a clinician would immediately follow up with the member.

This effort is especially notable, Gopal said, because it incorporates real-time analytics.

“When you are discharged from the hospital, for instance, the score is updated in real time and sent to a nurse,” he said. “If you are trying to prevent a readmission from happening within 30 days, you cannot run a predictive model once a month or even once a week.”

One of Humana’s latest efforts involves using analytics to address the progression of diseases like diabetes, which Gopal said affects about 30 percent of senior citizens. It is classifying its members with diabetes into low, medium and high severity categories. As a person goes from low to high severity, costs of care increase by seven times and quality of life steeply declines. Foot wounds go up 36 times, for example, and the number of foot amputations rises. So Humana is using predictive models to identify members most likely to progress and, hopefully, to slow progression through clinical interventions.

“Really understanding the variables through deep analytics and helping people to not progress, will be huge for our members and for overall public health as well,” Gopal said.

Keys to Analytics Success

Humana has benefited from a relatively mature technology infrastructure, a supportive CEO and an analytics team that Gopal built by design to include a mix of professionals with varied backgrounds — not just data scientists but those with backgrounds in public health, computer science, applied math and engineering.

Of his team, Gopal said, “These are deep problems, and we need the best multidisciplinary talent working on them. It’s not something just public health people can solve, or just computer science people can solve.”

Perhaps the biggest factor in Humana’s success with its analytics program, Gopal said, is using analytics to solve meaningful challenges.

“We do not work on stuff just because it’s cool to do, we work on problems where we can make a direct impact on the business,” he said. “That is how we select projects, and see it through to implementation and right through to results.”

Gopal will discuss Humana’s use of analytics in a presentation at TechFestLou, a conference hosted by the Technology Association of Louisville Kentucky next week in Louisville. A full schedule of events and other information is available on the event website.

Note: This article originally appeared in Enterprise Apps Today. Click for link here.

Originally Posted at: Humana Using Analytics to Improve Health Outcomes