Is the Importance of Customer Experience Overinflated?

Companies rely on customer experience management (CEM) programs to provide insight about how to manage customer relationships effectively to grow their business. CEM programs require measurement of primarily two types of variables, satisfaction with customer experience and customer loyalty. These metrics are used specifically to assess the importance of customer experience in improving customer loyalty. Determining the “importance” of different customer experience attributes needs to be precise as it plays a major role in helping companies: 1) prioritize improvement efforts, 2) estimate return on investment (ROI) of improvement efforts and 3) allocate company resources.

How We Determine Importance of Customer Experience Attributes

When we label a customer experience attribute as “important,” we typically are referring to the magnitude of the correlation between customer ratings on that attribute (e.g., product quality, account management, customer service) and a measure of customer loyalty (e.g., recommend, renew service contract). Correlations can vary from 0.0 to 1.0. Those attributes that have a high correlation with customer loyalty (approaching 1.0) are considered more “important” than other attributes that have a low correlation with customer loyalty (approaching 0.0).

Measuring Satisfaction with the Customer Experience and Customer Loyalty Via Surveys

Companies typically (almost always?) rely on customer surveys to measure both the satisfaction with the customer experience (CX) as well as the level of customer loyalty.  That is, customers are given a survey that includes questions about the customer experience and customer loyalty. The customers are asked to make ratings about their satisfaction with the customer experience and their level of customer loyalty (typically likelihood ratings).

As mentioned earlier, to identify the importance of customer experience attributes on customer loyalty, ratings of CX metrics and customer loyalty are correlated with each other.

The Problem of a Single Method of Measurement: Common Method Variance

The magnitude of the correlations between measures of satisfaction (with the customer experience) and measures of customer loyalty are made up of different components. On one hand, the correlation is due to the “true” relationship between satisfaction with the experience and customer loyalty.

On the other hand, because the two variables are measured using the same method - a survey with self-reported ratings, the magnitude of the correlation is partly due to the method of how the data are collected. Referred to as Common Method Variance (CMV) and studied in the field of social sciences (see Campbell and Fiske, 1959) where surveys are a common method of data collection, the general finding is that the correlation between two different measures is driven partly by the true relationship between the constructs being measured as well as the way they are measured.

The impact of CMV in customer experience management likely occurs when you use the same method of collecting data (e.g., survey questions) for both predictors (e.g., satisfaction with the customer experience) and outcomes (e.g., customer loyalty). That is, the size of the correlation between satisfaction and loyalty metrics is likely due to the fact that both variables are measured using a survey instrument.

Customer Loyalty Measures: Real Behaviors v. Expected Behaviors

The CMV problem is not really about how we measure satisfaction with the customer experience; a survey is a good way to measure the feelings/perceptions behind the customers’ experience. The problem lies with how we measure customer loyalty. Customer loyalty is about actual customer behavior. It is real customer behavior (e.g., number of recommendations, number of products purchased, whether a customer renewed their service contract) that drives company profits. Popular self-report measures ask for customers’ estimation of their likelihood of engaging in certain behaviors in the future (e.g., likely to recommend, likely to purchase, likely to renew).

Using self-report measures of satisfaction and loyalty, researchers have found high correlations between these two variables; For example, Bruce Temkin has found correlations between satisfaction with the customer experience and NPS to be around .70. Similarly, in my research, I have found comparably sized correlations (r ≈ .50) looking at the impact of the customer experience on advocacy loyalty (the recommend question is part of my advocacy metric). Are these correlations a good reflection of the importance of the customer experience in predicting loyalty (as measured by the recommend question)? Before I answer that question, let us first look at work (Sharma, Yetton and Crawford, 2009) that helps us classify different types of customer measurement and their impact on correlations.

Different Ways to Measure Customer Loyalty

Sharma et al. highlight four different types of measurement methods. I have slightly modified their four types to illustrate customer loyalty measures that are least susceptible to CMV (coded as 1) to measures that are most susceptible to CMV (coded as 4):

  1. System-captured metrics reflect objective metrics of customer loyalty: Data are obtained from historical records and other objective sources, including purchase records (captured in a CRM system). Example: Computer generated records of “time spent on the Web site” or “number of products/services purchased” or “whether a customer renewed their service contract.”
  2. Behavioral-continuous items reflect specific loyalty behaviors that respondents have carried out: Responses are typically captured on a continuous scale. Example item: How many friends did you tell about company XYZ in the past 12 months? None to 10, say.
  3. Behaviorally-anchored items reflect specific actions that respondents have carried out: Responses are typically captured on scales with behavioral anchors. Example item: How often have you shopped at store XYZ in the past month? Not at all to Very Often.
  4. Perceptually-anchored items reflect perceptions of loyalty behavior: Responses are typically on Likert scales, semantic differential or “agree/disagree scale”. Example: I shop at the store regularly. Agree to Disagree.

These researchers looked at 75 different studies examining the correlation between perceived usefulness (predictor) and usage of IT (criterion). While all studies used perceptually-anchored measures for perceived usefulness (perception/attitude), different studies used one of four different types of measures of usage (behavior). These researchers found that CMV accounted for 59% of the variance in the relationship between perceived usefulness and usage (r = .59 for perceptually-anchored items; r = .42 for behaviorally anchored items; r = .29 for behavioral continuous items; r = .16 for system-captured metrics). That is, the method with which researchers measure “usage” impacts the outcome of the results; as the usage measures become less susceptible to CMV (moving up the scale from 4 to 1 above), the magnitude of the correlation decreases between perceived usefulness and usage.

Looking at research in the CEM space, we commonly see that customer loyalty is measured using questions that reflect perceptually-anchored questions (type 4 above), the type of measure most susceptible to CMV.

Table 1. Descriptive statistics and correlations of two types of recommend loyalty metrics (behavioral-continuous and perceptually-anchored) with customer experience ratings.

An Example

I have some survey data on the wireless service industry that examined the impact of customer satisfaction with customer touch points (e.g, product, coverage/reliability and customer service) on customer loyalty. This study included measures of satisfaction with the customer experience (perceptually-anchored) and two different measures of customer loyalty:

  1. self-reported number of people you recommended the company to in the past 12 months (behavioral-continuous).
  2. self-reported likelihood to recommend (perceptually-anchored)

The correlations among these measures are located in Table 1.

As you can see, the two recommend loyalty metrics are weakly related to each other (r = .47), suggesting that they measure two different constructs. Additionally, and as expected by the CMV model, the behavioral-continuous measure of customer loyalty (number of friends/colleagues) shows a significantly lower correlation (average r = .28) with customer experience ratings compared to the perceptually-anchored measure of customer loyalty (likelihood to recommend) (average r = .52). These findings are strikingly similar to the above findings of Sharma et al. (2009).

Summary and Implications

The way in which we measure the customer experience and customer loyalty impacts the correlations we see between them. As measures of both variables use perceptually-anchored questions on the same survey, the correlation between the two are likely overinflated. I contend that the true impact of customer experience on customer loyalty can only be determined when real customer loyalty behaviors are used in the statistical modeling process.

We may be overestimating the importance (e.g., impact) of customer experience on customer loyalty simply due to the fact that we measure both variables (experience and loyalty) using the same instrument, a survey with similar scale characteristics. Companies commonly use the correlations (or squared correlation) between a given attribute and customer loyalty as the basis for estimating the return on investment (ROI) when improving the customer experience. The use of overinflated correlations will likely result in an overestimate of the ROI of customer experience improvement efforts. As such, companies need to temper this estimation when perceptually-anchored customer loyalty metrics are used.

I argue elsewhere that we need to use more objective metrics of customer loyalty whenever they are available. Using Big Data principles, companies can link real loyalty behaviors with customer satisfaction ratings. Using a customer-centric approach to linkage analysis, our company,TCELab  helps companies integrate customer feedback data with their CRM data where real customer loyalty data are housed  (see CEM Linkage for a deeper discussion).

While measuring customer loyalty using real, objective metrics (system-captured) would be ideal, many companies do not have the resources to collect and link customer loyalty behaviors to customer ratings of their experience. Perhaps loyalty measures that are less susceptible to CMV could be developed and used to get a more realistic assessment of the importance of the customer experience on customer loyalty.  For example, self-reported metrics that are more easily verifiable by the company (e.g., “likelihood to renew service contract” is more easily verifiable by the company than “likelihood to recommend”) might encourage customers to provide realistic ratings about their expected behaviors, thus reflecting a truer measure of customer loyalty. At TCELab, our customer survey, the Customer Relationship Diagnostic (CRD), includes verifiable types of loyalty questions (e.g., likely to renew contract, likely to purchase additional/different products, likely to upgrade).

The impact of the Common Method Variance (CMV) in CEM research is likely strong in studies in which the data for customer satisfaction (the predictor) and customer loyalty (the criterion) are collected using surveys with similar item characteristics (perceptually-anchored). CEM professionals need to keep the problem of CMV in mind when interpreting customer survey results (any survey results, really) and estimating the impact of customer experience on customer loyalty and financial performance.

What kind of loyalty metrics do you use in your organization? How do you measure them?

Originally Posted at: Is the Importance of Customer Experience Overinflated?

Will China use big data as a tool of the state?

Since imperial times Chinese governments have yearned for a perfect surveillance state. Will big data now deliver it? On 5 July 2009, residents of Xinjiang, China’s far western province, found the internet wasn’t working. It’s a regular frustration in remote areas, but it rapidly became apparent that this time it wasn’t coming back. The government had hit the kill switch on the entire province when a protest in the capital Ürümqi by young Uighur men (of the area’s indigenous Turkic population) turned into a riot against the Han Chinese, in which at least 197 people were killed.

The shutdown was intended to prevent similar uprisings by the Uighur, long subjected to religious and cultural repression, and to halt revenge attacks by Han. In that respect, it might have worked; officially, there was no fatal retaliation, but in retrospect the move came to be seen as an error.

Speaking anonymously, a Chinese security advisor described the blackout as ‘a serious mistake… now we are years behind where we could have been in tracking terrorists’. Young Uighur learnt to see the internet as hostile territory – a lesson reinforced by the arrest of Ilham Tohti, a popular professor of economics, on trumped-up charges of extremism linked to an Uighur-language website he administered. ‘We turn off our phones before we talk politics’, a tech-savvy Uighur acquaintance remarked.

The Uighur continued to consume digital media, but increasingly in off-line form, whether viewing discs full of Turkish TV series or jihadist propaganda passed on memory sticks. Where once Chinese media reports claimed that arrested Uighur had been visiting ‘separatist’ websites, now they noted drawers full of burnt DVDs and flash drives.

A series of brutal terrorist attacks early in 2014 reinforced the lesson for the Chinese authorities; by driving Uighur off-line they had thrown away valuable data. Last summer, the Public Security University in Beijing began recruiting overseas experts in data analysis, including, I’m told, former members of the Israeli security forces.

In Xinjiang, tightened control means less information, and the Chinese government has always had a fraught relationship with information – private and public. Today, an explosion in available data promises to open up sources of knowledge previously tightly locked away. To some, this seems a shift toward democracy. But technocrats within the government also see it as a way to create a more efficient form of authoritarianism.

In functioning democratic societies, information is gathered from numerous independent sources: universities, newspapers, non-government organisations (NGOs), pollsters. But for the Chinese Communist Party, the idea of independent, uncontrolled media remains anathema. Media is told to ‘direct public opinion’, not reflect it.

China’s rulers have always struggled to get good data, especially in the countryside

In the 2000s, a nascent civil society was gradually forming, much of it digital. Social media, public forums, online whistle-blowing, and investigative journalism offered ways to expose corrupt officials and to force the state to follow its own laws. But in the past three years all such efforts have been crushed with fresh ruthlessness. Lawyers, journalists and activists who were once public-opinion leaders have been jailed, exiled, banned from social media, silenced through private threats, or publicly humiliated by being forced to ‘confess’ on national television. Ideological alternatives to the Party, such as house churches, have seen heightened persecution. And the life was choked from services such as Weibo, the Chinese Twitter-alike, with thousands of accounts banned and posts deleted.

Yet this has left Beijing with the same problem it has always faced. From the beginning, central government has tried simultaneously to gather information for itself and to keep it out of the hands of the public.

A vast array of locals, from seismological surveyors to secret police services, gathered data for the government, but in its progress through the hierarchies of party and state the data inevitably became distorted for political and personal ends. Now some people in government see technology as a solution: a world in which data can be gathered directly, from bottom to top, circumventing the distortions of hierarchy and the threat of oversight alike. But for others, even letting go to the minimal degree necessary to gather such data presents a threat to their own power.

Ren (a pseudonym), a native Beijinger in his late 20s, spent his college years in the West fervently defending China online. But, he now says: ‘I realised that I didn’t know what was going on, and there are so many problems everywhere.’ Back in China, and working for the government, he sees monitoring social media as the best way for the government to keep abreast of and respond to public opinion, allowing a ‘responsible’ authoritarianism. Corrupt officials can be identified, local problems brought to the attention of higher levels, and the public’s voice heard. At the same time, data-analysis techniques can be used to identify worrying groupings in certain areas, and predict possible ‘mass group incidents’ (riots and protests) before they occur.

‘Now that we’ve taken out the “big Vs”,’ Ren told me, ‘we shouldn’t worry about ordinary people speaking.’ ‘Big Vs’ are famous and much-followed ‘Verified’ users on Weibo and other social media services; celebrities, but also ‘public intellectuals’ who have been systematically eliminated over the past three years. With potential rallying points such as opinion leaders or alternative ideologies crushed, the government can view the seething mass of public grievances as a potential source of information, not a direct challenge.

The central government is acutely aware of how little it knows about the country it rules, as fragmented local authorities contend to bend data to their own ends. For instance, the evaluation of officials’ performance by their superiors is, formally, deeply dependent on statistical measurements, predominantly local gross domestic product (GDP) growth. (Informally, it depends also on family connections and outright bribery.) As a result, officials go to great lengths to juke the stats. As Li Keqiang, now the Chinese premier, told a US official in 2007 when he was Party Secretary of Liaoning Province, in a conversation later released by WikiLeaks, GDP figures are ‘man-made’ and ‘for reference only’.

Li said he relied, as many analysts do, on proxy data that’s far harder to fake. To measure growth in his own province, for instance, he looked at electricity, volume of rail cargo and disbursed loans. But he also relied on ‘official and unofficial channels’ to find information on the area he ran, including ‘friends who are not from Liaoning to gather information [I] cannot obtain myself.’

Li’s dilemma would have been familiar to any past emperor. China’s rulers have always struggled to get good data, especially in the countryside. The imperial Chinese state did its best to make its vast and diverse population legible. Household registration systems, dating back to China’s ancient precursor kingdoms, tried to monitor subjects from birth to death. Government officials trudged their way to isolated hamlets across mountains, jungles and deserts. But at the same time, local leaders reported pleasant fictions to the capital to cover their backs.

The People’s Republic inherited these problems, but added to them an obsession with statistics, acquired from the Soviets. Communism was ‘scientific’, and so the evidence had to be manufactured to support it. Newspapers in the 1950s included paragraphs of figures about increased production and national dedication. Chinese reporters still cram unnecessary (and often fictitious) statistics into stories. (‘The new factory has an area of 2,794 square meters.’) ‘According to statistics’ is one of the most overused phrases in mainland writing.

All of this has made China a society in which real information is guarded with unusual jealousy, even within the government. For decades, even the most innocuous data was treated like a state secret. Even the phone numbers of government departments were given out only to a privileged few.

The government in Beijing is well aware that information it receives from below is mangled or invented on the way up. The National Bureau of Statistics (NBS), which chiefly manages industrial data, frequently demands more direct reporting; it has repeatedly called for businesses to send their information directly to the NBS, and has begun naming and shaming firms that don’t do so, as well as local authorities that it catches fixing numbers. In September 2013, for instance, it reported on its website that a Yunnanese county had inflated its industrial growth four-fold. But the NBS is largely ‘helpless’, a junior official at a more powerful body smugly told me, lacking the internal clout to enforce its demands.

One corrective has been sudden descents by higher authorities for ‘inspection tours’. But these are usually anticipated and controlled by local officials who have long since mastered the Potemkin arts. Another longstanding solution was the petitioning system, first institutionalised in the seventh century AD. It let individuals circumvent local officials and present their plea for justice directly to higher authorities, or even directly to the capital. The system is still in place, handling millions of requests a year. But it has never worked, with petitioners more likely to be branded as troublemakers, beaten up, or imprisoned, than for their information to reach anyone of note. Partly the problem is that one of the metrics used to measure officials is the number of petitioners their district produces, the theory being that good governance produces fewer complains, and so corruption has been incentivised.

The files of perceived dissidents are thick, but the records of ordinary life are thin

The constant interference of middlemen is why some in central government are so excited by the possibility of gathering data directly. Take the contentious issue of population. Incentives to distort information cut two ways; under the one‑child policy, rural families often try to avoid reporting births at all, but rural authorities have a strong incentive to over-report their population, since they receive size-linked benefits from the centre. Urban areas, meanwhile, have a strong incentive to under-report population figures, as they’re supposed to be limiting the speed of urbanisation to controllable levels. Beijing’s official population is 21.5 million, but public transport figures suggest the real figure might be 30-35 million.

In theory, China’s surveillance state already generates massive amounts of personal data that could provide government with valuable information. The ID card, now radio-frequency-based, is central to Chinese citizens’ lives, required from banks to hospitals. A centralised database lets ordinary people check ID numbers against names online and confirm identities. But individual transactions with the ID card often go unrecorded unless the Public Security Bureau (PSB) – essentially, the local police station – has already taken an interest in you. And so the files of perceived dissidents are thick, but the records of ordinary life are thin. Even if central agencies go looking for the information, it is distorted en route from municipal, then provincial PSBs.

Despite the vast amounts of data produced within the government, Chinese scientists and officials often find themselves turning to the same sources as Western ones. They’ve seized on projects from abroad that demonstrate how analysis could potentially map population mobility through mobile-phone usage. The mass of consumer data produced by the online shopping services run by the $255 billion Alibaba group is another huge bonanza. Now smartphones produce more of the information that the government needs than secret policemen do.

In and of itself, China’s search for data is morally neutral. As the US political scientist James Scott points out in Seeing Like a State (1998), population data can equally be used for universal vaccinations or genocidal round-ups.

If big data is used by China’s central government to identify corrupt officials, pinpoint potential epidemics and ease traffic, that can only be laudable. Better data would also help NGOs seeking to aid a huge and complex population, and firms looking to invest in China’s future. The flow of data could circumvent vested interests and open up the country’s potential. For Professor Shi Yong, deputy director of the Research Center on Fictitious Economy and Data Science in Beijing , this is a moral issue, not just a question of governance. ‘The data comes from the people,’ he said strongly, ‘so it should be shared by the people.’

Most people in China don’t want to protest against government. They want to know where the good schools are, how clean the air is, and what mortality rates are at local hospitals. Shi returned to China after spending two decades at universities in the US because he was excited by the possibilities of China’s growing information society.

Resistance to opening up officials’ property registration details is extremely fierce

‘Let’s say I want to move to a small city here,’ he told me. ‘I want to know school districts, rent, health: we don’t have this information easily available. Instead, people use personal contacts to get it.’ Shi says that there’s huge resistance to the idea of open data, from within the government and even more from businesses. ‘They might want to protect the way they run their business, they may want to hide something.’ One of his current projects is working with the People’s Bank of China (PBOC) on establishing a nationwide personal credit‑rating system.

‘Actually,’ Shi told me, ‘they have two databases: one for personal information and one for companies’ information, and they wanted us to work on both. But I said no, we would only work on the first. This data is very beautiful! Better than the American data, because all the other banks must send the information directly to the PBOC, the central bank, every day.’ The company data, in contrast, was bad enough to be unworkable. ‘You know garbage in, garbage out? With data analysis, small garbage in, big garbage out.’

Shi highlighted the ways in which the internet had already opened up the provinces for the central government. ‘Look at the PX protests,’ he said, pointing to the local outrage in August 2011 in Dalian and elsewhere against factories producing the chemical paraxylene (PX). ‘Two decades ago, that would have gone nowhere. But this time, the higher authorities took notice of it.’

Small injections of information have already had a palpable effect in China. Air pollution comes in two main forms: relatively large particles called PM 10, and relatively small ones called PM 2.5. For years, Chinese cities published only PM 10 figures, and further skewed statistics by picking selectively from less polluted areas. But after independent monitors, including the US Embassy in Beijing, began putting their own PM 2.5 figures online hourly, which spread rapidly through social media, public pressure eventually forced a shift in official policy.

The crucial issue is who gets to see and to use the data. If it’s limited to officials, however pure-minded their intentions, all it will do is reinforce the reach of the state. China’s strong data protection and privacy laws function primarily not to protect citizens from state intrusion, but to shield officials and businessmen from public scrutiny. Resistance to opening up officials’ property registration details is extremely fierce.

Even if opened up, this information means nothing without tools to find it.  In China, much of that searching is filtered through the web services and search engine Baidu, which is based in Beijing and commands three-quarters of search revenue on the mainland. Like much Chinese online innovation, Baidu profited from the government’s fears of foreign firms, which created a walled garden in which domestic products thrived. After Google announced it would cease censoring searches in China in 2010, the US giant was effectively blocked on the mainland, its share of searches falling from 36 per cent in 2009 to 1.6 per cent in 2013. But Baidu had to fight off internal competitors too, including ChinaSo, the search engine created last year by the merger of the People’s Daily newspaper and the Xinhua news agency, both state-run.

Baidu recently announced that it would launch a big-data engine to allow the public to search and analyse its available data. The firm already works with the Ministry of Transport, using data drawn from the search results on its map service to predict travel trends and help manage traffic. In a project ‘inspired by’ Google Flu Trends, it’s also working with health authorities to predict epidemic outbreaks.

Baidu is widely criticised for co‑operating with the authorities over censorship, and for its dependence on paid advertising which puts the highest-paying companies at the top of search results. That’s why, as Ren explained: ‘If you search for public opinion (minyi), you get two pages of car results’ – minyi uses the same characters as an automobile brand.

Yet the firm also puts up some quiet, informal resistance to government intrusion. It maintains less personal information on users than Google does, for instance, partly because it has fewer integrated services, but it also wipes its own records of search histories far more frequently than Western firms. Insiders say that in meetings with the authorities, Baidu plays an active role in speaking up for greater online freedoms.

That might be why Baidu isn’t popular among many in government. ‘The Party’s publicity department invited Zhou Xiaoping [a young, ultra-nationalist blogger] to speak recently,’ Ren told me. ‘Much of the speech was a rant against Baidu, how they were “rightists” [pro-US, civil rights, and free markets]. Do you know, he said, that if you search “police brutality” on Baidu, you get results about China? Why are the results not about the US, he asked. He got rounds and rounds of applause.’

The network-analysis techniques the authorities use to identify terrorists are also deployed against peaceful independence activists

Whatever measures some firms take, intrusion by the state is hard to resist. A draconian new draft for a national security law, likely to be introduced this year, specifies that the state has full access to any data it demands – already the case in practice – and that any foreign firm working in China must keep all their Chinese data inside the country. It also envisages extensive camera networks and the use of facial-recognition software on a vast scale.

Shi described to me how personal banking and credit information ‘is being used as part of the anti-corruption campaign to identify the networks of corrupt officials’, who in China often hide their graft – whether it’s property or cash – by putting it in the names of friends or family. Using data analysis, Shi suggested, the Party’s investigators could root out such previously opaque networks.

Identifying and targeting friends and family, however, is also a technique that the Chinese state has traditionally used against dissidents and whistleblowers. In earlier times, ideological deviance could cause a man’s entire family to be persecuted, or even executed. Even today, the threat of children forced out of school or spouses fired from jobs is part of the toolset deployed against ‘troublemakers’. In Xinjiang, meanwhile, the network-analysis techniques the authorities use to identify terrorists are also deployed against peaceful independence activists, academic dissidents such as Tohti (whose students were marched out to testify against him), and Islamic teachers.

When I asked Shi about the increasing discussion in the West over government surveillance, he suggested that it would come in time in China. ‘We’re not at that stage yet,’ he said. ‘Right now, we’re just setting up the basic infrastructure. In time, we’ll have the kinds of legal protections that developed countries do.’

That might happen. But I’ve been hearing from well-meaning people ever since I came to China more than a decade ago that the rule of law is right around the corner. The corner’s still there. But now it has a CCTV camera on it.

James Palmer is a British writer and editor who works closely with Chinese journalists. His latest book is The Death of Mao (2012). He lives in Beijing

Originally posted via “Will China use big data as a tool of the state?”

Source: Will China use big data as a tool of the state?

20 Best Practices in Customer Feedback Programs: Building a Customer-Centric Company

Customer feedback programs (sometimes referred to as Voice of the Customer Programs, Customer Loyalty Programs) are widely used by many companies. These customer feedback programs are designed to help them understand their customers’ attitudes and experiences to ensure they are delivering a great customer experience. The ultimate goal of a customer feedback program is to maximize customer loyalty, consequently improving business performance (Hayes, 2010).

Chief Customer Officers and the like look to industry professionals for help and guidance to implement or improve their customer feedback programs. These industry professionals, in turn, offer a list of best practices for implementing/running customer feedback programs (I’m guessing there are as many of these best practice lists as there are industry professionals). I wanted to create a list of best practices that was driven by empirical evidence. Does adoption of best practices actually lead to more effective programs? How do we define “effective”? Are some best practices more critical than others? I addressed these questions through a systematic study of customer feedback programs and what makes them work. I surveyed customer feedback professionals across a wide range of companies (including Microsoft, Oracle, Akamai) about their customer feedback program. Using these data, I was able to understand why some programs are good (high loyalty) and others are not (low loyalty).  If you are a customer feedback professional, you can take the best practices survey here: http://businessoverbroadway.com/resources/self-assessment-survey to understand how your company stacks up against best practices standards in customer feedback programs.

I will present the major findings of the study here, but for the interested reader, the full study can be found in my book, Beyond the Ultimate Question. While no best practices list can promise results (my list is no different), research shows that the following 20 best practices will greatly improve your chances of achieving improved customer relationship management and increasing customer loyalty.

Components of Customer Feedback Programs

Before I talk about how to best structure a customer feedback program, let us take a 30,000-ft view of an enterprise-wide customer feedback program. A customer feedback program involves more than simply surveying customers. To be useful, a customer feedback program must successfully manage many moving parts of the program, each impacting the effectiveness of the overall program. The elements of customer feedback programs can be grouped into six major areas or components. These components are: Strategy, Governance, Business Process Integration, Method, Reporting, and Applied Research. Figure 1 below represents the components of customer feedback programs.

Components of a Customer Feedback Program
Figure 1. Elements of a Customer Feedback Program

Strategy involves the executive-level actions that set the overarching guidelines around the company’s mission and vision regarding the company objectives. Governance deals with the organization’s policies surrounding the customer feedback program. Business Process Integration deals with the extent to which the customer feedback program is integrated into the daily business processes. Method deals with the way in which customer feedback data are collected. Reporting is involved in the way in which customer feedback data are summarized and disseminated throughout the company. Finally, Applied Research focuses on the extent to which companies gain additional operational and business insight through systematic research using their customer feedback data.

Best Practices Study and General Findings

While many companies have a formal customer feedback program, only some of them experience improvements in customer loyalty while the other companies find that their customer loyalty remains flat. To understand why this difference occurs, I conducted a study to understand how loyalty leading companies, compared to loyalty lagging companies, structure their customer feedback programs (see Hayes (2009) for details of the study methodology).

A total of 277 customer feedback professionals from midsize to large companies completed a survey about their company’s customer feedback program. The respondents indicated whether their company adopts 28 specific business practices related to their customer feedback program (e.g., senior executive is champion of customer feedback program; Web-based surveys are used to collect customer feedback). Additionally, respondents were asked to provide an estimate of their company’s customer loyalty ranking within their industry; this question was used to segment customers into loyalty leaders (companies with a loyalty ranking of 70% or higher) and loyalty laggards (companies with a loyalty ranking below 70%).

Table 1. Adoption Rates of Customer Feedback Program Practices of Loyalty Leaders and Loyalty Laggards
Table 1. Adoption Rates of Customer Feedback Program Practices of Loyalty Leaders and Loyalty Laggards

The survey results revealed real differences between loyalty leaders and loyalty laggards in their customer feedback programs (See Table 1). There were statistically significant differences in adoption rates between loyalty leaders and loyalty laggards across many of the business practices. Loyalty leading companies were more likely to adopt specific practices compared to their loyalty lagging counterparts, especially in areas related to strategy/governance, integration and applied research. In upcoming posts, I will explore each component of the customer feedback program and present best practices for each.

Take the Customer Feedback Programs Best Practices Survey

If you are a customer feedback professional, you can take the best practices survey to receive free feedback on your company’s customer feedback program. This self-assessment survey assesses the extent to which your company adopts best practices throughout their program. Go here to take the free survey: http://businessoverbroadway.com/resources/self-assessment-survey.

Source: 20 Best Practices in Customer Feedback Programs: Building a Customer-Centric Company by bobehayes

Improving Employee Empowerment Begins with Measurement

empowermentI read an article last week on employee empowerment by Annette Franz. She reflected on the merits of employee empowerment and also provided excellent examples of how employers can improve the customer experience by empowering their employees; she sites examples from the likes of Ritz-Carlton, Hyatt and Diamond Resorts, to name a few. Before employers institute ways to improve employee empowerment, however, they need to understand the level of empowerment their employees currently experience. How do employers know if their employees feel empowered? An effective way is to simply ask them.

Employee Empowerment Questionnaire (EEQ)

Twenty years ago (yes, 20 years ago), I developed an Employee Empowerment Questionnaire (EEQ) that includes 8 questions you can use in your employee survey to measure employee empowerment. The EEQ was designed to measure the degree to which employees believe that they have the authority to act on their own to increase quality (my definition of employee empowerment). Employees are asked to indicate “the extent to which you agree or disagree with each of the following statements on a 1 (strongly disagree to) to 5 (strongly agree) scale”:

  1. I am allowed to do almost anything to do a high-quality job.
  2. I have the authority to correct problems when they occur.
  3. I am allowed to be creative when I deal with problems at work.
  4. I do not have to go through a lot of red tape to change things.
  5. I have a lot of control over how I do my job.
  6. I do not need to get management’s approval before I handle problems.
  7. I am encouraged to handle job-related problems by myself.
  8. I can make changes on my job whenever I want.

The EEQ is calculated by averaging the rating across all eight questions. EEQ scores can range from 1 (no empowerment) to 5 (high empowerment). Studies using the EEQ show that it has high reliability (Cronbach’s alpha = .85 and .94 in two independent samples) and is related to important organizational variables; using the EEQ, I found that employees who feel empowered at work, compared to their counterparts, report higher job satisfaction and lower intentions to quit.

Using the EEQ for Diagnostic and Prescriptive Purposes

Employers can use the EEQ for diagnostic as well as prescriptive purposes. Comparing different employee groups, employers can identify if there is a general “empowerment problem” in their organization or if it is isolated to specific areas/roles. This simple segmentation exercise can help employers know where they need to pinpoint improvement efforts. For example, in a study of employees working for a federal government agency, I found that employees in supervisory roles reported higher empowerment (Mean EEQ = 3.71) compared to non-supervisors (Mean EEQ = 3.04). For this agency, improvement efforts around empowerment might experience the greatest ROI when focused on employees in non-supervisory roles.

In addition to acting as a diagnostic tool, results of the EEQ can prescribe ways to improve employee empowerment. While these eight questions, taken as a whole, measure one underlying construct, each question’s content shows employers how they can empower employees:

  1. Minimize red tape around change management.
  2. Allow employees to make mistakes in the name of satisfying customers.
  3. Reward employees who solve problems without the permission of management.
  4. Give employees rules of engagement but let them be creative when dealing with unique customer problems.

Summary

Employee empowerment remains an important topic of discussion in the world of customer experience management; employee empowerment is predictive of important organizational outcomes like employee job satisfaction and employee loyalty, outcomes that are associated with a better customer experience and increased customer loyalty. The Employee Empowerment Questionnaire (EEQ) allows companies to diagnose their empowerment problem and can help prescribe remedies to improve employee empowerment (e.g., minimizing bureaucratic red tape, allowing for mistakes, rewarding creative problem-solving).

As part of an annual employee survey, the EEQ can provide executives the insights they need to improve employee satisfaction and loyalty and, consequently, customer satisfaction and loyalty. To read more about the development of the Employee Empowerment Questionnaire, click here to download the free article.

Originally Posted at: Improving Employee Empowerment Begins with Measurement by bobehayes

Three Big Data Trends Analysts Can Use in 2016 and Beyond

One of the byproducts of technology’s continued expansion is a high volume of data generated by the web, mobile devices, cloud computing and the Internet of Things (IoT). Converting this “big data” into usable information has created its own side industry, one that businesses can use to drive strategy and better understand customer behavior.

The big data industry requires analysts to stay up to date with the machinery, tools and concepts associated with big data, and how each can be used to grow the field. Let’s explore three trends currently shaping the future of the big data industry:

Big Data Analytics Degrees

Mostly due to lack of know-how, businesses aren’t tapping into the full potential of big data. In fact, most companies only analyze about 12 percent of the emails, text messages, social media, documents or other data-collecting channels available to them (Forrester). Many universities now offer programs for big data analytics degrees to directly acknowledge this skills gap. The programs are designed to administer analytical talent, train and teach the skillsets – such as programming language proficiency, quantitative analysis tool expertise and statistical knowledge – needed to interpret big data. Analysts predict the demand for industry education will only grow, making it essential for universities to adopt analytics-based degree programs.

Predicting Consumer Behaviors

Big data allows businesses to access and extract key insights about their consumer’s behavior. Predictive analytics challenges businesses to take data interpretation a step further by not only looking for patterns and trends, but using them to predict future purchasing habits or actions. In essence, predictive analytics, which is a branch of big data and data mining, allows businesses to make more data-based predictions, optimize processes for better business outcomes and anticipate potential risk.

Another benefit of predictive analytics is the impact it will have on industries such as health informatics. Health informatics uses electronic health record (EHR) systems to solve problems in healthcare such as effectively tracking a patient’s medical history. By documenting records in electronic format, doctors can easily track and assess a patient’s medical history from any certified access port. This allows doctors to make assumptions about a patient’s health using predictive analytics based on documented results.

Cognitive Machine Improvements

A key trend evolving in 2016 is cognitive improvement in machinery. As humans, we crave relationship and identify with brands, ideas and concepts that are relatable and easy to use. We expect technology will adapt to this need by “humanizing” the way machines retain memories and interpret and process information.

Cognitive improvement aims to solve computing errors, yet still predict and improve outcomes as humans would. It also looks to solve human mistakes, such as medical errors or miscalculated analytics reports. A great example of cognitive improvement is IBM’s Watson supercomputer. It’s classified as the leading cognitive machine to answer complex questions using natural language.

The rise of big data mirrors the rise of tech. In 2016, we will start to see trends in big data education, as wells as a shift in data prediction patterns and error solutions. The future is bright for business and analytic intelligence, and it all starts with big data.

Dr. Athanasios Gentimis

Dr. Athanasios (Thanos) Gentimis is an Assistant Professor of Math and Analytics at Florida Polytechnic University. Dr. Gentimis received a Ph.D. in Theoretical Mathematics from the University of Florida, and is knowledgeable in several computer programming/technical languages that include C++, FORTRAN, Python and MATLAB.

Source: Three Big Data Trends Analysts Can Use in 2016 and Beyond by agentimis

Is Service Quality More Important than Product Quality?

This past weekend, Apple released the iPad 3. My daughter and I visited the Apple store in downtown San Francisco to take a peek at their new device. Of course, the store was packed full of Apple fans, each trying out the new iPad. This particular in-store experience got me thinking about the role of product vs. customer service/tech support in driving customer loyalty to a brand or company.

Product vs. Tech Support

Figure 1. Descriptive Statistics and Correlations among Variables

There has been much talk about how companies need to focus on customer service/tech support to help differentiate themselves from their competitors. While I believe that customer service/tech support is important in improving the customer relationship to increase customer loyalty, this focus on customer service has distracted attention from the importance of the product.

I will illustrate my point using some data on PC manufacturers I collected a few years ago. I have three variables for this analysis:

  1. Advocacy Loyalty
  2. PC Quality
  3. Tech Support Quality

Advocacy Loyalty was measured using 4 items (e.g., overall sat, recommend, buy again, and choose again for first time) using a 0 to 10 scale. PC Quality and Tech Support Quality were each measured on a 1 (Strongly Disagree) to 5 (Strongly Agree) scale. PC Quality was the average of three questions (PC meets expectations, PC is reliable, PC has features I want). Tech Support Quality was the average of six questions (tech support timely, knowledgeable, courteous, understands needs, always there when needed).

Figure 2. Path Diagram of Study Variables

Product is More Important than Technical Support in Driving Advocacy Loyalty

The descriptive statistics and correlations among these variables are located in Figure 1. A path diagram of these variables is presented in Figure 2. As you can see, when comparing the impact of each of the customer touch points on advocacy loyalty, PC quality has the largest impact (.68) while Tech Support as the smallest impact (.21) on advocacy loyalty.

As the results show, advocacy loyalty is more highly correlated with PC quality (.79) than with technical support quality (.59). Even when examining the partial correlations among these variables (controlling for the effect of the third variable), PC quality is much more closely linked to advocacy loyalty (.68) than is technical support quality (.22).

Summary

People don’t frequent a store primarily because of the service. They flock to a company because of the primary product the company provides. Can we please temper all this talk about how important service quality is relative to the product?  Ask anybody at that Apple store why they were there and they would tell you it was because of the products, not the service.

Originally Posted at: Is Service Quality More Important than Product Quality?

SAS enlarges its palette for big data analysis

SAS offers new tools for training, as well as for banking and network security.

SAS Institute did big data decades before big data was the buzz, and now the company is expanding on the ways large-scale computerized analysis can help organizations.

As part of its annual SAS Global Forum, being held in Dallas this week, the company has released new software customized for banking and cybersecurity, for training more people to understand SAS analytics, and for helping non-data scientists do predictive analysis with visual tools.

Founded in 1976, SAS was one of the first companies to offer analytics software for businesses. A private company that generated US$3 billion in revenue in 2014, SAS has devoted considerable research and development funds to enhance its core Statistical Analysis System (SAS) platform over the years. The new releases are the latest fruits of these labors.

With the aim of getting more people trained in the SAS ways, the company has posted its training software, SAS University Edition, on the Amazon Web Services Marketplace. Using AWS eliminates the work of setting up the software on a personal computer, and first-time users of AWS can use the 12-month free tier program, to train on the software at no cost.

SAS launched the University Edition a year ago, and it has since been downloaded over 245,000 times, according to the company.

With the release, SAS is taking aim at one of the chief problems organizations face today when it comes to data analysis, that of finding qualified talent. By 2018, the U.S. alone will face a shortage of anywhere from 140,000 to 190,000 people with analytical expertise, The McKinsey Global Institute consultancy has estimated.

Predictive analytics is becoming necessary even in fields where it hasn’t been heavily used heretofore. One example is information technology security. Security managers for large organizations are growing increasingly frustrated at learning of breaches only after they happen. SAS is betting that applying predictive and behavioral analytics to operational IT data, such as server logs, can help identify and deter break-ins and other malicious activity, as they unfold.

Last week, SAS announced that it’s building a new software package, called SAS Cybersecurity, which will process large of amounts of real-time data from network operations. The software, which will be generally available by the end of the year, will build a model of routine activity, which it then can use to identify and flag suspicious behavior.

SAS is also customizing its software for the banking industry. A new package, called SAS Model Risk Management, provides a detailed model of a how a bank operates so that the bank can better understand its financial risks, as well as convey these risks to regulators.

SAS also plans to broaden its user base by making its software more appealing beyond computer statisticians and data scientists. To this end, the company has paired its data exploration software, called SAS Visual Analytics, with its software for developing predictive models, called SAS Visual Statistics. The pairing can allow non-data scientists, such as line of business analysts and risk managers, to predict future trends based on current data.

The combined products can also be tied in with SAS In-Memory Analytics, software designed to allow large amounts of data to be held entirely in the server’s memory, speeding analysis. It can also work with data on Hadoop clusters, relational database systems or SAS servers.

QVC, the TV and online retailer, has already paired the two products. At its Italian operations, QVC streamlined its supply chain operations by allowing its sales staff to spot buying trends more easily, and spend less time building reports, according to SAS.

The combined package of SAS Visual Analytics and SAS Visual Statistics will be available in May.

Originally posted via “SAS enlarges its palette for big data analysis”

Source: SAS enlarges its palette for big data analysis

Can Hadoop be Apple easy?

Hadoop is now on the minds of executives who care deeply about the power of their rapidly accumulating data. It has already inspired a broad range of big data experiments, established a beachhead as a production system in the enterprise and garnered tremendous optimism for expanded use.

However, it is also starting to create tremendous frustration. A recent analyst report showed less enthusiasm for Hadoop pilots this year than last. Many companies are getting lost on their way to big data glory. Instead, they find themselves in a confusing place of complexity and befuddlement. What’s going on?

While there are heady predictions that by 2020, 75 percent of the Fortune 2000 will be running a 1,000-node Hadoop cluster, there is also evidence that Hadoop is not being adopted as easily as one would think. In 2013, six years after the birth of Hadoop, Gartner said that only 10 percent of the organizations it surveyed were using Hadoop. According to the most recent Gartner survey, less than 50 percent of 284 respondents have invested in Hadoop technology or even plan to do so.

data-center-Tim-Dorr-Flickr

The current attempts to transform Hadoop into a full-blown enterprise product only accomplish the basics and leave the most challenging activities, the operations part, to the users, who, for good reason, wonder what to do next. Now we get to the problem. Hadoop is still complex to run at scale and in production.

Once you get Hadoop running, the real work is just beginning. In order to provide value to the business you need to maintain a cluster that is always up and high performance while being transparent to the end-user. You must make sure the jobs don’t get in each other’s way. You need to support different types of jobs that compete for resources. You have to monitor and troubleshoot the work as it flows through the system. This means doing all sorts of work that is managed, controlled, and monitored by experts. These tasks include diagnosing problems with users’ jobs, handling resource contention between users, resolving problems with jobs that block each other, etc.

How can companies get past the painful stage and start achieving the cost and big data benefits that Hadoop promises? When we look at the advanced practitioners, those companies that have ample data and ample resources to pursue the benefits of Hadoop, we find evidence that the current ways of using Hadoop still require significant end-customer involvement and hands-on support in order to be successful.

For example, Netflix created the Genie project to streamline the use of Amazon Elastic MapReduce by its data scientists, whom Netflix wanted to insulate from the complexity of creating and managing clusters. The Genie project fills the gaps between what Amazon offers and what Netflix actually needs to run diverse workloads in an efficient manner. After a user describes the nature of a desired workload by using metadata, Genie matches the workload with clusters that are best suited to run it, thereby granting the user’s wish.

Once Hadoop finds its “genie,” it can solve the problem of turning Hadoop into a useful tool that can be run at scale and in production. The reason Hadoop adoption and the move into production is going slowly is that these hard problems are being figured out over and over again, stalling progress. By filling this gap for Hadoop, users can do just what they want to do, and learn things about data, without having to waste time learning about Hadoop.

To read the original article on Venture Beat, click here.

Source: Can Hadoop be Apple easy?

Deriving “Inherently Intelligent” Information from Artificial Intelligence

The emergence of big data and scalable data lakes has made it easy for organizations to focus on amassing enormous quantities of data–almost to the exclusion of the analytic insight which renders big data an asset.

According to Paxata co-founder and Chief Product Officer Nenshad Bardoliwalla,“People are collecting this data but they have no idea what’s actually in the data lake, so they can’t take advantage of it.”

Instead of focusing on data and its collection, enterprises should focus on information and its insight, which is the natural outcome of intelligent analytics. Data preparation exists at the nexus between ingesting data and obtaining valuable information from them, and is the critical requisite which has traditionally kept data in the backrooms of IT and away from the business users that need them.

Self-service data preparation tools, however, enable business users to actuate most aspects of preparation—including integration, data quality measures, data governance adherence, and transformation—themselves. The incorporation of myriad facets of artificial intelligence including machine learning, natural language processing, and semantic ontologies both automates and expedites these processes, delivering their vaunted capabilities to the business users who have the most to gain from them.

“How do I get information that allows me to very rapidly do analysis or get insight without having to make this a PhD thesis for every person in the company?” asked Paxata co-founder and CEO Prakash Nanduri. “That’s actually the challenge that’s facing our industry these days.”

Preparing Analytics with Artificial Intelligence
Contemporary artificial intelligence and its accessibility to the enterprise today is the answer to Nanduri’s question, and the key to intelligent information. Transitioning from initial data ingestion to analytic insight in business-viable time frames requires the leveraging of the aforementioned artificial intelligence capabilities in smart data preparation platforms. These tools effectively obsolete the manual data preparation that otherwise threatens to consume the time of data scientists and IT departments. “We cannot do any analysis until we have complete, clean, contextual, and consumable data,” Nanduri maintained, enumerating (at a high level) the responsibility of data preparation platforms. Artificial intelligence facilitates those necessities with smart systems that learn from both data-derived precedents and user input, natural language, and evolving semantic models “to do all the heavy lifting for the human beings,” Nanduri said.

Utilizing Natural Language
Artificial intelligence algorithms are at the core of modern data preparation platforms such as Paxata that have largely replaced manual preparation. “There are a series of algorithmic techniques that can automate the process of turning data into information,” Bardoliwalla explained. Those algorithms exploit natural language processing in three key ways that offer enhanced user experiences for self-service:

  • User experience is directly improved with search capabilities via natural language processing that hasten aspects of data discovery.
  • The aforementioned algorithms are invaluable for joining relevant data sets to one another for integration purposes, while suggesting to end users the best way to do so.
  • NLP is also used to standardized terms that may have been entered in different ways, yet which have the same meaning across different systems in a manner that reinforces data quality.

“I always like to say that I just want the system to do it for me,” Bardoliwalla remarked. “Just look at my data, tell me where all the variations are, then recommend the right answer.” The human involvement in this process is vital, particularly with these type of machine learning algorithms that provide recommendations that users choose from-—and which then become the basis for future actions.

At Scale
Perhaps one of the most discernible advantages of smart data management platforms is their ability to utilize artificial intelligence technologies at scale. Scale is one of the critical prerequisites for making such options enterprise grade, and encompasses affirmative responses to critical questions Nanduri asked of these tools, such as can they “handle security, can you handle lineage, can you handle mixed work loads, can you deal with full automation, do you allow for both interactive workloads and batch jobs, and have a full audit trail?” The key to accounting for these different facets of enterprise-grade data preparation at scale is a distributed computing environment that relies on in-memory techniques to account for the most exacting demands of big data. The scalable nature of the algorithms that power such platforms is optimized in that setting. “It’s not enough to run these algorithms on my desk top,” Bardoliwalla commented. “You have to be able to run this on a billion rows, and standardize a billion rows, or join a billion row data set with a 500 billion row data set.”

Shifting the ETL Paradigm with “Point and Click” Transformation
Such scalability is virtually useless without the swiftness to account for the real-time and near real-time needs of modern business. “With a series of technologies we built on top of Apache Spark including a compiler and optimizer, including columnar caching, including our own transformations that are coded as RDDs which is the core Spark abstraction, we have built a very intelligent distributed computing layer… that allows us to interact with sub-second response time on very large volumes of data,” Bardoliwalla mentioned. The most cogent example of this intersection of scale and expeditiousness is in transforming data, which is typically an arduous, time consuming process utilizing traditional ETL methods. Whereas ETL in relational environments requires exhaustive modeling for all possible questions of data in advance—and significant re-calibration times for additional requirements or questions—the incorporation of a semantic model across all data sources voids such concerns. “Instead of presupposing what semantics are, and premodeling the transformations necessary to get uniform semantics, in a Paxata model we build from the ground up and infer our way into a standardized model,” Bardoliwalla revealed. “We are able to allow the data to emerge into information based on the precedents and the algorithmic recommendations people are doing.”

Overcoming the Dark
The significance of self-service data preparation is not easy to summarize. It involves, yet transcends, its alignment with the overall tendency within the data management landscape to empower the business and facilitate timely control over its data at scale. It is predicated on, yet supersedes, the placement of the foremost technologies in the data space—artificial intelligence and all of its particulars—in the hands of those same people. Similarly, it is about more than the comprehensive nature of these solutions and their ability to reinforce parts of data quality, data governance, transformation, and data integration.

Quintessentially, it symbolizes a much needed victory over dark data, and helps to bring to light information assets that might otherwise remain untapped through a process vital to analytics itself.

“The entire ETL paradigm is broken,” Bardoliwalla said. “The reason we have dark data in the enterprise is because the vast majority of people cannot use the tools that are already available to them to turn data into information. They have to rely on the elite few who have these capabilities, but don’t have the business context.”

In light of this situation, smart data preparation is not only ameliorating ETL, but also fulfilling a longstanding industry need to truly democratize data and their management.

Source: Deriving “Inherently Intelligent” Information from Artificial Intelligence by jelaniharper

Is big data dating the key to long-lasting romance?

If you want to know if a prospective date is relationship material, just ask them three questions, says Christian Rudder, one of the founders of US internet dating site OKCupid.

  • “Do you like horror movies?”
  • “Have you ever travelled around another country alone?”
  • “Wouldn’t it be fun to chuck it all and go live on a sailboat?”

Why? Because these are the questions first date couples agree on most often, he says.

Mr Rudder discovered this by analysing large amounts of data on OKCupid members who ended up in relationships.

Dating agencies like OKCupid, Match.com – which acquired OKCupid in 2011 for $50m (£30m) – eHarmony and many others, amass this data by making users answer questions about themselves when they sign up.

Some agencies ask as many as 400 questions, and the answers are fed in to large data repositories. Match.com estimates that it has more than 70 terabytes (70,000 gigabytes) of data about its customers.

Applying big data analytics to these treasure troves of information is helping the agencies provide better matches for their customers. And more satisfied customers mean bigger profits.

US internet dating revenues top $2bn (£1.2bn) annually, according to research company IBISWorld. Just under one in 10 of all American adults have tried it.

Morecambe & Wise with Glenda Jackson as Cleopatra
If Cleopatra had used big data analytics perhaps she wouldn’t have made the ultimately fatal decision to hook up with Mark Anthony

The market for dating using mobile apps is particularly strong and is predicted to grow from about $1bn in 2011 to $2.3bn by 2016, according to Juniper Research.

Porky pies

There is, however, a problem: people lie.

To present themselves in what they believe to be a better light, the information customers provide about themselves is not always completely accurate: men are most commonly economical with the truth about age, height and income, while with women it’s age, weight and build.

Mr Rudder adds that many users also supply other inaccurate information about themselves unintentionally.

“My intuition is that most of what users enter is true, but people do misunderstand themselves,” he says.

For example, a user may honestly believe that they listen mostly to classical music, but analysis of their iTunes listening history or their Spotify playlists might provide a far more accurate picture of their listening habits.

Lovers on a picnic
Can big data analytics really engineer the perfect match?

Inaccurate data is a problem because it can lead to unsuitable matches, so some dating agencies are exploring ways to supplement user-provided data with that gathered from other sources.

With users’ permission, dating services could access vast amounts of data from sources including their browser and search histories, film-viewing habits from services such as Netflix and Lovefilm, and purchase histories from online shops like Amazon.

But the problem with this approach is that there is a limit to how much data is really useful, Mr Rudder believes.

“We’ve found that the answers to some questions provide useful information, but if you just collect more data you don’t get high returns on it,” he says.

Social engineering

This hasn’t stopped Hinge, a Washington DC-based dating company, gathering information about its customers from their Facebook pages.

The data is likely to be accurate because other Facebook users police it, Justin McLeod, the company’s founder, believes.

Man pressing "Like" button
Dating site Hinge uses Facebook data to supplement members’ online dating profiles

“You can’t lie about where you were educated because one of your friends is likely to say, ‘You never went to that school’,” he points out.

It also infers information about people by looking at their friends, Mr McLeod says.

“There is definitely useful information contained in the fact that you are a friend of someone.”

Hinge suggests matches with people known to their Facebook friends.

“If you show a preference for people who work in finance, or you tend to like Bob’s friends but not Ann’s, we use that when we curate possible matches,” he explains.

The pool of potential matches can be considerable, because Hinge users have an average of 700 Facebook friends, Mr McLeod adds.

‘Collaborative filtering’

But it turns out that algorithms can produce good matches without asking users for any data about themselves at all.

For example, Dr Kang Zhao, an assistant professor at the University of Iowa and an expert in business analytics and social network analysis, has created a match-making system based on a technique known as collaborative filtering.

Dr Zhao’s system looks at users’ behaviour as they browse a dating site for prospective partners, and at the responses they receive from people they contact.

“If you are a boy we identify people who like the same girls as you – which indicates similar taste – and people who get the same response from these girls as you do – which indicates similar attractiveness,” he explains.

Model of the word love on a laptop
Do opposites attract or does it come down to whether you share friends and musical taste?

Dr Zhao’s algorithm can then suggest potential partners in the same way websites like Amazon or Netflix recommend products or movies, based on the behaviour of other customers who have bought the same products, or enjoyed the same films.

Internet dating may be big business, but no-one has yet devised the perfect matching system. It may well be that the secret of true love is simply not susceptible to big data or any other type of analysis.

“Two people may have exactly the same iTunes history,” OKCupid’s Christian Rudder concludes, “but if one doesn’t like the other’s clothes or the way they look then there simply won’t be any future in that relationship.”

Originally posted via “Is big data dating the key to long-lasting romance?”

Source: Is big data dating the key to long-lasting romance?