May 30, 19: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
Data interpretation  Source

[ AnalyticsWeek BYTES]

>> Succession Planning by the Number [Infographics] by v1shal

>> How to be Data-Driven when Data Economics are Broken by analyticsweekpick

>> Preparing the leaders for #DataDriven #Future @ErikaAndersen by v1shal

Wanna write? Click Here

[ FEATURED COURSE]

Python for Beginners with Examples

image

A practical Python course for beginners with examples and exercises…. more

[ FEATURED READ]

Machine Learning With Random Forests And Decision Trees: A Visual Guide For Beginners

image

If you are looking for a book to help you understand how the machine learning algorithms “Random Forest” and “Decision Trees” work behind the scenes, then this is a good book for you. Those two algorithms are commonly u… more

[ TIPS & TRICKS OF THE WEEK]

Data Have Meaning
We live in a Big Data world in which everything is quantified. While the emphasis of Big Data has been focused on distinguishing the three characteristics of data (the infamous three Vs), we need to be cognizant of the fact that data have meaning. That is, the numbers in your data represent something of interest, an outcome that is important to your business. The meaning of those numbers is about the veracity of your data.

[ DATA SCIENCE Q&A]

Q:Explain what a false positive and a false negative are. Why is it important these from each other? Provide examples when false positives are more important than false negatives, false negatives are more important than false positives and when these two types of errors are equally important
A: * False positive
Improperly reporting the presence of a condition when it’s not in reality. Example: HIV positive test when the patient is actually HIV negative

* False negative
Improperly reporting the absence of a condition when in reality it’s the case. Example: not detecting a disease when the patient has this disease.

When false positives are more important than false negatives:
– In a non-contagious disease, where treatment delay doesn’t have any long-term consequences but the treatment itself is grueling
– HIV test: psychological impact

When false negatives are more important than false positives:
– If early treatment is important for good outcomes
– In quality control: a defective item passes through the cracks!
– Software testing: a test to catch a virus has failed

Source

[ VIDEO OF THE WEEK]

#BigData @AnalyticsWeek #FutureOfData #Podcast with Eloy Sasot, News Corp

 #BigData @AnalyticsWeek #FutureOfData #Podcast with Eloy Sasot, News Corp

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

Big Data is not the new oil. – Jer Thorp

[ PODCAST OF THE WEEK]

#BigData @AnalyticsWeek #FutureOfData #Podcast with Eloy Sasot, News Corp

 #BigData @AnalyticsWeek #FutureOfData #Podcast with Eloy Sasot, News Corp

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

Market research firm IDC has released a new forecast that shows the big data market is expected to grow from $3.2 billion in 2010 to $16.9 billion in 2015.

Sourced from: Analytics.CLUB #WEB Newsletter

Designing Mobile Analytics: 4 Guidelines and 1 Unexpected Pitfall

Application users are increasingly mobile: Nurses access patient information on a tablet. Manufacturing workers review machine downtime data on the floor. Sales managers look at their latest forecasts while on the road.

If your application has embedded dashboards and reports, they need to work on every mobile device regardless of screen size. But effective mobile analytics design means more than resizing a couple charts. Designing responsive mobile analytics requires a whole new set of skills, both in terms of dashboard design and scaling for a variety of screen sizes.

>> Related: 4 Misconceptions About Mobile Business Intelligence <<

Let’s look at the best practices for designing mobile BI (business intelligence)—along with one of the biggest pitfalls.

Best Practices for Mobile Analytics Design

#1: Show only relevant information

If you include too much content or present content with an overly complex design, your end users may not bother using your analytics at all. Less is more for mobile BI design. Consider the information your end users truly need. If you’re designing a dashboard for a sales team, for example, don’t include visualizations on marketing campaign channels. Placing sales and marketing metrics together will only confuse users with information they don’t really need.

#2: Hide some content

Every piece of information doesn’t need to appear on your mobile dashboard all at once. Don’t be afraid to let users drill down into some data points. Utilize icons, pop-up windows, sliding trays, and other expandable areas to show longer blocks of text as users dive deeper. Selectively showing content also has the added bonus of speeding up load times—and as any developer knows, poor performance is a killer for application engagement.

#3: Embrace white space

Determining where to put every element of your embedded analytics is an important undertaking that can make or break the success of your mobile dashboard. Negative space (or white space) is a crucial element of dashboard design of any type. It increases readability and breaks up blocks of information. Adding padding between objects also makes the application easier to use on the smaller screens of mobile devices.

#4: Use iconography

Content isn’t limited to data and charts. To support a great user experience, especially on a mobile device, application teams are using icons in navigation panes and reports. Icons are typically small graphic images, sometimes accompanied by a one- or two-word description. They help users easily navigate analytics, understand exactly what they’re looking at, and quickly discern what action to take.

#1 Pitfall of Mobile BI Design: Starting with your desktop experience

When it’s time to design your responsive analytics, your first inclination may be to start with the desktop dashboards you already have. But you’re actually better off starting from scratch and designing for the smallest screen first. In most cases, this will be a smartphone.

Starting on a small screen forces you to prioritize content. You have to choose only the most important elements, rather than paring down content from a robust desktop application with seemingly unlimited screen space. This method also allows you to add more content and features as the screen gets larger, rather than removing elements as it gets smaller.

Looking for more mobile analytics best practices? Read the ebook on Designing Responsive Dashboards >

 

Source: Designing Mobile Analytics: 4 Guidelines and 1 Unexpected Pitfall

August 28, 2017 Health and Biotech analytics news roundup

Genome sequencing method can detect clinically relevant mutations using 5 CTCs: Researchers showed that a technique that can sequence very long stretches of the genome can accurately quantify mutations using only 5 ‘circulating tumor cells’ (although they used 34 in this study).

Artificial intelligence predicts dementia before onset of symptoms: Using only one scan of the brain per patient, McGill scientists were able to accurately predict Alzheimer’s 2 years before its onset.

Using machine learning to improve patient care: Two papers from MIT made strides in the field, one that used ICU data to predict necessary treatments and another that trained models of mortality and length of stay based on electronic health record data.

How CROs Are Helping With Healthcare’s Data Problem: Clinical trial costs are a major cause of rising health care costs. To help streamline this, pharmaceutical companies are increasingly using ‘contract research organizations’ to conduct trials, as they can use their expertise and specialized business intelligence tools to cut costs.

I was worried about artificial intelligence—until it saved my life: Krista Jones had a rare form of cancer that was only able to be correctly treated with machine learning technology.

Genomic Medicine Has Entered the Building: Some types of genome sequences now cost as much as an MRI, which has allowed organizations to undertake large-scale studies in personalized medicine.

Source: August 28, 2017 Health and Biotech analytics news roundup

How Does Did Recommend Differ from Likely to Recommend?

Have you recommended a product to a friend?

Will you recommend that same product to a friend again?

Which of these questions is a better indicator of whether you will actually recommend a product?

If people were consistent, rational, and honest, it’s simple. The second question asks about future intent so that would be the logical choice.

It may come as no surprise that people aren’t necessarily good predictors of their own future behavior.

How the question is phrased, along with the activity and distance in the future, each have an impact on how predictive the question is. It may, therefore be risky to heavily rely on people’s prediction of whether they WILL recommend.

We’ll look at how well-stated intentions predict future actions in a future article. In this article, we’ll examine how asking people whether they recommended something (recalling the past) differs from asking whether they will recommend (a future prediction of behavior). Are you really asking something different? If so, how do they differ and what problems arise from using only reports of past behavior to predict future behavior?

Predicting Is Hard, Especially About the Future

In theory, people should be accurate in recalling whether they did something rather than whether they will do something. With this in mind, Tomer Sharon recommends something he calls the aNPS (actual NPS)—whether people said they recommended over a recent period of time. Similarly, Jared Spool argues past recommendations are a better predictor of someone’s loyalty than asking about what people will do in the future. He uses an example from a Netflix survey to illustrate the point. Neither provides any data to support the idea that these are superior measures of future intent to recommend. (Finding data is hard.)

There is success in predicting what will happen based on what has happened, especially over short intervals and when people and things remain the same.

But even when the ideal conditions exist—even for behavior that is habitual—past behavior is far from a perfect predictor of what will happen. For example, when predicting whether people will exercise in the future, past exercise behavior was found as the best predictor of future exercise. But its correlation, while high, was not a perfect predictor (r = .69).

In a meta-analysis of 16 studies[pdf], Oulette and Wood (1998) also found that past behavior correlated with future behavior, but it was more modest (r = .39). Interestingly, though, they found behavior intentions (how likely you are to do something) were better predictors of future behavior (r = .54). They did find a more complex interaction between past behavior and future intent. In stable contexts, past behavior is a better predictor of future behavior. But in unstable contexts and when behavior isn’t performed on a frequent basis, intention is a better predictor of future acts.

Recommending products or brands to friends and colleagues is likely a less stable and infrequent action. A future analysis will examine this in more detail.

People and Products Change

Using past recommendations to predict future behavior also presents another challenge to consider: changing experiences.

Even if we have a perfect record of what people did and could exclude memory bias, change happens. People change their minds, and products and companies change (often quickly). So what people did in the past, even the recent past, may not be as good an indicator of what they will do in the future. A good example of this also comes from Netflix.

In 2011, Netflix split its mail order DVD business from its online streaming and raised prices. This angered customers (many of whom likely recently recommended the service) and Netflix lost 800,000 customers by one estimate.

We were tracking Netflix’s NPS in 2011 and this change was captured in its Net Promoter Score, which went from one of the highest in our database (73%) to a low -7% as shown in Figure 1.

Figure 1: The Netflix NPS in Feb 2011 and Oct 2011, before and after announcing an unpopular product change.

The Netflix example shows how past recommendations (even recent ones) can quickly become poor predictors of future attitudes and behavior. Netflix ultimately reversed its decision and more than recovered its subscriber base (and NPS score).

But that’s probably an extreme example of how a major change to a product can result in a major change to usage and recommendations. To more systematically understand how reports of past recommendations may differ from future likelihood recommendations, we examined data in two large datasets and compared the differences between past recommendations and stated likelihood to recommend in the future.

Study 1: Software Recommendations

In our 2014 consumer and business software reports, we asked current users of software products whether they recommended the product and how likely they are to recommend it in the future using the 11-point Likelihood to Recommend (LTR) item from the Net Promoter Score. We used this information to quantify the value of a promoter.

On average, 29% of software users in our study reported that they recommended the product to at least one person in the prior year. Of these past recommenders, we found 41% were most likely to recommend it again—giving a 10 on the LTR item (Figure 2). Extending this to the top two boxes we see that 64% of people who recommended are also promoters. Those who responded 8–10 on the LTR item accounted for a substantial (85%) portion of the recommendations. Including passives and promoters (7–10) accounts for almost all (93%) of the recommendations.

You can see the large drop off in recommendations between an 8 and 7 (Figure 2) where the percentage of software users reported recommending drops by more than half (from 21% to 8%) and then by half again (8% to 3%) when moving from 7 (passive) to 6 (detractor). It would be interesting to know why respondents in this study who did recommend in the past are less likely to in the future.

Figure 2: Percent of respondents who reported recommending a software product in the past and would recommend it in the future using the 0 to 10 LTR scale.

Study 2: Recent Recommendation

To corroborate the pattern and look for reasons why past recommendations don’t account for all future recommendations we conducted another study in November 2018 with a more diverse set of products and services. We asked 2,672 participants in an online survey to think about what product or service they most recently recommended to a friend or colleague, either online or offline. Some examples of recently recommended companies and products were:

  • Barnes & Noble
  • Amazon
  • eBay
  • Colourpop Lipstick
  • PlayStation Plus
  • Spotify music
  • Bojangles Chicken and Biscuits
  • Ryka—shoes for women

After the first few hundred responses we noticed many people had recently recommended Amazon, so we asked respondents to think of another company. After recalling their most recent recommendation, we used the 0 to 10 LTR scale (0 = not at all likely to recommend and 10 = extremely likely) to find out how likely they would be to recommend this same product or service again to a friend.

The distribution of likelihood to recommend responses is shown in Figure 3. About half (52%) of participants selected the highest response of 10 (extremely likely) and 17% selected 9. Or, of the people who recommended a product or service in the past, 69% are promoters (Figure 4). Figure 3 also shows the bulk of values are at 8 and above, accounting for 84% of responses and another substantial drop in recommendations happens from 7 to 6 (from 8% to 2%).

Figure 3: Distribution of likelihood to recommend from 2,672 US respondents who recently recommended a product or service.

Figure 4 shows that 92% of all recommendations came from passives and promoters (almost identical to the 93% in Study 1). Across both studies around 8% of past recommenders were not very likely to recommend again. Why won’t they recommend again?

Figure 4: Percent of respondents that are promoters, passives, and detractors for products or services they reported recommending in the past (n =2672).

To find out, we asked respondents who gave 6 or below to briefly tell us why they’re now less likely to recommend in the future.

The most common reasons given were that the product or service was disappointing to them or the person they recommended it to, or it changed since they used it. Examples include:

  • I am slightly less likely to recommend because my purchase contained dead pixels (TCL television).
  • We’ve had recent issues with their products (lightexture).
  • The website was not as user-friendly as it could be (Kohl’s).
  • Service not what I was told it would be (HughesNet).
  • Product didn’t perform as expected (Pony-O).
  • The person I recommended to it went on to have an issue using the site (Fandango).

Several respondents also didn’t feel the opportunity would come up again, supporting the idea that recommendations may be infrequent, indicating past behavior may be a poor predictor of future behavior.

  • I usually recommend Roxy to those who compliment me on my Roxy shoes. If that doesn’t happen then I don’t really bring them up.
  • I don’t expect anyone in my immediate circle to be soliciting recommendations for Dell. If they asked, I would give them a positive recommendation.
  • If it comes up in the conversation, I will recommend it. If it doesn’t I won’t necessarily go out of my way to do it.

Summary and Takeaways

In our analysis of the literature and across two studies examining past recommendations and future intent to recommend, we found:

Did recommend and will recommend offer similar results. If you want to use past recommendations as a “better” measure of future intent, this analysis suggests asking future intent to recommend is highly related with past recommendations. Across two studies, around 92% of respondents who did recommend are also most likely to recommend again (if you include the promoters and passives category).

Around 8% won’t recommend again. While past recommendations and future intent to recommend are highly related, around 8% of respondents who recently recommended a company, product, or experience won’t recommend it in the future (recommendation attrition). Even the most recent recommendation wasn’t a perfect predictor of stated intent to recommend. Expect the recommendation attrition to increase the longer the time passes between when you ask people whether they recommended.

Poor experiences cause recommendation loss. The main reason we found why people who had recommended in the past won’t in the future is that they (or the person they recommended) had a disappointing experience. This can happen from participants who had a more recent bad experience or because the product or service changed (as was the case when Netflix changed its pricing). Several participants also indicated they didn’t feel an opportunity would come up again to recommend.

Don’t dismiss future intentions. Past behavior may be a good indicator of future behavior but not universally. A meta-analysis in the psychology literature suggests stated intent has both an important moderating effect on past behavior, and in many cases is a better predictor of future behavior. We’ll investigate how this holds with product and company recommendations in a future article.

Ask both past recommendations and future intent. If you’re able, ask about both the past and future. It’s likely that people who recommended and are extremely likely to recommend again are the best predictors of who will recommend in the future. A literature review found that a combination of past behavior and future intentions may be better predictors of future behavior depending on the context and frequency of the behavior. Several participants in this study indicated that their recommendations were infrequent and despite prior recommendations may be less likely to recommend again (even though their attitude toward their experience didn’t decrease)

 

(function() {
if (!window.mc4wp) {
window.mc4wp = {
listeners: [],
forms : {
on: function (event, callback) {
window.mc4wp.listeners.push({
event : event,
callback: callback
});
}
}
}
}
})();

Sign-up to receive weekly updates.

 

Source: How Does Did Recommend Differ from Likely to Recommend?

Relationships, Transactions and Heuristics

There are two general types of customer satisfaction surveys: 1) Customer Transaction Surveys and 2) Customer Relationship Surveys. Customer Transactional Surveys allow you to track satisfaction for specific events. The transactional surveys are typically administered soon after the customer has a specific interaction with the company. The survey asks the customers to rate that specific interaction. Customer Relationship Surveys allow you to measure your customer’s attitudes across different customer touchpoints (e.g., marketing, sales, product, service/support) at a given point in time. The administration of the relationship surveys is not linked to any specific customer interaction with the company. Relationship surveys are typically administered at periodic times throughout the year (e.g., every other quarter, annually). Consequently, the relationship survey asks the customers to rate the company based on their past experience. While the surveys differ with respect to what is being rated (a transaction vs. a relationship), these surveys can share identical customer touchpoint questions (e.g., technical support, sales).

A high-tech company was conducting both a transactional survey and a relationship survey. The surveys shared identical items. Given that the ratings were coming from the same company and shared identical touchpoint questions, they expected the ratings to be the same for both the relationship survey and the transactional survey. The general finding, however, was that ratings on the transactional survey were typically higher than ratings for the same question on the relationship survey. What score is correct about the customer relationship? Why don’t ratings of identical items on relationship surveys and transactional surveys result in the same score? Humans are fallible.

Availability Heuristic

There is a line of research that examines the process by which people make judgments. This research shows that people use heuristics, or rules of thumb, when asked to make decisions or judgments about frequencies and probabilities of events. There is a heuristic called the “availability heuristic” that applies here quite well and might help us explain the difference between transactional ratings and relationship ratings of identical items.

People are said to employ the availability heuristic whenever their estimate of the frequency or probability of some event is based on the ease with which instances of that event can be brought to mind. Basically, the things you can recall more easily are estimated by you to be more frequent in the world than things you can’t recall easily. For example, when presented with a list containing an equal number of male and female names, people are more likely to think that the list contains more male names than female names when more males’ names are of famous men. Because these famous names were more easily recalled, the people think that there must be more male names than female names.

Customers, when rating companies as a whole (relationship surveys), are recalling their prior interactions with the company (e.g., their call into phone support, receipt of marketing material). Their “relationship rating” is a mental composite of these past interactions, negative, positive and mundane. Negative customer experiences, unlike positive or mundane ones, tend to be more vivid, visceral, and, consequently, are more easily recalled compared to pleasant experiences. When I think of my past interactions with companies, it is much easier for me to recall negative experiences than positive experiences. When thinking about a particular company, due to the availability heuristic, customers might overestimate the number of negative experiences, relative to positive experiences, that actually occurred with the company. Thus, their relationship ratings would be adversely affected by the use of the availability heuristic.

Ratings from transactional surveys, however, are less vulnerable to the effect of the availability heuristic. Because the customers are providing ratings for one recent, specific interaction, the customers’ ratings would not be impacted by the availability heuristic.

Summary and Implications

Customer satisfaction ratings in relationship surveys are based on customers’ judgment of past experiences with the company, and, consequently, are susceptible to the effects of the availability heuristic. Customers may more easily recall negative experiences, and, consequently, these negative experiences negatively impact their ratings of the company overall. While it would appear that a transactional survey could be a more accurate measure than a relationship survey, you shouldn’t throw out the use of relationship surveys just yet.

While average scores on items in relationship surveys might be decreased due to the availability heuristic, the correlation among items should not be impacted by the availability heuristic because correlations are independent of scale values; decreasing ratings by a constant across all customers does not have any effect on the correlation coefficients among the items being rated. Consequently, the same drivers of satisfaction/loyalty would be found irrespective of survey type.

I’d like to hear your thoughts.

Source: Relationships, Transactions and Heuristics

How the Right Loyalty and Operational Metrics Drive Service Excellence – Webinar

cx_forum_logo

Last week, I spoke at the CustomerThink Customer Experience Thought Leader Forum, which includes customer experience researchers and practitioners sharing leading-edge practices. Bob Thompson, founder of CustomerThink, organized several sessions focusing on specific CX issues facing business today. In our session, titled Customer Service Excellence: How to Optimize Channel and Metrics to Drive Ominchannel Excellence, Stephen Fioretti, VP of Oracle and I addressed two areas about customer service. He talked about how customer service organizations can align their channel strategy to customer needs by guiding them to the right channel based on the complexity and time sensitivity of interactions. I talked about the different types of metrics to help us understand relationship-level and transaction-level attitudes around service quality.

Self-Service Channel Adoption Increases but Delivers a Poor Experience

Stephen reported some interesting industry statistics from Forrester and Technology Services Industry Association. While the adoption of self-service is on the rise, customers are substantially less satisfied (47% satisfied) with these channels compared to the traditional (and still most popular) telephone channel (74% satisfied). So, while automated service platforms save companies money, they do so at the peril of the customer experience. As more customers adopt these automated channels, companies need to ensure they deliver a great self-service experience.

Improving the Customer Experience of Automated Channels through Behavioral/Attitudinal Analytics

In the talk, I showed how companies, by using linkage analysis, can better understand the self-service channel by analyzing the data behind the transactions, both behavioral and attitudinal. After integrating different data silos, companies can apply predictive analytics on their customer-generated data (e.g., web analytics) to make predictions about customers’ satisfaction with the experience. Using web analytics of online behavior patterns, companies might be able to profile customers who are predicted to be dissatisfied and intervene during the transaction to either improve their service experience or ameliorate its negative impact.

Stephen and I cover a lot more information in the webinar. To learn more, you can access the complete CX Forum webinar recording and slides here (free registration required).

 

Source: How the Right Loyalty and Operational Metrics Drive Service Excellence – Webinar

Beachbody Gets Data Management in Shape with Talend Solutions

This post is co-authored by Hari Umapathy, Lead Data Engineer at Beachbody and Aarthi Sridharan, Sr.Director of Data (Enterprise Technology) at Beachbody.

Beachbody is a leading provider of fitness, nutrition, and weight-loss programs that deliver results for our more than 23 million customers. Our 350,000 independent “coach” distributors help people reach their health and financial goals.

The company was founded in 1998, and has more than 800 employees. Digital business and the management of data is a vital part of our success. We average more than 5 million monthly unique visits across our digital platforms, which generates an enormous amount of data that we can leverage to enhance our services, provide greater customer satisfaction, and create new business opportunities.

Building a Big Data Lake

One of our most important decisions with regard to data management was deploying Talend’s Real Time Big Data platform about two years ago. We wanted to build a new data environment, including a cloud-based data lake, that could help us manage the fast-growing volumes of data and the growing number of data sources. We also wanted to glean more and better business insights from all the data we are gathering, and respond more quickly to changes.

We are planning to gradually add at least 40 new data sources, including our own in-house databases as well as external sources such as Google Adwords, Doubleclick, Facebook, and a number of other social media sites.

We have a process in which we ingest data from the various sources, store the data that we ingested into the data lake, process the data and then build the reporting and the visualization layer on top of it. The process is enabled in part by Talend’s ETL (Extract, Transform, Load) solution, which can gather data from an unlimited number of sources, organize the data, and centralize it into a single repository such as a data lake.

We already had a traditional, on-premise data warehouse, which we still use, but we were looking for a new platform that could work well with both cloud and big data-related components, and could enable us to bring on the new data sources without as much need for additional development efforts.

The Talend solution enables us to execute new jobs again and again when we add new data sources to ingest in the data lake, without having to code each time. We now have a practice of reusing the existing job via a template, then just bringing in a different set of parameters. That saves us time and money, and allows us to shorten the turnaround time for any new data acquisitions that we had to do as an organization.

The Results of Digital Transformation

For example, whenever a business analytics team or other group comes to us with a request for a new job, we can usually complete it over a two-week sprint. The data will be there for them to write any kind of analytics queries on top of it. That’s a great benefit.

The new data sources we are acquiring allow us to bring all kinds of data into the data lake. For example, we’re adding information such as reports related to the advertisements that we place on Google sites, the user interaction that has taken place on those sites, and the revenue we were able to generate based on those advertisements.

We are also gathering clickstream data from our on-demand streaming platform, and all the activities and transactions related to that. And we are ingesting data from the Salesforce.com marketing cloud, which has all the information related to the email marketing that we do. For instance, there’s data about whether people opened the email, whether they responded to the email and how.

Currently, we have about 60 terabytes of data in the data lake, and as we continue to add data sources we anticipate that the volume will at least double in size within the next year.

Getting Data Management in Shape for GDPR

One of the best use cases we’ve had that’s enabled by the Talend solution relates to our efforts to comply with the General Data Protection Regulation (GDPR). The regulation, a set of rules created by the European Parliament, European Council, and European Commission that took effect in May 2018, is designed to bolster data protection and privacy for individuals within the European Union (EU).

We leverage the data lake whenever we need to quickly access customer data that falls under the domain of GDPR. So when a customer asks us for data specific to that customer we have our team create the files from the data lake.

The entire process is simple, making it much easier to comply with such requests. Without a data lake that provides a single, centralized source of information, we would have to go to individual departments within the company to gather customer information. That’s far more complex and time-consuming.

When we built the data lake it was principally for the analytics team. But when different data projects such as this arise we can now leverage the data lake for those purposes, while still benefiting from the analytics use cases.

Looking to the Future

Our next effort, which will likely take place in 2019, will be to consolidate various data stores within the organization with our data lake. Right now different departments have their own data stores, which are siloed. Having this consolidation, which we will achieve using the Talend solutions and the automation these tools provide, will give us an even more convenient way to access data and run business analytics on the data.

We are also planning to leverage the Talend platform to increase data quality. Now that we’re increasing our data sources and getting much more into data analytics and data science, quality becomes an increasingly important consideration. Members of our organization will be able to use the data quality side of the solution in the upcoming months.

Beachbody has always been an innovative company when it comes to gleaning value from our data. But with the Talend technology we can now take data management to the next level. A variety of processes and functions within the company will see use cases and benefits from this, including sales and marketing, customer service, and others.

About the Authors: 

Hari Umapathy

Hari Umapathy is a Lead Data Engineer at Beachbody working on architecting, designing and developing their Data Lake using AWS, Talend, Hadoop and Redshift.  Hari is a Cloudera Certified Developer for Apache Hadoop.  Previously, he worked at Infosys Limited as a Technical Project Lead managing applications and databases for a huge automotive manufacturer in the United States.  Hari holds a bachelor’s degree in Information Technology from Vellore Institute of Technology, Vellore, India.

 

Aarthi Sridharan

Aarthi Sridharan is the Sr.Director of Data (Enterprise Technology) at Beachbody LLC,  a health and fitness company in Santa Monica. Aarthi’s leadership drives the organization’s abilities to make data driven decisions for accelerated growth and operational excellence. Aarthi and her team are responsible for ingesting and transforming large volumes of data into traditional enterprise data warehouse and into the data lake and building analytics on top it. 

The post Beachbody Gets Data Management in Shape with Talend Solutions appeared first on Talend Real-Time Open Source Data Integration Software.

Source: Beachbody Gets Data Management in Shape with Talend Solutions