How AI and Machine Learning Can Win Elections

Reading Time: 4 minutesElections are the time when the people of a country are bestowed with the power to choose the next government that will govern their nation. The period leading up to the elections is packed with massive campaigning activities by all political parties. Every voter has his/her own ideologies and expectations that they would like to see a candidate fulfill. The main objective of political parties is to influence or sway the mind of the voter to vote for their respective candidates. The general techniques used by politicians to achieve this objective is by meeting voters in person, through mass media advertising, public rallies, social media campaigns, etc. This has been the case with political elections throughout modern history.

In recent years, technology has changed the whole approach drastically. Politicians are now relying on technological advancements such as analysing Big Data to connect and engage better with voters. Former US President Barack Obama’s campaign team used Big Data Analytics to maximize the effectiveness of his email campaigns, which resulted in the raising of a whopping US$ 1billion of campaign money.

Along with Big Data, the next technologies that are going to have a huge impact in election campaigns and political life are Artificial Intelligence and Machine Learning.

Engaging Voters Using AI

AI and machine learning can be used to engage voters in election campaigns and help them be more informed about important political issues happening in the country. Based on statistical techniques, machine learning algorithms can automatically identify patterns in data. By analyzing the online behaviour of voters which includes their data consumption patterns, relationships, and social media patterns, unique psychographic and behavioural user profiles could be created. Targeted advertising campaigns could then be sent to each voter based on their individual psychology. This helps in persuading voters to vote for the party that meets their expectations.

Read Also: How AI and ML can Transform Governance

Source: How AI and Machine Learning Can Win Elections by administrator

The Hidden Bias in Customer Metrics

Business leaders understand how their business is performing by monitoring different metrics. Metrics are essentially a summary all the data (yes, even Big Data) into a score. Metrics include new customer growth rate, number of sales and employee satisfaction, to name a few. Your hope is that these scores tell you something useful.

There are a few ways to calculate a metric. Using the examples above, we see that we can use percentages, a simple count and descriptive statistics (e.g., mean) to calculate a metric. In the world of customer feedback, there are a few ways to calculate metrics from structured data. Take, for example, a company that has 10,000 responses to a recent customer survey in which they used a 0-10 point rating scale (e.g., 0 = extremely dissatisfied; 10 = extremely satisfied). They have a few options for calculating a summary metric:

  1. Mean Score:  This is the arithmetic average of the set of responses. The mean is calculated by summing all responses and dividing by the number of responses. Possible mean scores can range from 0 to 10.
  2. Top Box Score: The top box score represents the percentage of respondents who gave the best responses (either a 9 and 10 on a 0-10 scale). Possible percentage scores can range from 0 to 100.
  3. Bottom Box Score: The bottom box score represents the percentage of respondents who gave the worst responses (0 through 6 on a 0-10 scale). Possible percentage scores can range from 0 to 100.
  4. Net Score: The net score represents the difference between the Top Box Score and the Bottom Box Score. Net scores can range from -100 to 100. While the net score was made popular by the Net Promoter Score camp, others have used a net score to calculate a metric (please see Net Value Score.) While the details might be different, net scores take the same general approach in their calculations (percent of good responses – percent of bad responses). For the remainder, I will focus on the Net Promoter Score methodology.

Different Summary Metrics Tell You the Same Thing

Table 1. Correlations among different summary metrics of the same question (likelihood to recommend).

When I compared these summary metrics to each other, I found that they tell you pretty much the same thing about the data set. Across 48 different companies, these four common summary metrics are highly correlated with each other (See Table 1). Companies who receive high mean scores also receive a high NPS and top box scores. Likewise, companies who receive low mean scores will get low NPS and top box scores.

If each of these metrics are mathematically equivalent, does it matter which one we use?

How Are Metrics Interpreted by Users?

Even though different summary metrics are essentially the same, some metrics might be more beneficial to users due to their ease of interpretation. Are there differences between Mean Scores and Net Promoter Scores at helping users understand the data? Even though a mean of 7.0 is comparable to an NPS of 0.0, are there advantages of using one over the other?

Table 2. Net Promoter Scores and Predicted Values of Other Summary Metrics. Click image to enlarge.

One way of answering that question is to determine how well customer experience (CX) professionals can describe the underlying distribution of ratings on which the Mean Score or Net Promoter Score is calculated.

Study

Study participants were invited to the study via a blog post about the study; the post included a hyperlink to the Web-based data collection instrument. The post was shared through social media connections, professional online communities and the author’s email list.

For the current study, each CX professional ran through a series of exercises in which they estimated the size of different customer segments based on their knowledge of either a Mean Score or the Net Promoter Score. To ensure Mean Scores and Net Promoter Scores were comparable to each other, I created the study protocol using the data from the study above. Table 2 includes a list of six summary metrics with their corresponding values. NPS values range from -100 to 100 in increments of 10. The values of other metrics are based on the regression formulas that predicted a specific summary metric from different values of the NPS. An NPS of 0.0 corresponds to a Mean Score of 7.1.

First, each study participant was given five NPS values (-100, -50, 0, 50 and 100). For each NPS value, they were asked to provide their best guess of the size of four specific customer segments from which that NPS was calculated: 1) percent of respondents with ratings of 6 or greater (Satisfied); 2) percent of respondents who have ratings of 9 or 10 (Promoters); 3) percent of respondents with ratings between 0 and 6, inclusive (Detractors) and 4) percent of respondents with ratings of 7 or 8 (Passives).

Table 3. Sample Demographics
Table 3. Sample Demographics

Next, these same CX professionals were given five comparable (to the NPS values above) mean values (4.0, 5.5, 7.0, 8.5 and 10.0). For each mean score, they were asked to provide their best guess of the percent of respondents in each of the same categories above (i.e., Satisfied, Promoters, Detractors and Passives).

Results

A total of 41 CX professionals participated in the study. Most CX professionals were from B2B companies (55%) or B2B/B2C companies (42%). Three quarters of them had formal CX roles, and most (77%) considered themselves either proficient or experts in their company’s CX program. See Table 3.

Figures 1 through 4 contain the results. Each figure contains three pieces of information that illustrate the accuracy of CX professionals’ estimate. The red dots represent the actual size of the specific customer segment for each value of the NPS. The green bars represent the CX professionals’ estimates of the size of the customer segment as well as their corresponding 95% confidence interval)

Figure 1 focuses on the estimation of the number of Promoters. Results show that CX professionals underestimate the Top Box percentage (i.e., Promoters) when the Mean Score is high. For example, CX professionals estimated that a Mean Score of 8.5 was equivalent to 45% Top Box Score when the actual Top Box Score would really be 64%. We saw a smaller effect using the NPS. In general, CX professionals could more accurately guess the Top Box Scores when using Net Promoter Scores, except for the highest NPS value of 100 (Actual Top Box Score = 100; CX professional’s estimate = 89).

Figure 2. Estimating % of Promoters from NPS and Mean Values
Figure 1. Estimating % of Promoters from NPS and Mean Values – Click image to enlarge.

In Figure 2, I looked at how well study participants could guess the size of the Bottom Box Scores (i.e., Detractors). Results show that CX professionals could accurately predict the percent of Detractors throughout the range of NPS values. On the other hand, CX professionals greatly underestimated the Bottom Box Scores when the Mean Score was extremely low (Mean = 4.0; Corresponding Bottom Box Score = 89; CX professionals’ estimate of Bottom Box Score= 64).

In Figure 3, I looked at how well study participants could guess the size of the Passives segment. Again, CX professionals were able to accurately estimate the percent of Passives across all values of the NPS. When using Mean Scores, however, study participants tended to overestimate the size of the Passives segment across all levels of the Mean Score.

Figure 2. Estimating % of Detractors from NPS and Mean Values
Figure 2. Estimating % of Detractors from NPS and Mean Values – Click image to enlarge.

In Figure 4, I looked at how well CX professionals could estimate the size of the Satisfied segment (rating of 6 or greater). Unlike the other findings using the NPS, we see that study partcipants underestimated the size of this segment across all levels of the NPS. The effect was less pronounced and slightly different when CX professionals relied on the Mean Score. Under this condition, CX professionals underestimated the size of the Satisfied segment when Mean ratings were 5.5 or above but overestimated the size of the Satisfied segments when Mean ratings was 4.0.

Figure 3. Estimating % of Passives from NPS and Mean Values
Figure 3. Estimating % of Passives from NPS and Mean Values – Click image to enlarge.

Summary and Conclusions

The results of this study show that customer metrics possess inherent bias. People tend to make consistent errors when interpreting customer metrics, especially for extreme values.

When the Mean Score was used, estimations of segment sizes suffered on the extreme ends of the scale. When things are really good (high Mean Score), CX professionals underestimated the number of Promoters they really have. When things are really bad (Mean score of 4.0), they underestimated the number of Detractors they really have.

The use of the NPS leads to more accurate estimations about underlying customer segments that are a part of the NPS lexicon (i.e., Promoters, Detractors and Passives). Net scores force the user to think about their data in specific segments. When CX professionals were estimating the size of a segment unrelated to the NPS (i.e., estimating percent of 6 – 10 ratings), they greatly underestimated the size of the segment across the entire spectrum of the NPS.

Figure 4.
Figure 4. Estimating % of Positives from NPS and Mean Values – Click image to enlarge.

Generally speaking, better decisions will be made when the interpretation of results matches reality. We saw that a mean of 8.5 really indicates that 64% of the customers are very satisfied (% of 9 and 10 ratings); yet, the CX professionals think that only 45% of the customers are very satisfied, painting a vastly different picture of how they interpret the data. Any misinterpretation of a performance metric could lead to sub-optimal decisions that are driven more by biases than by what the data really tell us, leading to unnecessary investments in areas where leaders are doing better than they think they are.

My advice is to consider using a few metrics to describe what’s happening with your data. First, Mean Scores and Net Scores are equivalent. So, for trending purposes, pick either and use it consistently. Second, report the size of specific customer segments (e.g., % Top Box) to ensure people understand the true meaning of the underlying data.

With the shortage of data scientists to help fill analytic roles in business, companies are looking for ways to train existing employees on how to analyze and interpret data. In addition to training the next analytics leaders, businesses need to focus on educating the consumers (e.g., executives, managers and individual contributors) about data and the use of analytics. The current sample used professionals who have a high degree of proficiency in the use of metrics as well as in the application of those metrics in a formal company program. Yet, these savvy users still misinterpreted metrics. For data novices, we would likely see greater bias. If you are a metric-rich company (and who isn’t?), consider offering a class on basic statistics to all employees.

Some Big Data vendors hope to build solutions to help bring data science to the masses. These solutions help users gain insight through easy analysis and visualization of results. For example, Statwing and Tableau provide good examples of solutions that allow you to present data in different ways (e.g., means, frequency distributions), helping you communicate what is really going on in the data.

Remember that metrics don’t exist in a vacuum. They are interpreted by people. We saw that people are biased in their understanding of the meaning of two commonly used customer metrics, the Mean Score and Net Score. Carefully consider how you communicate your results as well as your audiences’ potential biases.

http://www.slideshare.net/bobehayes/the-hidden-bias-in-customer-metrics

Source: The Hidden Bias in Customer Metrics by bobehayes

Jul 09, 20: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
Insights  Source

[ AnalyticsWeek BYTES]

>> Predictive Workforce Analytics Studies: Do Development Programs Help Increase Performance Over Time? by groberts

>> Birst shapes data analytics around the user by analyticsweekpick

>> September 19, 2016 Health and Biotech Analytics News Roundup by pstein

Wanna write? Click Here

[ FEATURED COURSE]

R Basics – R Programming Language Introduction

image

Learn the essentials of R Programming – R Beginner Level!… more

[ FEATURED READ]

Data Science from Scratch: First Principles with Python

image

Data science libraries, frameworks, modules, and toolkits are great for doing data science, but they’re also a good way to dive into the discipline without actually understanding data science. In this book, you’ll learn … more

[ TIPS & TRICKS OF THE WEEK]

Finding a success in your data science ? Find a mentor
Yes, most of us dont feel a need but most of us really could use one. As most of data science professionals work in their own isolations, getting an unbiased perspective is not easy. Many times, it is also not easy to understand how the data science progression is going to be. Getting a network of mentors address these issues easily, it gives data professionals an outside perspective and unbiased ally. It’s extremely important for successful data science professionals to build a mentor network and use it through their success.

[ DATA SCIENCE Q&A]

Q:How do you test whether a new credit risk scoring model works?
A: * Test on a holdout set
* Kolmogorov-Smirnov test

Kolmogorov-Smirnov test:
– Non-parametric test
– Compare a sample with a reference probability distribution or compare two samples
– Quantifies a distance between the empirical distribution function of the sample and the cumulative distribution function of the reference distribution
– Or between the empirical distribution functions of two samples
– Null hypothesis (two-samples test): samples are drawn from the same distribution
– Can be modified as a goodness of fit test
– In our case: cumulative percentages of good, cumulative percentages of bad

Source

[ VIDEO OF THE WEEK]

Understanding #BigData #BigOpportunity in Big HR by @MarcRind #FutureOfData #Podcast

 Understanding #BigData #BigOpportunity in Big HR by @MarcRind #FutureOfData #Podcast

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

Everybody gets so much information all day long that they lose their common sense. – Gertrude Stein

[ PODCAST OF THE WEEK]

@TimothyChou on World of #IOT & Its #Future Part 1 #FutureOfData #Podcast

 @TimothyChou on World of #IOT & Its #Future Part 1 #FutureOfData #Podcast

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

And one of my favourite facts: At the moment less than 0.5% of all data is ever analysed and used, just imagine the potential here.

Sourced from: Analytics.CLUB #WEB Newsletter

Friends of Juice: Jessica Walker

Juice wouldn’t be the successful company that it is today without our friends who have championed our mission and expanded our network. We have identified several of Juice’s closest friends and advocates and we want to introduce them to you! Meet Jessica Walker, the CEO of Care Sherpa!

JessicaWalker_03 (1).jpg

For over 14 years, Jessica has been consulting with health care organizations and providers to support patient and employee engagement strategies with demonstrated financial and quality improvement impact. From Predictive Analytics, Marketing Strategies, CRM, Digital Engagement, Population Health and Patient Portals, Jessica supported multiple healthcare organizations on their patient acquisition and retention journeys.

Here’s what Jessica had to say when we discussed what she loved most about Juice Analytics and our team.

How did you hear or find out about Juice? 

I first became familiar with Juice after meeting Zach at the Nashville Analytics Summit and then later became familiar with Juice’s full capabilities when I was involved in an acquisition where the former company was utilizing some dashboards and insights tools that Juice had built. 

What do you love the most about Juice or what they do? 

I appreciate the thoughtfulness of starting with the “story” or what problem you are trying to solve. Juice quickly becomes a true partner aligned with your strategic goals and giving you the tools you need to get there. I also appreciate the team members I have had the pleasure to work with, they are very service-oriented and responsive. 

Who would you recommend to Juice for all of their data product needs? 

I think Juice can be a critical tool in any organization’s tool box. Specifically, I believe that anyone in a consultancy or change management role would quickly accelerate their business and stakeholder impact by working with Juice. Further Senior Leaders or Executives that are looking to demonstrate impact of their product line, team or investments would be well served in a Juice partnership. Finally, early stage organizations that need performance tracking would not only benefit from the expertise that Juice brings to the table to guide with best practices, but a Juice investment can directly impact their business growth over time.

What has impressed you the most about Juice and its team? 

Beyond what was stated prior regarding service orientation, I was most impressed with the depth of expertise and their ability to bring examples of what others have done that may be similar to my needs. This knowledge not only accelerated our project timeline but also represented true cost savings by eliminating multiple revisions.   

We are so thankful for the partnership and friendship that we have with Jessica! Thank you for being such a champion for Juice!

Originally Posted at: Friends of Juice: Jessica Walker by analyticsweek

Jul 02, 20: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
Human resource  Source

[ AnalyticsWeek BYTES]

>> The Big Data Debate: Batch vs. Streaming Processing by analyticsweekpick

>> The Changing Landscape of Customer Acquisition, Engagement and Retention in 2020 by administrator

>> â€˜Big Data’ Alters Investment Research Landscape by analyticsweekpick

Wanna write? Click Here

[ FEATURED COURSE]

Lean Analytics Workshop – Alistair Croll and Ben Yoskovitz

image

Use data to build a better startup faster in partnership with Geckoboard… more

[ FEATURED READ]

Storytelling with Data: A Data Visualization Guide for Business Professionals

image

Storytelling with Data teaches you the fundamentals of data visualization and how to communicate effectively with data. You’ll discover the power of storytelling and the way to make data a pivotal point in your story. Th… more

[ TIPS & TRICKS OF THE WEEK]

Data Have Meaning
We live in a Big Data world in which everything is quantified. While the emphasis of Big Data has been focused on distinguishing the three characteristics of data (the infamous three Vs), we need to be cognizant of the fact that data have meaning. That is, the numbers in your data represent something of interest, an outcome that is important to your business. The meaning of those numbers is about the veracity of your data.

[ DATA SCIENCE Q&A]

Q:What is latent semantic indexing? What is it used for? What are the specific limitations of the method?
A: * Indexing and retrieval method that uses singular value decomposition to identify patterns in the relationships between the terms and concepts contained in an unstructured collection of text
* Based on the principle that words that are used in the same contexts tend to have similar meanings
* “Latent”: semantic associations between words is present not explicitly but only latently
* For example: two synonyms may never occur in the same passage but should nonetheless have highly associated representations

Used for:

* Learning correct word meanings
* Subject matter comprehension
* Information retrieval
* Sentiment analysis (social network analysis)

Source

[ VIDEO OF THE WEEK]

#FutureOfData Podcast: Conversation With Sean Naismith, Enova Decisions

 #FutureOfData Podcast: Conversation With Sean Naismith, Enova Decisions

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

The data fabric is the next middleware. – Todd Papaioannou

[ PODCAST OF THE WEEK]

#BigData @AnalyticsWeek #FutureOfData #Podcast with Juan Gorricho, @disney

 #BigData @AnalyticsWeek #FutureOfData #Podcast with Juan Gorricho, @disney

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

Estimates suggest that by better integrating big data, healthcare could save as much as $300 billion a year — that’s equal to reducing costs by $1000 a year for every man, woman, and child.

Sourced from: Analytics.CLUB #WEB Newsletter

Amazon Redshift COPY command cheatsheet

Although it’s getting easier, ramping up on the COPY command to import tables into Redshift can become very tricky and error-prone.   Following the accordion-like hyperlinked Redshift documentation to get a complete command isn’t always straighforward, either.

Treasure Data got in on the act (we always do!) with a guide to demystify and distill all the COPY commands you could ever need into one short, straightforward guide.

Load tables into Redshift from S3, EMR, DynamoDB, over SSH, and more.
Load tables into Redshift from S3, EMR, DynamoDB, over SSH, and more!

 

Includes example commands, how to use data sources – including the steps for setting up an SSH connection,  using temporary and encrypted credentials, formatting, and much more.

Get the guide here.

Originally Posted at: Amazon Redshift COPY command cheatsheet by john-hammink

Rethinking classical approaches to analysis and predictive modeling

Rethinking classical approaches to analysis and predictive modeling
Rethinking classical approaches to analysis and predictive modeling

Synopsis:

The speaker will address the need to rethink classical approaches to analysis and predictive modeling. He will examine “iterative analytics” and extremely fine grained segmentation down to a single customer – ultimately building one model per customer or millions of predictive models delivering on the promise of “segment of one” . The speaker will also address the speed at which all this has to work to maintain a competitive advantage for innovative businesses.

Speaker:

Afshin Goodarzi Chief Analyst 1010data

A veteran of analytics, Goodarzi has led several teams in designing, building and delivering predictive analytics and business analytical products to a diverse set of industries. Prior to joining 1010data, Goodarzi was the Managing Director of Mortgage at Equifax, responsible for the creation of new data products and supporting analytics to the financial industry. Previously, he led the development of various classes of predictive models aimed at the mortgage industry during his tenure at Loan Performance (Core Logic). Earlier on he had worked at BlackRock, the research center for NYNEX (present day Verizon) and Norkom Technologies. Goodarzi’s publications span the fields of data mining, data visualization, optimization and artificial intelligence.

Presentation Video:

Presentation Slideshare:

Sponsor:
1010Data [ http://1010data.com ]
Microsoft NERD [ http://microsoftnewengland.com ]
Cognizeus [ http://cognizeus.com ]

Originally Posted at: Rethinking classical approaches to analysis and predictive modeling

Jun 25, 20: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
Extrapolating  Source

[ AnalyticsWeek BYTES]

>> Model Risk 101: A Checklist for Risk Managers by analyticsweekpick

>> Apr 26, 18: #AnalyticsClub #Newsletter (Events, Tips, News & more..) by admin

>> Consider The Close Variants During Page Segmentation For A Better SEO by thomassujain

Wanna write? Click Here

[ FEATURED COURSE]

Probability & Statistics

image

This course introduces students to the basic concepts and logic of statistical reasoning and gives the students introductory-level practical ability to choose, generate, and properly interpret appropriate descriptive and… more

[ FEATURED READ]

Superintelligence: Paths, Dangers, Strategies

image

The human brain has some capabilities that the brains of other animals lack. It is to these distinctive capabilities that our species owes its dominant position. Other animals have stronger muscles or sharper claws, but … more

[ TIPS & TRICKS OF THE WEEK]

Winter is coming, warm your Analytics Club
Yes and yes! As we are heading into winter what better way but to talk about our increasing dependence on data analytics to help with our decision making. Data and analytics driven decision making is rapidly sneaking its way into our core corporate DNA and we are not churning practice ground to test those models fast enough. Such snugly looking models have hidden nails which could induce unchartered pain if go unchecked. This is the right time to start thinking about putting Analytics Club[Data Analytics CoE] in your work place to help Lab out the best practices and provide test environment for those models.

[ DATA SCIENCE Q&A]

Q:How frequently an algorithm must be updated?
A: You want to update an algorithm when:
– You want the model to evolve as data streams through infrastructure
– The underlying data source is changing
– Example: a retail store model that remains accurate as the business grows
– Dealing with non-stationarity

Some options:
– Incremental algorithms: the model is updated every time it sees a new training example
Note: simple, you always have an up-to-date model but you can’t incorporate data to different degrees.
Sometimes mandatory: when data must be discarded once seen (privacy)
– Periodic re-training in “batch” mode: simply buffer the relevant data and update the model every-so-often
Note: more decisions and more complex implementations

How frequently?
– Is the sacrifice worth it?
– Data horizon: how quickly do you need the most recent training example to be part of your model?
– Data obsolescence: how long does it take before data is irrelevant to the model? Are some older instances
more relevant than the newer ones?
Economics: generally, newer instances are more relevant than older ones. However, data from the same month, quarter or year of the last year can be more relevant than the same periods of the current year. In a recession period: data from previous recessions can be more relevant than newer data from different economic cycles.

Source

[ VIDEO OF THE WEEK]

Venu Vasudevan @VenuV62 (@ProcterGamble) on creating a rockstar data science team #FutureOfData #Podcast

 Venu Vasudevan @VenuV62 (@ProcterGamble) on creating a rockstar data science team #FutureOfData #Podcast

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

Hiding within those mounds of data is knowledge that could change the life of a patient, or change the world. – Atul Butte, Stanford

[ PODCAST OF THE WEEK]

Understanding #BigData #BigOpportunity in Big HR by @MarcRind #FutureOfData #Podcast

 Understanding #BigData #BigOpportunity in Big HR by @MarcRind #FutureOfData #Podcast

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

Data production will be 44 times greater in 2020 than it was in 2009.

Sourced from: Analytics.CLUB #WEB Newsletter

Big Data: Would number geeks make better football managers?

Charles Reep was a retired RAF Wing Commander who loved football.

Specifically, Swindon Town. And it ached to see them losing – something the team made a habit of in the 1949/50 season.

So frustrated was Wing Cdr Reep with one particular performance, that for the second half he pulled out his notepad and started making notes on the players – their movements, their positions, the shape of their play. He identified small changes that he thought could help the team grab a few more goals.

He was decades ahead of his time.

Now, behind the biggest football teams in the world, lies a sophisticated system of data gathering, metrics and number-crunching. Success on the pitch – and on the balance sheet – is increasingly becoming about algorithms.

The richest 20 clubs in the world bring in combined revenues of 5.4bn euros ($7.4bn, £4.5bn), according to consultancy firm Deloitte. And increasingly, data is being seen as crucial to maximising that potential income by getting the most from football’s prized investments – the players.

Hoof it!

Data and football have had a strained relationship over the years.

Back in the 1950s, Swindon didn’t have much time for Wing Cdr Reep’s approach. But west London side Brentford did.

ProzoneProzone’s software offers real-time match tracking, and is used by over 300 clubs worldwide

The club was facing a relegation battle. Wing Cdr Reep was taken on as an advisor – and with his counsel, the team turned their fortunes around and were safe from relegation at the close of the season.

A triumph, you would think – but his approach, despite the measurable success, drew considerable scorn.

His data suggested that most goals were scored from fewer than three direct passes, and he therefore recommended the widely-despised “long-ball” game.

In other words, the ugliest type of football imaginable. Hoof the ball forward, hope you get a lucky break, and poke it into the net.

“Unfortunately it kind of brought statistics and football into disrepute,” says Chris Anderson, author of The Numbers Game, an analytical and historical look at the use of data in football.

“Because people pooh-poohed the idea of the long ball game in football and thought it responsible for the England team not doing nearly as well as they should have for all these years.”

Leg sensors

Wing Cdr Reep passed away in 2002. Were he alive today, he would likely be a welcome guest at German football club TSG Hoffenheim, where the “big data” revolution is changing everything about how they prepare for a match.

Through a partnership with SAP – which specialises in handling “big data” for business – the club has incorporated real-time data measurements into its training schedule.

“It’s a very new way of training,” says Stefan Lacher, head of technology at SAP.

SAP data at HoffenheimThe data can be analysed in real-time by data experts – and training schedules can be adapted

“The entire training area becomes accessible virtually by putting trackers on everything that’s important – on the goals, on the posts. Every player gets several of them – one on each shinpad – and the ball of course has a sensor as well.

“If you train for just 10 minutes with 10 players and three balls – it produces more than seven million data points, which we can then process in real time.”

SAP’s software is able to crunch that data, and suggest tweaks that each individual player can make.

“It’s about better understanding the strengths and weaknesses of the players,” Mr Lacher says, “and spending more time working on the weaknesses and making better use of the strengths.

“It’s moving from gut feeling to facts and figures.”

Career-threatening injury

But it’s in the boardroom where football data has an even more critical role to play in the success of the team, says Dr Paul Neilson from football technology specialists Prozone.

“One of the most important things within elite sport is making sure your players are available for training and matches as much as possible, and that is about mitigating injury risks,” he says.

“If you’re doing that you should be able to reduce the risk of physical overload, and reduce the risk of injury.

SAP data at HoffenheimThe data can be relayed to players so they can work on their weaknesses

“When you’re paying players as much as players get paid, it’s very important to make sure they’re on the pitch as much as possible.”

Non-playing players is a massive financial concern for football clubs. The famous example is the case of Jonathan Woodgate, who left Newcastle United in 2004 to join Spanish giants Real Madrid – for a tasty £13.4m.

Plagued by injury, Woodgate played for Real just nine times before leaving in 2007. That’s just under £1.5m per game – without his weekly wages taken into consideration.

Prozone’s research lab wants to reduce this risk for clubs by using data to analyse body movements and spot, before a physio can, where future injuries may occur.

In young players, analysis of movement can also provide an early warning system for those who may develop career-threatening injuries.

Collecting this data is a sophisticated task. Prozone’s approach relies on a complex network of cameras fitted around the stadium, picking up player movements from several angles at once.

Two worlds

Football managers and coaches like to think it’s their instinct, not geeky data, that gets results. And so, uptake of data analysis in football has been a slow process.

“Football, particularly in the UK, can be a little bit conservative,” says Dr Neilson.

“You look at rugby, and the head coach/manager will often be in the stand for all the game and be surrounded by data and technology and video analysis.

SAP sensor
Players at Hoffenheim attach sensors to their kit to monitor their movements

“Compare that with football and the manager is still very much in the dugout, trying to affect the players personally, in terms of instructions and shouting – and very much being part of the sometimes chaotic nature of football.”

This culture clash means there are no managers that prowl the touchline with a tablet – yet. But behind the scenes it’s a very different picture.

Prozone provides intricate data for more than 300 football clubs around the world, including every team in the lucrative English Premier League.

But to make sense of it all requires talent – and Dr Neilson believes that soon, fans will come to admire – or despise – their club’s data scientist in the same way they treat the manager now.

“In a typical football club you have technical people like your sports analysis staff, or sports science staff. They are very analytical, very objective and process driven.

“At the opposite end of the scale you have the decision makers – the chief executive who writes the cheques, the manager that makes the weekly decision in terms of team selection.

“The challenge is connecting those two worlds – so the decision makers trust in that data.”

Sadly, Wing Cdr Reep didn’t live to see the true appreciation of his craft.

And to this day, his long-ball philosophy is criticised by many who say that his data collection was far too primitive to come to such sweeping conclusions.

But nevertheless, his work pioneered what has become a cornerstone of the modern, beautiful game.

Somewhere, in the not-so-distant future, at a football club losing three-nil at home – the fans are chanting “you’re getting sacked in the morning”. Not at the manager, but at the man with the big data.

Follow Dave Lee on Twitter @DaveLeeBBC

Originally posted via “Big Data: Would number geeks make better football managers?”

Source