Oct 18, 18: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

[  COVER OF THE WEEK ]

image
Data shortage  Source

[ LOCAL EVENTS & SESSIONS]

More WEB events? Click Here

[ AnalyticsWeek BYTES]

>> Apr 19, 18: #AnalyticsClub #Newsletter (Events, Tips, News & more..) by admin

>> Dec 07, 17: #AnalyticsClub #Newsletter (Events, Tips, News & more..) by admin

>> The ultimate customer experience [infographic] by v1shal

Wanna write? Click Here

[ NEWS BYTES]

>>
 Nikkei: Japan probes Facebook data security – Seeking Alpha Under  Data Security

>>
 How Is Artificial Intelligence Changing The Business Landscape? – Forbes Under  Artificial Intelligence

>>
 Global Risk Analytics Market report provides the data on the past progress, ongoing market scenarios and future … – The Business Investor Under  Risk Analytics

More NEWS ? Click Here

[ FEATURED COURSE]

Intro to Machine Learning

image

Machine Learning is a first-class ticket to the most exciting careers in data analysis today. As data sources proliferate along with the computing power to process them, going straight to the data is one of the most stra… more

[ FEATURED READ]

Superintelligence: Paths, Dangers, Strategies

image

The human brain has some capabilities that the brains of other animals lack. It is to these distinctive capabilities that our species owes its dominant position. Other animals have stronger muscles or sharper claws, but … more

[ TIPS & TRICKS OF THE WEEK]

Data aids, not replace judgement
Data is a tool and means to help build a consensus to facilitate human decision-making but not replace it. Analysis converts data into information, information via context leads to insight. Insights lead to decision making which ultimately leads to outcomes that brings value. So, data is just the start, context and intuition plays a role.

[ DATA SCIENCE Q&A]

Q:How to clean data?
A: 1. First: detect anomalies and contradictions
Common issues:
* Tidy data: (Hadley Wickam paper)
column names are values, not names, e.g. 26-45…
multiple variables are stored in one column, e.g. m1534 (male of 15-34 years’ old age)
variables are stored in both rows and columns, e.g. tmax, tmin in the same column
multiple types of observational units are stored in the same table. e.g, song dataset and rank dataset in the same table
*a single observational unit is stored in multiple tables (can be combined)
* Data-Type constraints: values in a particular column must be of a particular type: integer, numeric, factor, boolean
* Range constraints: number or dates fall within a certain range. They have minimum/maximum permissible values
* Mandatory constraints: certain columns can’t be empty
* Unique constraints: a field must be unique across a dataset: a same person must have a unique SS number
* Set-membership constraints: the values for a columns must come from a set of discrete values or codes: a gender must be female, male
* Regular expression patterns: for example, phone number may be required to have the pattern: (999)999-9999
* Misspellings
* Missing values
* Outliers
* Cross-field validation: certain conditions that utilize multiple fields must hold. For instance, in laboratory medicine: the sum of the different white blood cell must equal to zero (they are all percentages). In hospital database, a patient’s date or discharge can’t be earlier than the admission date
2. Clean the data using:
* Regular expressions: misspellings, regular expression patterns
* KNN-impute and other missing values imputing methods
* Coercing: data-type constraints
* Melting: tidy data issues
* Date/time parsing
* Removing observations

Source

[ VIDEO OF THE WEEK]

@JohnTLangton from @Wolters_Kluwer discussed his #AI Lead Startup Journey #FutureOfData #Podcast

 @JohnTLangton from @Wolters_Kluwer discussed his #AI Lead Startup Journey #FutureOfData #Podcast

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

If you can’t explain it simply, you don’t understand it well enough. – Albert Einstein

[ PODCAST OF THE WEEK]

@chrisbishop on futurist's lens on #JobsOfFuture #FutureofWork #JobsOfFuture #Podcast

 @chrisbishop on futurist’s lens on #JobsOfFuture #FutureofWork #JobsOfFuture #Podcast

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

Facebook stores, accesses, and analyzes 30+ Petabytes of user generated data.

Sourced from: Analytics.CLUB #WEB Newsletter

Oracle releases database integration tool to ease big data analytics

Oracle has bolstered its database portfolio with the Oracle Data Integrator (ODI), a piece of middleware designed to help analysts sift through big data across a variety of sources.

As the name suggests, the ODI effectively eases the process of linking data in different formats and from diverse databases and clusters, such as Hadoop, NoSQL and relational databases.

This enables Oracle customers to conduct analysis on large and varied datasets without dedicating time and resources to preparing big data in an integrated and secure way prior to analysis.

In effect, the ODI allows huge pools of data to be treated as just another data source to be used alongside more regularly accessed data warehouses and structured databases.

Jeff Pollock, vice president of product management at Oracle, claimed that the ODI allows customers to be experts in extract, transform and load tools without learning the code needed to carry out such actions.

“Oracle is the only vendor that can automatically generate Spark, Hive and Pig transformations from a single mapping which allows our customers to focus on business value and the overall architecture rather than multiple programming languages,” he said.

Avoiding the need for proprietary code means that the ODI can be run natively with a company’s existing Hadoop cluster, bypassing the need to invest in additional development.

Cluster databases like Hadoop and Spark have traditionally been geared towards programmers with knowledge of the coding needed to manipulate them. On the flipside, analysts would mostly use software tools to carry out enterprise-level data analytics.

The ODI gives the non-code savvy analyst the ability to harness Hadoop and other data sources without requiring the coding knowledge to do so.

It also means that a company’s developers need not retrain to handle multiple databases. Oracle is touting this as a way for companies to save money and time on big data analysis.

Oracle’s move to build its portfolio to focus on delivering direct data insights for its customers is indicative of the business-focused direction big data analytics are heading, underlined byVisa’s head of analytics saying big data projects must focus on making money

Originally posted via “Oracle releases database integration tool to ease big data analytics”

Source: Oracle releases database integration tool to ease big data analytics

Creating Value from Analytics: The Nine Levers of Business Success

IBM just released the results of a global study on how businesses can get the most value from Big Data and analytics. They found nine areas that are critical to creating value from analytics. You can download the entire study here.

IBM Institute for Business Value surveyed 900 IT and business executives from 70 countries from June through August 2013. The 50+ survey questions were designed to help translate concepts relating to generating value from analytics into actions.

Nine Levers to Value Creation

Figure 1. Nine Levers to Value Creation from Analytics
Figure 1. Nine Levers to Value Creation from Analytics. Click image to enlarge.

The researchers identified nine levers that help organizations create value from data. They compared leaders (those who identified their organization as substantially outperforming their industry peers) with the rest of the sample. They found that the leaders (19% of the sample) implement the nine levers to a greater degree than the non-leaders. These nine levers are:

  1. Source of value: Actions and decisions that generate results. Leaders tend to focus primarily on their ability to increase revenue and less so on cost reduction.
  2. Measurement: Evaluating the impact on business outcomes. Leaders ensure they know how their analytics impact business outcomes.
  3. Platform: Integrated capabilities delivered by hardware and software. Sixty percent of Leaders have predictive analytic capabilities, as well as simulation (55%) and optimization (67%) capabilities.
  4. Culture: Availability and use of data and analytics within an organization. Leaders make more than half of their decisions based on data and analytics.
  5. Data: Structure and formality of the organization’s data governance process and the security of its data. Two-thirds of Leaders trust the quality of their data and analytics. A majority of leaders (57%) adopt enterprise-level standards, policies and practices to integrate data across the organization.
  6. Trust: Organizational confidence. Leaders demonstrate a high degree of trust between individual employees (60% between executives, 53% between business and IT executives)
  7. Sponsorship: Executive support and involvement. Leaders (56%) oversee the use of data and analytics within their own departments, guided by an enterprise-level strategy, common policies and metrics, and standardized methodologies compared to the rest (20%).
  8. Funding: Financial rigor in the analytics funding process. Nearly two-thirds of Leaders pool resources to fund analytic investments. They evaluate these investments through pilot testing, cost/benefit analysis and forecasting KPIs.
  9. Expertise: Development of and access to data management and analytic skills and capabilities. Leaders share advanced analytics subject matter experts across projects, where analytics employees have formalized roles, clearly defined career paths and experience investments to develop their skills.

The researchers state that each of the nine levers have a different impact on the organization’s ability to deliver value from the data and analytics; that is, all nine levers distinguish Leaders from the rest but each Lever impacts value creation in different ways. Enable levers need to be in place before value can be seen through the Drive and Amplify levers. The nine levers are organized into three levels:

  1. Enable: These levers form the basis for big data and analytics.
  2. Drive: These levers are needed to realize value from data and analytics; lack of sophistication within these levers will impede value creation.
  3. Amplify: These levers boost value creation.

Recommendations: Creating an Analytic Blueprint

Figure 2. Analytics Blueprint for Creating Value from Data. Click image to enlarge
Figure 2. Analytics Blueprint for Creating Value from Data. Click image to enlarge

Next, the researchers offered a blueprint on how business leaders can translate the research findings into real changes for their own businesses. This operational blueprint consists of three areas: 1) Strategy, 2) Technology and 3) Organization.

1. Strategy

Strategy is about the deliberateness with which the organization approaches analytics. Businesses need to adopt practices around Sponsorship, Source of value and Funding to instill a sense of purpose to data and analytics that connects the strategic visions to the tactical activities.

2. Technology

Technology is about the enabling capabilities and resources an organization has available to manage, process, analyze, interpret and store data. Businesses need to adopt practices around Expertise, Data and Platform to create a foundation for analytic discovery to address today’s problems while planning for future data challenges.

3. Organization

Organization is about the actions taken to use data and analytics to create value. Businesses need to adopt practices around Culture, Measurement and Trust to enable the organization to be driven by fact-based decisions.

Summary

One way businesses are trying to outperform their competitors is through the use of analytics on their treasure trove of data. The IBM researchers were able to identify the necessary ingredients to extract value from analytics. The current research supports prior research on the benefits of analytics in business:

  1. Top-performing businesses are twice as likely to use analytics to guide future strategies and guide day-to-day operations compared to their low-performing counterparts.
  2. Analytic innovators 1) use analytics primarily to increase value to the customer rather than to decrease costs/allocate resources, 2) aggregate/integrate different business data silos and look for relationships among once-disparate metric and 3) secure executive support around the use of analytics that encourage sharing of best practices and data-driven insights throughout their company.

Businesses, to extract value from analytics, need to focus on improving strategic, technological and organizational aspects on how they treat data and analytics. The research identified nine area or levers executives can use to improve the value they generate from their data.

For the interested reader, I recently provided a case study (see: The Total Customer Experience: How Oracle Builds their Business Around the Customer) that illustrates how one company uses analytical best practices to help improve the customer experience and increase customer loyalty.

————————–

TCE Total Customer Experience

 

Buy TCE: Total Customer Experience at Amazon >>

In TCE: Total Customer Experience, learn more about how you can  integrate your business data around the customer and apply a customer-centric analytics approach to gain deeper customer insights.

 

Source by bobehayes

Oct 11, 18: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

[  COVER OF THE WEEK ]

image
Conditional Risk  Source

[ LOCAL EVENTS & SESSIONS]

More WEB events? Click Here

[ AnalyticsWeek BYTES]

>> Nick Howe (@Area9Nick) talks about fabric of learning organization to bring #JobsOfFuture #podcast by v1shal

>> How do we cut through the jumble of Business Analytics? -Janet Amos Pribanic by analyticsweek

>> Why Your Company Should Use Data Science to Make Better Decisions by analyticsweekpick

Wanna write? Click Here

[ NEWS BYTES]

>>
 Deliver desktop and app virtualization for mobile devices – TechTarget Under  Virtualization

>>
 Continental Protects Vehicles From Cyber Attacks – Modern Tire Dealer Under  cyber security

>>
 Cyber security unit has no strategic plan, C&AG finds – Irish Times Under  cyber security

More NEWS ? Click Here

[ FEATURED COURSE]

Statistical Thinking and Data Analysis

image

This course is an introduction to statistical data analysis. Topics are chosen from applied probability, sampling, estimation, hypothesis testing, linear regression, analysis of variance, categorical data analysis, and n… more

[ FEATURED READ]

The Industries of the Future

image

The New York Times bestseller, from leading innovation expert Alec Ross, a “fascinating vision” (Forbes) of what’s next for the world and how to navigate the changes the future will bring…. more

[ TIPS & TRICKS OF THE WEEK]

Grow at the speed of collaboration
A research by Cornerstone On Demand pointed out the need for better collaboration within workforce, and data analytics domain is no different. A rapidly changing and growing industry like data analytics is very difficult to catchup by isolated workforce. A good collaborative work-environment facilitate better flow of ideas, improved team dynamics, rapid learning, and increasing ability to cut through the noise. So, embrace collaborative team dynamics.

[ DATA SCIENCE Q&A]

Q:Give examples of bad and good visualizations?
A: Bad visualization:
– Pie charts: difficult to make comparisons between items when area is used, especially when there are lots of items
– Color choice for classes: abundant use of red, orange and blue. Readers can think that the colors could mean good (blue) versus bad (orange and red) whereas these are just associated with a specific segment
– 3D charts: can distort perception and therefore skew data
– Using a solid line in a line chart: dashed and dotted lines can be distracting

Good visualization:
– Heat map with a single color: some colors stand out more than others, giving more weight to that data. A single color with varying shades show the intensity better
– Adding a trend line (regression line) to a scatter plot help the reader highlighting trends

Source

[ VIDEO OF THE WEEK]

Using Topological Data Analysis on your BigData

 Using Topological Data Analysis on your BigData

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

Information is the oil of the 21st century, and analytics is the combustion engine. – Peter Sondergaard

[ PODCAST OF THE WEEK]

@AlexWG on Unwrapping Intelligence in #ArtificialIntelligence #FutureOfData #Podcast

 @AlexWG on Unwrapping Intelligence in #ArtificialIntelligence #FutureOfData #Podcast

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

The data volumes are exploding, more data has been created in the past two years than in the entire previous history of the human race.

Sourced from: Analytics.CLUB #WEB Newsletter

10 Ways Big Data is Changing K-12 Education

You may have heard the term “big data” in reference to companies like Netflix, Google or Facebook. It’s the collection of all those little data points about your choices and decision making process that allows companies to know exactly what movie you’re in the mood for when you plop down on your couch with a bowl of popcorn after a long day. Recently, big data has also made a foray into the educational realm. Whether through information gathered through standardized testing or the use of adaptive learning systems, big data is well on its way to completely transforming K-12 education.

Here are 10 ways Big Data is changing K-12 education:

1. Different pace of learning
One of the main challenges that educators currently face is adapting their instruction so it accommodates many different students who learn at different paces. The tools used to collect data, like intelligent adaptive learning systems, are designed to shift the pace of instruction depending on the prior knowledge, abilities and interests of each student. Teachers, in turn, can use this data to inform their pace of instruction going forward.

2. Sharing information
When students change schools or move across state lines, it has often been a challenge for their new teachers to get a firm grasp of what they have covered and which content areas may need more attention. The Common Core standards make data interchangeable across schools and districts.

3. Pinpoint problem areas
A unique feature of big data is that it allows teachers and administrators to pinpoint academic problem areas in students as they learn rather than after they take the test. For example, if a student is working through an adaptive learning program and the data collected reveals that he or she needs more help understanding the fundamental concepts behind fractions, teachers or the adaptive learning system can set aside time to work individually with that student to address and overcome the problem.

4. Need for analysts
Of course, the collection of all of this data isn’t helpful for anyone if it just sits there – school districts are beginning to need analysts to interpret it all. Disparate data sets must be linked so that decision makers in a school district can view, sort and analyze the information to develop both long- and short-term plans for improving education. School districts may also need to set up workshops to show teachers how they can use all of this data effectively.

5. Different means of educational advancement
Traditionally, readiness for educational advancement has been determined more by age than whether or not the student was ready to learn more challenging material. Gifted students may be advanced, but they often stay in the same class as their peers because information about what they know can only be collected sporadically. Big data allows teachers and administrators to get a continuous sense of where students are falling academically, and whether or not they are ready to advance.

6. Smooth transitions
The collection of data is not only allowing for smoother transition between schools, but also grade levels. Access to information databases about what exactly students know could prove quite useful to school districts that are in the process of implementing the Common Core State Standards. Because the CCSS are changing academic requirements, some students find that they’ve inadvertently missed learning something important because it was shifted to the grade below. Data can pinpoint this problem so it can be addressed.

7. Personalized activities
Personalized learning has become a much-heralded approach to education, and big data is helping teachers tailor activities to individual learners. Technology, in particular, is playing a central role. Tech-savvy students can use computer games and adaptive learning programs to complete educational activities that are interactive and take their skill level into account.

8. Using analytics
One significant change that schools are seeing is the increasing use of analytics to inform their approaches. For example, big data can be analyzed to create plans to improve academic results, decrease dropout rates and influence the day-to-day decision making of administrators and teachers.

9. Engage parents and students
It’s extremely important for parents to be involved in their children’s education, and big data is providing a means of engaging both parents and students. If at parent/teacher conferences, educators can pinpoint exactly where a child is excelling and where more work is needed, and can provide data to back up those claims, parents will have a clearer understanding of what they can do to help their children succeed in school.

10. Customized instruction
Perhaps most exciting for teachers and students alike is the ability for customized instruction that big data provides. This differs greatly from the approach to education in the past, when teachers would deliver one lesson and expect all students to understand, even if they learned in very different ways.
Is your school using big data? What changes are you seeing?

Dan kerns

Originally posted via “10 Ways Big Data is Changing K-12 Education”

Source by analyticsweekpick

Oct 04, 18: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

[  COVER OF THE WEEK ]

image
Trust the data  Source

[ LOCAL EVENTS & SESSIONS]

More WEB events? Click Here

[ AnalyticsWeek BYTES]

>> Can Analytics Improve your Game? by bobehayes

>> @DrewConway on creating socially responsible data science practice #FutureOfData #Podcast by v1shal

>> Avoiding a Data Science Hype Bubble by analyticsweek

Wanna write? Click Here

[ NEWS BYTES]

>>
 Why It’s Time For Retail To Stop Experimenting With IoT, And Start Implementing It – PYMNTS.com Under  Internet Of Things

>>
 Panzura Barreling Toward IPO In $68B Cloud Data Management Market – Forbes Under  Cloud

>>
 As Hadoop landscape evolves, Hortonworks CEO plots future in hybrid cloud and IoT – SiliconANGLE News (blog) Under  Hadoop

More NEWS ? Click Here

[ FEATURED COURSE]

Lean Analytics Workshop – Alistair Croll and Ben Yoskovitz

image

Use data to build a better startup faster in partnership with Geckoboard… more

[ FEATURED READ]

Storytelling with Data: A Data Visualization Guide for Business Professionals

image

Storytelling with Data teaches you the fundamentals of data visualization and how to communicate effectively with data. You’ll discover the power of storytelling and the way to make data a pivotal point in your story. Th… more

[ TIPS & TRICKS OF THE WEEK]

Save yourself from zombie apocalypse from unscalable models
One living and breathing zombie in today’s analytical models is the pulsating absence of error bars. Not every model is scalable or holds ground with increasing data. Error bars that is tagged to almost every models should be duly calibrated. As business models rake in more data the error bars keep it sensible and in check. If error bars are not accounted for, we will make our models susceptible to failure leading us to halloween that we never wants to see.

[ DATA SCIENCE Q&A]

Q:Explain what a local optimum is and why it is important in a specific context,
such as K-means clustering. What are specific ways of determining if you have a local optimum problem? What can be done to avoid local optima?

A: * A solution that is optimal in within a neighboring set of candidate solutions
* In contrast with global optimum: the optimal solution among all others

* K-means clustering context:
It’s proven that the objective cost function will always decrease until a local optimum is reached.
Results will depend on the initial random cluster assignment

* Determining if you have a local optimum problem:
Tendency of premature convergence
Different initialization induces different optima

* Avoid local optima in a K-means context: repeat K-means and take the solution that has the lowest cost

Source

[ VIDEO OF THE WEEK]

#BigData @AnalyticsWeek #FutureOfData #Podcast with Juan Gorricho, @disney

 #BigData @AnalyticsWeek #FutureOfData #Podcast with Juan Gorricho, @disney

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

You can use all the quantitative data you can get, but you still have to distrust it and use your own intelligence and judgment. – Alvin Tof

[ PODCAST OF THE WEEK]

#DataScience Approach to Reducing #Employee #Attrition

 #DataScience Approach to Reducing #Employee #Attrition

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

YouTube users upload 48 hours of new video every minute of the day.

Sourced from: Analytics.CLUB #WEB Newsletter

Data Science Skills and the Improbable Unicorn

The role of data and analytics in business continues to grow. To make sense of their plethora of data, businesses are looking to data scientists for help. Job site, indeed.com, shows a continued growth in “data scientist” positions. To better understand the field of data science, we studied hundreds of data professionals.

In that study, we found that data scientists are not created equal. That is, data professionals differ with respect to the skills they possess. For example, some professionals are proficient in statistical and mathematical skills while others are proficient in computer science skills. Still others have a strong business acumen. In the current analysis, I want to determine the breadth of talent that data professionals possess to better understand the possibility of finding a single data scientist who is skilled in all areas. First, let’s review the study sample and the method of how we measured talent.

Assessing Proficiency in Data Skills

We surveyed hundreds of data professionals to tell us about their skills in five areas: Business, Technology, Math & Modeling, Programming and Statistics. Each skill area included five specific skills, totaling 25 different data skills in all.

For example, in the Business Skills area, data professionals were asked to rate their proficiency in such specific skills as “Business development,” and “Governance & Compliance (e.g., security).” In the Technology Skills area, they were asked to rate their proficiency in such skills as “Big and Distributed Data (e.g., Hadoop, Map/Reduce, Spark),” and “Managing unstructured data (e.g., noSQL).” In the Statistics Skills, they were asked to rate their proficiency in such skills as “Statistics and statistical modeling (e.g., general linear model, ANOVA, MANOVA, Spatio-temporal, Geographical Information System (GIS)),” and “Science/Scientific Method (e.g., experimental design, research design).”

For each of the 25 skills, respondents were asked to tell us their level proficiency using the following scale:

  • Don’t know (0)
  • Fundamental Knowledge (20)
  • Novice (40)
  • Intermediate (60)
  • Advanced (80)
  • Expert (100)

This rating scale is based on a proficiency rating scale used by NIH. Definitions for each proficiency level were fully defined in the instructions to the data professionals.

Standard of Performance

Figure 1.
Figure 1. Proficiency in data skills varies by job role.

The different levels of proficiency are defined around the data scientists ability to give or need to receive help. In the instructions to the data professionals, the “Intermediate” level of proficiency was defined as the ability “to successfully complete tasks as requested.” We used that proficiency level (i.e., Intermediate) as the minimum acceptable level of proficiency for each data skill. The proficiency levels below the Intermediate level (i.e., Novice, Fundamental Awareness, Don’t Know) were defined by an increasing need for help on the part of the data professional. Proficiency levels above the Intermediate level (i.e., Advanced, Expert) were defined by the data professional’s increasing ability to give help or be known by others as “a person to ask.”

We looked at the level of proficiency for the 25 different data skills across four different job roles. As is seen in Figure 1, data professionals tend to be skilled in areas that are appropriate for their job role (see green-shaded areas in Figure 1). Specifically, Business Management data professionals show the most proficiency in Business Skills. Researchers, on the other hand, show lowest level of proficiency in Business Skills and the highest in Statistics Skills.

For many of the data skills, the typical data professional does not have the minimum level of proficiency to do be successful at work, no matter their role (see yellow- and red-shaded areas in Figure 1). These data skills include the following: Unstructured data, NLP, Machine Learning, Big and distributed data, Cloud management, Front-end programming, Optimization, Graphic models, Algorithms and Bayesian statistics.

In Search of the Elite Data Scientist

data science unicorn
Figure 2. There are only a handful of data professionals who are proficient in all skill areas

There are a couple of ways an organization can build their data science capability. It can either hire a single individual who is skilled in all data science areas or it can hire a team of data professionals who have complementary skills. In both cases, the organization has all the skills necessary to use data intelligently. However, the likelihood of finding a data professional who is an expert in all five skill areas is quite low (see Figure 2). In our sample, we looked at three levels of proficiency: Intermediate, Advanced and Expert. We found that only 10% of the data professionals indicated they had, at least, an Intermediate level of proficiency in all five skill areas. The picture looks more bleak you look for data professionals who have advanced or expert proficiencies in data skills. The chance of finding a data professional with Advanced skills or better in all five skill areas drops to less than 1%. There were no data professionals who were considered as Experts in all five skill areas.

proficiency by industry
Figure 3. Proficiency levels by industry

We looked at proficiency differences across five industries: Consulting (n = 52), Education / Science (n = 50), Financial, (n = 52), Healthcare (n = 50) and IT (n = 95). We identified data professionals who had an advanced level of proficiency across the different skills. We found that data professionals in the Education / Science industry have more advanced skills (54% of data professionals have at least an advanced level of proficiency in at least one skill area) compared to data professionals in the Financial (37%) and IT (34%) industries.

Summary

The term “data scientist” is ambiguous. There are different types of data scientists, each defined by their level of proficiency in one of five skill areas: Business, Technology, Programming, Math & Modeling and Statistics. Data scientists can be defined by the skills they possess. So, when somebody tells you they are a data scientist, be sure you know what type they are.

Finding a data professional who is proficient in all data science skill areas is extremely difficult. As our study shows, data professionals rarely possess proficiency in all five skill areas at the level needed to be successful at work. The chance of finding a data professional with Expert skills in all five areas (even in 3 or 4 skill areas) is akin to finding a unicorn; they just don’t exist. There were very few data professionals who even had the basic minimum level of proficiency (i.e., Intermediate level of proficiency) in all five skill areas. Additionally, our initial findings on industry differences in skill proficiency suggest that skilled data professionals might be easier to find in specific industries. These industry differences could impact recruitment and management of data professionals. An under-supply of data science talent in one industry could require companies to use more dramatic recruitment efforts to attract data professionals from outside the industry. In industries where there are plenty of skilled data professionals, companies can be more selective in their hiring efforts.

Optimizing the value of business data is dependent on the skills of the data professionals who process the data. We took a skills-based approach to understanding how organizations can extract value from their data. Based on our findings, we recommend that organizations avoid trying to find a single data professional who has the skills that span the entire spectrum of data science. Rather, a better approach is to consider building up your data science capability through the formation of teams of data professionals who have complementary skills.

Originally Posted at: Data Science Skills and the Improbable Unicorn by bobehayes

Relative Performance Assessment: Improving your Competitive Advantage

Companies continually look for ways to increase customer loyalty (e.g., recommendations, retention, continue buying, purchase different/additional offerings). A popular loyalty improvement approach is customer experience management (CEM). CEM is the process of understanding and managing customers’ interactions with and perceptions about the company/brand. The idea behind this approach is that, if you can increase the satisfaction with the customer experience, your customers will engage in more loyalty behaviors toward your company/brand.

Customer Experience Isn’t Enough: Industry Ranking Also Drives Customer Loyalty

Using the CEM approach, companies typically measure the satisfaction with the customer experiences (e.g., product, service, support) and use that information to target improvement efforts in areas that will maximize customer loyalty. Keiningham et al. (2011) argue that focusing on your absolute improvements in customer experience is not enough to drive business growth. What is necessary to increase business growth is to improve your performance relative to your competitors. He and his colleagues found, in fact, that a company’s ranking (against the competition) was strongly related to share of wallet of their customers. In their two-year longitudinal study, they found that top-ranked companies received greater share of wallet of their customers compared to bottom-ranked companies.

A way to improve customer loyalty, then, is to increase your standing relative to your competition. To improve your ranking, you need to understand two pieces of information: 1) how your customers rank you relative to competitors they have used and 2) the reasons behind your ranking.

Relative Performance Assessment (RPA): A Competitive Benchmarking Approach

I developed a competitive analytics solution called the Relative Performance Assessment  (RPA) that helps companies understand their relative ranking against their competition and identify ways to increase their ranking.  The RPA helps companies improve their competitive advantage through customer feedback on a few key questions. In its basic form, the RPA method requires two additional questions in your customer relationship survey:

  1. How do our products compare with the competitors? This question allows you to gauge each customer’s perception of where they think you stand relative to other companies/brands they have used.  Other types of comparative questions that can be used include: “What best describes our performance compared to the competitors you use?” and “How does <your company name>’s services compare to other suppliers?” The key to RPA is the rating scale options. The response options allow each customer to tell you where your company ranks against all others in your space. A 5-point scale that works for this purpose is: 1) <your company name> is the worst; 2) <your company name> is better than some; 3) <your company name> is about average (about the same as others); 4) <your company name> is better than most; and 5) <your company name> is the best). Lower ratings (1-2) indicate that your customers think you are relatively worse than the competition; higher ratings (4-5) indicate that customers think you are relatively better than the competition. A Relative Performance Index (RPI) can be calculated by averaging the responses (See Customers’ Perception of Percentile Ranking – C-PeRk below – for another scaling method).
  2. Please tell us why you think that “insert answer to question above”. This question allows each customer to indicate the reasons behind his/her ranking of your performance. The content of the customers’ comments can be aggregated to identify underlying themes. Companies can use these comments to help diagnose the reasons for high rankings (e.g., ranked the best / better than most) or low rankings (ranked the worst / better than some).
In addition to these two questions, companies can also ask their customers about the current (and past) use of specific competitors:
  • Please tell us what other competitors you currently use or have used. This question allows you to identify how your customer’s ranking of your performance is influenced by specific competitors. The response options for this question could be a simple checklist of your competitors.

I have used the RPA in different settings and will illustrate its use in next week’s blog post.

Utility of the Relative Performance Assessment (RPA)

The value of any business solution is reflected in the insight it provides you. For the RPA, the value is seen in diagnosing the reasons behind your industry ranking and how to improve your ranking to increase customer loyalty.

  1. Improve marketing communications: Understand why customers gave you the top ranking (rated you “the best”). These customers’ comments define your competitive advantage (at least to your customers). Identify themes across these customers and use them to guide the content in your marketing communications to prospects.
  2. Identify how to improve your competitive ranking: Understand why customers rank you near the bottom of the pack. These customers’ comments define your competitors’ strength (compared to you) and can help you identify business areas where you need to make improvements in order to improve your ranking against competing brands.
  3. Estimate your industry percentile ranking: Your relative performance is indexed as a percentile rank. This percentile rank, expressed as a percentage, indicates where you stand in the distribution of all other competitors. The percentile rank can vary from 0% to 100%, and a higher percentile rank indicates better relative performance. A percentile rank of 80%, for example, indicates you are better than 80%  of the rest. A percentile rank of 20% indicates you are better than 20% of the rest. The ratings can be translated into percentile values using the following values: the worst = 0; better than some = 25; average = 50; better than most = 75; the best = 100. The average value across your respondents on this question represents your industry percentile rank. I call this index the Customers’ Perception of Percentile Rank (C-PeRk) score.
  4. C-PeRk Score as a Key Business Metric: Typical relationship surveys allow customers to rate their satisfaction with your performance. The metrics from these surveys (e.g., customer loyalty indices, satisfaction indices) are used to track your performance and gauge any improvements that may occur over time. By supplementing these key customer-centric metrics with the C-PeRk score, companies can now measure and track insights regarding the competitive landscape.

Summary

The relative performance of your company (compared to competitors) is related to customer loyalty. Companies that have higher industry rankings receive more share of wallet than companies who have lower industry rankings. Traditional customer experience metrics simply track your companies performance. In addition to these customer experience metrics, companies need to ask customers about their performance relative to competitors. The Relative Performance Assessment  lets customers provide valuable benchmark information and can help companies understand and improve their industry ranking and, consequently, customer loyalty. In next week’s post, I will illustrate the use of the RPA.

Source: Relative Performance Assessment: Improving your Competitive Advantage

Sep 27, 18: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

[  COVER OF THE WEEK ]

image
Data interpretation  Source

[ LOCAL EVENTS & SESSIONS]

More WEB events? Click Here

[ AnalyticsWeek BYTES]

>> The Blueprint for Becoming Data Driven: Data Quality by jelaniharper

>> The Role of Bias In Big Data: A Slippery Slope by analyticsweekpick

>> How to Win Business using Marketing Data [infographics] by v1shal

Wanna write? Click Here

[ FEATURED COURSE]

R, ggplot, and Simple Linear Regression

image

Begin to use R and ggplot while learning the basics of linear regression… more

[ FEATURED READ]

The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World

image

In the world’s top research labs and universities, the race is on to invent the ultimate learning algorithm: one capable of discovering any knowledge from data, and doing anything we want, before we even ask. In The Mast… more

[ TIPS & TRICKS OF THE WEEK]

Grow at the speed of collaboration
A research by Cornerstone On Demand pointed out the need for better collaboration within workforce, and data analytics domain is no different. A rapidly changing and growing industry like data analytics is very difficult to catchup by isolated workforce. A good collaborative work-environment facilitate better flow of ideas, improved team dynamics, rapid learning, and increasing ability to cut through the noise. So, embrace collaborative team dynamics.

[ DATA SCIENCE Q&A]

Q:Explain likely differences between administrative datasets and datasets gathered from experimental studies. What are likely problems encountered with administrative data? How do experimental methods help alleviate these problems? What problem do they bring?
A: Advantages:
– Cost
– Large coverage of population
– Captures individuals who may not respond to surveys
– Regularly updated, allow consistent time-series to be built-up

Disadvantages:
– Restricted to data collected for administrative purposes (limited to administrative definitions. For instance: incomes of a married couple, not individuals, which can be more useful)
– Lack of researcher control over content
– Missing or erroneous entries
– Quality issues (addresses may not be updated or a postal code is provided only)
– Data privacy issues
– Underdeveloped theories and methods (sampling methods…)

Source

[ VIDEO OF THE WEEK]

Making sense of unstructured data by turning strings into things

 Making sense of unstructured data by turning strings into things

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

The data fabric is the next middleware. – Todd Papaioannou

[ PODCAST OF THE WEEK]

@DrewConway on creating socially responsible data science practice #FutureOfData #Podcast

 @DrewConway on creating socially responsible data science practice #FutureOfData #Podcast

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

Poor data can cost businesses 20%–35% of their operating revenue.

Sourced from: Analytics.CLUB #WEB Newsletter

What Felix Baumgartner Space Jump Could Teach Entrepreneurs

What Felix Baumgartner Space Jump Could Teach EntrepreneursFirst some data about the space jump and Felix Baumgartner. “Felix Baumgartner (pronounced [felɪks baʊmgaːɐtnəʁ]; born 20 April 1969) is an Austrianskydiver and a BASE jumper. He set the world record for skydiving an estimated 39 kilometres (128,000 ft), reaching an estimated speed of 1,342 kilometres per hour (834 mph), or Mach 1.24, on October 14, 2012.[1] He is also renowned for the particularly dangerous nature of the stunts he has performed during his career. Baumgartner spent time in the Austrian military where he practiced parachute jumping, including training to land on small target zones.
Baumgartner’s most recent project was Red Bull Stratos, in which he jumped to Earth from a helium balloon in the stratosphere on 14 October 2012. As part of said project, he reached the altitude record for a manned balloon flight.[2]”
Officials from the Fédération Aéronautique Internationale (FAI) are currently analysing data from the descent.[4] If confirmed, the jump statistics Baumgartner will have attained will be:

  • Maximum altitude of 39.045 kilometres (24.261 mi)
  • Maximum velocity of 1,342 kilometres per hour (834 mph), which corresponds to Mach 1.24
  • Total free fall time of 4 minutes 20 seconds
  • Total free fall distance of 36,529 metres (119,846 ft)

(Via Wiki)

On this remarkable occasion, I would like to thank Team Red Bull Stratos for including us all during real-time media coverage to witness the making of history. It was truly a remarkable experience that will stay with us for as long as we live. Also, I would like to thank Felix for undertaking such important and impossible mission, it was truly an eye opening and breath holding jump. While this brave and exceptional jump instrumented with boatload of scientific experiments, it also carried some important lessons for entrepreneurs, something that each of us could learn.

The lessons that space jump by Felix Baumgardner teaches us are:
Aim high, and prepare well: One thing that we will all agree with is the magnanimity of the project. Jumping from 120k feet is an extremely ambitious project and every effort was made to achieve it. The higher the goal, more preparation gets into it. So, aiming big is not a problem but not preparing for it is. In space jump, several runs, reruns of relentless practice, simulation, emulation and trial drills led to achieving success. Similarly, entrepreneurs should also aim high and do everything in power to achieve it. The more the preparation, the closer everyone gets to the success of the project.
Ambitious projects could take some time but thoughtful and consistent preparation will make every dream a reality and every project a success.

A consistent drive will take you there: Getting to space jump was not a short term project for Felix, the project itself took years of preparation and not to mention, Felix was a high rise jumping legend already to suit the bill for the space jumper. A consistent effort and strong commitment from Felix led him to this fame. Not to forget that he worked tooth and nail for it and risked a lot during this journey to be at this point. An entrepreneur should learn that a relentless drive and passion could prepare one for big leagues, if not today, but, will surely succeed tomorrow. Consistent drive is what it takes to fill the gap between what we are and what we want to be. So, entrepreneurs should throw in the game everything they have to achieve the goal.

It takes a team to do the impossible: One more thing that stood out from this jump was an extremely capable team. It was not just one man’s effort but a rockstar team, working relentlessly to achieve a common ambitious goal that lead to success. Goal here was to make Felix jump from 120k+ feet high stratosphere. It was clear during the live run that every member in the team was working tirelessly to make sure that the goal is achieved safely and effectively. Entrepreneurs should understand that having a team is as important, if not more, as having a rockstar idea. After All, it is not just the idea that counts but the execution as well. So, every effort should be made to build a strong team that is capable of executing the grand vision and help convert it to reality. If one has the right team, even space jump is an achievable goal.

Take risk, but safety comes first: Another great example that could be picked from the space jump was the risk involved and dedication to safety. Although, everyone was super excited and anxious to see the jump happen, still every effort was made to make it a safe jump. We had seen that jump had been postponed on numerous occasions during trial attempted run. No matter how small and trivial the issue was, safety was never compromised. This much dedication to safety was extremely instrumental to success of the project. This is a great lesson for entrepreneurs. Entrepreneurs are groomed to take bold risks, and run on thin-tight rope, but they should make sure that every effort is made to understand and reduce the risk. This will not only improve the likelihood of success but also make things more saleable for the business.

Use best advisors: What do you think when you have heard that Felix Baumgartner project has Joe Kittinger as an advisor. Joe Kittinger had achieved stratospheric jump from an altitude of 102.8k feet. So, who better to have as an advisor than to have someone who has been there and done that. Imagine getting boatload of great advice and suggestion from someone who was in the same shoes before and made it all happen.
It certainly did increase the likelihood of success for the project.
So, it is extremely important to have advisors who have crossed the chasm themselves and bring to the table experience that is directly relevant to the areas that are critical to the business. Therefore, having capable advisors do everything to spice up the probability of success for the project.

I am certain that there are numerous other examples that could emerge from the jump, but I would leave it to my brilliant readers to suggest more lessons that we could learn from the jump and share it with the world.

Whew.. Brave friends, you have made it this far. Thank you thank you thank you.. For your bravery and hunger for knowledge, I commend you and salute you with this video.

Source by v1shal