Analytics Strategy that is Startup Compliant
With right tools, capturing data is easy but not being able to handle data could lead to chaos. One of the most reliable startup strategy for adopting data analytics is TUM or The Ultimate Metric. This is the metric that matters the most to your startup. Some advantages of TUM: It answers the most important business question, it cleans up your goals, it inspires innovation and helps you understand the entire quantified business.
[ DATA SCIENCE Q&A]
Q:Explain likely differences between administrative datasets and datasets gathered from experimental studies. What are likely problems encountered with administrative data? How do experimental methods help alleviate these problems? What problem do they bring?
– Large coverage of population
– Captures individuals who may not respond to surveys
– Regularly updated, allow consistent time-series to be built-up
– Restricted to data collected for administrative purposes (limited to administrative definitions. For instance: incomes of a married couple, not individuals, which can be more useful)
– Lack of researcher control over content
– Missing or erroneous entries
– Quality issues (addresses may not be updated or a postal code is provided only)
– Data privacy issues
– Underdeveloped theories and methods (sampling methods )
The largest AT&T database boasts titles including the largest volume of data in one unique database (312 terabytes) and the second largest number of rows in a unique database (1.9 trillion), which comprises AT&T’s extensive calling records.
In this first part of two part podcast @TimothyChou discussed the Internet of Things landscape. He laid out how internet has always been about internet of things and not internet of people. He sheds light on internet of things as it is spread across themes of things, connect, collect, learn and do workflows. He builds an interesting case about achieving precision to introduction optimality.
Timothy Chou has his career spanning through academia, successful (and not so successful) startups and large corporations. He was one of only a few people to hold the President title at Oracle. As President of Oracle On Demand he grew the cloud business from itâs very beginning. Today that business is over $2B. He wrote about the move of applications to the cloud in 2004 in his first book, âThe End of Softwareâ. Today he serves on the board of Blackbaud, a nearly $700M vertical application cloud service company.
After earning his PhD in EE at the University of Illinois he went to work for Tandem Computers, one of the original Silicon Valley startups. Had he understood stock options he would have joined earlier. Heâs invested in and been a contributor to a number of other startups, some youâve heard of like Webex, and others youâve never heard of but were sold to companies like Cisco and Oracle. Today he is focused on several new ventures in cloud computing, machine learning and the Internet of Things.
#FutureOfData podcast is a conversation starter to bring leaders, influencers and lead practitioners to come on show and discuss their journey in creating the data driven future.
Below is the final installment of the 20 Best Practices for Customer Feedback Programs. Today’s post covers best practices in Applied Research.
Applied Research Best Practices
Customer-focused research using the customer feedback data can provide additional insight into the needs of the customer base and increases the overall value of the customer feedback program. This research extends well beyond the information that is gained from the typical reporting tools that summarize customer feedback with basic descriptive statistics.
Loyalty leaders regularly conduct applied research using their customer feedback data. Typical research projects can include creating customer-centric business metrics, building incentive compensation programs around customer metrics, and establishing training criteria that has a measured impact on customer satisfaction. Sophisticated research programs require advanced knowledge of research methods and statistics. Deciphering signal from noise in the data require more than the inter-ocular test (eyeballing the data).
Loyalty leaders link their customer feedback data to other data sources (see Figure 5 for financial, operational, and constituency linkages). Once the data are merged (see Figure 6 for data model for financial linkage), analysis can be conducted to help us understand the causes (operational, constituency) and consequences (financial) of customer satisfaction and loyalty. Loyalty leaders can use the results of these types of studies to:
Support business case of customer feedback program (financial linkage)
Identify objective, operational metrics that impact customer satisfaction and manage employee performance using these customer-centric metrics (operational linkage)
Understand how employees and partners impact customer satisfaction to ensure proper employee and partner relationship management (constituency linkage)
A list of best practices in Applied Research appears in Table 6.
Table 6. Best Practices in Applied Research
15. Ensure results from customer feedback collection processes are reliable, valid and useful
Conduct a validation study of the customer feedback program. Verify the reliability, validity and usefulness of customer feedback metrics to ensure you are measuring the right things. This assessment needs to be one of the first research projects conducted to support (and dispute any challenges regarding) the use of these customer metrics to manage the company. This research will help you create summary statistics for use in executive reporting and company dashboards; summary scores are more reliable and provide a better basis for business decisions compared to using only individual survey questions.
16. Identify linkage between customer feedback metrics and operational metrics
Demonstrate that operational metrics are related to customer feedback metrics so that these operational metrics can be used to manage employees.Â Additionally, because of their reliability and specificity, these operational metrics are good candidates for use in employee incentive programs.
17. Regularly conduct applied customer-focused research
Build a comprehensive research program using the customer-centric metrics (and other business metrics) to get deep insight regarding the business processes. Customer feedback can be used to improve all phases of the customer lifecycle (marketing, sales, and service).
18. Identify linkage between customer feedback metrics and business metrics
Illustrate that financial metrics (e.g., profit, sales, and revenue) are related to customer feedback metrics. Often times, this type of study can be used as a business case to demonstrate value of the customer feedback program.
19. Identify linkage between customer feedback metrics and other constituencyâs attitudes
Identify factors of constituency attitudesÂ (e.g., employee and partner satisfaction) that are linked to customer satisfaction/loyalty. Use these insights to properly manage employee and partner relationships to ensure high customer loyalty. Surveying all constituencies in the company ecosystem helps ensure all parties are focused on the customers and their needs.
20. Understand customer segments using customer information
Compare customer groups to identify key differences among groups on customer feedback metrics (e.g., satisfaction, and loyalty). This process helps identify best practices internally among customer segments.
Loyalty leaders are excellent examples of customer-centric companies. Compared to their loyalty lagging counterparts, loyalty leading companies embed customer feedback throughout the entire company, from top to bottom. Loyalty leaders use customer feedback to set the vision and manage their business; they also integrate the feedback into daily business processes and communicate all processes, goals and results of the customer program to the entire company. Finally, they integrate different business data (operational, financial, customer feedback), to reveal deep customer insights through in-depth research.
Take the Customer Feedback Programs Best Practices Survey
You can take the best practices survey to receive free feedback on your companyâs customer feedback program. This self-assessment survey assesses the extent to which your company adopts best practices throughout their program. Go here to take the free survey: http://businessoverbroadway.com/resources/self-assessment-survey.
Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16(3), 297-334.
Finding a success in your data science ? Find a mentor
Yes, most of us dont feel a need but most of us really could use one. As most of data science professionals work in their own isolations, getting an unbiased perspective is not easy. Many times, it is also not easy to understand how the data science progression is going to be. Getting a network of mentors address these issues easily, it gives data professionals an outside perspective and unbiased ally. It’s extremely important for successful data science professionals to build a mentor network and use it through their success.
[ DATA SCIENCE Q&A]
Q:Is it better to design robust or accurate algorithms?
A: A. The ultimate goal is to design systems with good generalization capacity, that is, systems that correctly identify patterns in data instances not seen before
B. The generalization performance of a learning system strongly depends on the complexity of the model assumed
C. If the model is too simple, the system can only capture the actual data regularities in a rough manner. In this case, the system poor generalization properties and is said to suffer from underfitting
D. By contrast, when the model is too complex, the system can identify accidental patterns in the training data that need not be present in the test set. These spurious patterns can be the result of random fluctuations or of measurement errors during the data collection process. In this case, the generalization capacity of the learning system is also poor. The learning system is said to be affected by overfitting
E. Spurious patterns, which are only present by accident in the data, tend to have complex forms. This is the idea behind the principle of Occams razor for avoiding overfitting: simpler models are preferred if more complex models do not significantly improve the quality of the description for the observations
Quick response: Occams Razor. It depends on the learning task. Choose the right balance
F. Ensemble learning can help balancing bias/variance (several weak learners together = strong learner) Source
In the developed economies of Europe, government administrators could save more than 100 billion ($149 billion) in operational efficiency improvements alone by using big data, not including using big data to reduce fraud and errors and boost the collection of tax revenues.
I wrote about net scores last week and presented evidence that showedÂ net scores are ambiguous and unnecessary. Â Net scores are created by taking the difference between the percent of “positive” scores and the percent of “negative” scores. Net scores were made popular by Fred Reichheld and Satmetrix in their work on customer loyalty measurement. Their Net Promoter Score is a difference score between the percent of “promoters” (ratings of 9 or 10) and percent of “detractors” (ratings of 0 thru 6) on the question, “How likely would you be to recommend <company> to your friends/colleagues?”
This resulting Net Promoter Score is used to gauge the level of loyalty for companies or customer segments. In my post, I presented what I believe to be sound evidence that mean scores and top/bottom box scores are much better summary indices than net scores. Descriptive statistics like the mean and standard deviationÂ provideÂ important information that describe the location and spread of the distribution of responses. Also, top/bottom box scores provide precise information about the size of customer segments. Net scores do neither.
Rob Markey, the co-author of the book, The Ultimate Question 2.0 Â (along with Fred Reichheld), tweeted about last week’s blog post.
I really am unclear about how Mr. Markey believes my argument is supporting (in CAPS, mind you) the NPS point of view. I responded to his tweet but never received a clarification from him.
So, IÂ present this post as an open invitation for Mr. Markey to explain how my argument regarding the ambiguity of the NPS supports their point of view.
One More Thing
I never deliver argumentsÂ shrouded behind a mask of criticism. Â While my analyses focused on the NPS, my argument against net scores (difference scores) applies to any net score; I just happened to have data on the recommend question, a common question used in customer surveys. In fact, I even ran the same analyses (e.g., comparing means to net scores) on other customer loyalty questions (e.g., overall sat, likelihood to buy), but I did not present those results because they were highly redundant to what I found using the recommend question. The problem of difference scores applies to any customer metric.
I have directly and openly criticized the research on which the NPS is based in my blog posts,Â articles, andÂ books. I proudly stand behind my research and critique of the Net Promoter Score. Other mask-less researchers/practitioners have also voiced concern about the “research” on which the NPS is based. See Vovici’s blog postÂ for a review. Also, be sure to readÂ Tim Keiningham’s interview with Research Magazine in which he calls the NPS claims “nonsense”. Yes. Nonsense.
Just to be clear, “Nonsense” does not mean “Awesome.”
In this podcast @JohnNives discusses ways to demystify AI for enterprise. He shared his perspective on how businesses should engage with AI and what are some of the best practices and considerations for businesses to adopt AI in their strategic roadmap. This podcast is great for anyone seeking to learn about way to adopt AI in enterprise landscape.
Jean-Louis (John) Nives serves as Chief Digital Officer and the Global Chair of the Digital Transformation practice at N2Growth. Prior to joining N2Growth, Mr. Nives was at IBM Global Business Services, within the Watson and Analytics Center of Competence. There he worked on Cognitive Digital Transformation projects related to Watson, Big Data, Analytics, Social Business and Marketing/Advertising Technology. Examples include CognitiveTV and the application of external unstructured data (social, weather, etc.) for business transformation.
Prior relevant experience includes executive leadership positions at Nielsen, IRI, Kraft and two successful advertising technology acquisitions (Appnexus and SintecMedia). In this capacity, Jean-Louis combined information, analytics and technology to created significant business value in transformative ways.
Jean-Louis earned a Bachelorâs Degree in Industrial Engineering from University at Buffalo and an MBA in Finance and Computer Science from Pace University. He is married with four children and lives in the New York City area.
#FutureOfData podcast is a conversation starter to bring leaders, influencers and lead practitioners to come on show and discuss their journey in creating the data driven future.
If you or any you know wants to join in,
Register your interest @ http://play.analyticsweek.com/guest/
Want to sponsor?
Email us @ email@example.com
Keeping Biases Checked during the last mile of decision making
Today a data driven leader, a data scientist or a data driven expert is always put to test by helping his team solve a problem using his skills and expertise. Believe it or not but a part of that decision tree is derived from the intuition that adds a bias in our judgement that makes the suggestions tainted. Most skilled professionals do understand and handle the biases well, but in few cases, we give into tiny traps and could find ourselves trapped in those biases which impairs the judgement. So, it is important that we keep the intuition bias in check when working on a data problem.
[ DATA SCIENCE Q&A]
Q:Give examples of bad and good visualizations?
A: Bad visualization:
– Pie charts: difficult to make comparisons between items when area is used, especially when there are lots of items
– Color choice for classes: abundant use of red, orange and blue. Readers can think that the colors could mean good (blue) versus bad (orange and red) whereas these are just associated with a specific segment
– 3D charts: can distort perception and therefore skew data
– Using a solid line in a line chart: dashed and dotted lines can be distracting
– Heat map with a single color: some colors stand out more than others, giving more weight to that data. A single color with varying shades show the intensity better
– Adding a trend line (regression line) to a scatter plot help the reader highlighting trends
A quarter of decision-makers surveyed predict that data volumes in their companies will rise by more than 60 per cent by the end of 2014, with the average of all respondents anticipating a growth of no less than 42 per cent.
The projected expansion of the data ecosystem in 2017 is causing extremely deliberate, systematic challenges for organizations attempting to exploit the most effective techniques available for maximizing data utility.
The plenitude of cognitive computing options, cloud paradigms, data science, and mobile technologies for big data has demonstrated its business value in a multitude of use cases. Pragmatically, however, its inclusion alongside conventional data management processes poses substantial questions on the back end pertaining to data governance and, more fundamentally, to data modeling.
Left unchecked, these concerns could potentially compromise any front-end merit while cluttering data-driven methods with unnecessary silos and neglected data sets. The key to addressing them lies in the implementation of swiftly adjustable data models which can broaden to include the attributes of the constantly changing business environments in which organizations compete.
According to TopQuadrant Executive VP and Director of TopBraid Technologies Ralph Hodgson, the consistency and adaptability of data modeling may play an even more dire role for the enterprise today:
âYou have physical models and logical models, and they make their way into different databases from development to user acceptance into production. On that journey, things change. People might change the names of some of the columns of some of those data bases. The huge need is to be able to trace that through that whole assembly line of data.â
Enterprise Data Models
One of the surest ways to create a flexible enterprise model for a top down approach to the multiple levels of modeling Hodgson denoted is to use the linked data approach reliant upon semantic standards. Although there are other means of implementing enterprise data models, this approach has the advantages of being based on uniform standards applicable to all data which quickly adjust to include new requirements and use cases. Moreover, it has the added benefit of linking all data on an enterprise knowledge graph which, according to Franz CEO Jans Aasman, is one of the dominant trends to impact the coming year. âWe donât have to even talk about it anymore,â Aasman stated. âEveryone is trying to produce a knowledge graph of their data assets.â
The merit of a uniform data model for multiple domains throughout the enterprise is evinced in Master Data Management platforms as well; one can argue the linked data approach of ontological models merely extends that concept throughout the enterprise. In both cases, organizations are able to avoid situations in which âthey spend so much time trying to figure out what the data model looks like and how do we integrate these different systems together so they can talk.â Stibo Systems Director of Vertical Solutions Shahrukh Arif claimed. âIf you have it all in one platform, now you can actually realize that full value because you donât have so spend so much time and money on the integrations and data models.â
Data Utility Models
The consistency of comprehensive approaches to data modeling are particularly crucial for cloud-based architecture or for incorporating data external to the enterprise. Frequently, organizations may encounter situations in which they must reconcile differences in modeling and metadata when attaining data from third-party sources. They can address these issues upfront by creating what DISCERN Chairman and CEO Harry Blount termed a âdata utility modelâ, in which âall of the relevant data was available and mapped to all of the relevant macro-metadata, a metamodel I should say, and you could choose which data you wantâ from the third party in accordance with the utility model. Actually erecting such a model requires going through the conventional modeling process of determining business requirements and facilitating them through ITâwhich organizations can actually have done for them by competitive service providers. âStep one is asking all the right questions, step two is you need to have a federated, real-time data integration platform so you can take in any data in any format at any time in any place and always keep it up to date,â Blount acknowledged. âThe third requirement is you need to have a scalable semantic graph structure.â
Relational Data Modeling (On-Demand Schema)
Data modeling in the relational world is increasingly impacted by the modeling techniques associated with contemporary big data initiatives. Redressing the inherent modeling disparities between the two is largely a means of accounting for semi-structured and unstructured data in relational environments primarily designed for structured data. Organizations are able to hurdle this modeling issue through the means of file formats which derive schema on demand. Options such as JSON and Avro are ideal for those who âwant what is modeled in the big data world to align with what they have in their relational databases so they can do analytics held in their main databases,â Hodgson remarked.
One of the boons of utilizing Avro is the complete traceability it provides for data in relational settingsâalthough such data may have originated from more contemporary unstructured sources associated with big data. The Avro format, and other files in this vein, allow modelers to traverse both relational schema requirements with what may be a lack of such schema intrinsic to most big data. According to Hodgson, Avro âstill has the ontological connection, but it still talks in terms of property values and columns. Itâs basically a table in the same sense you find in a spreadsheet. Itâs that kind of table but the columns all align with the columns in a relational database, and those columns can be associated with a logical model which need not be an entity-relationship model. It can be an ontology.â
Predictive models have been widely impacted by cognitive computing methods and other aspects of data science–although these two realms of data management are not necessarily synonymous with classic statistically-trained predictive models. Still, the influx of algorithms associated with various means of cognitive computing are paramount to the creation of predictive models which illustrate their full utility on unstructured big data sets at high velocities. Organizations can access entire libraries of machine learning and deep learning models from third-party vendors through the cloud, and either readily deploy them with their own data or âAs a platform, we allow customers to build their own models or extend our models in service of their own specific needsâ indico Chief Customer Officer Vishal Daga said.
The result is not only a dramatic reduction in the overall cost, labor, and salaries of hard to find data scientists to leverage cognitive computing techniques for predictive models, but also a degree of personalizationâfacilitated by the intelligent algorithms involvedâenabling organizations to tailor those models to their own particular use cases. Thus, AI-centered SaaS opportunities actually reflect a predictive models on-demand service based on some of the most relevant data-centric processes to date.
The nucleus of the enduring appositeness of data modeling is the increasingly complicated data landscapeâincluding cognitive computing, a bevy of external data sources heralded by the cloud and mobile technologies in big data quantitiesâand the need to effectually structure data in a meaningful way. Modeling data is the initial step to gleaning its meaning and provides the basis for all of the different incarnations of data modeling, regardless of the particular technologies involved. However, there appears to be a burgeoning sense of credence associated with doing so on an enterprise-wide scale as âKnowing how dataâs flowing and who itâs supporting, and what kind of new sources might make a difference to those usages, itâs all going to be possible when you have a representation of the enterprise,â Hodgson commented.
Adding further conviction to the value of enterprise data modeling is the analytic output facilitated by it. All-inclusive modeling techniques at the core of enterprise-spanning knowledge graphs appear well-suited for the restructuring of the data sphere caused by the big data disruptionâparticularly when paired with in-memory, parallel processing graph-aware analytics engines. âAs modern data diversity and volumes grow, relational database management systems (RDBMS) are proving too inflexible, expensive and time-consuming for enterprises,â Cambridge Semantics VP of Engineering Barry Zane said.Â âGraph-based online analytical processing (GOLAP) will find a central place in everyday business by taking on data analytics challenges of all shapes and sizes, rapidly accelerating time-to-value in data discovery and analytics.â