@TimothyChou on World of #IOT & Its #Future Part 1 #FutureOfData #Podcast

[youtube https://www.youtube.com/watch?v=ezNX6XYozIc]

In this first part of two part podcast @TimothyChou discussed the Internet of Things landscape. He laid out how internet has always been about internet of things and not internet of people. He sheds light on internet of things as it is spread across themes of things, connect, collect, learn and do workflows. He builds an interesting case about achieving precision to introduction optimality.

 

Timothy’s Recommended Read:
Life 3.0: Being Human in the Age of Artificial Intelligence by Max Tegmark http://amzn.to/2Cidyhy
Zone to Win: Organizing to Compete in an Age of Disruption Paperback by Geoffrey A. Moore http://amzn.to/2Hd5zpv

Podcast Link:
iTunes: http://math.im/itunes
GooglePlay: http://math.im/gplay

Timothy’s BIO:
Timothy Chou has his career spanning through academia, successful (and not so successful) startups and large corporations. He was one of only a few people to hold the President title at Oracle. As President of Oracle On Demand he grew the cloud business from it’s very beginning. Today that business is over $2B. He wrote about the move of applications to the cloud in 2004 in his first book, “The End of Software”. Today he serves on the board of Blackbaud, a nearly $700M vertical application cloud service company.

After earning his PhD in EE at the University of Illinois he went to work for Tandem Computers, one of the original Silicon Valley startups. Had he understood stock options he would have joined earlier. He’s invested in and been a contributor to a number of other startups, some you’ve heard of like Webex, and others you’ve never heard of but were sold to companies like Cisco and Oracle. Today he is focused on several new ventures in cloud computing, machine learning and the Internet of Things.

About #Podcast:
#FutureOfData podcast is a conversation starter to bring leaders, influencers and lead practitioners to come on show and discuss their journey in creating the data driven future.

Wanna Join?
If you or any you know wants to join in,
Register your interest @ http://play.analyticsweek.com/guest/

Want to sponsor?
Email us @ info@analyticsweek.com

Keywords:
#FutureOfData #DataAnalytics #Leadership #Podcast #BigData #Strategy

Originally Posted at: @TimothyChou on World of #IOT & Its #Future Part 1 #FutureOfData #Podcast by admin

20 Best Practices for Customer Feedback Programs: Applied Research

Below is the final installment of the 20 Best Practices for Customer Feedback Programs. Today’s post covers best practices in Applied Research.

Figure 5. Common types of linkages among disparate data sources
Figure 5. Common types of linkages among disparate data sources

Applied Research Best Practices

Customer-focused research using the customer feedback data can provide additional insight into the needs of the customer base and increases the overall value of the customer feedback program. This research extends well beyond the information that is gained from the typical reporting tools that summarize customer feedback with basic descriptive statistics.

Loyalty leaders regularly conduct applied research using their customer feedback data. Typical research projects can include creating customer-centric business metrics, building incentive compensation programs around customer metrics, and establishing training criteria that has a measured impact on customer satisfaction. Sophisticated research programs require advanced knowledge of research methods and statistics. Deciphering signal from noise in the data require more than the inter-ocular test (eyeballing the data).

Figure 6. Data model for financial linkage analysis

Loyalty leaders link their customer feedback data to other data sources (see Figure 5 for financial, operational, and constituency linkages). Once the data are merged (see Figure 6 for data model for financial linkage), analysis can be conducted to help us understand the causes (operational, constituency) and consequences (financial) of customer satisfaction and loyalty. Loyalty leaders can use the results of these types of studies to:

  1. Support business case of customer feedback program (financial linkage)
  2. Identify objective, operational metrics that impact customer satisfaction and manage employee performance using these customer-centric metrics (operational linkage)
  3. Understand how employees and partners impact customer satisfaction to ensure proper employee and partner relationship management (constituency linkage)

A list of best practices in Applied Research appears in Table 6.

Table 6. Best Practices in Applied Research
Best Practices The specifics…
15. Ensure results from customer feedback collection processes are reliable, valid and useful Conduct a validation study of the customer feedback program. Verify the reliability, validity and usefulness of customer feedback metrics to ensure you are measuring the right things. This assessment needs to be one of the first research projects conducted to support (and dispute any challenges regarding) the use of these customer metrics to manage the company. This research will help you create summary statistics for use in executive reporting and company dashboards; summary scores are more reliable and provide a better basis for business decisions compared to using only individual survey questions.
16. Identify linkage between customer feedback metrics and operational metrics Demonstrate that operational metrics are related to customer feedback metrics so that these operational metrics can be used to manage employees.  Additionally, because of their reliability and specificity, these operational metrics are good candidates for use in employee incentive programs.
17. Regularly conduct applied customer-focused research Build a comprehensive research program using the customer-centric metrics (and other business metrics) to get deep insight regarding the business processes. Customer feedback can be used to improve all phases of the customer lifecycle (marketing, sales, and service).
18. Identify linkage between customer feedback metrics and business metrics Illustrate that financial metrics (e.g., profit, sales, and revenue) are related to customer feedback metrics. Often times, this type of study can be used as a business case to demonstrate value of the customer feedback program.
19. Identify linkage between customer feedback metrics and other constituency’s attitudes Identify factors of constituency attitudes  (e.g., employee and partner satisfaction) that are linked to customer satisfaction/loyalty. Use these insights to properly manage employee and partner relationships to ensure high customer loyalty. Surveying all constituencies in the company ecosystem helps ensure all parties are focused on the customers and their needs.
20. Understand customer segments using customer information Compare customer groups to identify key differences among groups on customer feedback metrics (e.g., satisfaction, and loyalty). This process helps identify best practices internally among customer segments.
Copyright © 2011 Business Over Broadway

Summary

Loyalty leaders are excellent examples of customer-centric companies. Compared to their loyalty lagging counterparts, loyalty leading companies embed customer feedback throughout the entire company, from top to bottom. Loyalty leaders use customer feedback to set the vision and manage their business; they also integrate the feedback into daily business processes and communicate all processes, goals and results of the customer program to the entire company. Finally, they integrate different business data (operational, financial, customer feedback), to reveal deep customer insights through in-depth research.

Take the Customer Feedback Programs Best Practices Survey

You can take the best practices survey to receive free feedback on your company’s customer feedback program. This self-assessment survey assesses the extent to which your company adopts best practices throughout their program. Go here to take the free survey: http://businessoverbroadway.com/resources/self-assessment-survey.

References

Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16(3), 297-334.

Hayes, B.E. (2011). Lessons in loyalty. Quality Progress, March, 24-31.

Hayes, B.E., Goodden, R., Atkinson, R., Murdock, F. & Smith, D. (2010). Where to Start: Experts weigh in on what all of us can learn from Toyota’s challenges. Quality Progress, April, 16-23.

Hayes, B. E. (2009). Beyond the ultimate question: A systematic approach to improve customer loyalty. Quality Press. Milwaukee, WI.

Hayes, B. E. (2008a). Measuring customer satisfaction and loyalty: Survey design, use and statistical analysis methods (3rd ed.). Quality Press. Milwaukee, WI.

Hayes, B. E. (2008b). Customer loyalty 2.0: The Net Promoter Score debate and the meaning of customer loyalty, Quirk’s Marketing Research Review, October, 54-62.

Hayes, B. E. (2008c). The true test of loyalty. Quality Progress. June, 20-26.

Keiningham, T. L., Cooil, B., Andreassen, T.W., & Aksoy, L. (2007). A longitudinal examination of net promoter and firm revenue growth. Journal of Marketing, 71 (July), 39-51.

Morgan, N.A. & Rego, L.L. (2006). The value of different customer satisfaction and loyalty metrics in predicting business performance. Marketing Science, 25(5), 426-439.

Nunnally, J. M. (1978). Psychometric Theory, Second Edition. New York, NY. McGraw-Hill.

Reichheld, F. F. (2003). The One Number You Need to Grow. Harvard Business Review, 81 (December), 46-54.

Reichheld, F. F. (2006). The ultimate question: driving good profits and true growth. Harvard Business School Press. Boston.

 

 

Originally Posted at: 20 Best Practices for Customer Feedback Programs: Applied Research

Unmasking the Problem with Net Scores and the NPS Claims

I wrote about net scores last week and presented evidence that showed net scores are ambiguous and unnecessary.  Net scores are created by taking the difference between the percent of “positive” scores and the percent of “negative” scores. Net scores were made popular by Fred Reichheld and Satmetrix in their work on customer loyalty measurement. Their Net Promoter Score is a difference score between the percent of “promoters” (ratings of 9 or 10) and percent of “detractors” (ratings of 0 thru 6) on the question, “How likely would you be to recommend <company> to your friends/colleagues?”

This resulting Net Promoter Score is used to gauge the level of loyalty for companies or customer segments. In my post, I presented what I believe to be sound evidence that mean scores and top/bottom box scores are much better summary indices than net scores. Descriptive statistics like the mean and standard deviation provide important information that describe the location and spread of the distribution of responses. Also, top/bottom box scores provide precise information about the size of customer segments. Net scores do neither.

Rob Markey, the co-author of the book, The Ultimate Question 2.0  (along with Fred Reichheld), tweeted about last week’s blog post.

Rob Markey' Tweet

I really am unclear about how Mr. Markey believes my argument is supporting (in CAPS, mind you) the NPS point of view. I responded to his tweet but never received a clarification from him.

So, I present this post as an open invitation for Mr. Markey to explain how my argument regarding the ambiguity of the NPS supports their point of view.

One More Thing

I never deliver arguments shrouded behind a mask of criticism.  While my analyses focused on the NPS, my argument against net scores (difference scores) applies to any net score; I just happened to have data on the recommend question, a common question used in customer surveys. In fact, I even ran the same analyses (e.g., comparing means to net scores) on other customer loyalty questions (e.g., overall sat, likelihood to buy), but I did not present those results because they were highly redundant to what I found using the recommend question. The problem of difference scores applies to any customer metric.

I have directly and openly criticized the research on which the NPS is based in my blog posts, articles, and books. I proudly stand behind my research and critique of the Net Promoter Score. Other mask-less researchers/practitioners have also voiced concern about the “research” on which the NPS is based. See Vovici’s blog post for a review. Also, be sure to read Tim Keiningham’s interview with Research Magazine in which he calls the NPS claims “nonsense”. Yes. Nonsense.

Just to be clear, “Nonsense” does not mean “Awesome.”

Source: Unmasking the Problem with Net Scores and the NPS Claims by bobehayes

@JohnNives on ways to demystify AI for enterprise #FutureOfData

[youtube https://www.youtube.com/watch?v=daiVHrsZQMU]

@JohnNives on ways to demystify AI for enterprise #FutureOfData

Youtube: https://www.youtube.com/watch?v=daiVHrsZQMU
iTunes: http://math.im/itunes

In this podcast @JohnNives discusses ways to demystify AI for enterprise. He shared his perspective on how businesses should engage with AI and what are some of the best practices and considerations for businesses to adopt AI in their strategic roadmap. This podcast is great for anyone seeking to learn about way to adopt AI in enterprise landscape.

John’s Recommended Listen:
FutureOfData Podcast http://math.im/itunes
War and Peace Leo Tolstoy (Author),‎ Frederick Davidson (Narrator),‎ Inc. Blackstone Audio (Publisher) https://amzn.to/2w7ObkI

Podcast Link:
iTunes: http://math.im/itunes
GooglePlay: http://math.im/gplay

Jean’s BIO:
Jean-Louis (John) Nives serves as Chief Digital Officer and the Global Chair of the Digital Transformation practice at N2Growth. Prior to joining N2Growth, Mr. Nives was at IBM Global Business Services, within the Watson and Analytics Center of Competence. There he worked on Cognitive Digital Transformation projects related to Watson, Big Data, Analytics, Social Business and Marketing/Advertising Technology. Examples include CognitiveTV and the application of external unstructured data (social, weather, etc.) for business transformation.
Prior relevant experience includes executive leadership positions at Nielsen, IRI, Kraft and two successful advertising technology acquisitions (Appnexus and SintecMedia). In this capacity, Jean-Louis combined information, analytics and technology to created significant business value in transformative ways.
Jean-Louis earned a Bachelor’s Degree in Industrial Engineering from University at Buffalo and an MBA in Finance and Computer Science from Pace University. He is married with four children and lives in the New York City area.

About #Podcast:
#FutureOfData podcast is a conversation starter to bring leaders, influencers and lead practitioners to come on show and discuss their journey in creating the data driven future.

Wanna Join?
If you or any you know wants to join in,
Register your interest @ http://play.analyticsweek.com/guest/

Want to sponsor?
Email us @ info@analyticsweek.com

Keywords:
#FutureOfData #DataAnalytics #Leadership #Podcast #BigData #Strategy

Source: @JohnNives on ways to demystify AI for enterprise #FutureOfData by admin

2017 Trends in Data Modeling

The projected expansion of the data ecosystem in 2017 is causing extremely deliberate, systematic challenges for organizations attempting to exploit the most effective techniques available for maximizing data utility.

The plenitude of cognitive computing options, cloud paradigms, data science, and mobile technologies for big data has demonstrated its business value in a multitude of use cases. Pragmatically, however, its inclusion alongside conventional data management processes poses substantial questions on the back end pertaining to data governance and, more fundamentally, to data modeling.

Left unchecked, these concerns could potentially compromise any front-end merit while cluttering data-driven methods with unnecessary silos and neglected data sets. The key to addressing them lies in the implementation of swiftly adjustable data models which can broaden to include the attributes of the constantly changing business environments in which organizations compete.

According to TopQuadrant Executive VP and Director of TopBraid Technologies Ralph Hodgson, the consistency and adaptability of data modeling may play an even more dire role for the enterprise today:

“You have physical models and logical models, and they make their way into different databases from development to user acceptance into production. On that journey, things change. People might change the names of some of the columns of some of those data bases. The huge need is to be able to trace that through that whole assembly line of data.”

Enterprise Data Models
One of the surest ways to create a flexible enterprise model for a top down approach to the multiple levels of modeling Hodgson denoted is to use the linked data approach reliant upon semantic standards. Although there are other means of implementing enterprise data models, this approach has the advantages of being based on uniform standards applicable to all data which quickly adjust to include new requirements and use cases. Moreover, it has the added benefit of linking all data on an enterprise knowledge graph which, according to Franz CEO Jans Aasman, is one of the dominant trends to impact the coming year. “We don’t have to even talk about it anymore,” Aasman stated. “Everyone is trying to produce a knowledge graph of their data assets.”

The merit of a uniform data model for multiple domains throughout the enterprise is evinced in Master Data Management platforms as well; one can argue the linked data approach of ontological models merely extends that concept throughout the enterprise. In both cases, organizations are able to avoid situations in which “they spend so much time trying to figure out what the data model looks like and how do we integrate these different systems together so they can talk.” Stibo Systems Director of Vertical Solutions Shahrukh Arif claimed. “If you have it all in one platform, now you can actually realize that full value because you don’t have so spend so much time and money on the integrations and data models.”

Data Utility Models
The consistency of comprehensive approaches to data modeling are particularly crucial for cloud-based architecture or for incorporating data external to the enterprise. Frequently, organizations may encounter situations in which they must reconcile differences in modeling and metadata when attaining data from third-party sources. They can address these issues upfront by creating what DISCERN Chairman and CEO Harry Blount termed a “data utility model”, in which “all of the relevant data was available and mapped to all of the relevant macro-metadata, a metamodel I should say, and you could choose which data you want” from the third party in accordance with the utility model. Actually erecting such a model requires going through the conventional modeling process of determining business requirements and facilitating them through IT—which organizations can actually have done for them by competitive service providers. “Step one is asking all the right questions, step two is you need to have a federated, real-time data integration platform so you can take in any data in any format at any time in any place and always keep it up to date,” Blount acknowledged. “The third requirement is you need to have a scalable semantic graph structure.”

Relational Data Modeling (On-Demand Schema)
Data modeling in the relational world is increasingly impacted by the modeling techniques associated with contemporary big data initiatives. Redressing the inherent modeling disparities between the two is largely a means of accounting for semi-structured and unstructured data in relational environments primarily designed for structured data. Organizations are able to hurdle this modeling issue through the means of file formats which derive schema on demand. Options such as JSON and Avro are ideal for those who “want what is modeled in the big data world to align with what they have in their relational databases so they can do analytics held in their main databases,” Hodgson remarked.

One of the boons of utilizing Avro is the complete traceability it provides for data in relational settings—although such data may have originated from more contemporary unstructured sources associated with big data. The Avro format, and other files in this vein, allow modelers to traverse both relational schema requirements with what may be a lack of such schema intrinsic to most big data. According to Hodgson, Avro “still has the ontological connection, but it still talks in terms of property values and columns. It’s basically a table in the same sense you find in a spreadsheet. It’s that kind of table but the columns all align with the columns in a relational database, and those columns can be associated with a logical model which need not be an entity-relationship model. It can be an ontology.”

Predictive Models
Predictive models have been widely impacted by cognitive computing methods and other aspects of data science–although these two realms of data management are not necessarily synonymous with classic statistically-trained predictive models. Still, the influx of algorithms associated with various means of cognitive computing are paramount to the creation of predictive models which illustrate their full utility on unstructured big data sets at high velocities. Organizations can access entire libraries of machine learning and deep learning models from third-party vendors through the cloud, and either readily deploy them with their own data or “As a platform, we allow customers to build their own models or extend our models in service of their own specific needs” indico Chief Customer Officer Vishal Daga said.

The result is not only a dramatic reduction in the overall cost, labor, and salaries of hard to find data scientists to leverage cognitive computing techniques for predictive models, but also a degree of personalization—facilitated by the intelligent algorithms involved—enabling organizations to tailor those models to their own particular use cases. Thus, AI-centered SaaS opportunities actually reflect a predictive models on-demand service based on some of the most relevant data-centric processes to date.

Enterprise Representation
The nucleus of the enduring appositeness of data modeling is the increasingly complicated data landscape—including cognitive computing, a bevy of external data sources heralded by the cloud and mobile technologies in big data quantities—and the need to effectually structure data in a meaningful way. Modeling data is the initial step to gleaning its meaning and provides the basis for all of the different incarnations of data modeling, regardless of the particular technologies involved. However, there appears to be a burgeoning sense of credence associated with doing so on an enterprise-wide scale as “Knowing how data’s flowing and who it’s supporting, and what kind of new sources might make a difference to those usages, it’s all going to be possible when you have a representation of the enterprise,” Hodgson commented.

Adding further conviction to the value of enterprise data modeling is the analytic output facilitated by it. All-inclusive modeling techniques at the core of enterprise-spanning knowledge graphs appear well-suited for the restructuring of the data sphere caused by the big data disruption—particularly when paired with in-memory, parallel processing graph-aware analytics engines. “As modern data diversity and volumes grow, relational database management systems (RDBMS) are proving too inflexible, expensive and time-consuming for enterprises,” Cambridge Semantics VP of Engineering Barry Zane said. “Graph-based online analytical processing (GOLAP) will find a central place in everyday business by taking on data analytics challenges of all shapes and sizes, rapidly accelerating time-to-value in data discovery and analytics.”

 

Source

What Happens When You Put Hundreds of BI Experts in One Room?

Last week we wrapped up the second day of our global, two-day client conference: Eureka!. Our sold-out event brought together hundreds of business leaders and analytics professionals from around the globe to listen to thought-provoking presentations and engage in discussions about the evolution of the analytics industry.

You may be wondering why we chose to call out client conference “Eureka!”. I’m glad you asked. A “eureka moment” is an “aha!” moment, a moment where something clicks and finally makes sense. In hearing and sharing stories, experiences, and perspectives with industry veterans and peers, it was our hope that attendees experienced moments of surprise and enlightenment.

Unsurprisingly, some of the hottest topics at Eureka! were the shift to embedding analytics everywhere, the impact of AI and augmented analytics on businesses, and how to drive transformational change with analytics.

Eureka!

Embedding Analytics Everywhere

In his opening day keynote, Sisense CEO, Amir Orad, emphasized the importance of lowering the barrier to analytics and empowering everyone to use data to make decisions. Providing analytics to everyone, everywhere means catering to the different ways people understand data. This means moving beyond desktop dashboards and offering insights naturally throughout our lives.

Continuing with the non-traditional side of analytics, Amir pointed to three organizations using analytics in unique ways:

  1. Celestica, a global electronics manufacturer, leverages analytics to reduce its carbon footprint. Within just four months of implementing analytics, they saw a 1,041 metric tons reduction of Co2e. That’s enough energy to power 110 homes for one full year!
  2. Skullcandy, the incredibly popular maker of headphones, earbuds, and other audio and wireless products, has used analytics in their business to virtually eliminate fraudulent returns.
  3. Indiana Donor Network, the organ and tissue donation network for the state of Indiana, has used analytics to increase skin donations by 70% and cornea donations by a whopping 224%.

Solidifying the need to embed analytics everywhere in order to transform industries was Sham Sokka of Philips, who spoke about revolutionizing patient care by delivering relevant data and analytics to the right individual at each stage of client care. “We fully believe in this concept of data democratization,” Sham said. “Not everyone is a data scientist so you want to have a platform that can serve simple data to a patient but complex data to an administrator. Getting the right data to the right person is super critical.”

AI and Augmented Analytics

There’s no doubt that artificial intelligence and augmented analytics are going to continue to impact every aspect of analytics – from data prep to insight discovery.

In her keynote, Jen Underwood of Impact Analytix, discussed the unprecedented pace of continuous technological change we’re currently witnessing. When organizations adopt augmented analytics, Jen said they see a multitude of benefits, which include:

  1. Empowering the masses: Rather than providing analytics for only around 30% of an organization, augmented analytics makes discovering insight easy enough for everyone.
  2. Saving time: Augmented data prep automates and accelerates the process, applies reinforcement learning while humans drive algorithms, and helps improve data quality for faster results.
  3. Revealing hidden patterns: Augmented analytics can find patterns in your data that a human might never detect – or detect when it’s too late – using manual techniques.
  4. Improving accuracy: With the ability to apply statistical significance, uncertainty, and risk model estimates, augmented analytics takes into account aspects of data prep and modeling that manual approaches may miss.

Joining in on the topic of artificial intelligence, Professor and Author Avi Goldfarb, gave a keynote that had participants glued to their chairs. His session demonstrated how artificial intelligence will affect business, public policy, and society in virtually all fields. The point he drove home? Prediction isn’t useful unless you can do something with it. What’s useful about AI and prediction is the ability to take action and create a feedback loop – that’s where the competitive edge comes into play.

Transformational Change

Advancements in technology are great but it’s the changes they bring to organizations that make all the difference in the real world. In his session, Bill Janczak from Indiana Donor Network told his organization’s story of transformation through the implementation of analytics.

Eureka!

As a small organization with a small IT budget, Indiana Donor Network has a large mission – to help people during their time of need. Run traditionally like a non-profit, Indiana Donor Network realized that changing their behavior and adding in analytics was the missing piece to ensuring organs make it to the right place at the right time. Using analytics they were able to make some major, important changes:

  1. Within hours they can now catch errors and common data entry challenges that would normally take around 30-45 days to find. This lead to improved matches for organ transplants.
  2. They are now able to monitor which donor outreach programs are successful and which are not in order to focus their activities and spend their resources on programs that actually drive more awareness and donor authorization so that more people can be helped in the long run.

We’ve Struck Gold!

The last two days were a whirlwind of bright ideas, futuristic visions, and practical applications of analytics to improve businesses around the globe. If the excitement in the room surrounding all of the technological transformations was any indication, I’d say the future for analytics is bright.

I’d like to extend a quick thank you to all of our speakers and customers for contributing to an awesome, fascinating, and fun event. Until next year!

Originally Posted at: What Happens When You Put Hundreds of BI Experts in One Room? by analyticsweek

Data Management Rules for Analytics

With analytics taking a central role in most companies’ daily operations, managing the massive data streams organizations create is more important than ever. Effective business intelligence is the product of data that is scrubbed, properly stored, and easy to find. When your organization uses raw data without proper management procedures, your results suffer.

The first step towards creating better data for analytics starts with managing data the right way. Establishing clear protocols and following them can help streamline the analytics process, offer better insights, and simplify the process of handling data. You can start by implementing these five rules to manage your data more efficiently.

1. Establish Clear Analytics Goals Before Getting Started

As the amount of data produced by organizations daily grows exponentially, sorting through terabytes of information can become problematic and reduce the efficiency of analytics. Such large data sets require significantly longer times to scrub and properly organize. For companies that deal with multiple streams that exhibit heavy bandwidth, having a clear line of sight towards business and analytics goals can help reduce inflows and prioritize relevant data.

It’s important to establish clear objectives for data and create parameters that filter out data points that are irrelevant or unclear. This facilitates pre-screening datasets and makes scrubbing and sorting easier by reducing white noise. Additionally, you can focus even more on measuring specific KPIs to further filter out the right data from the stream.

6 crucial steps of preparing data for analysis

2. Simplify and Centralize Your Data Streams

Another problem analytics suites face is reconciling disparate data from multiple streams. Organizations have internal, third-party, customer, and other data that must be considered as part of a larger whole instead of viewed in isolation. Leaving data as-is can be damaging to insights, as different sources may use unique formats or different styles.

Before allowing multiple streams to connect to your data analytics software, your first step should be establishing a process to collect data more centrally and unify it. This centralization makes it easier to input data seamlessly into analytics tools, but also simplifies the methodology for users to find and manipulate data. Consider how to set up your data streams best to reduce the number of sources to eventually produce more unified sets.

3. Scrub Your Data Before Warehousing

The endless stream of data raises questions about quality and quantity. While having more information is preferable, data loses its usefulness when it’s surrounded by noise and irrelevant points. Unscrubbed data sets make it harder to uncover insights, properly manage databases, and access information later.

Before worrying about data warehousing and access, consider the processes in place to scrub data to produce clean sets. Create phases that ensure data relevance is considered while effectively filtering out data that is not pertinent. Additionally, make sure the process is as automated as possible to reduce wasted resources. Implementing functions such as data classification and pre-sorting can help expedite the cleaning process.

4. Establish Clear Data Governance Protocols

One of the biggest emerging issues facing data management is data governance. Because of the sensitive nature of many sources—consumer information, sensitive financial details, and so on—concerns about who has access to information are becoming a central topic in data management. Moreover, allowing free access to datasets and storage can lead to manipulation, mistakes, and deletions that could prove damaging.

It’s vital to establish clear and explicit rules about who can access data, when, and how. Creating tiered permission systems (read, read/write, admin) can help limit the exposure to mistakes and danger. Additionally, sorting data in ways that facilitate access to different groups can help manage data access better without the need to give free rein to all team members.

5. Create Dynamic Data Structures

Many times, storing data is reduced to a single database that limits how you can manipulate it. Static data structures are effective for holding data, but they are restrictive when it comes to analyzing and processing it. Instead, data managers should place a greater emphasis towards creating structures that encourage deeper analysis.

Dynamic data structures present a way to store real-time data that allows users to connect points better. Using three-dimensional databases, finding methods to reshape data rapidly, and creating more inter-connected data silos can help contribute to more agile business intelligence. Generate databases and structures that simplify accessing and interacting with data rather than isolating it.

The fields of data management and analytics are constantly evolving. For analytics teams, it’s vital to create infrastructures that are future-proofed and offer the best possible insights for users. By establishing best practices and following them as closely as possible, organizations can significantly enhance the quality of the insights their data produces.

6 crucial steps of preparing data for analysis

Source

The Data Driven Road Less Traveled

data_img

To help better explain the topic, let me take a small detour; explain conventional big business paradox and why it a threat in today’s economy. Remember the days when large businesses were coined 800-pound gorilla and small businesses only dreamt about touching their market share. That in some sense is the conventional big business paradox. It is not true anymore. With ever connected world, easy access to cutting edge platform and methodologies, even small businesses have access to disruptive technologies and ways. In fact Small businesses have an advantage. They could react quickly, act nimbly and have a better focus. So, it is not a surprise that every now and then, big businesses are getting small blows by companies running against conventional big business paradox. So, what is wrong? Big organizations more often than required, run on their conventional ways. Sure, you could argue that scale and size makes them slow, however, try explaining it to businesses like Amazon, Salesforce and Google. In the current landscape, rapidly changing market and customer dynamics demand better ways to analyze the evolving customer expectations and faster response times.

Besides getting your hands on best talent in the market, large businesses should create ways to introduce another paradigm that would help them identify and understand changing customer expectations and technology paradigm. I call it Data Driven Enterprise.

Data never lies, never introduces bias, nor leaves anything to assumptions. Data driven businesses have been proven time and again as more sustainable business. In fact, if you talk about star products of big businesses, they are extensively monitored. What fails most businesses is their lack of attention towards nooks and crannies, where we first observe the signs of the changing customer expectations/ preference/ technology. That is why a centralized data driven approach is the way to go.

A data driven framework is not complicated but a series of focused steps taken to achieve a data focused enterprise. Think of it as an engine to rapidly validate hypothesis in a lean/iterative manner. Sounds cool? Yes, it is! More often than before, more businesses are aligning themselves to a better data driven enterprise. Want more information…

I’ve written an ebook, which had crossed 3k downloads last month. It has a series of easy to digest steps for building thought leadership that is needed to help take a big business to data driven innovation route. Please feel free to download the ebook at: http://pxl.me/ddibook and let me know your thoughts.
Remember, it’s just the start of the discussion; together we all have to travel a long road to sustained business growth.

Originally Posted at: The Data Driven Road Less Traveled by d3eksha

Big Data Explained in Less Than 2 Minutes – To Absolutely Anyone

There are some things that are so big that they have implications for everyone, whether we want them to or not. Big Data is one of those concepts, and is completely transforming the way we do business and is impacting most other parts of our lives.

It’s such an important idea that everyone from your grandma to your CEO needs to have a basic understanding of what it is and why it’s important.

Source for cartoon: click here

What is Big Data?

“Big Data” means different things to different people and there isn’t, and probably never will be, a commonly agreed upon definition out there. But the phenomenon is real and it is producing benefits in so many different areas, so it makes sense for all of us to have a working understanding of the concept.

So here’s my quick and dirty definition:

The basic idea behind the phrase ‘Big Data’ is that everything we do is increasingly leaving a digital trace (or data), which we (and others) can use and analyse. Big Data therefore refers to that data being collected and our ability to make use of it.

I don’t love the term “big data” for a lot of reasons, but it seems we’re stuck with it. It’s basically a ‘stupid’ term for a very real phenomenon – the datafication of our world and our increasing ability to analyze data in a way that was never possible before.

Of course, data collection itself isn’t new. We as humans have been collecting and storing data since as far back as 18,000 BCE. What’s new are the recent technological advances in chip and sensor technology, the Internet, cloud computing, and our ability to store and analyze data that have changed the quantityof data we can collect.

Things that have been a part of everyday life for decades — shopping, listening to music, taking pictures, talking on the phone — now happen more and more wholly or in part in the digital realm, and therefore leave a trail of data.

The other big change is in the kind of data we can analyze. It used to be that data fit neatly into tables and spreadsheets, things like sales figures and wholesale prices and the number of customers that came through the door.

Now data analysts can also look at “unstructured” data like photos, tweets, emails, voice recordings and sensor data to find patterns.

How is it being used?

As with any leap forward in innovation, the tool can be used for good or nefarious purposes. Some people are concerned about privacy, as more and more details of our lives are being recorded and analyzed by businesses, agencies, and governments every day. Those concerns are real and not to be taken lightly, and I believe that best practices, rules, and regulations will evolve alongside the technology to protect individuals.

But the benefits of big data are very real, and truly remarkable.

Most people have some idea that companies are using big data to better understand and target customers. Using big data, retailers can predict what products will sell, telecom companies can predict if and when a customer might switch carriers, and car insurance companies understand how well their customers actually drive.

It’s also used to optimize business processes. Retailers are able to optimize their stock levels based on what’s trending on social media, what people are searching for on the web, or even weather forecasts. Supply chains can be optimized so that delivery drivers use less gas and reach customers faster.

But big data goes way beyond shopping and consumerism. Big data analytics enable us to find new cures and better understand and predict the spread of diseases. Police forces use big data tools to catch criminals and even predict criminal activity and credit card companies use big data analytics it to detect fraudulent transactions. A number of cities are even using big data analytics with the aim of turning themselves into Smart Cities, where a bus would know to wait for a delayed train and where traffic signals predict traffic volumes and operate to minimize jams.

Why is it so important?

The biggest reason big data is important to everyone is that it’s a trend that’s only going to grow.

As the tools to collect and analyze the data become less and less expensive and more and more accessible, we will develop more and more uses for it — everything from smart yoga mats to better healthcare tools and a more effective police force.

And, if you live in the modern world, it’s not something you can escape. Whether you’re all for the benefits big data can bring, or worried about Big Brother, it’s important to be aware of the phenomena and tuned in to how it’s affecting your daily life.

What are your biggest questions about big data? I’d love to hear them in the comments below — and they may inspire future posts to address them.

To read the full article on Data Science Central, click here.

Source: Big Data Explained in Less Than 2 Minutes – To Absolutely Anyone

Bias: Breaking the Chain that Holds Us Back

Speaker Bio: Dr. Vivienne Ming was named one of 10 Women to Watch in Tech by Inc. Magazine, she is a theoretical neuroscientist, entrepreneur, and author. She co-founded Socos Labs, her fifth company, an independent think tank exploring the future of human potential. Dr. Ming launched Socos Labs to combine her varied work with that of other creative experts and expand their impact on global policy issues, both inside companies and throughout our communities. Previously, Vivienne was a visiting scholar at UC Berkeley’s Redwood Center for Theoretical Neuroscience, pursuing her research in cognitive neuroprosthetics. In her free time, Vivienne has invented AI systems to help treat her diabetic son, predict manic episodes in bipolar sufferers weeks in advance, and reunited orphan refugees with extended family members. She sits on boards of numerous companies and nonprofits including StartOut, The Palm Center, Cornerstone Capital, Platypus Institute, Shiftgig, Zoic Capital, and SmartStones. Dr. Ming also speaks frequently on her AI-driven research into inclusion and gender in business. For relaxation, she is a wife and mother of two.

Distilled Blog Post Summary: Dr. Vivienne Ming’s talk at a recent Domino MeetUp delved into bias and its implications, including potential liabilities for algorithms, models, businesses, and humans. Dr. Ming’s evidence included first-hand knowledge fundraising for multiple startups, data analysis completed during her tenure as the Chief Scientist at Gild, as well as citing studies within data, economics, recruiting, and education. This blog post provides text and video clip highlights from the talk. The full video is available for viewing. If you are interested in viewing additional content from Domino’s past events, review the Data Science Popup Playlist. If you are interested in attending an event in-person, then consider the upcoming Rev.

Research, Experimentation, and Discovery: Core of Science

Research, experimentation, and discovery are at the core of all types of science, including data science. Dr. Ming kicked off the talk with indicating “one of the powers of doing a lot of rich data work, there’s this whole range– I mean, there’s very little in this world that’s not an entree into”. While Dr. Ming provided detailed insights and evidence that pointed to the potential of rich data work during the entire talk, this blog post focuses on the implications and liabilities of bias within gender, names, and ethnic demographics. It also covers how bias isn’t solely a data or algorithm problem, it is a human problem. The first step to address bias is acknowledging that it exists.

Do You See the Chameleon? The Roots of Bias

Each one of us has biases and makes assessments based on those biases. Dr. Ming uses Johannes Stotter’s Chameleon to point out that “the roots of bias are fundamental and unavoidable”. Many people when they see the image, see a chameleon. However, the chameleon image consists of two people covered in body paint and are strategically placed to look like a chameleon. In the video clip below, Dr. Ming indicates

“I cannot make an unbiased AI. There are no unbiased rats in the world. In a very basic sense, these systems are making decisions on their uncertainty, and the only rational way to do that is to act the best we can given the data. The problem is when you refuse to acknowledge there’s a problem with our bias and actually do something about it. And we have this tremendous amount of evidence that there is a serious problem, and it’s holding, not just small things back. But as I’m going to get to later, it’s holding us back from a transformed world, one that I think anyone can selfishly celebrate.”

https://fast.wistia.com/assets/external/E-v1.js

Bias as the Pat on the Head (or the Chain) that Holds Us Back

While history is filled with moments when bias is not acknowledged as a problem, there are also moments when people addressed societal-reinforced gender bias. Women have assumed male nom de plumes to write epic novels, fight in wars, win Judo championships, run marathons, and even, as Dr. Ming pointed out, create an all-women software company called Freelance Programmers in the 1960s. During the meetup, Dr. Ming indicated that Dame Stephane “Steve” Shirley’s TedTalk, “Why do ambitious women have flat heads?”, helped her parse two distinctly different startup fundraising experiences that were grounded in gender bias.

Prior to Dr. Ming co-founding her current education technology company and obtaining her academic credentials, she dropped out of college and started a film company. When

“we started this company, and the funny thing is, despite having nothing, nothing that anyone should invest in– we didn’t have a script. We didn’t have talent. Literally, we didn’t even have talent. We didn’t have experience. We had nothing. We essentially raised what you might in the tech industry called seed round after a few phone calls.“

However, raising funding was more difficult the second time, for her current company, despite having substantially more academic, technology, and business credentials. During one of the funding meetings with a small firm with 5 partners, Dr. Ming relayed how the last partner said “‘you should feel so proud of what you’ve built’. And at the time, I thought, oh, Jesus, at least one of these people is on our side. In fact, as we were leaving the room, he literally patted me on the head, which seemed a little strange.” This prompted Dr. Ming to consider how

“my credentials are transformed that second time. No one questioned us about the technology. They loved it. They questioned whether we know how to run a business. The product itself people loved versus a film. Everything the second time around should have been dramatically easier. Except the only real difference that I can see is that the first time I was a man and the second time I was a woman.“

This led Dr. Ming to conclude and understand what Stephanie Shirley meant by ambitious women having flat heads from all of the times they have been pat on the head. Dr. Ming relays that

“I’ve learned ever since as an entrepreneur is, as soon as it feels like they’re dealing with their favorite niece rather than me as a business person, then I know, I know that they simply are not taking me seriously. And all the PhD’s in the world doesn’t matter, all the past successes in my other companies doesn’t matter. You are just that thing to me. And what I’ve learned is, figure that out ahead of time. Don’t bother wasting days and hours, and prepping to pitch to people that simply are not capable of understanding who you are, but of course, in a lot of context, that’s all you’ve got.“

Dr. Ming also pointed out that the bias due to gender also manifested at an organization where she worked before and after her gender transition. She noted when she went into work after her gender transition,

“That’s the last day anyone ever asked me a math question, which is kind of funny. I do happen to also have a PhD in psychology. But somehow one day to the next, I didn’t forget how to do convergence proofs. I didn’t forget what it meant to invent algorithms. And yet that was how people dealt with it, people who knew before. You see how powerful the change is to see someone in a different skin.”

This experience is similar to Dame Shirley’s, who, in order to start what would become a multi-billion dollar software company in the 1960s, “started to challenge the conventions of the time, even to the extent of changing my name from “Stephanie” to “Steve” in my business development letters, so as to get through the door before anyone realized that he was a she”. Dame Shirley subverted bias during a time when she, as a female, was prevented from working on the stock exchange, driving a bus, or, “Indeed, I couldn’t open a bank account without my husband’s permission”. Yet, despite the bias, Dame Shirley remarked

“who would have guessed that the programming of the black box flight recorder of Supersonic Concord would have been done by a bunch of women working in their own homes” ….”And later, when it was a company valued at over three billion dollars, and I’d made 70 of the staff into millionaires, they sort of said, “Well done, Steve!”

While it is no longer the 1960s, bias implications and liabilities are still present. Yet, we in data science are able to access data to have open conversations about bias as the first step avoiding inaccuracies, training data liabilities, and model liabilities within our data science projects and analysis. What if, in 2018, people built and trained models based on the assumption that humans with XY chromosomes lacked the ability to code because they only reviewed and used data from Dame Shirley’s company in the 1960s? Consider that a moment, as that is what happened to Dame Shirley, Dr. Ming, and many others. Bias implications and liabilities have real world consequences. Being aware of the bias and then addressing it, moves the industry forward towards breaking the chain that holds research, data science, and us, back.

Say My Name: Biased Perceptions Uncovered

When Dr. Ming was the Chief Scientist at Gild, a reporter called her for a quote on the Jose Zamora story. This also led to Dr. Ming’s research on her upcoming book. “The Tax of Being Different”, Dr. Ming relayed anecdotes during the meetup (see video clip) and has also written about this research for the Financial Times:

“To calculate the tax on being different I made use of a data set of 122m professional profiles collected by Gild, a company specialising in tech for hiring and HR, where I worked as chief scientist. From that data, I was able to compare the career trajectories of specific populations by examining the actual individuals. For example, our data set had 151,604 people called “Joe” and 103,011 named “José”. After selecting only for software developers we still had 7,105 and 4,896 respectively, real people writing code for a living. Analysing their career trajectories I found that José typically needs a masters degree or higher compared to Joe with no degree at all to be equally likely to get a promotion for the same quality of work. The tax on being different is largely implicit. People need not act maliciously for it to be levied. This means that José needs six additional years of education and all of the tuition and opportunity costs that education entails. This is the tax on being different, and for José that tax costs $500,000-$1m over his lifetime.” (Financial Times)

https://fast.wistia.com/assets/external/E-v1.js

While this particular example focuses on ethnicity-oriented demographic bias, during the meetup discussion, Dr. Ming referenced quite a few research studies regarding name bias. In case Domino Data Science Blog readers do not have some of research she cites on hand, a sample of studies have published around bias with names include: names that suggest male gender, “noble-sounding” surnames in Europe, names that are perceived as “easy-to-pronounce” which also has implications for how organizations choose their names. Yet, Dr. Ming did not limit the discussion to bias within gender and naming, she also dived right into how demographic bias impacts image classification, particularly with ethnicity.

Bias within Image Classification: Missing Uhura and Not Unlocking your iPhone X

Before Dr. Ming was the Chief Data Scientist at Gild, she was able to see Paul Viola’s face recognition algorithm demo. In that demo, she noticed that the algorithm didn’t detect Uhura. Viola indicated that this was a problem and it would be addressed. Fast forward years later to when Dr. Ming was the Chief Scientist at Gild, she relayed how she received “a call from The Wall Street Journal [and WSJ asked her] ‘So Google’s face recognition system just labeled a black couple as gorillas. Is AI racist?’ And I said, ‘Well, it’s the same as the rest of us. It depends on how you raise it.’“

For background context, in 2015, Google released a new photo app and a software developer discovered that the app labeled two people of color as “gorillas”  and Yonatan Zunger was the Chief Architect for Social at Google at the time. Since Yonatan Zunger is no longer at Google, he has since provided candid commentary about bias. Then, in January 2018, Wired ran a follow up story regarding the 2015 event. In the article Wired tested Google Photos and found that the labels for gorillas, chimpanzees, chimp, and monkey “were censored from searches and image tags after the 2015 incident”. This was confirmed by Google. Wired also ran a test to assess view of people by conducting searches for “African American”, “black man”, “black woman”, or “black person” which resulted in “an image of a grazing antelope” (on the search “African American”) as well as “black-and-white images of people, correctly sorted by gender but not filtered by race”. This points to the continued challenges involved with addressing bias in machine learning and models. Bias that also has implications beyond social justice.

As Dr. Ming pointed out in the meetup video clip below, facial recognition is also built into the iPhone X. The face recognition feature has potential challenges in recognizing global faces of color. Yet, despite all of this, Dr. Ming indicates “but what you have to recognize, none of these are algorithm problems. These are human problems.” Humans made decisions to build algorithms, build models, train models, and roll out products that include bias that has wide implications.

https://fast.wistia.com/assets/external/E-v1.js

Conclusion

Introducing liability into an algorithm or model via bias isn’t solely a data or algorithm problem, it is a human problem. Understanding that it is a problem is the first step in addressing it. In the recent Domino Meetup, Dr. Ming relayed how

“AI is an amazing tool, but it’s just a tool. It will never solve your problems for you. You have to solve them. And particularly in the work I do, there are only ever messy human problems, and they only ever have messy human solutions. What’s amazing about machine learning is that once we found some of those issues, we can actually use it to reach as many people as possible, to make this essentially cost-effective, to scale that solution to everyone. But if you think some deep neural network is going to somehow magically figure out who you want to hire when you have not been hiring the right people in the first place, what is it you think is happening in that data set?”

Domino continually curates and amplifies ideas, perspectives, and research to contribute to discussions that accelerate data science work. The full video of Dr. Ming’s talk at the recent Domino MeetUp is available. There is also an additional technical talk that Dr. Ming gave at the Berkeley Institute of Data Science on “Maximizing Human Potential Using Machine Learning-Driven Applications”. If you are interested in similar content to these talks, please feel free to visit the Domino Data Science Popup Playlist or attend the upcoming Rev.

The post Bias: Breaking the Chain that Holds Us Back appeared first on Data Science Blog by Domino.

Source: Bias: Breaking the Chain that Holds Us Back