How To Turn Your Data Into Content Marketing Gold

With every brand out there becoming a publisher, it’s harder than ever to make your content stand out. Each day, you have a choice. You can play it safe and do what everyone else is doing: re-blog the same industry studies and curate uninspired listicles. Or you can be original and craft a story that only you can tell. The good news for most of you: there is content gold right under your nose. If used correctly, this will enable you to create truly compelling content that is not only shareable, but will set you apart from your peers.

The gold is your own data.

This data is often used to inform your business strategies and tactics, such as assessing which headlines performed better or what time of day you should tweet. And while those things are important, we’re talking about a close cousin of those efforts. This is about looking at the data your team has gathered and analyzed, and identifying original insights that you can craft into engaging stories to fuel your content marketing.

Visage, a platform meant to help marketers create branded visual content, conducted a survey of 504 marketers to see just how well they are taking advantage of this opportunity for original data storytelling. 75% of those surveyed are directly responsible for creating content, and 75% work in a company with 10 or less people working in their marketing department. Here’s what they found out:

1. Everyone is creating a lot of content

Most organizations (73%) publish original content on at least a weekly basis, and many (21%) are publishing content multiple times per day. Most brands are doing this because they know that if you aren’t sharing your latest thinking with the digital world (or at least being entertaining), your brand doesn’t exist for most people outside of your family.

content marketing data

2. It’s still not enough

Relatively few modern marketers believe that their organization creates enough original content. The fact is, as anyone who has rescheduled dates on an editorial calendar knows, getting into a publishing rhythm is hard. We can get enamored or overwhelmed by other brands who we see publishing a high volume of content. In such a state, it’s easy for some to play copycat and fall into regurgitating news and curating stories covered by other people. But your real challenge is differentiating from competitors and earning the trust of a potential customer. So, you need to use your limited resources to give yourself a shot for your content to either stand out and be remembered. Otherwise, it will be just one little drop flowing past in the social river.

content marketing data

3. Marketers are sitting on gold

Visage’s survey found that 41% of organizations are doing original market research more than once per year. Conducting a quick survey or poll is one powerful way to create a fresh, original story that hasn’t been told before. Start with a small experiment aimed at helping you understand your own market better, and keep your ideal customer profile in mind as you write your questions. The advantage to this approach is that you can structure your data collection and save yourself the time and money associated with cleaning up and organizing outside data. Finally, format your questions to gather the information and answers that you know your audience will find valuable.

content marketing data

4. Marketers aren’t using their data to its full potential.

The biggest shocker was that 60% of respondents claim to be sitting on interesting data, but only 18% are publishing it externally. There are many valid reasons to keep your internal data private (eg. security, competitive advantage), but you don’t need to take an all-or-nothing approach to this. For example, there’s a big opportunity to share aggregated trends and behaviors. Spotify does this with their music maps, and OKcupid does this with theirOKTrends blog.

content marketing data

5. They see the opportunity

Brand marketers aren’t just hoarding this gold. 82% of companies said it was important or extremely important that their marketing team learn to tell better data stories. You might notice the growing number of situations that require you to communicate with data in your own work, even just in your own internal reports and presentations.

content marketing data

6. The struggle is real

So, if so many marketers are sitting on interesting data and think it is important to craft original stories from it – why isn’t it happening? As the survey showed, many marketers don’t feel they have the skills or tools to craft the story from their data. Only 34% feel their teams have above average data literacy. Even when the data is cleaned, analyzed and ready to be visualized, modern marketers still have a hard job to do. Your audience needs context, and a strong narrative is a key ingredient of communicating with data. Often, the most successful data stories come as a result of combining powerful talents – the journalist working with a graphic designer, or a content marketer working closely with a data analyst. Get both sides of the brain firing in your content creation, even if you need to combine forces.

content marketing data
content marketing data

7. How to get started

Like any new marketing initiative, success in crafting original data stories as a means of differentiating your brand will take time and money. Start where you are and do what you can, even if it feels microscopic at first. If the prospect of getting rolling with your own data seems overwhelming, get some practice with public data available from credible sources like the Census Bureau or Pew Research. The cool news is that it’s easier than ever to get started with a plethora of great tools and educational material on the web.

Data storytelling is a skill that modern marketers can and must learn. If you are committed to creating original content that makes your brand shine, consider the precious gold insights that are ready to be mined from your data to provide tangible value to your audience.

To read the original article on NewsCred, click here.

Source: How To Turn Your Data Into Content Marketing Gold

Measuring The Customer Experience Requires Fewer Questions Than You Think

Figure 1. Three Phases of the Customer Lifecycle

A formal definition of customer experience, taken from Wikipedia, states that customer experience is: “The sum of all experiences a customer has with a supplier of goods or services, over the duration of their relationship with that supplier.” In practical terms, customer experience is the customer’s perception of, and attitude about, different areas of your company or brand across the entire customer lifecycle (see Figure 1 to right).

We know that the customer experience has a large impact on customer loyalty. Customers who are satisfied with the customer experience buy more, recommend you and are easier to up/cross-sell than customers who are dissatisfied with the customer experience. Your goal for the customer relationship survey, then, is to ensure it includes customer experience questions asking about important customer touchpoints.

Table 1. General and Specific Customer Experience Questions. In practice, survey asks customers to rate their satisfaction with each area.

Customer Experience Questions

Customer experience questions typically account for most of the questions in customer relationship surveys. There are two types of customer experience questions: General and Specific. General questions ask customers to rate broad customer touchpoints. Specific customer experience questions focus on specific aspects of the broader touchpoints.  As you see in Table 1, general customer experience questions might ask the customers to rate their satisfaction with 1. Product Quality, 2. Account Management, 3. Technical Support and so on. Specific customer experience questions ask customers to rate their satisfaction with detailed aspects of each broader customer experience area.

I typically see both types of questions in customer relationship surveys for B2B companies. The general experience questions are presented first and then are followed-up with specific experience questions. As such, I have seen customer relationship surveys that have as little as five (5) customer experience questions and other surveys that have 50+ customer experience questions.

Figure 2. General Customer Experience Questions

General Customer Experience Questions

Here are some general customer experience questions I typically use as a starting point for helping companies build their customer survey. As you can see in Figure 2, these general questions address broad areas across the customer lifecycle, from marketing and sales to service.

While specific customer experience questions are designed to provide greater understanding of customer loyalty, it is important to consider their usefulness. Given that we already have general customer loyalty question in our survey, do we need the specific questions? Do the specific questions help us explain customer loyalty differences above what we know through the general questions?

Customer Experience Questions Predicting Customer Loyalty

To answer these questions, I analyzed four different B2B customer relationship surveys, each from four different companies. These companies represented midsize to large enterprise companies. Their semi-annual customer surveys included a variety of loyalty questions and specific and general customer experience questions. The four companies had different combinations of general (5 to 7) and specific customer experience questions (0 to 34).

Figure 3. Impact of General and Specific Customer Experience Questions on Customer Loyalty (overall sat, recommend, buy again). Percent of variability is based on stepwise regression analysis.

The goal of the analysis was to show whether the inclusion of specific experience questions added to our understanding of customer loyalty differences beyond what the general experience questions explained. The results of the analysis are presented in Figure 3.  Through step-wise regression analysis, I first calculated the percent of variance in customer loyalty that is explained by the general customer experience questions (green area). Then, I calculated the percent of variance in customer loyalty explained by the specific questions above what the general questions explained (blue area). Clearly, the few general experience questions explain a lot of the variability in customer loyalty (42% to 85%) while the specific customer experience questions account for very little extra (2% to 4%).

Efficient Customer Relationship Surveys

We may be asking customers too many questions in our relationship surveys. Short relationship surveys, using general experience questions, provide great insight into understanding how to improve customer loyalty. Asking customers about specific, detailed aspects about their experience provides very little additional information about what drives customer loyalty.

Customers’ memories are fallible.  Given the non-trivial time between customer relationship surveys (up to a year between surveys), customers are unable to make fine distinctions regarding their experience with you (as measured in your survey). This might be a good example of the halo effect, the idea that a global evaluation of a company/brand (e.g., great product) influences opinions about their specific attributes (e.g., reliable product, ease of use).

Customers’ ratings about general customer experience areas explain as much of the differences in customer loyalty as we are able to with customer experience questions. Short relationship surveys allow customers the optimal way to give their feedback on a regular basis. Not only do these short relationship surveys provide deep customer insight about the causes of customer loyalty, they also enjoy higher response rates and show that you are considerate of customers’ time.

Source: Measuring The Customer Experience Requires Fewer Questions Than You Think by bobehayes

@chrisbishop on futurist’s lens on #JobsOfFuture


@chrisbishop on futurist’s lens on #JobsOfFuture #FutureofWork #JobsOfFuture #Podcast

In this podcast Christopher Bishop, Chief Reinvention Officer, Improvising Careers talks about his journey as a multimodal careerists, and his past as a rockstar. He shared some of hacks / best practices that businesses could adopt to better work through new age of work, worker and workplace. This podcast has lots of thought leadership perspective for future HR leaders.

Chris’s Recommended Reads:
The Industries of the Future by Alec Ross
Disrupted: My Misadventure in the Start-Up Bubble by Dan Lyons
Breakout Nations: In Pursuit of the Next Economic Miracles by Ruchir Sharma
How We Got to Now: Six Innovations That Made the Modern World by Steven Johnson
The New Rules of Work: The Modern Playbook for Navigating Your Career by Alexandra Cavoulacos and Kathryn Minshew

Podcast Link:

Chris’s BIO:
Christopher Bishop has had many different careers since he graduated from Bennington College with a B.A. in German literature. He has worked as a touring rock musician (played with Robert Palmer), jingle producer (sang on the first Kit Kat jingle “Gimme A Break”) and Web site project manager (developed Johnson & Johnson’s first corporate Web site). Chris also spent 15 years at IBM in a variety of roles including business strategy consultant and communications executive driving social media adoption and use of virtual worlds.

Chris is a member of the World Future Society and gave a talk at their annual conference in Washington, D.C. last summer on “How to Succeed at Jobs That Don’t Exist Yet.” In addition, he’s on the Board of TEDxTimesSquare and gave a talk on *Openness* at the New York event in April 2013.

Chris writes, consults and speaks about “improvising careers” at universities and industry conferences.

About #Podcast:
#JobsOfFuture podcast is a conversation starter to bring leaders, influencers and lead practitioners to come on show and discuss their journey in creating the data driven future.

Want to sponsor?
Email us @

#JobsOfFuture #Leadership #Podcast #Future of #Work #Worker & #Workplace

Source: @chrisbishop on futurist’s lens on #JobsOfFuture

Ashok Srivastava(@aerotrekker) @Intuit on Winning the Art of #DataScience #FutureOfData #Podcast


Ashok Srivastava(@aerotrekker) @Intuit on Winning the Art of #DataScience #FutureOfData


In this podcast Ashok Srivastava(@aerotrekker) talks about how the code of creating a great data science practice goes through #PeopleDataTech and he suggested how to handle unreasonable expectations from reasonable technologies. He shared his journey through culturally diverse organizations and how he successfully build data science practice. He shared his role in Intuit and some of the AI/Machine learning focus in his current role. This podcast is a must for all data driven leaders, strategists and wannabe technologists who are tasked to grow their organization and build a robust data science practice.

Ashok’s Recommended Read:
Guns, Germs, and Steel: The Fates of Human Societies – Jared Diamond Ph.D.
Collapse: How Societies Choose to Fail or Succeed: Revised Edition – by Jared Diamond

Podcast Link:

Ashok’s BIO:
Ashok N. Srivastava, Ph.D. is the Senior Vice President and Chief Data Officer at Intuit. He is responsible for setting the vision and direction for large-scale machine learning and AI across the enterprise to help power prosperity across the world. He is hiring hundreds of people in machine learning, AI, and related areas at all levels.

Previously, he was Vice President of Big Data and Artificial Intelligence Systems and the Chief Data Scientist at Verizon. He is an Adjunct Professor at Stanford in the Electrical Engineering Department and is the Editor-in-Chief of the AIAA Journal of Aerospace Information Systems. Ashok is a Fellow of the IEEE, the American Association for the Advancement of Science (AAAS), and the American Institute of Aeronautics and Astronautics (AIAA).

Ashok has a range of business experience including serving as Senior Director at Blue Martini Software and Senior Consultant at IBM.

He has won numerous awards, including the Distinguished Engineering Alumni Award, the NASA Exceptional Achievement Medal, IBM Golden Circle Award, the Department of Education Merit Fellowship, and several fellowships from the University of Colorado. Ashok holds a Ph.D. in Electrical Engineering from the University of Colorado at Boulder.

About #Podcast:
#FutureOfData podcast is a conversation starter to bring leaders, influencers and lead practitioners to come on show and discuss their journey in creating the data driven future.

Wanna Join?
If you or any you know wants to join in,
Register your interest @

Want to sponsor?
Email us @

#FutureOfData #DataAnalytics #Leadership #Podcast #BigData #Strategy


Eradicating Silos Forever with Linked Enterprise Data

The cry for linked data began innocuously enough with the simple need to share data. It has reverberated among countless verticals, perhaps most ardently in the health care space, encompassing both the public and private sectors. The advantages of linked enterprise data can positively affect any organization’s ROI and include:

  • Greater agility
  • More effective data governance implementation
  • Coherent data integration
  • Decreased time to action for IT
  • Increased trust in data

Still, the greatest impact that linked data has on the enterprise is its penchant to interminably vanquish the silo-based culture that still persists, and which stands squarely in the way of allowing true data culture to manifest.

According to TopQuadrant Managing Director David Price, for many organizations, “The next natural step in cases where they have data about the same thing that comes from different systems is to try to make links between those so they can have one single view about sets of data.”

And, if those links are managed correctly, they may very well lead to the proverbial single version of the truth.

From Linked Open Data…
The concept of linked enterprise data stems directly from linked open data, which has typically operated at the nexus between the public and private sectors (although it can involve either one singularly) and enabled organizations to link to and access data that are not theirs. Because of the uniform approach of semantic technologies, that data is exchangeable with virtually any data management system that utilizes smart data techniques. “As long as we make sure that all of the data that we put in a semantic data lake adheres to standard RDF technology and we use standard ontologies and taxonomies to format the data, they’re already integrated,” said Franz CEO Jans Aasman. “You don’t have to do anything; you can just link them together.” Thus, organizations in the private sector can readily integrate public sector linked open data into their analytics and applications in a time frame that largely bypasses typical pain points of integration and data preparation.

Such celerity could prove influential in massive data sharing endeavors like the Pentagon Papers, in which data were exchanged across international borders and numerous databases to help journalists track down instances of financial fraud. Price is leading TopQuadrant’s involvement in the Virtual Construction for Roads (V-CON) project, in which the company is contributing to an IT system that harmonizes data for road construction between a plethora of public and private sector entities in Holland and Sweden. When asked if TopQuadrant’s input on the project was based on integrating and linking data among these various parties, Price commented, “That’s exactly where the focus is.”

…to Linked Enterprise Data Insight
Linked data technologies engender an identical effect when deployed within the enterprise. In cases in which different departments require the same data for different purposes, or in instances in which there are multiple repositories or applications involving the same data, linked enterprise data can provide a comprehensive data source comprised of numerous tributaries relevant for all applications. “The difference is this stuff is enabled to also allow you to extract all the data and make it available for anybody to download it… and that includes locally,” Price commented. “You get more flexibility and less vendor lock-in by using standards.” In what might be the most compelling use case for linked enterprise data, organizations can also link all of their data–stemming from internal and external sources–for a more profound degree of analytics based on relationship subtleties that semantic technologies instinctively perceive. Cambridge Semantics VP of Marketing John Rueter weighed in on these benefits when leveraged at scale.

“That scale is related to an almost sort of instantaneous querying and results of an entire collection of data. It has eliminated that linear step-wise approach of multiple links or steps to get at that data. The fact that you’re marrying the combination of scale and speed you’re also, I would posit, getting better insights and more precise and accurate results based upon the sets of questions you’re asking given that you’ve got the ability to access and look at all this data.”

Agile Flexibility
Linked enterprise data allows all data systems to share ontologies—semantic models—that readily adjust to include additional models and data types. The degree of flexibility they facilitate is underscored by the decreased amounts of data preparation and maintenance required to sustain what is in effect one linked system. Instead of addressing modeling requirements and system updates individually, linked enterprise data systems handle these facets of data management holistically and, in most instances, singularly. Issuing additional requirements or updating different databases in a linked data system necessitates doing so once in a centralized manner that is simultaneously reflected in the individual components of the linked data systems. “In a semantic technology approach the data model or schema or ontology is actually its own data,” Price revealed. “The schema is just more data and the data in some database that represents me, David Price, can actually be related to different data models at the same time in the same database.” This sort of flexibility makes for a much more agile environment in which IT teams and end users spend less time preparing data, and more reaping their benefits.

Data Governance Ramifications
Although linked enterprise data doesn’t formally affect data governance as defined as the rules, roles, and responsibilities upon which sustainable use of data depends, it greatly improves its implementation. Whether ensuring regulatory compliance or reuse of data, standards-based environments furnish consistent semantics and metadata that are understood in a uniform way—across as many different systems as an enterprise has. One of the most pivotal points for implementing governance policy is ensuring that organizations are utilizing the same terms for the same things, and vice versa. “The difference our technology brings is that things are much more flexible and can be changed more easily, and the relationships between things can be made much more clear,” Price remarked about the impact of linked data on facilitating governance. Furthermore, the uniform approach of linked data standards ensures that “the items that are managed are accurate, complete, have a good definition that’s understandable by discipline experts, and that sometimes have a more general business glossary definition and things like that,” he added.

There are multiple facets of data governance that are tied to security, such as who has the authority to view which data and how. In a linked data environment such security is imperative, particularly when sharing data across the public and private sectors. Quite possibly, security measures are reinforced even more in linked data settings than in others, since they are fortified by conventional security methods and those particular to smart data technologies. The latter involves supplementing traditional data access methods with semantic statements or triples; the former includes any array of conventional methods to protect the enterprise and its data. “The fact that you use a technology that enables things to be public doesn’t mean they have to be,” Price said. “Then you put on your own security policies. It’s all stored in a database that can be secured at various levels of accessing the database.”

Eradicating Silos
Implicit in all of the previously mentioned benefits is the fact that linked enterprise data effectively eradicates the proliferation of silos which has long complicated data management as a whole. Open data standards facilitate much more fluid data integration while decreasing temporal aspects of data preparation, shifting the emphasis on insight and action. This ability to rid the enterprise of silos is one which transcends verticals, a fact which Price readily acknowledged. “Our approach to the V-Con project is that although the organizations involved in this are National Roads Authority, our view is that the problem they are trying to solve is a general one across more than the roads industry.” In fact, it is applicable to the enterprise in general, particularly that which is attempting to sustain its data management in a long-term, streamlined manner to deliver both cost and performance boons.


Turning Business Users into Citizen Data Scientists

The data scientist position may well be one of the most multi-faceted jobs around, involving aspects of statistics, practical business knowledge, interpersonal skills, programming languages, and many other wide-sweeping qualifications.

For business end users of data-driven processes, however, many times these professionals simply seem like glorified IT personnel: they’re the new hirees the former goes to, and waits upon, to get the data required to do their jobs.

Today, analytics platforms featuring conversational, interactive responses to questions can eliminate the backlog of demands for data while transforming the business into citizen data scientists capable of performing enough lower echelon data science functions to conduct their own analytics.

Moreover, they also equip them with the means to modify the data and answer their own questions, as needed, to take a greater sense of ownership and perhaps even pride in the data which impacts their jobs.

Ben Szekely, Cambridge Semantics Vice President of Solutions and Pre-Sales, reflected that, “Because business users are getting answers back in real time they’re able to start making judgments about the data, and they’re developing a level of trust and intuition about the data in their organization that wasn’t there before.”

Real-Time Answers, Ad-Hoc Questions

The most immediately demonstrable facet of a citizen data scientist is the ability to answer one’s own data-centric questions autonomously. Dependence on external IT personnel or data scientists is not required with centralized data lake options enhanced by smart data technologies and modern query mechanisms. This combination, which leverages in-memory computing, parallel processing, and the power of the cloud to scale on demand, exploits the high-resolution copy of data assets linked together within a semantic data lake. Users are able to issue their own questions and answers of the resulting enterprise knowledge graph through either a simple web browser interface or their favorite self-service BI tool of choice—the latter of which is likely already in use at their organization. “They’re getting their answers through a real-time conversation and interaction with the content, versus going and asking someone and getting back a Powerpoint deck,” Szekely mentioned. “That’s a very static thing which they can’t converse with or really understand necessarily, or [understand] how the answer was come to.”

Understanding Answers and Data

Full-fledged data scientists are able to trust in data and analytics results because they have an intimate knowledge of those data and the processes they underwent to supply answers to questions. Citizen data scientists can have that same understanding and readily determine insight into data provenance. The underlying graph mechanisms powering these options deliver full lineage of data’s sources, transformation, and other aspects of their use so citizen data scientists can retrace the data’s journey to their analytics results. Even laymen business users can understand how to traverse a knowledge graph for these purposes, because all of the data modeling is done in terms predicated on business concepts and processes—as opposed to arcane query languages or IT functions. “We talk about the way things are related to basic concepts and properties,” Szekely said. “You don’t have to be able to read an ER diagram to understand the data. You just have to be able to look at basic names and relationships.” Those names and relationships are described in business terms to maximize end user understanding of data.

Selecting and Modeling Data

Another key function of the data science position is determining which sources are relevant for questions, and modeling their data in a way so that a particular application can extract value from them. Citizen data scientists can also perform this basic functionality autonomously with a number of automated data modeling features. Relational technologies, for example, require copious time periods for constructing data models, calibrating additional data to fit into predefined schemas, and successfully mapping it all together. They require data scientists or IT to “build that monolithic data warehouse model and then map everything in it,” Szekely acknowledged. Conversely, smart data lakes enable users to begin analyzing data as soon as they are ingested, without having to wait for data to be formatted to fit the schema requirements of the repository. There are even basic data cleaning and preparation formulas to facilitate this prerequisite for citizen data scientists. According to Szekely, “You can bring in new data and we’ll build a model kind of automatically from the source data. You can start exploring it and looking at it without doing any additional modeling. The modeling comes in when you want to start connecting it up to other sources or reshaping the data to help with particular business problems.”

Enterprise Accessible Data Science

Previously, data science was relegated to the domain of a select few users who functioned as gatekeepers for the rest of the enterprise. However, self-service analytics platforms are able to effectively democratize some of the rudimentary elements of this discipline so business users can begin accessing their own data. By turning business users into citizen data scientists, these technologies are helping to optimize manpower and productivity across the enterprise.


Source by jelaniharper

Tour de France’s big data experiment is… getting there


This year’s Tour de France was supposed to be different, with greater access to the teams, greater flows of big data info on riders and greater ways to watch the race unfold. But there have been some teething issues, understandably.

Last week we reported on Dimension Data’s deal with the Tour, which promised fans immense arrays of information to follow their favourite riders and teams. We were told that we could follow any of our preferred cyclists (198 in total), measure gaps between groups, find real-time speeds etc…

What has transpired has fallen a little shy of that so far. The data provided in the Twitter feed has been good, as have the wrap-up emails at the end of each stage. There’s just not a whole lot to them.

That’s because the beta website, the flagship tool for viewers, is not quite ready for the public.

After a week’s testing at the Critérium du Dauphiné race last month, Dimension Data was fine-tuning its offering, with the team behind the data transmission currently putting the finishing touches on the upgrades.

Until then, no dice.

“I have a team of people in 11 cities around the world working around the clock to get this up and available as soon as possible,” explained Adam Foster, group executive for the communications business unit at Dimension Data.

A keen cyclist himself, Foster’s pretty happy with how his company’s project is going so far.

At the moment all the teams are receiving their fill of cyclist information, as is Amaury Sport Organisation (ASO, the Tour de France organiser), with the Twitter handle @letourdata rifling out information to the public all the time.

‘I’ve done less cycling since we agreed to work on the Tour de France than ever before!’

There’s a short video infographic wrap of each day’s stage (above), with the previously mentioned emails also serving a decent purpose, to tide us over until the site goes live.

But there’s more to come, too, as Dimension Data’s dealings with TV companies will soon show.

“As soon as the data is ready you’ll see different inlays that you haven’t seen before on TV,” explained Foster. “They’re ready to go, we just have to make sure we’re populating it with good information.”

It’s not just Dimension Data helping to make the Tour a more digital-friendly event. Velon – a consortium of some of the top teams competing – and ASO have partnered with GoPro to produce added video of this year’s Tour.

Footage from team cars, mechanics, bikes, and more will be available viewers.

But it’s Dimension Data’s site that we’re all waiting on, something Foster and his team are working extensively to fix.

An avid cyclist, competing in triathalons and the likes himself, Foster’s own life has been thrown upside down since his company partnered with ASO.

“I’ve done less cycling since the agreement then I’ve ever done before,” he said.“This has been a massive time commitment and the Dimension Data team has been unbelievable.

“It’s been a monumental undertaking,” he explained, before saying that, when live, Twitter users will be asked for feedback, to improve a service which he feels can be spread across other sports.

“We’ve had significant interest from other sports on the back of this,” he said. “But all of our focus and attention is on making this event as successful as possible.”

That’s something that can’t be determined until we get to play around with that site, though.

To read the original article on silicon republic, click here.

Originally Posted at: Tour de France’s big data experiment is… getting there

6 Best Practices for Maximizing Big Data Value

businessman puts a sign
businessman puts a sign

A  survey  released last month, indicated that Hadoop adoption is facing challenges. Specifically, the vast majority of respondents had no current plans to invest in Hadoop due to “…sizable challenges around business value and skills”. Since Hadoop is the leading tool for Big Data, this points to a bigger problem in overall Big Data adoption.  Finding resources with the right analytic skills is a difficult challenge that is being targeted by a new generation of business friendly Big Data tools.  Getting business value is a more fundamental issue.  For many organizations that I’ve talked to, the plan for getting business value from Big Data is simple: get a deeper understanding of how customers behave, and then leverage that knowledge to improve customer satisfaction and increase business profitability.  That can be easier said than done, but reviewing some of the emerging best practices seems like a good place to start.

Maximizing Big Data’s value comes down to doing six things well:

  1. Start with a business problem in mind: Wandering through huge amounts of data with Hadoop and other advanced analytic tools can be lots of fun for data scientists, but it can also be a huge waste of time and resources if the results do not translate into something that can be applied to solve real world business problems. Working with business experts to understand their challenges and opportunities for improvement is a key ingredient for successful projects.  Focusing on a specific business problem makes it easier to identify useful data sources and choose appropriate tools and techniques.  It also sets you up for the next step…
  2. Look ahead to how you will put your insights into practice: To achieve real business value you have to be able to operationalize the results of your analysis. Although this sounds obvious, far too many projects are left gathering dust on the shelf because it is too hard to incorporate their findings into the business activities that would benefit from them.  Data that looks wonderful in the lab may not be available, or may be too expensive to get at the time you need it for use in day-to-day business operations.  Industry regulations can also have a huge influence over where and how your data can be used.
  3. Take advantage of the latest analytic innovations: Innovations in Business Intelligence and Business Analytics are transforming how businesses get value from their customer data.  This is causing a shift from traditional approaches that provide periodic snapshots in the form of descriptive reports and historical dashboards, to systems that continuously analyze incoming data to provide prescriptive insights that are actionable in real-time. Big data tools and infrastructure are making it faster and easier to apply machine learning techniques to explore huge datasets that include a wide variety of structured and unstructured data.
  4. Embrace analytic diversity: R, Python, Hive, Groovy, Scala, MATLAB, SQL, SAS; which one is your favorite?  One of the side effects of the exploding world of analytic innovation is that taking advantage of the latest techniques often requires learning a new set of tools.  Waiting for your favorite analytic tool vendor to catch up and provide an integrated solution isn’t usually an option.   Leading analytic teams will inevitably need to use multiple tools to support their business needs, so the best approach is to embrace diversity and create a flexible infrastructure that can operationalize models authored by a wide range of tools.  Getting multiple types of analytic models to work together in a robust production environment can be a significant challenge. However, modern decision management systems like the FICO® Decision Management Platform simplify the task by supporting extensible libraries and taking advantage of web services and standards such as the Predictive Modeling Markup Language (PMML) to combine analytic services and business rules into cohesive decisioning applications.
  5. Leverage the Cloud and productivity platforms: Creating big data analytics no longer requires making a huge investment in expensive infrastructure and specialized skills.  By running your analytic projects in the cloud you can let a dedicated third party handle the underlying systems and services while you focus on the business problem at hand. You can rent out just the capacity and services you need, at a fraction of the cost of implementing your own.
  6. Give Control to the Business Experts: The final ingredient for getting value from your big data analytics is also the most important one.  The greatest value comes from giving business experts new insights that they can quickly turn into differentiating strategies and actions that will delight customers and shareholders alike.  Interactive and highly visual dashboards and reports can provide information that helps business experts to refine and evolve high-performance strategies.  Standard decision management components such as business rules authoring services can make it faster and easier for experts to incorporate new models and insights into their business rules and policies.  Simulation and data visualizations can also speed the approval time for implementing new models and strategies by making it easier to understand and explore their potential impact.

Note: This article originally appeared in FICO. Click for link here.

Source: 6 Best Practices for Maximizing Big Data Value

What it takes to be a Santa: a data’s perspective


I should be not a surprise to image how massive would the Santa’s operations be. Santa and his army of Elf’s must be working like a horse. Some numbers to understand what it takes to be a Santa: There are approximately two billion children (persons under 18) in the world. However, since Santa does not visit children of Muslim, Hindu, Jewish, or Buddhist (except maybe in Japan) religions, this reduces the workload for Christmas night to 15% of the total, or 378 million (according to the population reference bureau). At an average (census) rate of 3.5 children per household, that comes to 108 million homes, presuming there is at least one good child in each.

Santa has about 31 hours of Christmas to work with, thanks to the different time zones and the rotation of the earth, assuming east to west (which seems logical). This works out to 967.7 visits per second. This is to say that for each Christian household with a good child, Santa has around 1/1000th of a second to park the sleigh, hop out, jump down the chimney, fill the stockings, distribute the remaining presents under the tree, eat whatever snacks have been left for him, get back up the chimney, jump into the sleigh and get onto the next house.

Assuming that each of these 108 million stops is evenly distributed around the earth (which, of course, we know to be false, but will accept for the purposes of our calculations), we are not talking about 0.78 miles per household; a total trip of 75.5 million miles, not counting bathroom stops or breaks.
This means Santa’s sleigh is moving at 650 miles per second – 3,000 times the speed of sound. For purposes of comparison, the fastest man made vehicle, the Ulysses space probe, moves at a pokey 27.4 miles per second, and a conventional reindeer can run (at best) 15 miles per hour. The payload of the sleigh adds another interesting element. Assuming that each child gets nothing more than a medium sized LEGO set (two pounds), the sleigh is carrying over 500 thousand tons, not counting Santa himself. On land, a conventional reindeer can pull no more than 300 pounds. Even granting that flying reindeer can pull 10 times the normal amount, the job can’t be done with eight or even nine of them -Santa would need 360,000 of them. This increases the payload, not counting the weight of the sleigh, another 54,000 tons, or roughly seven times the weight of the Queen Elizabeth (the ship, not the monarch).

A mass of nearly 600,000 tons traveling at 650 miles per second creates enormous air resistance – this would heat up the reindeer in the same fashion as a spacecraft re-entering the earth’s atmosphere. The lead pair of reindeer would absorb 14.3 quintillion joules of energy per second each. In short, they would burst into flames almost instantaneously, exposing the reindeer behind them and creating deafening sonic booms in their wake. The entire reindeer team would be vaporized within 4.26 thousandths of a second, or right about the time Santa reaches the fifth house on his trip. Not that it matters, however, since Santa, as a result of accelerating from a dead stop to 650 m.p.s. in .001 seconds, would be subjected to acceleration forces of 17,000 g’s.

A 250 pound Santa (which seems ludicrously slim considering all the high calorie snacks he must have consumed over the years) would be pinned to the back of the sleigh by 4,315,015 pounds of force, instantly crushing his bones and organs and reducing him to a quivering blob of pink goo. Therefore, if Santa did exist, he must be going through a hell of rigorous regiment to make those gifts upto you. So, enjoy whatever you get, it surely took some bones to make that logistical arrangements work. MERRY CHRISTMAS Yo’ll!!!

Originally Posted at: What it takes to be a Santa: a data’s perspective by v1shal

The One Hidden Skill You Need to Unlock the Value of Your Data

An examination of data scientist skills reveals an often overlooked skill necessary to uncover insights from data: The Scientific Method

Data scientists are a hot commodity in today’s data-abundant world. Business leaders are relying on data scientists to improve how they acquire data, determine its value, analyze it and build algorithms for the ultimate purpose of improving how they do business. While the job title of “data scientist” was coined by D.J. Patil and Thomas H. Davenport only in  2008, it reached the status of “sexiest job of the 21st century“ by 2012. But what makes for a good data scientist?

In this post, I take a look at several industry experts’ opinions about the skills, abilities and temperament needed to be a good data scientist. Specifically, I reviewed 11 articles that included lists of various data scientist skills (each link directs you to a specific list):, Smart Data Collective, InformationWeek, Data Science Central, Teradata, Silicon Angle, Gigaom, Forrester, Wired, TDWI, and Dataversity. From each article, I extracted statements  (96 in all) that reflected a skill, ability or temperament and grouped them into smaller categories. I let the content of the statements drive the generation of the categories. Some categories had a fairly specific, narrow meaning (e.g., NLP) and others had a broader meaning (e.g., computer science).


While the major, popular buckets of data scientist skills emerged (e.g., quantitative, computer engineering, business acumen, communication), an additional one also emerged that I call Scientific Method. First, let’s look at the details of each skill or category:

1. Quantitative

Click image to enlarge.
Click image to enlarge.

Businesses need quantitative skills if they are to extract insights from their data. Quantitative skills include statistics, mathematics and predictive modeling skills. Statistical skills can help businesses summarize their large data sets into smaller pieces of meaningful information. Predictive modeling skills help businesses create algorithms, both automatically and manually, to improve business processes. As a whole,these quantitative skills allow businesses to apply mathematical rigor to to their large, quickly expanding data sets to help make sense of them.

2. Computer Engineering

Click image to enlarge.
Click image to enlarge.

Another key skill is related to computer engineering. With the advent of new ways to store, analyze and retrieve data, the idea data scientist needs skills in programming languages, distributed computing systems and open-source tools.

Click image to enlarge.
Click image to enlarge.

In addition to analyzing structured data, businesses are now trying to uncover insights from unstructured data from such sources as social media, emails, community message boards and even open-ended comments in surveys. Skills in natural language processing help data scientists transform these unstructured data into structured data to allow for quantitative analysis. Machine learning skills help businesses identify generalized patterns in the data (training data) that allows for classification of future data (target data). This pattern recognition helps drive recommendation engines that present customers with information that is relevant to them. Finally, data management skills help data scientists develop and integrate different data systems so businesses can utilize all their data in an integrated fashion.

3. Business Acumen

Click to enlarge image.
Click to enlarge image.

Quantitative and computer skills don’t occur in a vacuum. Data scientists, to be successful, need a good understanding of the business, including its people, products and services, and how they all work together. This knowledge of how business works helps data scientists direct their energies to data that are the most valuable to the business.

4. Communication

Click image to enlarge.
Click image to enlarge.

Data scientists need to have good communication skills. This skill is closely linked to business acumen, as data scientists need to be able to convey complex quantitative, computational findings into terms that business executives, managers, and front line employees can understand. Data scientists often need to use visualization tools to help translate quantitative findings into images that are easily consumable by the masses. With good communication skills and the use of tools to visualize the data, data scientists are able to provide the insights that business leaders need to operationalize changes to their current business processes.

5. Scientific Method

Click image to enlarge.
Click image to enlarge.

The final group of skills reflect the need to approach problems using critical thinking, creativity and open-mindedness. I grouped these final set of skills under the label of “scientific method.” Formally defined, the scientific method is a body of techniques for objectively investigating phenomena, acquiring new knowledge, or correcting and integrating previous knowledge. The scientific method includes the collection of empirical evidence, subject to specific principles of reasoning. Specifically, the scientific method follows these general steps: 1) formulate a question or problem statement; 2) generate a hypothesis; 3) test hypothesis through experimentation (when we can’t conduct true experiments, data are obtained through observations and measurements); and 4) analyze data to draw conclusions.

Click image to enlarge.
Click image to enlarge.

These steps are not meant to imply that science is only a series of activities. Instead of thinking about science as an area of knowledge, it is better to conceptualize it as a way to understand how the world really works. As Carl Sagan said, “Science is a way of thinking much more than it is a body of knowledge.” The scientific method not only requires the adherence to rules, it also requires creativity, and imagination in order to find new possibilities, address problems in different ways and apply findings from one setting to another. Separating signal from noise, data scientists’ work truly reflects an exercise in uncovering reality.

What makes for a good data scientist?I believe that the scientific method plays a critical role in understanding any data, irrespective of their size or speed or variety. Despite the idea that Big Data will kill the need for theory and the scientific method, the human element is necessarily involved in the generation, collection and interpretation of data. As Kate Crawford points out in a thoughtful article, The Hidden Bias of Big Data, data do not speak for themselves; humans give data their voice; people draw inferences from the data and give data their meaning. Unfortunately, people introduce bias, intentional and unintentional, that weaken the quality of the data.

Additionally, I highlighted a few ways that the scientific method can help improve the veracity (validity) of data. To be of real, long-term value to business, Big Data needs to be about understanding the causal links among the variables. Hypothesis testing helps shed light on identifying the reasons why variables are related to each other and the underlying processes that drive the observed relationships. Hypothesis testing helps improve analytical models through trial and error to identify the causal variables and helps you generalize your findings across different situations.


Data scientists help businesses extract value from their data by finding insights. To solve business problems, data scientists need a variety of skills, including quantitative, computer, business acumen and communication. The current review, however, uncovered an overlooked skill needed by data scientists: the scientific method.

Even though finding a single person who possesses these data scientist skills is akin to finding a unicorn, companies need to understand all the data scientist requirements as they look to build data science teams to address their data analytic needs. Data scientists will require knowledge in research methodology to learn about different kinds of research methods they can employ (e.g., observational, experimental, quasi-experimental) as well as the threats to different kinds of validity (e.g., statistical conclusion, internal, construct and external).

The goal of the scientific method is to solve problems. If businesses want to solve their problems, they need to put the science in data science. If they don’t, all they have are data.