Big Data Not Doping: How The U.S. Olympic Women’s Cycling Team Competes On Analytics

Sports and data analytics are becoming fast friends, and their relationship is a topic I’ve explored before. Another example I recently came across is how the U.S. women’s cycling team used analytics to leap from underdog status to silver medalists at the 2012 London Olympics.

The team was struggling when it turned to Sky Christopherson for help. Christopherson was a former Olympian cyclist and broke a world record in the over 35s 200m velodrome sprint a decade after retiring as a professional athlete. He had done this using a training regime he designed himself, based on data analytics and originally inspired by the work of cardiologist Dr. Erik Topol.

Christopherson developed the Optimized Athlete program after becoming disillusioned with doping in the sport, putting the phrase “data not drugs” at the core of the philosophy. He put together a set of sophisticated data-capture and monitoring techniques to record every aspect affecting the women athletes’ performance, including diet, sleep patterns, environment and training intensity. However, he soon realized the data was growing at an unmanageable rate.

This prompted him to contact San Francisco’s data analytics and visualization specialists Datameer, which helped to implement the program. Datameer’s CEO, Stefan Groschupf, himself a former competitive swimmer at a national level in Germany, immediately saw the potential of the project. Christopherson said “They came back with some really exciting results – some connections that we hadn’t seen before. How diet, training and environment all influence each other – everything is interconnected and you can really see that in the data.”

The depth of the analytics meant that tailored programs could be tweaked for each athlete to get the best out of every team member. One insight which came up was that one cyclist – Jenny Reed – performed much better in training if she had slept at a lower temperature the night before. So she was provided with a temperature water-cooled mattress to keep her body at an exact temperature throughout the night. “This had the effect of giving her better deep sleep, which is when the body releases human growth hormone and testosterone naturally,” says Christopherson.

Big Data enables high performance sports teams to quantify the many factors that influence performance, such as training load, recovery, and how the human body regenerates. Teams can finally measure all these elements and establish early warning signals that, for example, stop them from pushing athletes into overtraining, which often results in injury and illness. The need to train hard while avoiding the dangers of injury and illness is, in Christopherson’s opinion, the leading temptation for athletes to use the performance-enhancing drugs (PEDs) which have blighted cycling and other sports for so long.

Christopherson’s system has not been put through rigorous scientific testing but it seems to work fairly well based his personal success as well as the success of the team he coached. The key is finding balance during training. “It’s manipulating the training based on the data you have recorded so that you are never pushing into that danger zone, but also never backing off and under-utilizing your talent. It’s a very fine line and that’s what Big Data is enabling us to finally do.”

When used accurately and efficiently, it is thought that Big Data could vastly extend the careers of professional athletes and sportsmen well beyond the typical retirement age of 30, with the right balance of diet and exercise, and avoiding injury through over-exertion. Christopherson spoke to me from Hollywood, where he is trying to finalize a distribution deal for a new documentary called ‘Personal Gold,’ that tells this amazing story in much more detail. The Optimized Athlete program has also been turned into an app (OAthlete), which will be made available to early adopters from June 18th.

To read the original article on Forbes, click here.

Source: Big Data Not Doping: How The U.S. Olympic Women’s Cycling Team Competes On Analytics

The Competitive Advantage of Managing Relationships with Multi-Domain Master Data Management

Not long ago, it merely made sense to deploy multi-domain Master Data Management (MDM) systems. The boons for doing so included reduced physical infrastructure, less cost, fewer points and instances of operational failure, more holistic data modeling, less complexity, and a better chance to get the proverbial ‘single version of the truth’—especially compared to deploying multiple single-domain hubs.

Several shifts in the contemporary business and data management climate, however, have intensified those advantages so that it is virtually impossible to justify the need for multiple single-domain platforms where a solitary multi-domain one would suffice.

The ubiquity of big data, the popularity of data lakes, and the emerging reality of digital transformation have made it vital to locate both customer and product data (as well as other domains) in a single system so that organizations “now have the opportunity to build relationships between those customers and those products,” Stibo Systems VP of Product Strategy Christophe Marcant remarked. “The pot of gold at the end of the rainbow is in managing those relationships.”

And, in exploiting them for competitive advantage.

Mastering Relationship Management
Understanding the way that multi-domain MDM hubs facilitate relationship management requires a cursory review of MDM in general. On the one hand, these systems provide access to all of the relevant data about a particular business domain, which may encompass various sources. The true value is in the mastering capabilities of these hubs, which facilitate governance protocols and data quality measures by providing uniform consistency for those data. Redundancies, different spellings for the same customer, data profiling, metadata management, and lifecycle management are tended to, in addition to implementing facets of completeness and recentness of data and their requisite fields. Managing these measures inside of a single platform, as opposed to integrating data beforehand with external tools for governance and quality, enables organizations to account for these standards repeatedly and consistently. Conversely, the integration required for using external solutions are frequently “a one time activity: data cleansing and then publishing it to the MDM platform,” Marcant said. “And then that’s it; it’s not going back because it would be another project, another integration, yet another workflow, and yet another opportunity to be out of sync.”

The Multi-Domain Approach
Such limitations do not exist with the quality and governance mechanisms within MDM, in which different types of data are already integrated. However, when deploying multi-domain hubs there are fewer points of integration and workflows related to synchronicity because data of different domains (such as product and customer) are housed together. Moreover, the changing climate in which digital transformation, big data, and data lakes have gained prominence has resulted in much greater utility produced from identifying and understanding relationships between data—both across and within domains. Multi-domain MDM facilitates this sort of relationship management so that organizations can determine how product data directly correlates to customer data, and vice versa. According to Marcant, a well-known book retailer uses such an approach to understand its products and customers to “better tailor what they offer them.”

Connecting the Dots between Domains with Data Modeling and Visualizations
Understanding the relationships between data across conventional domains is done at both granular and high levels. At the former, there are much fewer constraints for data modeling when utilizing a multi-domain platform. “For example, if you’re modeling products, then having the opportunities to model your suppliers, and possibly the market, the location where you make this product available… now you have the opportunity to track information that these are the intersections between suppliers and markets and products,” Marcant noted. He stated that outside of customers and products, the most relevant domain with Stibo’s customers include location, suppliers (supply chain management), and assets.

The ability to represent relationships with modern visualizations produces a degree of transparency and insight that is also an integral part of managing those relationships in multi-domain MDM. “It’s the ability to visualize information in a graphical manner,” Marcant observed. The charts and linear connections facilitated by competitive multi-domain MDMs exist across those domains, and are applicable to myriad use cases. “Being able to visualize relationships between people and organizations and departments is important,” Marcant said. “If you do a merger and acquisition you want to literally see on your screen this chart and be able to map a node to another node.” The visual manifestations of those relationships is a pivotal output of the dearth of modeling constraints when deploying multi-domain MDM.

Extending MDM’s Relevancy
Ultimately, the modeling liberties and visualizations associated with multi-domain MDM are responsible for extending the relevancy of Master Data Management systems. That relevancy is broadened with this approach by incorporating the domains that are most apposite to customer or product domains, and by visually rendering that relevance in the form of relationships across domains. It also provides a level of velocity unmatched by conventional point solutions for integration, data quality and governance when deploying single domain MDM hubs. That expedience is heightened with in-memory computing capabilities, which manifest in MDM via quickness in searching, profiling, onboarding and exporting data—and in producing results relevant for certain business processes and functions. “That speed is not only cost-saving in terms of labor, but really what it means down the road is that if you work faster, your product is going to be available for sale earlier,” Marcant mentioned.

Preparing for Digital Transformation
Of all the factors influencing the advantages of multi-domain MDM, digital transformation may indeed by the most formidable. Its impact on customer expectations is that “every single one of the consumers, they think their interactions at the store and online has to be consistent…that what they touch here, is reflected in like fashion there on the screen,” commented Marcant. As such, the management of relationships between the various domains of MDM is a valuable way of implementing that consistency, and of keeping ahead of the developments within and across the domains that yield competitive advantage. Organizations benefit from understanding how geographic location relates to supply chain concerns, and how those in turn influence their customer and product information. This advantage is reinforced (if not produced) by the comprehensive system of reference in multi-domain MDM systems and by their penchant for charting the relationships that exist between the many facets of master data today.

Source by jelaniharper

How Big Data And The Internet Of Things Improve Public Transport In London

Transport for London (TfL) oversees a network of buses, trains, taxis, roads, cycle paths, footpaths and even ferries which are used by millions every day. Running these vast networks, so integral to so many people’s lives in one of the world’s busiest cities, gives TfL access to huge amounts of data. This is collected through ticketing systems as well as sensors attached to vehicles and traffic signals, surveys and focus groups, and of course social media.

Lauren Sager-Weinstein, head of analytics at TfL spoke to me about the two key priorities for collecting and analyzing this data: planning services, and providing information to customers. “London is growing at a phenomenal rate,” she says. “The population is currently 8.6 million and is expected to grow to 10m very quickly. We have to understand how they behave and how to manage their transport needs.”

“Passengers want good services and value for money from us, and they want to see us being innovative and progressive in order to meet those needs.”

Oyster prepaid travel cards were first issued in 2003 and have since been expanded across the network. Passengers effectively “charge” them by converting real money from their bank accounts into “Transport for London money” which are swiped to gain access to buses and trains. This enables a huge amount of data to be collected about precise journeys that are being taken.

Journey mapping

This data is anonymized and used to produce maps showing when and where people are traveling, giving both a far more accurate overall picture, as well as allowing more granular analysis at the level of individual journeys, than was possible before. As a large proportion of London journeys involve more than one method of transport, this level of analysis was not possible in the days when tickets were purchased from different services, in cash, for each individual leg of the journey.

That isn’t to say that integrating state of the art data collection strategies with legacy systems has been easy in a city where the public transport has operated since 1829. For example on London Underground (Tube) journeys passengers are used to “checking out and checking in” – tickets are validated (by automatic barriers) at the start and end of a journey. However on buses, passengers simply check in. Traditionally tickets were purchased from the bus driver or inspector for a set fee per journey. There is no mechanism for recording where a passenger leaves the bus and ends their journey – and implementing one would have been impossible without creating an inconvenience to the customer.

“Data collection has to be tied to business operations. This was a challenge to us, in terms of tracking customer journeys,” says Sager-Weinstein. TfL worked with MIT, just one of the academic institutions with which it has research partnerships, to devise a Big Data solution to the problem. “We asked, ‘Can we use Big Data to infer where someone exited?’ We know where the bus is, because we have location data and we have Oyster data for entry,” says Sager-Weinstein. “What we do next is look at where the next tap is. If we see the next tap follows shortly after and is at the entry to a tube station, we know we are dealing with one long journey using bus and tube.”

“This allows us to understand load profiles – how crowded a particular bus or range of buses are at a certain time, and to plan interchanges, to minimize walk times and plan other services such as retail.”

Unexpected events

Big Data analysis also helps TfL respond in an agile way when disruption occurs. Sager-Weinstein cites an occasion where Wansworth Council was forced to close Putney Bridge – crossed by 870,000 people every day – for emergency repairs.

“We were able to work out that half of the journeys started or ended very close to Putney Bridge. The bridge was still open to pedestrians and cyclists, so we knew those people would be able to cross and either reach their destination or continue their journey on the other side. They either live locally, or their destination is local.”

“The other half were crossing the bridge at the half-way point of their journey. In order to serve their needs we were able to set up a transport interchange and increase bus service on alternate routes. We also sent them personalized messages about how their journey was likely to be affected. It was very helpful that we were able to use Big Data to quantify them.”

This personalized approach to providing travel information is the other key priority for TfL’s data initiatives. “We have been working really hard to really understand what our customers want from us in terms of information. We push information from 23 Twitter TWTR -0.26% accounts and provide online customer services 24 hours a day.”

Personalized travel news

Travel data is also used to identify customers who regularly use specific routes and send tailored travel updates to them. “If we know a customer frequently uses a particular station, we can include information about service changes at that station in their updates. We understand that people are hit by a lot of data these days and too much can be overwhelming so there is a strong focus on sending data which is relevant,” says Sager-Weinsten.

“We use information from the back-office systems for processing contactless payments, as well as Oyster, train location and traffic signal data, cycle hire and the congestion charge. We also take into account special events such as the Tour de France and identify people likely to be in those areas. 83% of our passengers rate this service as ‘useful’ or ‘very useful’.” Not bad when you consider that complaining about the state of public transport is considered a hobby by many British people.

TfL also provides its data through open APIs for use by 3rd party app developers, meaning that tailored solutions can be developed for niche user groups.

Its systems currently run on a number of Microsoft MSFT +2.22% and Oracle ORCL +0.00% platforms but the organization is currently looking into adopting Hadoop and other open source solutions to cope with growing data demands going forwards. Plans for the future include increasing the capacity for real-time analytics and working on integrating an even wider range of data sources, to better plan services and inform customers.

Big Data has clearly played a big part in re-energizing London’s transport network. But importantly, it is clear that it has been implemented in a smart way, with eyes firmly on the prize. “One of the most important questions is always ‘why are we asking these questions’” explains Sager-Weinstein. “Big Data is always very interesting but sometimes it is only interesting. You need to find a business case.”

“We always try to come back to the bigger questions – growth in London and how we can meet that demand, by managing the network and infrastructure as efficiently as possible.”

To read the full article on Forbes, click here.

Originally Posted at: How Big Data And The Internet Of Things Improve Public Transport In London by analyticsweekpick

June 5, 2017 Health and Biotech analytics news roundup

First analysis of AACR Project GENIE data published: The dataset was released earlier this year. Among other results, the analysis showed that many tumors have mutations that are ‘clinically actionable.’

Database aims to personalize chemotherapy and reduce long-term heart risks: Treatments for breast cancer can result in cardiovascular disease. University of Alberta researchers will make risk profiles for this outcome and match them with genetic information.

Stamford Health’s plunge into analytics has closed gaps, opened new doors: The hospital used Tableau to improve reporting rates and to connect disparate systems.

At Big Data in Biomedicine, reexamining clinical trials in the era of precision health: Traditional trials are expensive and time-consuming, and are not necessarily the best tool for examining certain questions. Researchers may have to use observational studies more and find creative ways to make current studies larger.

Source: June 5, 2017 Health and Biotech analytics news roundup

The New Analytics Professional: Landing A Job In The Big Data Era

Along with the usual pomp and celebration of college commencements and high school graduation ceremonies we’re seeing now, the end of the school year also brings the usual brooding and questions about careers and next steps. Analytics is no exception, and with the big data surge continuing to fuel lots of analytics jobs and sub-specialties, the career questions keep coming. So here are a few answers on what it means to be an “analytics professional” today, whether you’re just entering the workforce, you’re already mid-career and looking to make a transition, or you need to hire people with this background.

The first thing to realize is that analytics is a broad term, and there are a lot of names and titles that have been used over the years that fall under the rubric of what “analytics professionals” do: The list includes “statistician,” “predictive modeler,” “analyst,” “data miner” and — most recently — “data scientist.” The term “data scientist” is probably the one with the most currency – and hype – surrounding it for today’s graduates and upwardly mobile analytics professionals. There’s even a backlash against over-use of the term by those who slap it loosely on resumes to boost salaries and perhaps exaggerate skills.


Labeling the Data Scientist

In reality, if you study what successful “data scientists” actually do and the skills they require to do it, it’s not much different from what other successful analytics professionals do and require. It is all about exploring data to uncover valuable insights often using very sophisticated techniques. Much like success in different sports depends on a lot of the same fundamental athletic abilities, so too does success with analytics depend on fundamental analytic skills. Great analytics professionals exist under many titles, but all share some core skills and traits.
The primary distinction I have seen in practice is that data scientists are more likely to come from a computer science background, to use Hadoop, and to code in languages like Python and R. Traditional analytics professionals, on the other hand, are more likely to come from a statistics, math or operations research background, are likely to work in relational or analytics server environments, and to code in SAS and SQL.

Regardless of the labels or tools of choice, however, success depends on much more than specific technical abilities or focus areas, and that’s why I prefer the term “data artist” to get at the intangibles like good judgment and boundless curiosity around data. I wrote an article on the data artist for the International Institute for Analytics (IIA). I also collaborated jointly with the IIA and Greta Roberts from Talent Analytics to survey a wide number of analytics professionals. One of our chief goals in that 2013 quantitative study was to find out whether analytics professionals have a unique, measurable mind-set and raw talent profile.

A Jack-of-All Trades

Our survey results showed that these professionals indeed have a clear, measurable raw talent fingerprint that is dominated by curiosity and creativity; these two ranked very high among 11 characteristics we measured. They are the qualities we should prioritize alongside the technical bona fides when looking to fill jobs with analytics professionals. These qualities also happen to transcend boundaries between traditional and newer definitions of what makes an analytics professional.

This is particularly true as we see more and more enterprise analytics solutions getting built from customized mixtures of multiple systems, analytic techniques, programming languages and data types. All analytics professionals need to be creative, curious and adaptable in this complex environment that lets data move to the right analytic engines, and brings the right analytic engines to where the data may already reside.
Given that the typical “data scientist” has some experience with Hadoop and unstructured data, we tend to ascribe the creativity and curiosity characteristics automatically (You need to be creative and curious to play in a sandbox of unstructured data, after all). But that’s an oversimplification, and our Talent Analytics/International Institute of Analytics survey shows that the artistry and creative mindset we need to see in our analytics professionals is an asset regardless of what tools and technologies they’ll be working with and regardless of what title they have on their business card. This is especially true when using the complex, hybrid “all-of-the-above” solutions that we’re seeing more of today and which Gartner IT -0.48% calls the Logical Data Warehouse.

Keep all this in mind as you move forward. The barriers between the worlds of old and new; open source and proprietary; structured and unstructured are breaking down. Top quality analytics is all about being creative and flexible with the connections between all these worlds and making everything work seamlessly. Regardless of where you are in that ecosystem or what kind of “analytics professional” you may be or may want to hire, you need to prioritize creativity, curiosity and flexibility – the “artistry” – of the job.

To read the original article on Forbes, click here.

Source: The New Analytics Professional: Landing A Job In The Big Data Era by analyticsweekpick