The Potential Of Big Data In Africa

2941293054_7942457291

Africa may trail the US and Europe in terms of technology, but the gap is closing fast. It seems our economic woes have inspired us to harness certain technologies with more zeal and in more innovative ways. Yup, when necessity said she mothered invention,she wasn’t lying.

For example, mobile banking has enabled the small business merchant do business in ways they could only dream of during the days of cutthroat loans and complex credit facilities. Another example is in the area of backup electricity generation, which I’ll wager we’re ahead of the pack. Who needs nuclear, right? Yes, it’s a small victory but if power fails at our own Super bowl, the ‘inverters’ will keep the lights on while we switch on the ‘big gen’.

Moving on, one area where technology can swiftly bridge the gap is in data analytics. Data analytics, often coupled with the term “predictive modelling”, is the rapidly growing discipline of using data gathered in the past to predict what will happen in the future. It is as crazy as it sounds, yet a science. It is how Netflix suggests interesting TV shows or how Target knew a teen girl was pregnant before her father did. .

The Economist reports Cyber security, Cloud analytics and Data analytics – which all have a symbiotic relationship – as three top tech trends for 2015. These are interesting times indeed. Due to increased investments in the three areas, barriers to accessing big-data and analytics have fallen drastically. Literally anyone with an internet connection can have a supercomputer running a thousand miles away, doing their bidding, round the clock. Many African IT companies provide their services this way and it’s smart. One can now tap into the cloud and big-data without factoring power cuts, black-market diesel or expensive internet plans into their expenditure.Predictive modelling techniques are employed in farming, manufacturing, finance, warfare and oil exploration, to name a few and there is plenty of untapped potential for this in Africa.

Farmers can be encouraged to subscribe to analytics – or precision agriculture. IBM’s research into precision agriculture accurately highlights the importance of making smarter decisions regarding planting, fertilizing and harvesting crops; pointing out that 90% of crop losses are due to bad weather. While it might not be as easy as taking your clothes off the line before the rains, knowing weather patterns helps farmers make decisions before a single seed hits the dirt.

Knowing in advance the risk created by adverse weather, farmers are better able to access insurance and in turn, loans from the bank. This is already being done in Kenya with the Kilimo Salama crop insurance scheme. That’s a whole ecosystem revived.

Retailers such as Konga or Jumia could gain valuable insights into the lives of thousands of customers at their doorstep. Proper customer segmentation through analytics can easily provide a platform to tackle this. Marketing would be more accurate as packaging and offers would be tailored to individual tastes and peculiarities.

Big data and analytics presents the African continent with lots of questions. And exciting possibilities. Can predictive models discern between clean and fraudulent mobile money transactions? With that kind of accuracy, we could see an open market and an upsurge in mobile money services. Which would lead to a more secure, robust industry.

Big-Data and Analytics may not be the magic wand to these problems but an awareness of the benefits of analytics and technical expertise will create tangible value for businesses, industry and the local economy.

Note: This article originally appeared in TechCabal. Click for link here.

Originally Posted at: The Potential Of Big Data In Africa

Feb 22, 18: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

[  COVER OF THE WEEK ]

image
Pacman  Source

[ AnalyticsWeek BYTES]

>> Rise of Data Capital by Paul Sonderegger by thebiganalytics

>> SAS enlarges its palette for big data analysis by analyticsweekpick

>> The Reliability and Validity of the Consumer Financial Protection Bureau (CFPB) Complaint Database by bobehayes

Wanna write? Click Here

[ NEWS BYTES]

>>
 Hard choices: UT research hints at better breast cancer treatment – Knoxville News Sentinel Under  Business Analytics

>>
 Secrets, statistics and implicit bias – Financial Times Under  Statistics

>>
 Protagonist adds action tools to its narrative analytics for brands – MarTech Today Under  Marketing Analytics

More NEWS ? Click Here

[ FEATURED COURSE]

CS229 – Machine Learning

image

This course provides a broad introduction to machine learning and statistical pattern recognition. … more

[ FEATURED READ]

Thinking, Fast and Slow

image

Drawing on decades of research in psychology that resulted in a Nobel Prize in Economic Sciences, Daniel Kahneman takes readers on an exploration of what influences thought example by example, sometimes with unlikely wor… more

[ TIPS & TRICKS OF THE WEEK]

Keeping Biases Checked during the last mile of decision making
Today a data driven leader, a data scientist or a data driven expert is always put to test by helping his team solve a problem using his skills and expertise. Believe it or not but a part of that decision tree is derived from the intuition that adds a bias in our judgement that makes the suggestions tainted. Most skilled professionals do understand and handle the biases well, but in few cases, we give into tiny traps and could find ourselves trapped in those biases which impairs the judgement. So, it is important that we keep the intuition bias in check when working on a data problem.

[ DATA SCIENCE Q&A]

Q:How do you test whether a new credit risk scoring model works?
A: * Test on a holdout set
* Kolmogorov-Smirnov test

Kolmogorov-Smirnov test:
– Non-parametric test
– Compare a sample with a reference probability distribution or compare two samples
– Quantifies a distance between the empirical distribution function of the sample and the cumulative distribution function of the reference distribution
– Or between the empirical distribution functions of two samples
– Null hypothesis (two-samples test): samples are drawn from the same distribution
– Can be modified as a goodness of fit test
– In our case: cumulative percentages of good, cumulative percentages of bad

Source

[ VIDEO OF THE WEEK]

@AnalyticsWeek #FutureOfData with Robin Thottungal(@rathottungal), Chief Data Scientist at @EPA

 @AnalyticsWeek #FutureOfData with Robin Thottungal(@rathottungal), Chief Data Scientist at @EPA

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

It’s easy to lie with statistics. It’s hard to tell the truth without statistics. – Andrejs Dunkels

[ PODCAST OF THE WEEK]

#FutureOfData Podcast: Conversation With Sean Naismith, Enova Decisions

 #FutureOfData Podcast: Conversation With Sean Naismith, Enova Decisions

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

Every person in the world having more than 215m high-resolution MRI scans a day.

Sourced from: Analytics.CLUB #WEB Newsletter

Happy Holidays! Top 10 blogs from 2012

Happy Holidays! 2012 was an year with lots of ups and some downs. 2012 has not stopped us from moving but shown optimism. 2012 was the year when I started blogging. You all were very supportive and shared my blogs. Here is my list of top 10 blogs of the year. Ranks for these blogs is based on pageviews. I would like to thank you for continous appreciation and motivation to help me keep writing. Before that, I would like to take a moment to wish you ALL “A Very Happy Holidays and an year of fun, success, prosperity, great health and peace”.

My TOP 10 Blogs for 2012:

25 Cartoons To Give Current Big Data Hype A Perspective
Have you heard about “Big-Data”? It is an important word now-a-days and is driving enterprise world. Big-data is the new black in enterprises today….more

Lessons From Apple Retail Stores, Gold for Others
Being an Apple fanboy, All I can see is how Apple is energizing current market landscape. It is pioneering in areas around its product lineup, its App Store as well as its retail stores. Everywhere you look, there are boat load of lessons to learn from Apple. Apple Retail store is no different. …more

Top 5 Lessons LEGO story teaches an entrepreneur
LEGO launched a neat video few days back narrating its story. Not only it is nicely done, but also it encapsulates boatload of management and startup lessons. I have listed top 5 lessons that popped out for me in this video….more

Mobile Strategy For Brick-Mortar Stores And How Retailers Are Doing It All Wrong
No, we are not talking about mobile strategy for retailers but a sub-part of it, that is – Mobile strategy for brick-mortar stores. Yes, it is different from the overall mobile strategy for the retailer and no, it cannot be done correctly without thinking differently from online store strategy….more

How Retailer Should Use QR Code To Hit The Pot Of Gold
Before we dive into the topic, I want to take a step back and explain what is QRCode: QR Code (abbreviated from Quick Response Code) is the trademark for a type of matrix barcode (or two-dimensional code) first designed for the automotive industry. …more

10 must have books for a successful entrepreneur’s bookshelf
Entrepreneurial journey is pretty much on-the-job learning and there is no replacement to that. But, to help understand some common pitfalls and how other successful entrepreneurs overcame them is also crucial. …more

7 Deadly Sins of Total Customer Experience
Total customer experience is achieved by forming a firm 360 degree grip around customer experience. This requires firm commitment and discipline by participating stakeholders. …more

Social Tools to Attract Local Shoppers
So, what are some social tools that retailers should use? Following is the list of some social tools that every retailer should know about. It is not a complete list but this list should be used as a starting point to learn what tools exist in what categories. …more

10 Commandments to achieve Total Customer Experience
It is not difficult to understand the role of customer in any business. They play pivotal role and are solely responsible for making any business successful. For sustaining a profitable life, businesses should make sure to serve customers in every possible way and provide them with an experience of their lifetime….more

Must read quotes from Steve Jobs for Entrepreneurs
As we all know, Steve Jobs was always described as an innovator, a visionary; and rightly so. He dropped out of college at the age of 21 and started Apple with his friend Steve Wozniak from his parents’ garage. …more

Source: Happy Holidays! Top 10 blogs from 2012

Big Data – What it Really Means for VoC and Customer Experience Professionals

VOCFusionI had the privilege of delivering a talk on the topic of Big Data on May 15, 2013 in Las Vegas as part of VOCFusion, the world’s largest voice of the customer event. I would like to thank Allegiance for hosting and organizing this excellent educational event for our industry peers. I highly recommend this multi-day event for VoC and customer experience professionals.

The goal of my talk was to illustrate how the field of Big Data applies to voice of customer and customer experience professionals. In my talk, I start with a brief overview of area we call Big Data. Using real customer experience data from US hospital (Thanks, Medicare), I illustrate why the integration of different data sources (e.g., operational, financial, customer feedback) provides deeper insights than each data source alone. Additionally, I talk about how to select your first Big Data project (Thanks, IBM Big Data), the importance of veracity (the fourth, and, I think, the most important V of Big Data), objective customer loyalty metrics and three broad implications of Big Data to the field of customer experience management. You can view the slides below.

http://www.slideshare.net/bobehayes/big-data-what-it-really-means-for-voc-and-customer-experience-professionals

Originally Posted at: Big Data – What it Really Means for VoC and Customer Experience Professionals by bobehayes

Feb 15, 18: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

[  COVER OF THE WEEK ]

image
Weak data  Source

[ LOCAL EVENTS & SESSIONS]

More WEB events? Click Here

[ NEWS BYTES]

>>
 When Is Haskell More Useful Than R Or Python In Data Science? – Forbes Under  Data Science

>>
 We’re Entering The Horniest Season Of The Year, According To The Australian Bureau Of Statistics – Junkee Under  Statistics

>>
 The Amazon juggernaut has traders making record bets against America’s largest grocer – Markets Insider Under  Financial Analytics

More NEWS ? Click Here

[ FEATURED COURSE]

Baseball Data Wrangling with Vagrant, R, and Retrosheet

image

Analytics with the Chadwick tools, dplyr, and ggplot…. more

[ FEATURED READ]

Thinking, Fast and Slow

image

Drawing on decades of research in psychology that resulted in a Nobel Prize in Economic Sciences, Daniel Kahneman takes readers on an exploration of what influences thought example by example, sometimes with unlikely wor… more

[ TIPS & TRICKS OF THE WEEK]

Winter is coming, warm your Analytics Club
Yes and yes! As we are heading into winter what better way but to talk about our increasing dependence on data analytics to help with our decision making. Data and analytics driven decision making is rapidly sneaking its way into our core corporate DNA and we are not churning practice ground to test those models fast enough. Such snugly looking models have hidden nails which could induce unchartered pain if go unchecked. This is the right time to start thinking about putting Analytics Club[Data Analytics CoE] in your work place to help Lab out the best practices and provide test environment for those models.

[ DATA SCIENCE Q&A]

Q:What is the life cycle of a data science project ?
A: 1. Data acquisition
Acquiring data from both internal and external sources, including social media or web scraping. In a steady state, data extraction and routines should be in place, and new sources, once identified would be acquired following the established processes

2. Data preparation
Also called data wrangling: cleaning the data and shaping it into a suitable form for later analyses. Involves exploratory data analysis and feature extraction.

3. Hypothesis & modelling
Like in data mining but not with samples, with all the data instead. Applying machine learning techniques to all the data. A key sub-step: model selection. This involves preparing a training set for model candidates, and validation and test sets for comparing model performances, selecting the best performing model, gauging model accuracy and preventing overfitting

4. Evaluation & interpretation

Steps 2 to 4 are repeated a number of times as needed; as the understanding of data and business becomes clearer and results from initial models and hypotheses are evaluated, further tweaks are performed. These may sometimes include step5 and be performed in a pre-production.

5. Deployment

6. Operations
Regular maintenance and operations. Includes performance tests to measure model performance, and can alert when performance goes beyond a certain acceptable threshold

7. Optimization
Can be triggered by failing performance, or due to the need to add new data sources and retraining the model or even to deploy new versions of an improved model

Note: with increasing maturity and well-defined project goals, pre-defined performance can help evaluate feasibility of the data science project early enough in the data-science life cycle. This early comparison helps the team refine hypothesis, discard the project if non-viable, change approaches.

Steps 2 to 4 are repeated a number of times as needed; as the understanding of data and business becomes clearer and results from initial models and hypotheses are evaluated, further tweaks are performed. These may sometimes include step5 and be performed in a pre-production.

Deployment

Operations
Regular maintenance and operations. Includes performance tests to measure model performance, and can alert when performance goes beyond a certain acceptable threshold

Optimization
Can be triggered by failing performance, or due to the need to add new data sources and retraining the model or even to deploy new versions of an improved model

Note: with increasing maturity and well-defined project goals, pre-defined performance can help evaluate feasibility of the data science project early enough in the data-science life cycle. This early comparison helps the team refine hypothesis, discard the project if non-viable, change approaches.

Steps 2 to 4 are repeated a number of times as needed; as the understanding of data and business becomes clearer and results from initial models and hypotheses are evaluated, further tweaks are performed. These may sometimes include step5 and be performed in a pre-production.

Deployment

Operations
Regular maintenance and operations. Includes performance tests to measure model performance, and can alert when performance goes beyond a certain acceptable threshold

Optimization
Can be triggered by failing performance, or due to the need to add new data sources and retraining the model or even to deploy new versions of an improved model

Note: with increasing maturity and well-defined project goals, pre-defined performance can help evaluate feasibility of the data science project early enough in the data-science life cycle. This early comparison helps the team refine hypothesis, discard the project if non-viable, change approaches.

Steps 2 to 4 are repeated a number of times as needed; as the understanding of data and business becomes clearer and results from initial models and hypotheses are evaluated, further tweaks are performed. These may sometimes include step5 and be performed in a pre-production.

Deployment

Operations
Regular maintenance and operations. Includes performance tests to measure model performance, and can alert when performance goes beyond a certain acceptable threshold

Optimization
Can be triggered by failing performance, or due to the need to add new data sources and retraining the model or even to deploy new versions of an improved model

Note: with increasing maturity and well-defined project goals, pre-defined performance can help evaluate feasibility of the data science project early enough in the data-science life cycle. This early comparison helps the team refine hypothesis, discard the project if non-viable, change approaches.

Source

[ VIDEO OF THE WEEK]

#BigData @AnalyticsWeek #FutureOfData #Podcast with Nathaniel Lin (@analytics123), @NFPA

 #BigData @AnalyticsWeek #FutureOfData #Podcast with Nathaniel Lin (@analytics123), @NFPA

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

In God we trust. All others must bring data. – W. Edwards Deming

[ PODCAST OF THE WEEK]

#BigData @AnalyticsWeek #FutureOfData #Podcast with @MPFlowersNYC, @enigma_data

 #BigData @AnalyticsWeek #FutureOfData #Podcast with @MPFlowersNYC, @enigma_data

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

40% projected growth in global data generated per year vs. 5% growth in global IT spending.

Sourced from: Analytics.CLUB #WEB Newsletter

8 Ways Big Data and Analytics Will Change Sports

The leading minds in sports convened in Boston last week at the annual MIT Sloan Sports Analytics Conference to share ideas about how big data will be a game-changer for fans, players, coaches, officials and front-office personnel.

Analytics and big data have potential in many industries, but they are on the cusp of scoring major points in sports. From coaches and players to front offices and businesses, analytics can make a difference in scoring touchdowns, signing contracts or preventing injuries.

Coaches, players and the leading minds in sports came together to discuss the potential of analytics and big data last week at the 2014 MIT Sloan Sports Analytics Conference in Boston. Here are eight ways data analytics can improve efficiency, accuracy and profitability in sports. Who knows? Big data may even eliminate blown calls one day.

1. Better Precision in the Strike Zone

In baseball, Pitchf/x technology from Sportvision has been installed in all 30 Major League Baseball Stadiums to track pitches during games. Sportvision has a suite of other technologies for baseball, football and motor sports. However, nothing has replaced the judgment calls umpires have to make at the plate in real-time, says Hank Adams, CEO of Sportvision. “Sportvision technology is being adapted to use for referees and umpires. We can very accurately determine if something is a strike or a ball.”

For now, umpires still rely on the naked eye to call a strike or ball and until the technology or baseball rules evolve, catchers like Jose Molina will still be able to game the system.

2. More Resources for Analytics Buffs

On the fan side, statistic enthusiasts have a slew of websites they can visit to see breakdowns of their favorite players and slices and dices of specific games and plays.

“We take that data and organize it to make it understandable to average people. We can see how pitchers performance has changed in a certain game. Or pull up a map of what an umpire’s strike/ball calls are and see the strike zone’s shape and size,” says Dan Brooks, founder and lead developer of BrooksBaseball.net, a website that makes sports statistics digestible for sports fans.

3. Data From Wearable Technologies

Many technology vendors are trying to get into the wearable technology market, given the interest in devices like Google Glass and fitness trackers.

Adidas has a system called miCoach that works by having players attach a wearable device to their jerseys. Data from the device shows the coach who the top performers are and who needs rest. It also provides real-time stats on each player, such as speed, heart rate and acceleration.

That kind of real-time data could help trainers and physicians plan for better training and conditioning. Matt Hasselbeck, a quarterback for the Indianapolis Colts, says he’s in favor of technology that helps with player safety. “[By]checking hydration levels or tracking hits to the head, we could start collecting that data now to make sports safer.”

4. Live on the Field Data Collection

Currently, lots of data is collected manually during games and sports competitions. But much of the live data moves so fast that it’s a moment lost in time. One company trying to log more of the live data is Zebra Technologies. The company makes RFID tags, as part of their MotionWorks Sports Solution, that attach to equipment, balls and players to track movement, distance and speed. The tags blink 25 times per second and deliver data in 120 milliseconds. Another company, SportVU has six cameras in each NBA arena that collect data on the movements on every player and movements of the basketball 25x per second.

5. Predictive Insight Into Fan Preferences

Analytics can advance the sports fans’ experience as teams and ticket vendors compete with the at-home experience — the better they know their fans, the better they can cater to them.

Big Data and Sports

“It’s about knowing when a fan is interested in an opposing team coming to town or whether a 4 p.m. game is not too late for them. It’s about hitting them with that communication when they are in the decision mindset and giving season ticket holders more incentive to keep coming and retain their tickets,” says John Forese, senior vice president and general manager of LiveAnalytics, a LiveNation data, analytics and research company.

Many sports teams, such as the New England Patriots, are also trying to predict the wants and needs of fans with team specific mobile apps that provide special content, in-seat concession ordering and bathroom wait times.

6. Career Opportunities for the Blended Sports Fan and Numbers Whiz

Bryan Colangelo, former general manager and president of the Toronto Raptors, says teams should hire data analytics specialists in front offices to handle the data transmitted from new technologies and devices. “There are mountains of opportunity in analytics now. If you’re not spending $250K and having two to three people dedicated to it full time, you’re probably too light on it.”

Paraag Marathe, president of the San Francisco 49ers says data needs to be digestible so players and coaches can use it make split-second decisions. “If [data] is not synthesized in a way that a QB or coach can use it, then it’s useless.”

7. Influence Coaching Decisions

Data analysts could help deliver the most important data sets to coaches for better results on the field. Brian Burke, founder of the website Advanced NFL Stats, says data could help coaches and players make more informed decisions that could decide wins and losses.

“In football, the low hanging fruit in analytics is in coaches’ decision making. Things like punting on 4th and 1 used to make sense but maybe not anymore,” Burke says. “Offenses are better and it’s easier to get 2 point conversions.”

8. Build Arguments for Contract Negotiations

Marathe of the 49ers says good data insights can make or break a player being hired or a coach being fired. “In contract negotiations, both sides are using data and everyone is trying to find evidence that supports whatever contract demand they want to make. They can slice it in any way that helps them.” In fact, Adam Silver, commissioner of the National Basketball Association says analytics played a role in helping end the player lockout in 2012.

Lauren Brousell is a staff writer for CIO magazine. Follow her on Twitter @LBrousell. Follow everything from CIO.com on Twitter @CIOonline, Facebook, Google + and LinkedIn.

Originally posted via “8 Ways Big Data and Analytics Will Change Sports”

Source

New MIT algorithm rubs shoulders with human intuition in big data analysis

PatentedAlgorithms2
We all know that computers are pretty good at crunching numbers. But when it comes to analyzing reams of data and looking for important patterns, humans still come in handy: We’re pretty good at figuring out what variables in the data can help us answer particular questions. Now researchers at MIT claim to have designed an algorithm that can beat most humans at that task.

[AI can now muddle its way through the math SAT about as well as you can]

Max Kanter, who created the algorithm as part of his master’s thesis at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) along with his advisor Kalyan Veeramachaneni, entered the algorithm into three major big data competitions. In a paper to be presented this week at IEEE International Conference on Data Science and Advanced Analytics, they announced that their “Data Science Machine” has beaten 615 of the 906 human teams it’s come up against.

The algorithm didn’t get the top score in any of its three competitions. But in two of them, it created models that were 94 percent and 96 percent as accurate as those of the winning teams. In the third, it managed to create a model that was 87 percent as accurate. The algorithm used raw datasets to make models predicting things such as when a student would be most at risk of dropping an online course, or what indicated that a customer during a sale would turn into a repeat buyer.

Kanter and Veeramachaneni’s algorithm isn’t meant to throw human data scientists out — at least not anytime soon. But since it seems to do a decent job of approximating human “intuition” with much less time and manpower, they hope it can provide a good benchmark.

[MIT researchers can listen to your conversation by watching your potato chip bag]

“If the Data Science Machine performance is adequate for the purposes of the problem, no further work is necessary,” they wrote in the study.

That might not be sufficient for companies relying on intense data analysis to help them increase profits, but it could help answer data-based questions that are being ignored.

“We view the Data Science Machine as a natural complement to human intelligence,” Kanter said in a statement. “There’s so much data out there to be analyzed. And right now it’s just sitting there not doing anything. So maybe we can come up with a solution that will at least get us started on it, at least get us moving.”

This post has been updated to clarify that Kalyan Veeramachaneni also contributed to the study. 

View original post HERE.

Source: New MIT algorithm rubs shoulders with human intuition in big data analysis by analyticsweekpick

Feb 08, 18: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

[  COVER OF THE WEEK ]

image
Insights  Source

[ AnalyticsWeek BYTES]

>> #OpenAnalyticsDay: A Day for Analytics by v1shal

>> Analytics and its teen years by v1shal

>> Jun 01, 17: #AnalyticsClub #Newsletter (Events, Tips, News & more..) by admin

Wanna write? Click Here

[ NEWS BYTES]

>>
 HTC only Taiwan firm among top 100 world most-loved brands – Focus Taiwan News Channel Under  Social Analytics

>>
 The evolving state of enterprise middleware – SDTimes.com Under  Hadoop

>>
 Learn details of Canada’s condiment sauces market historic and … – WhaTech Under  Sales Analytics

More NEWS ? Click Here

[ FEATURED COURSE]

Lean Analytics Workshop – Alistair Croll and Ben Yoskovitz

image

Use data to build a better startup faster in partnership with Geckoboard… more

[ FEATURED READ]

How to Create a Mind: The Secret of Human Thought Revealed

image

Ray Kurzweil is arguably today’s most influential—and often controversial—futurist. In How to Create a Mind, Kurzweil presents a provocative exploration of the most important project in human-machine civilization—reverse… more

[ TIPS & TRICKS OF THE WEEK]

Data aids, not replace judgement
Data is a tool and means to help build a consensus to facilitate human decision-making but not replace it. Analysis converts data into information, information via context leads to insight. Insights lead to decision making which ultimately leads to outcomes that brings value. So, data is just the start, context and intuition plays a role.

[ DATA SCIENCE Q&A]

Q:Examples of NoSQL architecture?
A: * Key-value: in a key-value NoSQL database, all of the data within consists of an indexed key and a value. Cassandra, DynamoDB
* Column-based: designed for storing data tables as sections of columns of data rather than as rows of data. HBase, SAP HANA
* Document Database: map a key to some document that contains structured information. The key is used to retrieve the document. MongoDB, CouchDB
* Graph Database: designed for data whose relations are well-represented as a graph and has elements which are interconnected, with an undetermined number of relations between them. Polyglot Neo4J

Source

[ VIDEO OF THE WEEK]

@AnalyticsWeek: Big Data at Work: Paul Sonderegger

 @AnalyticsWeek: Big Data at Work: Paul Sonderegger

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

I keep saying that the sexy job in the next 10 years will be statisticians. And I’m not kidding. – Hal Varian

[ PODCAST OF THE WEEK]

@AngelaZutavern & @JoshDSullivan @BoozAllen discussed Mathematical Corporation #FutureOfData

 @AngelaZutavern & @JoshDSullivan @BoozAllen discussed Mathematical Corporation #FutureOfData

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

Decoding the human genome originally took 10 years to process; now it can be achieved in one week.

Sourced from: Analytics.CLUB #WEB Newsletter

“Putting Data Everywhere”: Leveraging Centralized Business Intelligence for Full-Blown Data Culture

For most organizations, the introduction to the holistic organizational transformation required for implementing data culture begins with business intelligence. According to Looker Chief Data Evangelist Daniel Mintz, however, the efficacy of even the most powerful analytics yielded from BI tools has traditionally been hampered by one onerous obstacle:

“The reality is if you make data hard to find or hard to access, more often than not people are just going to move ahead without it. And, the barriers that it takes to impede people’s access to it are not huge.”

Eliminating obstacles for accessing data, which facilitates a democratization of insights that propagates data-driven culture, is readily achieved by leveraging a centralized form of BI that effectively accomplishes what Mintz referred to as “putting data everywhere.”

That objective includes delivering analytics to the three tiers of enterprise users reliant on BI, doing so in a well-governed fashion predicated on a uniform data model, and systematically increasing the confidence and conviction of the perception of data throughout organizations.

Governed Business Intelligence

The need for a centralized approach to BI tools was largely spawned from the self-service movement that once threatened to fragment it. According to Mintz, an ingress of self-service BI tools in the early years of the previous decade was responsible for both expediting insight by decreasing lengthy reliance on IT, and eliminating “the central governance, the idea of everybody coming together and agreeing what the data means and keeping that in one place so that everyone’s working off the same playbook.”

Nonetheless, organizations can considerably reduce their physical infrastructure while simplifying their pipeline by deploying centralized data platforms that include BI tools that enable the results of analytics to readily migrate throughout the enterprise. The primary benefits of this approach involve reduced costs, less siloed tools for BI, and a streamlined pipeline that encompasses any variety of workflows. But as Mintz observed, another distinct advantage of this methodology is reinforced data governance in which “everyone has access to the data but it’s reliable, and used to make better decisisons.”

Centralized Data Modeling

The governance boons of centralized BI are based on a unified model that exists throughout the entire platform. This consistency in the terms, definitions, and semantics of data is largely preferable to situations in which “I have one tool for pipeline and one tool for federation and then another tool for transformation, and then I have this visualization tool and it’s just a very complicated pipeline,” Mintz noted. Utilizing singular platforms that account for all of these aspects of analytics ingrains a solitary data model for these myriad components of the pipeline–even across business units as needed–and evolves to accommodate new requirements or additions. “That’s really core to the platform,” Mintz revealed. “That idea of having language that can transform data from raw data that is not that useful into information and knowledge about business, because that’s what people need.” Piecemeal, point solutions that attempt to address these issues frequently fall victim to data chaos that obfuscates governance and reduces data’s utility over the long term.

From ETL to ELT

Gains with a centralized form of business intelligence naturally lend themselves to transformation. The Extract, Transform, and Load (ELT) method of transformation has historically been used with data warehousing. However, this approach requires transformation prior to loading data into a warehouse, and can take time to determine what Mintz referred to as how best to “transform it into something useful and try to get it into a star schema.” There are distinct advantages to reversing some of the core tenets of the ETL model into an Extract, Load and Transform model, which allows data to get to the warehouses quicker than the former method does. Mintz mentioned that Looker’s centralized approach “has such transformation capabilities, and the data warehouses that we’re sitting on are so powerful,” that users can replace ETL with ELT. In this case, “Rather than using a complex process to do that transformation before the data gets into their warehouse, instead we’re seeing that they just need a pipeline to get that data out of the source and into their warehouse,” Mintz explained. “And then they can get the transformation once the data is in their warehouse.”

Self-Service Data Culture

Perhaps the true value found in centralized BI tools pertains to the degree of self-service they contain, which in turn is a key requisite for perpetuating a truly data-centric culture. According to Mintz, there have traditionally been three constituencies of data users, each of whom benefit from centralized BI in their own ways.

  • Data Analysts: Mintz characterized these users as “The top of the pyramid. They’re the one’s who already have access to the data, they generally know SQL.” These users can leverage centralized BI platforms in a self-service manner without relying on IT to write queries and take inordinate time to return the results. “Now they can do something once, and then help everybody else in the company to self-serve and that saves their time because now they can concentrate on the hard problems,” Mintz said.
  • Data Explorers: According to Mintz, these users are “The middle slice of the pyramid. Those are people who are analytically minded, who think in numbers and are probably great at Excel, but don’t have enough of SQL to help themselves to the data.” The self-service nature of centralized BI with a unified data model that pushes results throughout the enterprise enables such users to “Open this door because they had all these questions and now they can get their own answers,” Mintz said. “And then when the answer raises another question they can get their own answer.”
  • Data Consumers: Mintz described these workers as “The base of the pyramid. These are people who need data in their daily jobs, who need a dashboard or a report, but often don’t feel comfortable exploring the data on their own.” These users benefit from having all of the data they need in a centralized location, as opposed to having to attempt to manually or visually aggregate results from multiple dashboards coming from myriad sources.

Becoming Data Driven

The ultimate gain from deploying centralized BI that is self-serviced throughout the enterprise is the fact that it empowers movement up the proverbial pyramid, so that different categories of users can transcend their positions and become higher categories of data users. “By lowering that barrier to entry and giving people a starting place, we see a lot of data consumers moving up to the data explorer world very quickly,” Mintz remarked. However, the foundation of the usefulness that such democratization to BI and to data in general that such an approach engenders is the enduring reality that it occurs in a centralized, governed framework that readily avoids all of the data chaos that hampered early attempts at self service.

“For the data analyst, Looker is actually really critical because they know I’m the one that curates that data,” Mintz said. “These people aren’t making up their own business metrics, I’m giving them the business metrics. When they need lifetime customer value, they click a link that says lifetime customer value. And I know that they’re using the definition that we all agreed on, and not freelancing.”

Originally Posted at: “Putting Data Everywhere”: Leveraging Centralized Business Intelligence for Full-Blown Data Culture

Feb 01, 18: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

[  COVER OF THE WEEK ]

image
Fake data  Source

[ AnalyticsWeek BYTES]

>> The Hidden Bias in Customer Metrics by bobehayes

>> A Big Data App That Helps You Find A Parking Spot by analyticsweekpick

>> Nate-Silvering Small Data Leads to Internet Service Provider (ISP) industry insights by bobehayes

Wanna write? Click Here

[ NEWS BYTES]

>>
 Webinar: Analytics-Based Sales: The New Era of B2B Growth – IndustryWeek Under  Sales Analytics

>>
 8 Ways You Can Succeed In A Machine Learning Career – Forbes Under  Machine Learning

>>
 Using AI To Improve Customer Experience – Forbes Under  Customer Experience

More NEWS ? Click Here

[ FEATURED COURSE]

R Basics – R Programming Language Introduction

image

Learn the essentials of R Programming – R Beginner Level!… more

[ FEATURED READ]

Rise of the Robots: Technology and the Threat of a Jobless Future

image

What are the jobs of the future? How many will there be? And who will have them? As technology continues to accelerate and machines begin taking care of themselves, fewer people will be necessary. Artificial intelligence… more

[ TIPS & TRICKS OF THE WEEK]

Strong business case could save your project
Like anything in corporate culture, the project is oftentimes about the business, not the technology. With data analysis, the same type of thinking goes. It’s not always about the technicality but about the business implications. Data science project success criteria should include project management success criteria as well. This will ensure smooth adoption, easy buy-ins, room for wins and co-operating stakeholders. So, a good data scientist should also possess some qualities of a good project manager.

[ DATA SCIENCE Q&A]

Q:What is a decision tree?
A: 1. Take the entire data set as input
2. Search for a split that maximizes the ‘separation” of the classes. A split is any test that divides the data in two (e.g. if variable2>10)
3. Apply the split to the input data (divide step)
4. Re-apply steps 1 to 2 to the divided data
5. Stop when you meet some stopping criteria
6. (Optional) Clean up the tree when you went too far doing splits (called pruning)

Finding a split: methods vary, from greedy search (e.g. C4.5) to randomly selecting attributes and split points (random forests)

Purity measure: information gain, Gini coefficient, Chi Squared values

Stopping criteria: methods vary from minimum size, particular confidence in prediction, purity criteria threshold

Pruning: reduced error pruning, out of bag error pruning (ensemble methods)

Source

[ VIDEO OF THE WEEK]

@AnalyticsWeek Panel Discussion: Health Informatics Analytics

 @AnalyticsWeek Panel Discussion: Health Informatics Analytics

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

It’s easy to lie with statistics. It’s hard to tell the truth without statistics. – Andrejs Dunkels

[ PODCAST OF THE WEEK]

@BrianHaugli @The_Hanover ?on Building a #Leadership #Security #Mindset #FutureOfData #Podcast

 @BrianHaugli @The_Hanover ?on Building a #Leadership #Security #Mindset #FutureOfData #Podcast

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

14.9 percent of marketers polled in Crain’s BtoB Magazine are still wondering ‘What is Big Data?’

Sourced from: Analytics.CLUB #WEB Newsletter