The Last Layer of Cyber Security: Business Continuity and Disaster Recovery with Incremental Backups

Due to burgeoning regulatory penalties and a seemingly interminable amount of threats, cyber security is a foremost concern for the contemporary enterprise.

Oftentimes, the most dependable protection involves combining methods and technologies to preserve the integrity of IT systems and their data. In this regard, an organization’s security stack is often as important as its technology stack.

According to BackupAssist Chief Executive Officer Linus Chang, there is considerable advantage to topping the former with reliable, systematic backups to ensure business continuity.

“A few years ago, people were saying that backup is dead, people are moving to the cloud, it’s all about high availability and so on,” Chang mentioned. “But these new regulations, and especially the onset of crypto ransomware, has really brought back the spotlight for having multiple layers of protection for data. There’s a lot of interest in keeping historical versions of data, and having backups as the last layer of protection should perimeter security fail.”

Ransomware
The advent of cryptographic ransomware makes for a compelling use case to preserve backup copies of information assets, and serves to highlight the various areas in which business continuity impacts the enterprise. Business continuity involves elements of cyber security, network availability, organizational risk, and disaster recovery—meaning that timely backups are assistive in each of these areas as well. Chang referenced an occurrence in which users “were struck by ransomware and didn’t have a proper backup system. They never got the data back and had to re-key in six weeks of data. Having backups would have minimized that to two, three hours maximum.” Modern backup systems are able to address instances of ransomware (and other types of malware) in three ways:

  • Protection—Protective capabilities can help ward off ransomware or malware attacks.
  • Detection—Scanning mechanisms enable the detection of any sort of large scale corruption or file system modifications typical of ransomware. Such routine scanning is a component of the backup process.
  • Response—Once ransomware or infected files are detected, backup jobs are ceased and prevented from running on those files. Alerts are sent to users notifying them of infected files.

High Availability
Network availability is an integral aspect of business continuity, particularly in the event of security breaches or disaster recovery. Nonetheless, there are a few key differences between conventional high availability methods (which frequently involve redundancy and cloud failover capabilities) and those provided by timely backups. According to Chang, “High availability is about minimizing downtime. Commonly you would say three nines [99.9%], four nines [99.99%], five nines [99.999%] of availability. Over the course of a year you might be down for 20 or 30 minutes.” Backing up data can also decrease downtime in certain network failure events. Still, backups issue benefits in addition to availability. “Backup is about being able to restore whole systems to get back historical data,” Chang noted. “High availability only talks about the current version of data. It doesn’t talk about being able to pull something back from two years ago.” Thus, when prompted by regulators or legal discovery measures for data companies may have possessed years ago, data backups—not high availability—provide ideal solutions. As such, backups are necessary for business continuity. “Business continuity’s all about the business and do they have the data when they need it,” Chang explained.

Flexibility
Viewed from a business continuity perspective, backups require a degree of flexibility to serve an ever evolving ecosystem of enterprise data needs. Cloud backups are usually the most common variety of backups deployed. However, there are certain situations in which backing up data to local, physical storage (typically on disk) is much more preferable to cloud backups. “If you’ve got small data sets, then absolutely it’s more feasible to put that data in the cloud,” Chang said. “When you’ve got large datasets and complete servers that you need to get up after a disaster, you need to bring it back up for full metal disaster recovery, then it’s always faster to do that when it’s stored on a hard disk at your local office. Imagine downloading terabytes of data from the cloud. It’s just too slow.” Other backup options involve cold storage backups, in which copies of data are “disconnected from the computer and the network,” Chang commented. Best practices include storing backups in multiple locations, utilizing cold storage, and leveraging ubiquitous file formats as opposed to proprietary, vendor formats that are difficult to access once versions of software or files have progressed.

Secure Implementations
In addition to investing in measures to fortify perimeter security, it’s becoming more and more necessary to preserve data with backups for any variety of use cases. Doing so is instrumental to business continuity in all of its facets, which include high availability, risk, cyber security, and disaster recovery.

Source: The Last Layer of Cyber Security: Business Continuity and Disaster Recovery with Incremental Backups by jelaniharper

Jul 25, 19: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
Data shortage  Source

[ LOCAL EVENTS & SESSIONS]

More WEB events? Click Here

[ AnalyticsWeek BYTES]

>> Choosing an Analytics Platform: 3 Factors to Consider for Your Application by analyticsweek

>> Making Sense of the 2018 Gartner Magic Quadrant for Data Integration Tools by analyticsweekpick

>> From Storage to Data Virtualization by analyticsweekpick

Wanna write? Click Here

[ FEATURED COURSE]

Baseball Data Wrangling with Vagrant, R, and Retrosheet

image

Analytics with the Chadwick tools, dplyr, and ggplot…. more

[ FEATURED READ]

Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython

image

Python for Data Analysis is concerned with the nuts and bolts of manipulating, processing, cleaning, and crunching data in Python. It is also a practical, modern introduction to scientific computing in Python, tailored f… more

[ TIPS & TRICKS OF THE WEEK]

Save yourself from zombie apocalypse from unscalable models
One living and breathing zombie in today’s analytical models is the pulsating absence of error bars. Not every model is scalable or holds ground with increasing data. Error bars that is tagged to almost every models should be duly calibrated. As business models rake in more data the error bars keep it sensible and in check. If error bars are not accounted for, we will make our models susceptible to failure leading us to halloween that we never wants to see.

[ DATA SCIENCE Q&A]

Q:Explain what a local optimum is and why it is important in a specific context,
such as K-means clustering. What are specific ways of determining if you have a local optimum problem? What can be done to avoid local optima?

A: * A solution that is optimal in within a neighboring set of candidate solutions
* In contrast with global optimum: the optimal solution among all others

* K-means clustering context:
It’s proven that the objective cost function will always decrease until a local optimum is reached.
Results will depend on the initial random cluster assignment

* Determining if you have a local optimum problem:
Tendency of premature convergence
Different initialization induces different optima

* Avoid local optima in a K-means context: repeat K-means and take the solution that has the lowest cost

Source

[ VIDEO OF THE WEEK]

@EdwardBoudrot / @Optum on #DesignThinking & #DataDriven Products #FutureOfData #Podcast

 @EdwardBoudrot / @Optum on #DesignThinking & #DataDriven Products #FutureOfData #Podcast

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

Without big data, you are blind and deaf and in the middle of a freeway. – Geoffrey Moore

[ PODCAST OF THE WEEK]

#BigData @AnalyticsWeek #FutureOfData #Podcast with  John Young, @Epsilonmktg

 #BigData @AnalyticsWeek #FutureOfData #Podcast with John Young, @Epsilonmktg

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

Every second we create new data. For example, we perform 40,000 search queries every second (on Google alone), which makes it 3.5 searches per day and 1.2 trillion searches per year.In Aug 2015, over 1 billion people used Facebook FB +0.54% in a single day.

Sourced from: Analytics.CLUB #WEB Newsletter

The Unexpected Connections Between Bitcoin and The Dow

It has been almost 10 years of a consistent attempt to create a new financial investment tool that is digital, online, and available to buy and sell on the internet 24/7. Out of thousands of ideas out there, Bitcoin is the latest financial craze, skyrocketing in popularity and initializing an entirely new currency market after coming into existence in January of 2009. It is the most traded digital currency and the biggest comparator to centralized currencies and other trading markets.

The Dow Jones Industrial Average, on the other hand, is a 122-year-old an index of 30 of the largest publicly owned companies in the US. It was calculated for the first time in 1896 and has been a staple of financial trading markets since its inception. It is an index for the stock market and a financial investment tool as of itself.

It might seem counterintuitive but if you’re a Bitcoin investor you should be monitoring the Dow Jones Industrial Average. Why? Well, we analyzed datasets to see if the two markets had any relation and we found a significant insight. Don’t believe me? Let me break down the data we analyzed and what we found.

Connecting Between Tradition and Revolution

In every Econ class, there is a point when you’re taught, for example, about the relationship between the trends in stock prices and in interest rates and the price of gold. There are even major academic studies about relationships between financial tools. Perhaps an important part of the evaluation for a financial investment tool is to be able to relate it to another one as a frame of reference for us.

Bitcoin is no different. There are studies focused on finding influences on market trends and attempting to find a positive correlation between Bitcoin value and other market trends.

We were intrigued and set out to get the data to investigate this ourselves.

How’d we do it?

We used two open datasets from Kaggle. Dataset one included information on Bitcoin values – open value, close value, the gap between them, the number of days per week values went up or down. Dataset two included information for the Dow – open value, close value, the gap between them, the number of days per week values went up or down.

We started with the Bitcoin value data and created a growth indicator – a simple counter that determined how many days per given week the gap between days was positive.
The growth indicator equaled one if Bitcoin value went up one day per week, equaled two if Bitcoin value went up two days per week, and so on until seven days a week.

Having this new indicator, we shifted the analysis to examine things on a weekly basis. This way we could examine how the current week trends related to the previous week trends as well as how the current week trends related to trends from two weeks prior.

To make it easier to explain we gave each week a descriptor:

  • Current week = (t 0)
  • Previous week = (t -1)
  • Two weeks prior = (t -2)

Then we did the same for The Dow data. We created a growth indicator – a simple counter that determined how many days per given week the gap between days was positive. The growth indicator equaled one if The Dow value went up one day per week, equaled two if The Dow value went up two days per week, and so on until five for going up five days a week (The Dow is only traded five days per week).

If you are curious to see the distribution of the growth indicators you can find it in our interactive dashboard or the full report.

Next, we mashed up the data using the “week id” as our joining point so we could look at the trends from a weekly perspective and compare the movements of the two markets to each other. We wanted to see how much each market moved up and down and at what times the movements occurred in order to see if there was a connection between The Dow and Bitcoin, or if one influenced the other.

We now had a classic dataset for a regression analysis investigation.

What did we find?

We found that upward movements in The Dow values in weeks (t -2) and (t -1) can predict upward movements in Bitcoin value in (t 0). The graph for this actually looks quite nice:

Bitcoin vs The Dow

What does this mean in percentages?

Bitcoin vs The Dow

When The Dow showed five consecutive days of growth in a given week, the Bitcoin showed between five and seven days of consecutive growth per week 38% of the weeks and between three and four days of consecutive growth 50% of the time! Or, on the flip side, when The Dow show zero consecutive days of growth per week the Bitcoin showed between five and seven days of growth 0% of the weeks.

Here are the exact correlation numbers:

Bitcoin vs The Dow

In short, when the Dow went up the Bitcoin went up a week or two later. It appears that the Bitcoin trends follow the Dow trends.

What does this mean for investors?

If you’re investing in Bitcoin, you have a pretty strong indicator to help make more informed decisions on what your next move should be.

If you are planning to invest in Bitcoin, you can look for a good week to start by zooming into the previous week’s growth of The Dow and evaluating your starting point. If you plan to sell your Bitcoin, you can also look at The Dow values and see if you are at a good week to sell or if perhaps it is better to wait another week.

Want to analyze the numbers yourself? Check out our interactive dashboard or read our full report.

Source: The Unexpected Connections Between Bitcoin and The Dow by analyticsweek

Jul 18, 19: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
Weak data  Source

[ LOCAL EVENTS & SESSIONS]

More WEB events? Click Here

[ AnalyticsWeek BYTES]

>> [Step-by-Step] Using Talend to Bulk Load Data in Snowflake Cloud Data Warehouse by analyticsweekpick

>> Jan 24, 19: #AnalyticsClub #Newsletter (Events, Tips, News & more..) by admin

>> Best & Worst Time for Cold Call by v1shal

Wanna write? Click Here

[ FEATURED COURSE]

CPSC 540 Machine Learning

image

Machine learning (ML) is one of the fastest growing areas of science. It is largely responsible for the rise of giant data companies such as Google, and it has been central to the development of lucrative products, such … more

[ FEATURED READ]

The Industries of the Future

image

The New York Times bestseller, from leading innovation expert Alec Ross, a “fascinating vision” (Forbes) of what’s next for the world and how to navigate the changes the future will bring…. more

[ TIPS & TRICKS OF THE WEEK]

Data Have Meaning
We live in a Big Data world in which everything is quantified. While the emphasis of Big Data has been focused on distinguishing the three characteristics of data (the infamous three Vs), we need to be cognizant of the fact that data have meaning. That is, the numbers in your data represent something of interest, an outcome that is important to your business. The meaning of those numbers is about the veracity of your data.

[ DATA SCIENCE Q&A]

Q:What is random forest? Why is it good?
A: Random forest? (Intuition):
– Underlying principle: several weak learners combined provide a strong learner
– Builds several decision trees on bootstrapped training samples of data
– On each tree, each time a split is considered, a random sample of m predictors is chosen as split candidates, out of all p predictors
– Rule of thumb: at each split m=?p
– Predictions: at the majority rule

Why is it good?
– Very good performance (decorrelates the features)
– Can model non-linear class boundaries
– Generalization error for free: no cross-validation needed, gives an unbiased estimate of the generalization error as the trees is built
– Generates variable importance

Source

[ VIDEO OF THE WEEK]

#FutureOfData Podcast: Peter Morgan, CEO, Deep Learning Partnership

 #FutureOfData Podcast: Peter Morgan, CEO, Deep Learning Partnership

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

Processed data is information. Processed information is knowledge Processed knowledge is Wisdom. – Ankala V. Subbarao

[ PODCAST OF THE WEEK]

#FutureOfData with @theClaymethod, @TiVo discussing running analytics in media industry

 #FutureOfData with @theClaymethod, @TiVo discussing running analytics in media industry

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

By 2020, we will have over 6.1 billion smartphone users globally (overtaking basic fixed phone subscriptions).

Sourced from: Analytics.CLUB #WEB Newsletter

Big, Bad Data: How Talent Analytics Will Make It Work In HR

team-at-a-glance

Here’s a mind-blowing fact to spark up the late-summer doldrums: research from IBM IBM -2.56% shows that 90% of the data in the world today has been created in the last two years alone. I find this fascinating.

Which means that companies have access to an unprecedented amount of information: insights, intelligence, trends, future-casting. In terms of HR, it’s a gold mine of Big Data.

This past spring, I welcomed the ‘Industry Trends in Human Resources Technology and Service Delivery Survey,’ conducted by the Information Services Group III +0.00% III -1.1% (ISG), a leading technology insights, market intelligence and advisory services company. It’s a useful study, particularly for leaders and talent managers, offering a clear glimpse of what companies investing in HR tech expect to gain from their investment.

talent_analytics

Not surprisingly, there are three key benefits companies expect to realize from investments in HR tech:

• Improved user and candidate experience

• Access to ongoing innovation and best practices to support the business

• Speed of implementation to increase the value of technology to the organization.

It’s worth noting that driving the need for an improved user interface, access, and speed is the nature of the new talent surging into the workforce: people for whom technology is nearly as much a given as air. We grew up with technology, are completely comfortable with it, and not only expect it to be available, we assume it will be available, as well as easy to use and responsive to all their situations, with mobile and social components.

According to the ISG study, companies want HR tech to offer strategic alignment with their business. I view this as more about enabling flexibility in talent management, recruiting and retention — all of which are increasing in importance as Boomers retire, taking with them their deep base of knowledge and experience. And companies are looking more for the analytics end of the benefit spectrum. No surprise here that the delivery model will be through cloud-based SaaS solutions.

Companies also want:

• Data security

• Data privacy

• Integration with existing systems, both HR and general IT

• Customizability —to align with internal systems and processes.

Cloud-based. According to the ISG report, more than 50% of survey respondents have implemented or are implementing cloud-based SaaS systems. It’s easy, it’s more cost-effective than on-premise software, and it’s where the exciting innovation is happening.

Mobile/social. That’s a given. Any HCM tool must have a good mobile user experience, from well-designed mobile forms and ease of access to a secure interface.

They want it to have a simple, intuitive user interface – another given. Whether accessed via desktop or mobile, the solution must offer a single, unified, simple-to-use interface.

They want it to offer social collaboration tools, which is particularly key for the influx of millenials coming into the workplace, who expect to be able to collaborate via social channels. HR is no exception here. While challenging from a security and data protection angle, it’s a must.

But the final requirement the study reported is, in my mind, the most important: Analytics and reporting. Management needs reporting to know their investment is paying off, and they also need robust analytics to keep ahead of trends within the workforce.

It’s not just a question of Big Data’s accessibility, or of sophisticated metrics, such as the Key Performance Indicators (KPIs) that reveal the critical factors for success and measure progress made towards strategic goals. For organizations to realize the promise of Big Data, they must be able to cut through the noise, and access the right analytics that will transform their companies for the better.

Given what companies are after, as shown in the ISG study, I predict that more and more companies are going to be recognizing the benefits of using integrated analytics for their talent management and workforce planning processes. Talent Analytics creates a powerful, invaluable amalgam of data and metrics; it can identify the meaningful patterns within that data and metrics and, for whatever challenges and opportunities an organization faces, it will best inform the decision makers on the right tactics and strategies to move forward. It will take talent analytics to synthesize Big Data and metrics to make the key strategic management decisions in HR. Put another way, it’s not just the numbers, it’s how they’re crunched.

Article originally appeared HERE.

Source: Big, Bad Data: How Talent Analytics Will Make It Work In HR

How Big Data Is Changing The Entertainment Industry!

Big Data is here – The latest buzzword of the Information Technology Industry!

The world is generating humungous amount of data every second. Rapid advances in technology is making analysis of such data a cake-walk. Big Data is influencing every aspect of our lives and will continue to grow bigger and better. Retailers will push us to buy extra chips and soft drinks from the nearest outlet as we will watch the T20 match with our friends and our favorite teams are playing. They will even recommend our favorite party songs CD and encourage us to donate a dollar to our often visited charity. Preventing diseases, share trading, marketing efforts and lot of other use cases are emerging.

Big Data is changing the sports and the entertainment industry as well. Sports and entertainment industry are driven by fans and their word of mouth. Engagement with audience is the key and Big Data is creating opportunities for driving this engagement and influencing audience sentiments.

IBM worked with a media company and ran its predictive models on the social buzz for the movie Ram Leela. According to the reports, IBM predicted a 73% success for the movie based on right selection of cities. Such rich analysis of social data was conducted for Barfi and Ek Tha Tiger. All these movies had a runaway success at the box office.

Hollywood uses Big Data big time ! The social media buzz can predict the box office success – more importantly based on the trending of the movie, strategies can be formulated to ensure favorable positioning of the movie. All science !

Netflix is the best case study of analyzing user behavior and hitting the jackpot ! Netflix original show The House Of Cards’ was commissioned solely on the basis of the big data results of the preferences of its customers.

Shah Rukh Khan’s Chennai Express, one of the biggest box office grossers on 2013, used Big Data & Analytics solutions to drive social media and digital marketing campaigns. IT Services company Persistent Systems helped Chennai Express team with the right strategic inputs. Chennai Express related tweets generated over 1 billion cumulative impressions and the total number of tweets across all hashtags was over 750 thousand over the 90-day campaign period. Persistent Systems CEO Siddhesh Bhobe said “Shah Rukh Khan and the success of Chennai Express have proved that social media is the channel of the future and that it presents unique opportunities to marketers and brands, at an unbeatable ROI (return on investment)”

Lady Gaga and her team browse through our listening preferences and sequences and optimize the playlist for the maximum impact at live events. Singapore based Big Data Analytics firm Crayon has worked with leading Hindi Film industry producers to understand the kind of music to release to create the right buzz for the movie.

Sports is another area where big data is making big impact. FIFA 2014 champion Germany have been using SAP’s Match Insights software. It has made a big difference to the team. Data was crunched relating to player position ‘touch maps’, passing ability, ball retention and even metrics such as ‘aggressive play’. Even Kolkota Knight Riders, an IPL team, to determine the consistency of the players based on 25 data point per ball. It helped in auction as well as ongoing training.

Big Data can definitely be a boon to the entertainment and sports industry. It can improve the profitability of the movies – always a high risk business. The green-lighting of the story to the cast selection to the timing of release can be determined. It can help to pick the right players for the sporting leagues – allowing talent to win !

Entertainment Industry leaders need to collaborate with the leading big data startups and visionaries to create new uses and deliver new success stories!

Originally posted via “How Big Data Is Changing The Entertainment Industry!”

Originally Posted at: How Big Data Is Changing The Entertainment Industry! by analyticsweekpick

Jul 11, 19: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
Trust the data  Source

[ LOCAL EVENTS & SESSIONS]

More WEB events? Click Here

[ AnalyticsWeek BYTES]

>> Marketing Analytics – Success Through Analysis by analyticsweekpick

>> Closer Than You Think: Data Strategies Across Your Company by analyticsweek

>> CISOs’ newest fear? Criminals with a big data strategy by analyticsweekpick

Wanna write? Click Here

[ FEATURED COURSE]

Data Mining

image

Data that has relevance for managerial decisions is accumulating at an incredible rate due to a host of technological advances. Electronic data capture has become inexpensive and ubiquitous as a by-product of innovations… more

[ FEATURED READ]

The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World

image

In the world’s top research labs and universities, the race is on to invent the ultimate learning algorithm: one capable of discovering any knowledge from data, and doing anything we want, before we even ask. In The Mast… more

[ TIPS & TRICKS OF THE WEEK]

Fix the Culture, spread awareness to get awareness
Adoption of analytics tools and capabilities has not yet caught up to industry standards. Talent has always been the bottleneck towards achieving the comparative enterprise adoption. One of the primal reason is lack of understanding and knowledge within the stakeholders. To facilitate wider adoption, data analytics leaders, users, and community members needs to step up to create awareness within the organization. An aware organization goes a long way in helping get quick buy-ins and better funding which ultimately leads to faster adoption. So be the voice that you want to hear from leadership.

[ DATA SCIENCE Q&A]

Q:Explain selection bias (with regard to a dataset, not variable selection). Why is it important? How can data management procedures such as missing data handling make it worse?
A: * Selection of individuals, groups or data for analysis in such a way that proper randomization is not achieved
Types:
– Sampling bias: systematic error due to a non-random sample of a population causing some members to be less likely to be included than others
– Time interval: a trial may terminated early at an extreme value (ethical reasons), but the extreme value is likely to be reached by the variable with the largest variance, even if all the variables have similar means
– Data: “cherry picking”, when specific subsets of the data are chosen to support a conclusion (citing examples of plane crashes as evidence of airline flight being unsafe, while the far more common example of flights that complete safely)
– Studies: performing experiments and reporting only the most favorable results
– Can lead to unaccurate or even erroneous conclusions
– Statistical methods can generally not overcome it

Why data handling make it worse?
– Example: individuals who know or suspect that they are HIV positive are less likely to participate in HIV surveys
– Missing data handling will increase this effect as it’s based on most HIV negative
-Prevalence estimates will be unaccurate

Source

[ VIDEO OF THE WEEK]

#BigData @AnalyticsWeek #FutureOfData #Podcast with Juan Gorricho, @disney

 #BigData @AnalyticsWeek #FutureOfData #Podcast with Juan Gorricho, @disney

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

Data are becoming the new raw material of business. – Craig Mundie

[ PODCAST OF THE WEEK]

@EdwardBoudrot / @Optum on #DesignThinking & #DataDriven Products #FutureOfData #Podcast

 @EdwardBoudrot / @Optum on #DesignThinking & #DataDriven Products #FutureOfData #Podcast

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

94% of Hadoop users perform analytics on large volumes of data not possible before; 88% analyze data in greater detail; while 82% can now retain more of their data.

Sourced from: Analytics.CLUB #WEB Newsletter

Don’t Let your Data Lake become a Data Swamp

In an always-on, competitive business environment, organizations are looking to gain an edge through digital transformation. Subsequently, many companies feel a sense of urgency to transform across all areas of their enterprise—from manufacturing to business operations—in the constant pursuit of continuous innovation and process efficiency.

Data is at the heart of all these digital transformation projects. It is the critical component that helps generate smarter, improved decision-make by empowering business users to eliminate gut feelings, unclear hypotheses, and false assumptions. As a result, many organizations believe building a massive data lake is the ‘silver bullet’ for delivering real-time business insights. In fact, according to a survey by CIO review from IDG, 75 percent of business leaders believe their future success will be driven by their organization’s ability to make the most of their information assets. However, only four percent of these organizations said they are set up a data-driven approach for successfully benefits from their information.

Is your Data Lake becoming more of a hindrance than an enabler?

The reality is that all these new initiatives and technologies come with a unique set of generated data, which creates additional complexity in the decision-making process. To cope with the growing volume and complexity of data and alleviate IT pressure, some are migrating to the cloud.

But this transition—in turn—creates other issues. For example, once data is made more broadly available via the cloud, more employees want access to that information. Growing numbers and varieties of business roles are looking to extract value from increasingly diverse data sets, faster than ever—putting pressure on IT organizations to deliver real-time, data access that serves the diverse needs of business users looking to apply real-time analytics to their everyday jobs. However, it’s not just about better analytics—business users also frequently want tools that allow them to prepare, share, and manage data.

To minimize tension and friction between IT and business departments, moving raw data to one place where everybody can access it sounded like a good move.  The concept of the data lake first coined by James Dixon in 2014 expected the data lake to be a large body of raw data in a more natural state where different users come to examine it, delve into it, or extract samples from it. However, increasingly organizations are beginning to realize that all the time and effort spent building massive data lakes have frequently made things worse due to poor data governance and management, which resulted in the formation of so-called “Data Swamps”.

Bad data clogging up the machinery

The same way data warehouses failed to manage data analytics a decade ago, data lakes will undoubtedly become “Data Swamps” if companies don’t manage them in the correct way. Putting all your data in a single place won’t in and of itself solve a broader data access problem. Leaving data uncontrolled, un-enriched, not qualified, and unmanaged, will dramatically hamper the benefits of a data lake, as it will still have the ability to only be utilized properly by a limited number of experts with a unique set of skills.

A success system of real-time business insights starts with a system of trust. To illustrate the negative impact of bad data and bad governance, let’s take a look at what happened to Dieselgate. The Dieselgate emissions scandal highlighted the difference between real-world and official air pollutant emissions data. In this case, the issue was not a problem of data quality, but of ethics, since some car manufacturers misled the measurement system by injecting fake data. This resulted in fines for car manufacturers exceeding more than tens of billions of dollars and consumers losing faith in the industry. After all, how can consumers trust the performance of cars now that they know the system-of-measure has been intentionally tampered with? 

The takeaway in the context of an enterprise data lake is that its value will depend on the level of trust employees have in the data contained in the lake. Failing to control data accuracy and quality within the lake will create mistrust amongst employees, seed doubt about the competency of IT, and jeopardize the whole data value chain, which then negatively impacts overall company performance.

A cloud data warehouse to deliver trusted insights for the masses

Leading firms believe governed cloud data lakes represent an adequate solution to overcoming some of these more traditional data lake stumbling blocks. The following four-step approach helps modernize cloud data warehouse while providing better insight into the entire organization. 

  1. Unite all data sources and reconcile them: Make sure the organization has the capacity to integrate a wide array of data sources, formats and sizes. Storing a wide variety of data in one place is the first step, but it’s not enough. Bridging data pipelines and reconciling them is another way to gain the capacity to manage insights. Verify the company has a cloud-enabled data management platform combining rich integration capabilities and cloud elasticity to process high data volumes at a reasonable price.
  2. Accelerate trusted insights to the masses: Efficiently manage data with cloud data integration solutions that help prepare, profile, cleanse, and mask data while monitoring data quality over time regardless of file format and size.  When coupled with cloud data warehouse capabilities, data integration can enable companies to create trusted data for access, reporting, and analytics in a fraction of the time and cost of traditional data warehouses. 
  3. Collaborative data governance to the rescue: The old schema of a data value chain where data is produced solely by IT in data warehouses and consumed by business users is no longer valid.  Now everyone wants to create content, add context, enrich data, and share it with others. Take the example of the internet and a knowledge platform such as Wikipedia where everybody can contribute, moderate and create new entries in the encyclopedia. In the same way Wikipedia established collaborative governance, companies should instill a collaborative governance in their organization by delegating the appropriate role-based, authority or access rights to citizen data scientists, line-of-business experts, and data analysts.
  4. Democratize data access and encourage users to be part of the Data Value Chain: Without making people accountable for what they’re doing, analyzing, and operating, there is little chance that organizations will succeed in implementing the right data strategy across business lines. Thus, you need to build a continuous Data Value Chain where business users contribute, share, and enrich the data flow in combination with a cloud data warehouse multi-cluster architecture that will accelerate data usage by load balancing data processing across diverse audiences.

In summary, think of data as the next strategic asset. Right now, it’s more like a hidden treasure at the bottom of many companies. Once modernized, shared and processed, data will reveal its true value, delivering better and faster insights to help companies get ahead of the competition.

The post Don’t Let your Data Lake become a Data Swamp appeared first on Talend Real-Time Open Source Data Integration Software.

Source: Don’t Let your Data Lake become a Data Swamp by analyticsweek

Are You Headed for the Analytics Cliff?

When was the last time you updated your analytics—or even took a hard look? Don’t feel guilty if it’s been a while. Even when there are minor indicators of trouble, many companies put analytics projects on the backburner or implement service packs as a Band-Aid solution.

What companies don’t realize, however, is that once analytics begin to fail, time is limited. Application teams that are not quick to act risk losing valuable revenue and customers. Fortunately, if you know the signs, you can avoid a catastrophe.

>> Related: Blueprint to Modern Analytics <<

Are you headed for the analytics cliff? Keep an eye out for these clear indicators that your analytics is failing:

Sign #1: Long Queue of Ad Hoc Requests

Is your queue of ad hoc requests constantly getting longer? Most companies start their analytics journeys by adding basic dashboards and reports to their applications. This satisfies users for a short period of time, but within a few months, users inevitably want more. Maybe they want to explore data on their own or connect new data sources to the application.

Eventually, you end up with a long queue of ad hoc requests for new features and capabilities. When you ignore these requests, you risk unhappy customers and skyrocketing churn rates. If you’re struggling to keep up with the influx—much less get ahead of it—you may be heading for the analytics cliff.

Sign #2: Unhappy Users & Poor Engagement

Are your customers becoming more vocal about what they don’t like about your embedded analytics? Dissatisfied customers, and in turn, poor user engagement, is a clear indication something is wrong. Ask yourself these questions to determine if your application is in trouble:

  • Basic adoption: How many users are regularly accessing the application’s dashboards and reports?
  • Stickiness: Are users spending more or less time in the embedded analytics?
  • The eject button: Have you seen an increase in users exporting data outside of your application to do their own analysis?

The more valuable your embedded dashboards and reports are, the more user engagement you’ll see. Forward-thinking application teams are adding value to their embedded analytics by going beyond basic capabilities.

Sign #3: Losing Customers to Competitors

When customers start abandoning your application for the competition, you’re fast approaching an analytics cliff. Whether you like it or not, you’re stacked against your competitors. If they’re innovating their analytics while yours stay stagnant, you’ll soon lose ground (if you haven’t already).

Companies that want to use embedded analytics as a competitive advantage or a source of revenue can’t afford to put off updates. As soon as your features start to lag behind the competition, you’ll be forced to upgrade just to catch up. And if your customers have started to churn, you’ll be faced with the overwhelming task of winning back frustrated customers or winning over new ones.

Sign #4: Revenue Impact

All the previous indicators were part of a slow and steady decline. By this point, you’re teetering on the edge of the analytics cliff. Revenue impact can come in many forms, including:

  • Declining win rate
  • Slowing pipeline progression
  • Decreasing renewals
  • Drop in sales of analytics modules

A two percent reduction in revenue can be an anomaly, or an indication of a downward trend. Some software companies make the mistake of ignoring such a small decrease. But even slowing rates of growth can be disastrous. According to a recent McKinsey study, “Grow Fast or Die Slow,” company growth yields greater returns and matters more than margins or cost structure. If a software company grows less than 20 percent annually, they have a 92 percent chance of failure. Revenue impact—no matter how small—is a sign that it’s definitely time to act.

To learn more, read our ebook: 5 Early Indicators Your Analytics Will Fail >

 

Source