Jun 29, 17: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

[  COVER OF THE WEEK ]

image
Human resource  Source

[ AnalyticsWeek BYTES]

>> Must read quotes from Steve Jobs for Entrepreneurs by v1shal

>> Prepping for Data Driven Innovation by v1shal

>> 7 Deadly Sins of Total Customer Experience  by v1shal

Wanna write? Click Here

[ NEWS BYTES]

>>
 The Downside of Windows Server 2016 for Virtualization Admins – Virtualization Review Under  Virtualization

>>
 How To Display Log Data Using Overlay Charts – Virtualization Review Under  Virtualization

>>
 [Bootstrap Heroes] G-Square brings in a bot and plug-and-play element into analytics – YourStory.com Under  Financial Analytics

More NEWS ? Click Here

[ FEATURED COURSE]

Hadoop Starter Kit

image

Hadoop learning made easy and fun. Learn HDFS, MapReduce and introduction to Pig and Hive with FREE cluster access…. more

[ FEATURED READ]

On Intelligence

image

Jeff Hawkins, the man who created the PalmPilot, Treo smart phone, and other handheld devices, has reshaped our relationship to computers. Now he stands ready to revolutionize both neuroscience and computing in one strok… more

[ TIPS & TRICKS OF THE WEEK]

Data Have Meaning
We live in a Big Data world in which everything is quantified. While the emphasis of Big Data has been focused on distinguishing the three characteristics of data (the infamous three Vs), we need to be cognizant of the fact that data have meaning. That is, the numbers in your data represent something of interest, an outcome that is important to your business. The meaning of those numbers is about the veracity of your data.

[ DATA SCIENCE Q&A]

Q:Explain the difference between “long” and “wide” format data. Why would you use one or the other?
A: * Long: one column containing the values and another column listing the context of the value Fam_id year fam_inc

* Wide: each different variable in a separate column
Fam_id fam_inc96 fam_inc97 fam_inc98

Long Vs Wide:
– Data manipulations are much easier when data is in the wide format: summarize, filter
– Program requirements

Source

[ VIDEO OF THE WEEK]

#BigData @AnalyticsWeek #FutureOfData #Podcast with @MichOConnell, @Tibco

 #BigData @AnalyticsWeek #FutureOfData #Podcast with @MichOConnell, @Tibco

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

The temptation to form premature theories upon insufficient data is the bane of our profession. – Sherlock Holmes

[ PODCAST OF THE WEEK]

#BigData @AnalyticsWeek #FutureOfData #Podcast with Nathaniel Lin (@analytics123), @NFPA

 #BigData @AnalyticsWeek #FutureOfData #Podcast with Nathaniel Lin (@analytics123), @NFPA

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

Poor data can cost businesses 20%–35% of their operating revenue.

Sourced from: Analytics.CLUB #WEB Newsletter

Mobile Strategy For Brick-Mortar Stores And How Retailers Are Doing It All Wrong

No, we are not talking about mobile strategy for retailers but a sub-part of it, that is – Mobile strategy for brick-mortar stores. Yes, it is different from the overall mobile strategy for the retailer and no, it cannot be done correctly without thinking differently from online store strategy. With ever increasing number of smartphone and affordable data plans, it would be a bad move not to think about mobile strategy for retailers. I hope retailers are already aware that it is critical for their business to have a mobile strategy. There are tons of material already written to suggest why. What I have yet to discover is separate mobile strategy for retailer to help their brick-mortar stores.

For number enthusiasts:
Comscore, as of January, 2012, 101.3 million people in the United States have a smartphone. That’s almost one in every three Americans! While Nielsen has this number to be around 43%. Fitch Ratings also predicts that by end of 2012 2/3rd that is approximately 66% of US population will be using smartphones. The U.S. Census Bureau recently announced that eCommerce was responsible for 48.2 billion dollars in sales during the third quarter of 2011.
A market research by Vibes’ shows that mobile technology plays a vital role for in-store shoppers:

  • 84% of shoppers have conducted in-store product research via smartphone
  • Nearly half of all consumers feel more confident about their purchasing decisions after pulling up additional product information on their mobile phones
  • 33% admitted to searching a competitor’s website for better deals while in-store
  • 6% of consumers said they were likely to abandon an in-store purchase for a competing offer

Interestingly, brick-mortar stores are not doing great despite many of them investing in a good mobile marketing strategy. The reason being – one size fits all approach used by the retailers. Many have just one app that handles the overall retail experience, online presence as well as brick-mortar stores. So, retailers should refocus their mobile strategy and break the overall plan into two parts: Online mobile strategy and brick-mortar store strategy. Both areas have their respective focus areas. One primarily caters to online mobile surfers, while other caters to visitors seeking help in brick-mortar stores. No, it is not necessary to design it as 2 different apps, it could very well be integrated into one app, but design, feature consideration should be specifically designed to also keep in mind the needs of the store visitors. While doing this from the same app, application framework  and app should be smart enough to identify the traffic. As a starter, it would be a good idea to experiment ans test it using QRCode backed website, which could later be integrated into app strategy if the workflows and used cases are identified and validated.

Following strategy design considerations would go a long way in building a strong brick-mortar retail specific mobile strategy:

1. Include all possible used cases needed by store visitors and wanderers:
It is important to understand what are the most promising features required by users who are surfing the store or wandering around the areas. The used case may include – learning more about products, asking for help, searching for an accessories, price match etc. May be hiring some shoppers or doing focus groups could provide some starting ground. With this savvy data age, I am strictly against focus group, but surely, it could work great as a starting point. It is also important to restrict the research and findings to areas that are impacting store workflows only.

2. Connect Online store with off-line through seamless layer:
Considering expanding mobile landscape, it is important not to lose sight of the bigger picture and the overall mobile strategy. So, seamless connectivity between store specific workflows and online store workflows provides easy maneuverability to users. The goal should be to keep users satisfied by fulfilling all their needs and thereby keeping their business confined within store commerce. Certain examples are: providing product availability online, having product shipped to home for free from online etc. So, it is important to compensate shortfalls from one channel with the other i.e. Online workflows and Brick-Mortar store workflows respectively.

3. Provide ability to leave feedback,suggestion, grievances etc.:
Learning is an important part for any business. With evolution in data tools, there is no excuse for anyone not to leverage it for personal benefits. Any possible auto-learning opportunities must be incorporated. A good customer experience management strategy provides list of those surveys and learning manuals that should just be enabled in mobile framework at appropriate workflow touch points. Having done that, retailers will not need anything else but this self learning mechanism to evolve with changing market dynamics. This can lead to sustainable business growth.

4. Reward visitors for enhancing usage:
Certainly, a usage will provide so many other opportunities to stores such as better learning, better chances for referrals, recommendations etc. With that in mind, retailers should provision for some reward system to encourage the use of mobile products and a good design framework should provide mechanics for integrating some reward system. This could be done by providing store credits, coupons etc. It is important that some learning should also be done on which rewards works at which stage.

5. Provide seamless presence and connectivity with other social platforms:
It is not a surprise that there are many other better, reliable local presence social apps being used by users. Some examples being, face book, foursquare, yelp etc. It is important for store mobile strategy to incorporate some alliance with those framework as well. There should be a customized and altered to attract visitors. The sooner stores get in those lines, the more adoption will they receive.

So, get the right gears and move onto building a robust brick-mortar store mobile strategy, that helps stores learn faster and move with changing customer landscapes.

Source by v1shal

Data Sources for Cool Data Science Projects: Part 1

At The Data Incubator, we run a free six week data science fellowship to help our Fellows land industry jobs. Our hiring partners love considering Fellows who don’t mind getting their hands dirty with data.  That’s why our Fellows work on cool capstone projects that showcase those skills.  One of the biggest obstacles to successful projects has been getting access to interesting data.  Here are a few cool public data sources you can use for your next project:

Economic Data:

  1. Publically Traded Market Data: Quandl is an amazing source of finance data. Google Finance and Yahoo Finance are additional good sources of data.  Corporate filings with the SEC are available on Edgar.
  2. Housing Price Data: You can use the Trulia API or the Zillow API.
  3. Lending data: You can find student loan defaults by university and the complete collection of peer-to-peer loans from Lending Club and Prosper, the two largest platforms in the space.
  4. Home mortgage data: There is data made available by the Home Mortgage Disclosure Act and there’s a lot of data from the Federal Housing Finance Agency available here.

Content Data:

  1. Review Content: You can get reviews of restaurants and physical venues from Foursquare and Yelp (see geodata).  Amazon has a large repository of Product Reviews.  Beer reviews from Beer Advocate can be found here.  Rotten Tomatoes Movie Reviews are available from Kaggle.
  2. Web Content: Looking for web content?  Wikipedia provides dumps of their articles.  Common Crawl has alarge corpus of the internet available.  ArXiv maintains all their data available via Bulk Download from AWS S3.  Want to know which URLs are malicious?  There’s a dataset for that.  Music data is available from the Million Songs Database.  You can analyze the Q&A patterns on sites like Stack Exchange (including Stack Overflow).
  3. Media Data: There’s open annotated articles form the New York Times, Reuters Dataset, and GDELT project (a consolidation of many different news sources).  Google Books has published NGrams for books going back to past 1800.
  4. Communications Data: There’s access to public messages of the Apache Software Foundation and communications amongst former execs at Enron.

Government Data:

  1. Municipal Data: Crime Data is available for City of Chicago and Washington DC.  Restaurant Inspection Data is available for Chicago and New York City.
  2. Transportation Data: NYC Taxi Trips in 2013 are available courtesy of the Freedom of Information Act.  There’s bikesharing data from NYC, Washington DC, and SF.  There’s also Flight Delay Data from the FAA.
  3. Census Data: Japanese Census Data.  US Census data from 2010, 2000, 1990.  From census data, the government has also derived time use data.  EU Census Data.  Check out popular male / female baby names going back to the 19th Century from the Social Security Administration.
  4. World Bank: They have a lot of data available on their website.
  5. Election Data: Political contribution data for the last few US elections can be downloaded from the FEChere and here.  Polling data is available from Real Clear Politics.
  6. Food, Drugs, and Devices Data: The FDA provides a number of high value public datasets.

 

While building your own project cannot replicate the experience of fellowship at The Data Incubator (our Fellows get amazing access to hiring managers and access to nonpublic data sources) we hope this will get you excited about working in data science.  And when you are ready, you can apply to be a Fellow!

Got any more data sources?  Let us know and we’ll add them to the list!

This article appeared in The Data Incubator on October 16, 2014. 

Originally Posted at: Data Sources for Cool Data Science Projects: Part 1

Jun 22, 17: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

[  COVER OF THE WEEK ]

image
Pacman  Source

[ LOCAL EVENTS & SESSIONS]

More WEB events? Click Here

[ AnalyticsWeek BYTES]

>> Talent analytics in practice by analyticsweekpick

>> Four Use Cases for Healthcare Predictive Analytics, Big Data by anum

>> The 7 Best Data Science and Machine Learning Podcasts by analyticsweekpick

Wanna write? Click Here

[ NEWS BYTES]

>>
 Data science and analytic centre opened – The Hindu Under  Data Science

>>
 Cyber security or cyber snooping? | Bangkok Post: opinion – Bangkok Post Under  cyber security

>>
 CA Technologies: CA Technologies claims its payment security … – ETCIO.com Under  Risk Analytics

More NEWS ? Click Here

[ FEATURED COURSE]

Hadoop Starter Kit

image

Hadoop learning made easy and fun. Learn HDFS, MapReduce and introduction to Pig and Hive with FREE cluster access…. more

[ FEATURED READ]

The Misbehavior of Markets: A Fractal View of Financial Turbulence

image

Mathematical superstar and inventor of fractal geometry, Benoit Mandelbrot, has spent the past forty years studying the underlying mathematics of space and natural patterns. What many of his followers don’t realize is th… more

[ TIPS & TRICKS OF THE WEEK]

Strong business case could save your project
Like anything in corporate culture, the project is oftentimes about the business, not the technology. With data analysis, the same type of thinking goes. It’s not always about the technicality but about the business implications. Data science project success criteria should include project management success criteria as well. This will ensure smooth adoption, easy buy-ins, room for wins and co-operating stakeholders. So, a good data scientist should also possess some qualities of a good project manager.

[ DATA SCIENCE Q&A]

Q:Is it better to spend 5 days developing a 90% accurate solution, or 10 days for 100% accuracy? Depends on the context?
A: * “premature optimization is the root of all evils”
* At the beginning: quick-and-dirty model is better
* Optimization later
Other answer:
– Depends on the context
– Is error acceptable? Fraud detection, quality assurance

Source

[ VIDEO OF THE WEEK]

@AnalyticsWeek Panel Discussion: Marketing Analytics

 @AnalyticsWeek Panel Discussion: Marketing Analytics

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

If you can’t explain it simply, you don’t understand it well enough. – Albert Einstein

[ PODCAST OF THE WEEK]

#BigData @AnalyticsWeek #FutureOfData #Podcast with Juan Gorricho, @disney

 #BigData @AnalyticsWeek #FutureOfData #Podcast with Juan Gorricho, @disney

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

Facebook stores, accesses, and analyzes 30+ Petabytes of user generated data.

Sourced from: Analytics.CLUB #WEB Newsletter

Improving Big Data Governance with Semantics

By Dr. Jans Aasman Ph.d, CEO of Franz Inc.

Effective data governance consists of protocols, practices, and the people necessary for implementation to ensure trustworthy, consistent data. Its yields include regulatory compliance, improved data quality, and data’s increased valuation as a monetary asset that organizations can bank on.

Nonetheless, these aspects of governance would be impossible without what is arguably its most important component: the common terminologies and definitions that are sustainable throughout an entire organization, and which comprise the foundation for the aforementioned policy and governance outcomes.

When intrinsically related to the technologies used to implement governance protocols, terminology systems (containing vocabularies and taxonomies) can unify terms and definitions at a granular level. The result is a greatly increased ability to tackle the most pervasive challenges associated with big data governance including recurring issues with unstructured and semi-structured data, integration efforts (such as mergers and acquisitions), and regulatory compliance.

A Realistic Approach
Designating the common terms and definitions that are the rudiments of governance varies according to organization, business units, and specific objectives for data management. Creating policy from them and embedding them in technology that can achieve governance goals is perhaps most expediently and sustainably facilitated by semantic technologies, which are playing an increasingly pivotal role in the overall implementation of data governance in the wake of big data’s emergence.

Once organizations adopt a glossary of terminology and definitions, they can then determine rules about terms based on their relationships to one another via taxonomies. Taxonomies are useful for disambiguation purposes and can clarify preferred labels—among any number of synonyms—for different terms in accordance to governance conventions. These definitions and taxonomies form the basis for automated terminology systems that label data according to governance standards via inputs and outputs. Ingested data adheres to terminology conventions and is stored according to preferred labels. Data captured prior to the implementation of such a system can still be queried according to the system’s standards.

Linking Terminology Systems: Endless Possibilities
The possibilities that such terminology systems produce (especially for unstructured and semi-structured big data) are virtually limitless, particularly with the linking capabilities of semantic technologies. In the medical field, a hand written note hastily scribbled by a doctor can be readily transcribed by the terminology system in accordance to governance policy with preferred terms, effectively giving structure to unstructured data. Moreover, it can be linked to billing coding systems per business functions. That structured data can then be stored in a knowledge repository and queried along with other data, adding to the comprehensive integration and accumulation of data that gives big data its value.

Focusing on common definitions and linking terminology systems enables organizations to leverage business intelligence and analytics on different databases across business units. This method is also critical for determining customer disambiguation, a frequently occurring problem across vertical industries. In finance, it is possible for institutions with numerous subsidiaries and acquisitions (such as Citigroup, Citibank, Citi Bike, etc.) to determine which subsidiary actually spent how much money with the parent company and additional internal, data-sensitive problems by using a common repository. Also, linking the different terminology repositories for these distinct yet related entities can achieve the same objective.

The primary way in which semantics addresses linking between terminology systems is by ensuring that those systems are utilizing the same words and definitions for the commonality of meaning required for successful linking. Vocabularies and taxonomies can provide such commonality of meaning, which can be implemented with ontologies to provide a standards-based approach to disparate systems and databases.

Subsequently, all systems that utilize those vocabularies and ontologies can be linked. In finance, the Financial Industry Business Ontology (FIBO) is being developed to grant “data harmonization and…the unambiguous sharing of meaning across different repositories.” The life sciences industry is similarly working on industry wide standards so that numerous databases can be made available to all within this industry, while still restricting access to internal drug discovery processes according to organization.

Regulatory Compliance and Ontologies
In terms of regulatory compliance, organizations are much more flexible and celeritous to account for new requirements when data throughout disparate systems and databases are linked and commonly shared—requiring just a single update as opposed to numerous time consuming updates in multiple places. Issues of regulatory compliance are also assuaged in a semantic environment through the use of ontological models, which provide the schema that can create a model specifically in adherence to regulatory requirements.

Organizations can use ontologies to describe such requirements, then write rules for them that both restrict and permit access and usage according to regulations. Although ontological models can also be created for any other sort of requirements pertaining to governance (metadata, reference data, etc.) it is somewhat idealistic to attempt to account for all facets of governance implementation via such models. The more thorough approach is to do so with terminology systems and supplement them accordingly with ontological models.

Terminologies First
The true value in utilizing a semantic approach to big data governance that focuses on terminology systems, their requisite taxonomies, and vocabularies pertains to the fact that this method is effective for governing unstructured data. Regardless of what particular schema (or lack thereof) is available, organizations can get their data to adhere to governance protocols by focusing on the terms, definitions, and relationships between them. Conversely, ontological models have a demonstrated efficacy with structured data. Given the fact that the majority of new data created is unstructured, the best means of wrapping effective governance policies and practices around them is through leveraging these terminology systems and semantic approaches that consistently achieve governance outcomes.

About the Author: Dr. Jans Aasman Ph.d is the CEO of Franz Inc., an early innovator in Artificial Intelligence and leading supplier of Semantic Graph Database technology. Dr. Aasman’s previous experience and educational background include:
• Experimental and cognitive psychology at the University of Groningen, specialization: Psychophysiology, Cognitive Psychology.
• Tenured Professor in Industrial Design at the Technical University of Delft. Title of the chair: Informational Ergonomics of Telematics and Intelligent Products
• KPN Research, the research lab of the major Dutch telecommunication company
• Carnegie Mellon University. Visiting Scientist at the Computer Science Department of Prof. Dr. Allan Newell

Originally Posted at: Improving Big Data Governance with Semantics

How the lack of the right data affects the promise of big data in India

Big data is the big buzzword these days. Big data refers to a collection of data sets or information too large and complex to be processed by standard tools. It is the art and science of combining enterprise data, social data and machine data to derive new insights, which it otherwise would not be possible to derive. It is also about combining past data with real time data to predict or suggest outcomes in a current or future context.

yourstory_BigData

The digital footprint is progressively expanding world over, into fragmented mediums (blogs, tweets, reviews etc.) and technologies (mobile, web, cloud/SaaS etc.).

Digital landscape in India

India’s digital landscape too may be evolving quickly but overall penetration remains low, with only 1 in 5 Indians using the Internet in July 2014.

In India, enterprises and businesses have access to a veritable wealth of information. And though some of the larger organisations have made a start in harnessing this information, most Indian companies are still learning how to collect and store big data.

Telecom providers, online travel agencies and online retail stores are some of the industries that are using big data analytics to engage customers in some way or another.

However, big data analytics is still in its infancy in India. Most companies are still learning to store the data collected. Also, there are several challenges when it comes to the collection of data sets themselves. Past and current data is required to make the application of big data analytics really useful, and there is a scarcity of this in public and private sectors in India. Some of the reasons for the lack of enough data are:

Yet to be fully computerised

Healthcare, economic, and statistical data, in both private and public sectors in India, is yet to be computerised. The main reason for this is the late adoption of IT in India. Unlike in the West, most industries in India made the transition from manual records to computerised information systems only during the last decade.

Over the years, the state and central ministries have made moves towards e-governance.  Efforts to deliver public services, and to make access to these services easier, are being made as well. This is still a work in progress; huge amounts of data across many government sectors are yet to be digitised.

Quality of data

In big data analytics, data sufficiency plays a critical role when samples are run across different dimensions. Sufficient data points to make informed analyses are required. Not only the quantity of data, the quality of data being used for crunching, too, influences the quality of insights.  If the signal-to-noise-ratio is high, the accuracy of results may vary for less than optimum data samples. In a country like India, there is very little information about the individuals, due to the fact that Indians are not overly expressive, especially on public forums.

Public social media information that is available for most individuals from India lacks quality information about users themselves. Random facts and figures in individual profiles, sharing of spam content, and fake social media accounts that are created for bots are very common in India.

Spam

Social media sites are becoming increasingly vulnerable to spam attacks. Time spent by a captive audience on social media sites opens up windows of opportunities for online threats and spammers.

Again, social media spam contributes to the signal-to-noise-ratio that defines the quality of big data. This takes away from the accuracy of results.

Cultural and Social influences

In most western markets, insights generated through big data can be applied across the whole consumer base. However, given the extensive cultural and linguistic variation across India, any insight generated for a consumer based out of Chandigarh, for example, will not be directly applicable to a consumer based in Chennai. This problem is made worse by the fact that a lot of local data lives in regional publications, in different languages, and has very limited online visibility.

Unstructured data leads to mapping issues

Big data in India is not structured. Most transactional data in the healthcare and retail segments are stored purely for book-keeping purposes. They have very limited appropriate information of the kind that can help big data analytics map enterprise-generated transactional data with public information.

In the case of developed countries, user data is rich enough to provide demographic or group level markers that can be used to generate customized insights while maintaining individual privacy. Lack of these standard identifiers in Indian consumer data is one of the biggest bottlenecks while mapping various transactional and social records in India.

Handsets and internet connectivity

Even though smart phones are driving the new handset market in India, feature phones still dominate everyday usage. Most connections in India are pre-paid and fewer than 10% of users have access to 3G networks. To add to it, internet connection speeds are amongst the lowest in Asia. As a result, consumer data, especially retail enterprise data, is limited.

As more people in India make the move to smart phones, and internet connectivity improves, there will be an increase in the amount of usable data generated. As big data analytics is in its infancy in India today, huge efforts would need to be made to improve the quality of data stored by organisations and enterprises. However, key contributors to the promise of big data analytics in India are steadily gaining ground. An increase in social media users, and efforts by enterprises, both public and private for optimum collection and storage of transactional enterprise data, will contribute to better quality data sets for the better application of big data analytics.

 About the Author: Srikant Sastri is the Co-founder of Crayon Data.

To read the original article on YourStory, click here.

Source by analyticsweekpick

Jun 15, 17: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

[  COVER OF THE WEEK ]

image
Data security  Source

[ AnalyticsWeek BYTES]

>> Talent analytics in practice by analyticsweekpick

>> Data center location – your DATA harbour by martin

>> The 10 Commandments for data driven leaders by v1shal

Wanna write? Click Here

[ NEWS BYTES]

>>
 Women and Men Now Grocery Shop Equally: Study … – Progressive Grocer Under  Prescriptive Analytics

>>
 Fast-moving big data changes data preparation process for analytics – TechTarget Under  Big Data

>>
 Scientists use Tweet ‘sentiment analysis’ to predict Hillary Clinton win – Daily News & Analysis Under  Sentiment Analysis

More NEWS ? Click Here

[ FEATURED COURSE]

Applied Data Science: An Introduction

image

As the world’s data grow exponentially, organizations across all sectors, including government and not-for-profit, need to understand, manage and use big, complex data sets—known as big data…. more

[ FEATURED READ]

Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython

image

Python for Data Analysis is concerned with the nuts and bolts of manipulating, processing, cleaning, and crunching data in Python. It is also a practical, modern introduction to scientific computing in Python, tailored f… more

[ TIPS & TRICKS OF THE WEEK]

Strong business case could save your project
Like anything in corporate culture, the project is oftentimes about the business, not the technology. With data analysis, the same type of thinking goes. It’s not always about the technicality but about the business implications. Data science project success criteria should include project management success criteria as well. This will ensure smooth adoption, easy buy-ins, room for wins and co-operating stakeholders. So, a good data scientist should also possess some qualities of a good project manager.

[ DATA SCIENCE Q&A]

Q:Do you think 50 small decision trees are better than a large one? Why?
A: * Yes!
* More robust model (ensemble of weak learners that come and make a strong learner)
* Better to improve a model by taking many small steps than fewer large steps
* If one tree is erroneous, it can be auto-corrected by the following
* Less prone to overfitting

Source

[ VIDEO OF THE WEEK]

#FutureOfData Podcast: Peter Morgan, CEO, Deep Learning Partnership

 #FutureOfData Podcast: Peter Morgan, CEO, Deep Learning Partnership

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

For every two degrees the temperature goes up, check-ins at ice cream shops go up by 2%. – Andrew Hogue, Foursquare

[ PODCAST OF THE WEEK]

#FutureOfData Podcast: Conversation With Sean Naismith, Enova Decisions

 #FutureOfData Podcast: Conversation With Sean Naismith, Enova Decisions

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

Estimates suggest that by better integrating big data, healthcare could save as much as $300 billion a year — that’s equal to reducing costs by $1000 a year for every man, woman, and child.

Sourced from: Analytics.CLUB #WEB Newsletter

Big Data: Are you ready for blast-off?

As Technology of Business begins a month-long series of features on the theme of Big Data, we kick off with a Q&A backgrounder answering some of those basic questions you were too afraid to ask.

Good question. After all, we’ve always had large amounts of data haven’t we, from loyalty card schemes, till receipts, medical records, tax returns and so on?

As Laurie Miles, head of analytics for big data specialist SAS, says: “The term big data has been around for decades, and we’ve been doing analytics all this time. It’s not big, it’s just bigger.”

But it’s the velocity, variety and volume of data that has merited the new term.

So what made it bigger?

Most traditional data was structured, or neatly organised in databases. Then the world went digital and the internet came along. Most of what we do could be translated into strings of ones and noughts capable of being recorded, stored, searched, and analysed.

There was a proliferation of so-called unstructured data generated by all our digital interactions, from email to online shopping, text messages to tweets, Facebook updates to YouTube videos.

Man checking phone before networked society poster
As the number of mobile phones grows globally, so does the volume of data they generate from call metadata, texts, emails, social media updates, photos, videos, and location

And the number of gadgets recording and transmitting data, from smartphones to intelligent fridges, industrial sensors to CCTV cameras, has also proliferated globally, leading to an explosion in the volume of data.

These data sets are now so large and complex that we need new tools and approaches to make the most of them.

How much data is there?

Nobody really knows because the volume is growing so fast. Some say that about 90% of all the data in the world today has been created in the past few years.

According to computer giant IBM, 2.5 exabytes – that’s 2.5 billion gigabytes (GB) – of data was generated every day in 2012. That’s big by anyone’s standards. “About 75% of data is unstructured, coming from sources such as text, voice and video,” says Mr Miles.

And as mobile phone penetration is forecast to grow from about 61% of the global population in 2013 to nearly 70% by 2017, those figures can only grow. The US government’s open data project already offers more than 120,000 publicly available data sets.

Where is it all stored?

The first computers came with memories measured in kilobytes, but the latest smartphones can now store 32GB and many laptops now have one terabyte (1,000GB) hard drives as standard. Storage is not really an issue anymore.

NSA data centre in Utah
The US National Security Agency has built a huge data centre in Bluffdale, Utah – codenamed Bumblehive – capable of storing a yottabyte of data – that’s one thousand trillion gigabytes

For large businesses “the cost of data storage has plummeted,” says Andrew Carr, UK and Ireland chief executive of IT consultancy Bull. Businesses can either keep all their data on-site, in their own remote data centres, or farm it out to “cloud-based” data storage providers.

A number of open source platforms have grown up specifically to handle these vast amounts of data quickly and efficiently, including Hadoop, MongoDB, Cassandra, and NoSQL.

Why is it important?

Data is only as good as the intelligence we can glean from it, and that entails effective data analytics and a whole lot of computing power to cope with the exponential increase in volume.

But a recent Bain & Co report found that of 400 large companies those that had already adopted big data analytics “have gained a significant lead over the rest of the corporate world.”

“Big data is not just historic business intelligence,” says Mr Carr, “it’s the addition of real-time data and the ability to mash together several data sets that makes it so valuable.”

Practically, anyone who makes, grows and sells anything can use big data analytics to make their manufacturing and production processes more efficient and their marketing more targeted and cost-effective.

It is throwing up interesting findings in the fields of healthcare, scientific research, agriculture, logistics, urban design, energy, retailing, crime reduction, and business operations – several of which we’ll be exploring over the coming weeks.

Thai farmer works in rice field
By analysing weather, soil, topography and GPS tractor data, farmers can increase crop yields

“It’s a big deal for corporations, for society and for each individual,” says Ralf Dreischmeier, head of The Boston Consulting Group’s information technology practice.

Can we handle all this data?

Big data needs new skills, but the business and academic worlds are playing catch up. “The job of data scientist didn’t exist five or 10 years ago,” says Duncan Ross, director of data science at Teradata. “But where are they? There’s a shortage.”

And many businesses are only just waking up to the realisation that data is a valuable asset that they need to protect and exploit. “Banks only use a third of their available data because it often sits in databases that are hard to access,” says Mr Dreischmeier.

“We need to find ways to make this data more easily accessible.”

Businesses, governments and public bodies also need to keep sensitive data safe from hackers, spies and natural disasters – an increasingly tall order in this mobile, networked world.

Who owns it all?

That’s the billion dollar question. A lot depends on the service provider hosting the data, the global jurisdiction it is stored in, and how it was generated. It is a legal minefield.

Facebook logo
Facebook’s logo – created using photos of its global users – adorns the wall of a new data centre in Sweden – its first outside the US. But who has rights to all the data?

Does telephone call metadata – the location, time, and duration of calls rather than their conversational content – belong to the caller, the phone network or any government spying agency that happens to be listening in?

When our cars become networked up, will it be the drivers, owners or manufacturers who own the data they generate?

Social media platforms will often say that their users own their own content, but then lay claim to how that content is used, reserving the right to share it with third parties. So when you tweet you effectively give up any control over how that tweet is used in future, even though Twitter terms and conditions say: “What’s yours is yours.”

Privacy and intellectual property laws have not kept up with the pace of technological change.

Originally posted via “Big Data: Are you ready for blast-off?”

Source: Big Data: Are you ready for blast-off? by anum

It’s Time to Tap into the Cloud Data Protection Market Opportunity

DARPA_Big_Data
Until now, most businesses did not have the access or resources to implement more complete data protection, including advanced backup, disaster recovery, and secure file sync and share. In fact, a recent study from research firm IDC found that 70% of SMBs have insufficient disaster recovery protection today. At the same time, a recent Spiceworks survey reported that cloud backup and recovery is the top cloud service that IT Pros plan to start using in the next six months.

The good news is that companies today have more options for data protection than ever before. The cloud makes enterprise-grade backup and disaster recovery solutions accessible and affordable for SMBs–and this translates into a massive market opportunity for service providers.

At Acronis, we believe that service providers are uniquely positioned to tap into the cloud to bring best-in-class data protection services to their customers.

We all know that service providers are experts at providing IT services, including administration, maintenance and customer support. They’ve opened up the door to cloud computing for businesses of all sizes, especially for SMBs.

But, service providers do much more than provide cloud solutions, servers and storage. For example, service providers are constantly improving upon the efficiency and cost-effectiveness of the solutions they deliver, including integrating different services into completely transparent and uniform services for their customers.

Service providers also look for opportunities to continuously enhance their offerings to provide end customers with the best possible solutions–now and in the future. Finally, service providers are the best cost managers in the business–they know how to scale solutions and make them easier to buy and deploy for end users. This relentless focus on cost-effectiveness benefits both their businesses with higher margins and their end customers with better value at a lower cost.

This is why Acronis delivers a complete set of cloud data protection solutions for service providers. We know service providers, and we know what it takes to make them successful. And there is a huge and unmet market need for easy, complete and affordable data protection for small and midsize businesses.

The bottom line: Now’s the ideal time to check out how you can grow your business with the latest solutions in cloud data protection, leveraging highly flexible go-to-market models and support for the broadest range of service provider workloads.

If you’d like to learn more about how Acronis can help you quickly tap into the growing market for cloud data protection services, you’ll find more information about our solutions here.

Read more at: http://mspmentor.net/blog/it-s-time-tap-cloud-data-protection-market-opportunity

Originally Posted at: It’s Time to Tap into the Cloud Data Protection Market Opportunity by analyticsweekpick

Jun 08, 17: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

[  COVER OF THE WEEK ]

image
Data interpretation  Source

[ AnalyticsWeek BYTES]

>> 8 Best Practices to Maximize ROI from Predictive Analytics by analyticsweekpick

>> Data Driven Innovation: A Primer by v1shal

>> Map of US Hospitals and their Patient Experience Ratings by bobehayes

Wanna write? Click Here

[ NEWS BYTES]

>>
 RFx (request for x) encompasses the entire formal request process and can include any of the following: – TechTarget Under  Sales Analytics

>>
 SAP’s Leonardo points towards Applied Data Science as a Service – Diginomica Under  Data Science

>>
 Four ways to create the ultimate personalized customer experience – TechTarget Under  Customer Experience

More NEWS ? Click Here

[ FEATURED COURSE]

Pattern Discovery in Data Mining

image

Learn the general concepts of data mining along with basic methodologies and applications. Then dive into one subfield in data mining: pattern discovery. Learn in-depth concepts, methods, and applications of pattern disc… more

[ FEATURED READ]

Superintelligence: Paths, Dangers, Strategies

image

The human brain has some capabilities that the brains of other animals lack. It is to these distinctive capabilities that our species owes its dominant position. Other animals have stronger muscles or sharper claws, but … more

[ TIPS & TRICKS OF THE WEEK]

Analytics Strategy that is Startup Compliant
With right tools, capturing data is easy but not being able to handle data could lead to chaos. One of the most reliable startup strategy for adopting data analytics is TUM or The Ultimate Metric. This is the metric that matters the most to your startup. Some advantages of TUM: It answers the most important business question, it cleans up your goals, it inspires innovation and helps you understand the entire quantified business.

[ DATA SCIENCE Q&A]

Q:When you sample, what bias are you inflicting?
A: Selection bias:
– An online survey about computer use is likely to attract people more interested in technology than in typical

Under coverage bias:
– Sample too few observations from a segment of population

Survivorship bias:
– Observations at the end of the study are a non-random set of those present at the beginning of the investigation
– In finance and economics: the tendency for failed companies to be excluded from performance studies because they no longer exist

Source

[ VIDEO OF THE WEEK]

@AnalyticsWeek: Big Data Health Informatics for the 21st Century: Gil Alterovitz

 @AnalyticsWeek: Big Data Health Informatics for the 21st Century: Gil Alterovitz

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

Data beats emotions. – Sean Rad, founder of Ad.ly

[ PODCAST OF THE WEEK]

#DataScience Approach to Reducing #Employee #Attrition

 #DataScience Approach to Reducing #Employee #Attrition

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

100 terabytes of data uploaded daily to Facebook.

Sourced from: Analytics.CLUB #WEB Newsletter