Big Data and Analytics Trends 2018

2017 has been a year when ML/AI technologies have become mainstream, and businesses are conversant with their application and possible use cases. 2018 will, however, be the year when trends from 2015-17 will finally come into maturity and we will be able to see results. Commercial application of blockchain (beyond Bitcoin), wider acceptance of enterprise SaaS solutions, and optimization of investments in data lakes – are just some of the examples we can think of. Customers too, are now beginning to understand more about data privacy and security, and want to be more in control of their own data. It seems that finally in 2018, technology and business will move ahead together.

The highlights of 2018 will be as below.

Big-Data-&-Analytics-Trends-2018


Embed this infographic on your site:

Source by analyticsweek

Sep 20, 18: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

[  COVER OF THE WEEK ]

image
Data security  Source

[ LOCAL EVENTS & SESSIONS]

More WEB events? Click Here

[ AnalyticsWeek BYTES]

>> Leveraging Virtualization to Streamline Data Management by analyticsweekpick

>> How to Successfully Incorporate Analytics Into Your Growth Marketing Process by analyticsweek

>> Benefits of IoT for Hospitals and Healthcare by analyticsweek

Wanna write? Click Here

[ NEWS BYTES]

>>
 DC BLOX Plans $785M Data Center Investment in AL – Commercial Property Executive Under  Data Center

>>
 How Legacy Systems Stifle Marketing Analytics – eMarketer Under  Marketing Analytics

>>
 How artificial intelligence will transform sales – Raconteur Under  Sentiment Analysis

More NEWS ? Click Here

[ FEATURED COURSE]

Probability & Statistics

image

This course introduces students to the basic concepts and logic of statistical reasoning and gives the students introductory-level practical ability to choose, generate, and properly interpret appropriate descriptive and… more

[ FEATURED READ]

On Intelligence

image

Jeff Hawkins, the man who created the PalmPilot, Treo smart phone, and other handheld devices, has reshaped our relationship to computers. Now he stands ready to revolutionize both neuroscience and computing in one strok… more

[ TIPS & TRICKS OF THE WEEK]

Keeping Biases Checked during the last mile of decision making
Today a data driven leader, a data scientist or a data driven expert is always put to test by helping his team solve a problem using his skills and expertise. Believe it or not but a part of that decision tree is derived from the intuition that adds a bias in our judgement that makes the suggestions tainted. Most skilled professionals do understand and handle the biases well, but in few cases, we give into tiny traps and could find ourselves trapped in those biases which impairs the judgement. So, it is important that we keep the intuition bias in check when working on a data problem.

[ DATA SCIENCE Q&A]

Q:How would you come up with a solution to identify plagiarism?
A: * Vector space model approach
* Represent documents (the suspect and original ones) as vectors of terms
* Terms: n-grams; n=1 to as much we can (detect passage plagiarism)
* Measure the similarity between both documents
* Similarity measure: cosine distance, Jaro-Winkler, Jaccard
* Declare plagiarism at a certain threshold

Source

[ VIDEO OF THE WEEK]

@SidProbstein / @AIFoundry on Leading #DataDriven Technology Transformation #FutureOfData #Podcast

 @SidProbstein / @AIFoundry on Leading #DataDriven Technology Transformation #FutureOfData #Podcast

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

You can have data without information, but you cannot have information without data. – Daniel Keys Moran

[ PODCAST OF THE WEEK]

#BigData @AnalyticsWeek #FutureOfData #Podcast with Eloy Sasot, News Corp

 #BigData @AnalyticsWeek #FutureOfData #Podcast with Eloy Sasot, News Corp

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

In that same survey, by a small but noticeable margin, executives at small companies (fewer than 1,000 employees) are nearly 10 percent more likely to view data as a strategic differentiator than their counterparts at large enterprises.

Sourced from: Analytics.CLUB #WEB Newsletter

Why Entrepreneurship Should Be Compulsory In Schools

Why Entrepreneurship Should Be Compulsory In Schools

It is not a surprise where global economy is going. With high unemployment and traditional businesses preparing to improve worker efficiency and achieve more with less; soon, world unemployment stats will start to show some fears(if not showing already). In such an economy, we need more entrepreneurs, more self-motivated people who believe in providing for other people’s need. All of us today have taken education to prepare ourselves for our current jobs. (Though there are always questions about the relevance and quality). Imagine if education framework is used for preparing next generation of entrepreneurs.

So, why should education system give way for entrepreneurship in its curriculum? If you are already living a life of an entrepreneur, it is not a difficult sell. In fact you must know more points than I do, but if not, let me give these 5 reasons (out of the many) that come to my mind.

1. Prepping for business 2.0: Compare the size of recent companies with that of companies formed a decade or two ago. It has completely changed today. So, businesses are no longer contained in a giant box of buildings, which could accommodate 1000s of employee delivering value. Current businesses require a lot smaller setup and could hit the ground running at a low capital infusion. So, why prepare the workforce through education for old school company landscape. It is extinct. World is changing rapidly to accommodate fast moving professionals that are dynamic enough to deliver value from anywhere in the world. What could prepare tomorrow’s workforce best, other than the system that produces one.

2. Easier to take risks and failures early in life: I am sure you must have heard about fail early, fail fast. I know you must be scratching your head that I am quoting a wrong context here. Take a deep breath and think again. It very well applies here. If students at an early age were given a chance to be entrepreneur and take risks, what worst could happen? Failure? They will still have their life’s prime days ahead to recover from it. These young turfs will have a system of teachers, parents, community mentors etc, wrapped around to help young guns experiment, fail and iterate. So, chances are high that these students will learn the art of failing and iterating at such an early stage in life that entire life will stand in front for them to reap the benefit of it. Compared to us old school people, they will get to experience more failures and get greater chances for success compared to the older generation.

3. Expectation is low and energy levels are high: Remember the tender young days and what money could have meant? some extra candies, party with friends and all fun. You were not planning to rule the world. Ruling the world is one of the psychology that kills most of startup dreams. Escalated expectations and failing to meet too much too soon is disappointing. Early age will come to rescue in this regards as well. With smaller milestones, more focus could be given in achieving it. Also, the high energy level that kids have is difficult to match. Rarely anyone could compete with the energy level of youngsters, so why not use it for their advantage. With low expectation and high energy, they are bound to achieve something substantial that can boost their confidence. This is an important ingredient in building their character.

4. More time to learn and master the craft: There are numerous times that I have heard (I am assuming you must have heard about it as well) that “I wish I could have done this earlier, I am too old to do it now”. Sure, these are some lame excuses we give to cover our laziness, but sometimes it is coming straight from the heart. Getting a chance to start early will clock more experience to master the craft. World is changing every minute. So, getting students involved early will give them tools and capabilities needed to survive and be successful in this changing world. The methodologies used 50 years ago might not still work, so getting them used to the change early will cause that much needed early intervention in our species to adapt to change.

5. A picture is worth 1000 words, it is all practical, makes learning effective: Nuff talk about why it is a good business decision and career wise a sane advice, what about their academic and learning cycles. Remember this famous quote? It is applicable here as well. Having to do something which you would have to do anyways couple of years later would open student’s horizon on what is life like, give them an early sneak peak so that they can prepare well and fail less. Entrepreneurship could teach students things that most of us would never be able to experience and are important for our existence. So, having a subject with life like experience early is nothing but a crash course to life which will give students a better understanding of their world, and how they want to make a dent in it.

I am almost tempted to list a few more points around the importance of introducing entrepreneurship at school and what it could do to their education and career. But I am okay with 5th point doing some indirect justice to most of the other points. Having an entrepreneurship course will prepare them better for life and let them see how education could directly impact their life. It may positively impact the drop-out rates once students see the value in education and what it is preparing them for- LIFE.

Finally a perfect video to include here:

 

Source: Why Entrepreneurship Should Be Compulsory In Schools

Intel bets on real-time analytics with high-end processor refresh

The chip company launches its new generation of its Xeon E7 processor that it hopes will spur companies to crunch data as it is collected.

Intel today refreshed its Xeon E7 family of processors with a line-up aimed at helping businesses carry out real-time analytics on big datasets.

The chip giant launched the third generation of its high-end server chips, based on its Haswell-EX microarchitecture.

Intel sees a use for these new processors in carrying out analytics on large datasets as they are collected by enterprise

Scott Pendrey, Intel server product manager for EMEA, said exponential growth in data being collected by firms will drive renewal of analytics infrastructure.

“We continue to see larger and larger volumes of data every year. By 2020 we’re forecasting something like 50 billion devices and 44 zettabytes of data – huge amounts of information.”

Intel is betting that firms will have an appetite for carrying out real-time analytics on that data – necessitating machines with the compute and memory capacity of E7-based systems.

“It’s the real-time shift into analytics that is the big business case that we see continuing to grow.”

To boost performance for these types of analytic workloads, the new E7 range includes a feature called TSX (Transactional Synchronisation Extensions).

In tests, TSX has helped the SAP Hana in-memory analytics platform achieve six times faster data transaction times than when using previous generation processors, according to Pendrey.

On average, Intel claims the new generation of processors could deliver a 40 percent improvement in performance over their predecessors when handling “mainstream workloads”.

Other optimisations in the processor family are able to boost code written to run in parallel, with Pendrey saying such code could execute 70 percent faster when run on the top of the range offerings in the new E7 line-up, due to the higher core count, processor cache and support for AVX2 extensions.

Intel has upped the maximum number of processor cores in the E7 line, from 15 to 18, and increased cache per socket from 37.5MB to 45MB. Also new is support for DDR4 alongside DDR3 memory. DDR4 allows for a greater memory density with a lower power consumption than DDR3, as well as operating at a higher speed. However, DDR4 is also about 20 percent more expensive than the equivalent amount of DDR3 memory.

Each Xeon E7 processor has native support for up to eight sockets per system, with each socket supporting up to 1.5TB, which takes the total memory per eight-way system up to 12TB.

To help satisfy Intel’s goal of the E7 being used for mission-critical workloads the new processors also offer 40 redundancy, availability and serviceability features – including memory sparing to support back-up memory modules and various provisions for error-checking.

Like the E5, the E7 also includes the AES-NI instruction set to accelerate data encryption.

Intel and its partners – such as Dell, HP and Fujistu – will launch about 15 systems based on the new processors today, with that number rising to 40 within 30 days.

The cost of the processors will match the previous generation E7 but systems may be more expensive due to the use of DDR4 memory.

Originally posted via “Intel bets on real-time analytics with high-end processor refresh”

Originally Posted at: Intel bets on real-time analytics with high-end processor refresh by analyticsweekpick

Sep 13, 18: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

[  COVER OF THE WEEK ]

image
Data security  Source

[ LOCAL EVENTS & SESSIONS]

More WEB events? Click Here

[ NEWS BYTES]

>>
 Top 10 Cloud Computing Challenges – Datamation Under  Cloud Security

>>
 Statistics Show More Signs Of The Tourism Slowdown – Reykjavík Grapevine Under  Statistics

>>
 IoT In Action – Introducing Azure Sphere – Microsoft – Channel 9 Under  IOT

More NEWS ? Click Here

[ FEATURED COURSE]

Lean Analytics Workshop – Alistair Croll and Ben Yoskovitz

image

Use data to build a better startup faster in partnership with Geckoboard… more

[ FEATURED READ]

Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking

image

Written by renowned data science experts Foster Provost and Tom Fawcett, Data Science for Business introduces the fundamental principles of data science, and walks you through the “data-analytic thinking” necessary for e… more

[ TIPS & TRICKS OF THE WEEK]

Data aids, not replace judgement
Data is a tool and means to help build a consensus to facilitate human decision-making but not replace it. Analysis converts data into information, information via context leads to insight. Insights lead to decision making which ultimately leads to outcomes that brings value. So, data is just the start, context and intuition plays a role.

[ DATA SCIENCE Q&A]

Q:Give examples of data that does not have a Gaussian distribution, nor log-normal?
A: * Allocation of wealth among individuals
* Values of oil reserves among oil fields (many small ones, a small number of large ones)

Source

[ VIDEO OF THE WEEK]

@DrewConway on fabric of an IOT Startup #FutureOfData #Podcast

 @DrewConway on fabric of an IOT Startup #FutureOfData #Podcast

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

Without big data, you are blind and deaf and in the middle of a freeway. – Geoffrey Moore

[ PODCAST OF THE WEEK]

@EdwardBoudrot / @Optum on #DesignThinking & #DataDriven Products #FutureOfData #Podcast

 @EdwardBoudrot / @Optum on #DesignThinking & #DataDriven Products #FutureOfData #Podcast

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

We are seeing a massive growth in video and photo data, where every minute up to 300 hours of video are uploaded to YouTube alone.

Sourced from: Analytics.CLUB #WEB Newsletter

November 21, 2016 Health and Biotech analytics news roundup

Here’s the latest in health and biotech analytics, in particular some new partnerships between academia and industry:

Analytical Booster Platform to Deliver “Smarter Healthcare”: THB (Technology, Healthcare, Big Data Analytics) is targeting the Indian healthcare system. They aim to give providers the “right information and tools at the right time.”

Pitt, Pfizer team up on health data analytics: The one-year partnership will use public and private data to find relationships between brain disease, brain imaging, and genetic markers.

Broad Institute Teams Up With Intel To Integrate Genomic Data From Diverse Sources And Enhance Genomic Data Analytic Capabilities: The Intel-Broad Center for Genomic Data Engineering will seek to optimize tools to be used on Intel-based computational platforms. They will also seek to enable collaborations through common workflow models.

UC San Francisco and GE Healthcare Launch Deep Learning Partnership to Advance Care Globally: They will be developing deep learning algorithms to help with many facets of healthcare problem solving, like determining what requires normal care and what requires quick intervention.

Source: November 21, 2016 Health and Biotech analytics news roundup

Innovation at The Power of Incubation

Innovating at The Power of Incubation
Innovating at The Power of Incubation

Having worked with corporate innovation and seen innovations evolving in different sectors, it became easier to imagine what an innovation cycle entails. Some companies do it better than the others, but most of the companies spend a lot of money on innovation with less visibility on the outcome and returns. And, in large corporates, a strong sense of bias still exists at all levels that could taint the disruptive idea even before it can be executed well. So, how to fix that?
Organize an incubator to disrupt your business. Wait don’t panic, let me explain how it could really help a fortune company stay in business for a sustainable, foreseeable future. Incubators are the next level of business case competition, but more rigorous, longer duration and more effective. In an incubator, you just don’t get the next big idea for the company, but also get the one that can be executed in most effective way.

Here are my 5 reason on why it is relevant:

1. It lets you find out the opportunities that were not visible to your focused eyes: Startup entrepreneurs have an open mind to try out the most amazing and complex projects. Also, they don’t have the restrictions (legal, bureaucracy, brand impact, accountability to shareholders) that large corporates have that pose as a big hurdle in innovation. Startups are also free from multiple biases that large corporations are plagued with. Hence, it is much easier and faster to innovate and try out new things in a lean manner and in a bureaucracy free startup environment. Incubators bring out the best of the both worlds, where the best minds are competing to bring the best products to market and where the corporates provides the necessary support and is its culture is not able to taint the ideas.

2. Stay close to the ideas that could disrupt your market and sleep better: What salesforce did to Oracle, and what amazon did to retail stores is not something you would be looking forward to. So, if you are not keeping your eyes and ears open to next disruption, you might miss the very last boat that will let you afloat. Incubators can act as a breeding ground for disruptive ideas not just for your current market landscape, but also things that might change the marketplace for your goods in the future for forever. So, it provides you an opportunity to grow in your land as well as in neighboring lands while investing limited resources.

3. Opportunity to hire entrepreneurs which rarely show up in HR resume: Incubators could be a great place to spot talent especially people who are motivated to endure in the land of unknown and can make things happen. It is HR’s dream to hire those 20% that lifts 80% of the company and take them to GREEN zone. I have been at numerous roles and seen variable talent pool. True entrepreneurs always stand out; hustling, giving their 110% heart and soul to make things happen. Money does not motivate them, but creating something useful does. Every company craves for people with out of the box thinking, lean, fast and ambitious to make things happen. In favor of sustainability, this is the talent that each organization needs especially for fostering innovation and success.

4. Keeps you current, sexy and relevant: Whether we talk about Larry Ellison discussing the concept of cloud/big data as fluff, or Steve Ballmer laughing at iPhone or Blackberry’s tumble. Every big conglomerate lives in their bubble, they have limited sized window that shows them the world they live in. Reality distortion is almost true for all big companies, bigger the size, thicker the lens, poorer the vision. Startups have the tendency to stay current and act on latest and greatest methodologies that exists today. Big companies get that know-how free if they are associated with startups. They could be part of what and how the world is changing and what roles startups play, so they can adept their practices and stay current. Companies don’t need to invest millions to get ideas on staying current, sometimes all it takes is thousands to make the difference.

5. Good karma points, positive PR and strong brand building: Last but not the least; incubators can really help a brand image of the large corporates, as startups are considered to be interesting, sexy and young. They attract the youth and the early adopters. Press and media want to find out and write about the next big idea in the industry. So, being attached to an incubator and startups gets you good media coverage and publicity and creates brand awareness. It also creates positive vibes in the old consumers and reinforces their support for the brand and attracts new customers.

There are many large corporates that have leveraged incubation as a technique to get an edge over innovation in their industries, namely – Pepsi, GE, Nike, Microsoft etc. All these companies are pioneers in their respective fields and have leveraged and profited from their involvement with the incubators and startups.

What should we do?

No, you don’t have to get into the incubation business; there are tons of incubators out there. Find one and partner with them to get you going on the road to fix some of the innovation loopholes that could not be fixed by data innovation. Yes, big data innovation is still super relevant and yes, incubators could help you innovate as well. So, there is more than one easy, cost effective, and optimal ways for big enterprises to innovate.

Here is a quick video by Christie Hefner on designing a corporate culture that is open to all ideas.

Source

Sep 06, 18: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

[  COVER OF THE WEEK ]

image
Fake data  Source

[ LOCAL EVENTS & SESSIONS]

More WEB events? Click Here

[ AnalyticsWeek BYTES]

>> 5 tips to becoming a big data superhero by analyticsweekpick

>> What Are the 3 Critical Keys to Healthcare Big Data Analytics? by analyticsweekpick

>> How Big Data Is Changing The Entertainment Industry! by analyticsweekpick

Wanna write? Click Here

[ NEWS BYTES]

>>
 Insurers pay premium for cyber security experts – Financial Times Under  cyber security

>>
 Lawmakers Unveil Plans for Agency Telework and Cloud Security – Nextgov Under  Cloud Security

>>
 As VMworld nears, virtualization disrupts the cloud application ecosystem – SiliconANGLE News (blog) Under  Virtualization

More NEWS ? Click Here

[ FEATURED COURSE]

Artificial Intelligence

image

This course includes interactive demonstrations which are intended to stimulate interest and to help students gain intuition about how artificial intelligence methods work under a variety of circumstances…. more

[ FEATURED READ]

Hypothesis Testing: A Visual Introduction To Statistical Significance

image

Statistical significance is a way of determining if an outcome occurred by random chance, or did something cause that outcome to be different than the expected baseline. Statistical significance calculations find their … more

[ TIPS & TRICKS OF THE WEEK]

Keeping Biases Checked during the last mile of decision making
Today a data driven leader, a data scientist or a data driven expert is always put to test by helping his team solve a problem using his skills and expertise. Believe it or not but a part of that decision tree is derived from the intuition that adds a bias in our judgement that makes the suggestions tainted. Most skilled professionals do understand and handle the biases well, but in few cases, we give into tiny traps and could find ourselves trapped in those biases which impairs the judgement. So, it is important that we keep the intuition bias in check when working on a data problem.

[ DATA SCIENCE Q&A]

Q:Explain likely differences between administrative datasets and datasets gathered from experimental studies. What are likely problems encountered with administrative data? How do experimental methods help alleviate these problems? What problem do they bring?
A: Advantages:
– Cost
– Large coverage of population
– Captures individuals who may not respond to surveys
– Regularly updated, allow consistent time-series to be built-up

Disadvantages:
– Restricted to data collected for administrative purposes (limited to administrative definitions. For instance: incomes of a married couple, not individuals, which can be more useful)
– Lack of researcher control over content
– Missing or erroneous entries
– Quality issues (addresses may not be updated or a postal code is provided only)
– Data privacy issues
– Underdeveloped theories and methods (sampling methods…)

Source

[ VIDEO OF THE WEEK]

Surviving Internet of Things

 Surviving Internet of Things

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

It is a capital mistake to theorize before one has data. Insensibly, one begins to twist the facts to suit theories, instead of theories to

[ PODCAST OF THE WEEK]

George (@RedPointCTO / @RedPointGlobal) on becoming an unbiased #Technologist in #DataDriven World #FutureOfData #Podcast

 George (@RedPointCTO / @RedPointGlobal) on becoming an unbiased #Technologist in #DataDriven World #FutureOfData #Podcast

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

Market research firm IDC has released a new forecast that shows the big data market is expected to grow from $3.2 billion in 2010 to $16.9 billion in 2015.

Sourced from: Analytics.CLUB #WEB Newsletter

Creating Great Choices to Enable #FutureOfWork by @JenniferRiel #JobsOfFuture #Podcast

 

In this podcast Jennifer Harris (@JenniferRiel) sat with Vishal (@Vishaltx from @AnalyticsWeek) to discuss her book “Creating Great Choices: A Leader’s Guide to Integrative Thinking”. She sheds light on the importance of integrating thinking in generating long lasting solutions. She shared some of the innovative ways business could get to creative problem solving that prevent bias and isolation and brings diversity in the opinion. Jennifer also spoke about the challenges that tribalism brings to the quality of decision making. This conversation and her book is great for anyone looking to create a futureproof organization that takes measured decision for effective outcome.

Her Book Link:
Creating Great Choices: A Leader’s Guide to Integrative Thinking by Jennifer Riel (Author), Roger L. Martin (Author) https://amzn.to/2JGeljS

Jennifer’s Recommended Read:
Pride and Prejudice by Jane Austen and Tony Tanner https://amzn.to/2MbHkeb
Thinking, Fast and Slow by Daniel Kahneman https://amzn.to/2sNzgbt
The Righteous Mind: Why Good People Are Divided by Politics and Religion by Jonathan Haidt https://amzn.to/2xUZFZD
Give and Take: Why Helping Others Drives Our Success by Adam M. Grant Ph.D. https://amzn.to/2xYtWHa

Podcast Link:
iTunes: http://math.im/jofitunes
GooglePlay: http://math.im/jofgplay

Here is Jennifer’s Bio:
Jennifer Riel is an adjunct professor at the Rotman School of Management, University of Toronto, specializing in creative problem solving. Her focus is on helping everyone, from undergraduate students to business executives, to create better choices, more of the time.

Jennifer is the co-author of Creating Great Choices: A Leader’s Guide to Integrative Thinking (with Roger L. Martin, former Dean of the Rotman School of Management). Based on a decade of teaching and practice with integrative thinking, the book lays out a practical methodology for tackling our most vexing business problems. Using illustrations from organizations like LEGO, Vanguard and Unilever, The book shows how individuals can leverage the tension of opposing ideas to create a third, better way forward.

An award-winning teacher, Jennifer leads training on integrative thinking, strategy and innovation at organizations of all types, from small non-profits to some of the largest companies in the world.

About #Podcast:
#JobsOfFuture podcast is a conversation starter to bring leaders, influencers and lead practitioners to come on show and discuss their journey in creating the work, worker and workplace of the future.

Want to sponsor?
Email us @ info@analyticsweek.com

Keywords:
#JobsOfFuture
JobsOfFuture
Jobs of future
Future of work
Leadership
Strategy

Source: Creating Great Choices to Enable #FutureOfWork by @JenniferRiel #JobsOfFuture #Podcast by v1shal

Aug 30, 18: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

[  COVER OF THE WEEK ]

image
Convincing  Source

[ LOCAL EVENTS & SESSIONS]

More WEB events? Click Here

[ AnalyticsWeek BYTES]

>> October 10, 2016 Health and Biotech Analytics News Roundup by pstein

>> Ten Guidelines for Clean Customer Feedback Data by bobehayes

>> Apr 20, 17: #AnalyticsClub #Newsletter (Events, Tips, News & more..) by admin

Wanna write? Click Here

[ NEWS BYTES]

>>
 Data Analytics Add Value to Healthcare Supply Chain Management – RevCycleIntelligence.com Under  Health Analytics

>>
 The men’s fashion company that’s part apparel, part big data – Marketplace.org Under  Big Data

>>
 TV Time’s New Analytics Tool Breaks Down Fan Reaction to Shows … – Variety Under  Social Analytics

More NEWS ? Click Here

[ FEATURED COURSE]

Introduction to Apache Spark

image

Learn the fundamentals and architecture of Apache Spark, the leading cluster-computing framework among professionals…. more

[ FEATURED READ]

On Intelligence

image

Jeff Hawkins, the man who created the PalmPilot, Treo smart phone, and other handheld devices, has reshaped our relationship to computers. Now he stands ready to revolutionize both neuroscience and computing in one strok… more

[ TIPS & TRICKS OF THE WEEK]

Winter is coming, warm your Analytics Club
Yes and yes! As we are heading into winter what better way but to talk about our increasing dependence on data analytics to help with our decision making. Data and analytics driven decision making is rapidly sneaking its way into our core corporate DNA and we are not churning practice ground to test those models fast enough. Such snugly looking models have hidden nails which could induce unchartered pain if go unchecked. This is the right time to start thinking about putting Analytics Club[Data Analytics CoE] in your work place to help Lab out the best practices and provide test environment for those models.

[ DATA SCIENCE Q&A]

Q:What is an outlier? Explain how you might screen for outliers and what would you do if you found them in your dataset. Also, explain what an inlier is and how you might screen for them and what would you do if you found them in your dataset
A: Outliers:
– An observation point that is distant from other observations
– Can occur by chance in any distribution
– Often, they indicate measurement error or a heavy-tailed distribution
– Measurement error: discard them or use robust statistics
– Heavy-tailed distribution: high skewness, can’t use tools assuming a normal distribution
– Three-sigma rules (normally distributed data): 1 in 22 observations will differ by twice the standard deviation from the mean
– Three-sigma rules: 1 in 370 observations will differ by three times the standard deviation from the mean

Three-sigma rules example: in a sample of 1000 observations, the presence of up to 5 observations deviating from the mean by more than three times the standard deviation is within the range of what can be expected, being less than twice the expected number and hence within 1 standard deviation of the expected number (Poisson distribution).

If the nature of the distribution is known a priori, it is possible to see if the number of outliers deviate significantly from what can be expected. For a given cutoff (samples fall beyond the cutoff with probability p), the number of outliers can be approximated with a Poisson distribution with lambda=pn. Example: if one takes a normal distribution with a cutoff 3 standard deviations from the mean, p=0.3% and thus we can approximate the number of samples whose deviation exceed 3 sigmas by a Poisson with lambda=3

Identifying outliers:
– No rigid mathematical method
– Subjective exercise: be careful
– Boxplots
– QQ plots (sample quantiles Vs theoretical quantiles)

Handling outliers:
– Depends on the cause
– Retention: when the underlying model is confidently known
– Regression problems: only exclude points which exhibit a large degree of influence on the estimated coefficients (Cook’s distance)

Inlier:
– Observation lying within the general distribution of other observed values
– Doesn’t perturb the results but are non-conforming and unusual
– Simple example: observation recorded in the wrong unit (°F instead of °C)

Identifying inliers:
– Mahalanobi’s distance
– Used to calculate the distance between two random vectors
– Difference with Euclidean distance: accounts for correlations
– Discard them

Source

[ VIDEO OF THE WEEK]

@TimothyChou on World of #IOT & Its #Future Part 1 #FutureOfData #Podcast

 @TimothyChou on World of #IOT & Its #Future Part 1 #FutureOfData #Podcast

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

The world is one big data problem. – Andrew McAfee

[ PODCAST OF THE WEEK]

#BigData @AnalyticsWeek #FutureOfData #Podcast with @MichOConnell, @Tibco

 #BigData @AnalyticsWeek #FutureOfData #Podcast with @MichOConnell, @Tibco

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

571 new websites are created every minute of the day.

Sourced from: Analytics.CLUB #WEB Newsletter