Sep 19, 19: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
SQL Database  Source

[ AnalyticsWeek BYTES]

>> 22 tips for better data science by analyticsweekpick

>> 6 Questions to Ask When Preparing Data for Analysis by analyticsweek

>> The UX of Brokerage Websites by analyticsweek

Wanna write? Click Here

[ FEATURED COURSE]

Tackle Real Data Challenges

image

Learn scalable data management, evaluate big data technologies, and design effective visualizations…. more

[ FEATURED READ]

Machine Learning With Random Forests And Decision Trees: A Visual Guide For Beginners

image

If you are looking for a book to help you understand how the machine learning algorithms “Random Forest” and “Decision Trees” work behind the scenes, then this is a good book for you. Those two algorithms are commonly u… more

[ TIPS & TRICKS OF THE WEEK]

Strong business case could save your project
Like anything in corporate culture, the project is oftentimes about the business, not the technology. With data analysis, the same type of thinking goes. It’s not always about the technicality but about the business implications. Data science project success criteria should include project management success criteria as well. This will ensure smooth adoption, easy buy-ins, room for wins and co-operating stakeholders. So, a good data scientist should also possess some qualities of a good project manager.

[ DATA SCIENCE Q&A]

Q:What is: collaborative filtering, n-grams, cosine distance?
A: Collaborative filtering:
– Technique used by some recommender systems
– Filtering for information or patterns using techniques involving collaboration of multiple agents: viewpoints, data sources.
1. A user expresses his/her preferences by rating items (movies, CDs.)
2. The system matches this user’s ratings against other users’ and finds people with most similar tastes
3. With similar users, the system recommends items that the similar users have rated highly but not yet being rated by this user

n-grams:
– Contiguous sequence of n items from a given sequence of text or speech
– ‘Andrew is a talented data scientist”
– Bi-gram: ‘Andrew is”, ‘is a”, ‘a talented”.
– Tri-grams: ‘Andrew is a”, ‘is a talented”, ‘a talented data”.
– An n-gram model models sequences using statistical properties of n-grams; see: Shannon Game
– More concisely, n-gram model: P(Xi|Xi?(n?1)…Xi?1): Markov model
– N-gram model: each word depends only on the n?1 last words

Issues:
– when facing infrequent n-grams
– solution: smooth the probability distributions by assigning non-zero probabilities to unseen words or n-grams
– Methods: Good-Turing, Backoff, Kneser-Kney smoothing

Cosine distance:
– How similar are two documents?
– Perfect similarity/agreement: 1
– No agreement : 0 (orthogonality)
– Measures the orientation, not magnitude

Given two vectors A and B representing word frequencies:
cosine-similarity(A,B)=?A,B?/||A||?||B||

Source

[ VIDEO OF THE WEEK]

Jeff Palmucci @TripAdvisor discusses managing a #MachineLearning #AI Team

 Jeff Palmucci @TripAdvisor discusses managing a #MachineLearning #AI Team

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

With data collection, ‘the sooner the better’ is always the best answer. – Marissa Mayer

[ PODCAST OF THE WEEK]

Solving #FutureOfWork with #Detonate mindset (by @steven_goldbach & @geofftuff) #JobsOfFuture #Podcast

 Solving #FutureOfWork with #Detonate mindset (by @steven_goldbach & @geofftuff) #JobsOfFuture #Podcast

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

Distributed computing (performing computing tasks using a network of computers in the cloud) is very real. Google GOOGL -0.53% uses it every day to involve about 1,000 computers in answering a single search query, which takes no more than 0.2 seconds to complete.

Sourced from: Analytics.CLUB #WEB Newsletter

Sep 12, 19: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
Ethics  Source

[ AnalyticsWeek BYTES]

>> All the Ways to Connect Your Azure SQL Data Warehouse with Talend by analyticsweekpick

>> How oil and gas firms are failing to grasp the necessity of Big Data analytics by analyticsweekpick

>> 5 Ways Manufacturing Analytics Will Change Your Business by analyticsweek

Wanna write? Click Here

[ FEATURED COURSE]

Statistical Thinking and Data Analysis

image

This course is an introduction to statistical data analysis. Topics are chosen from applied probability, sampling, estimation, hypothesis testing, linear regression, analysis of variance, categorical data analysis, and n… more

[ FEATURED READ]

Antifragile: Things That Gain from Disorder

image

Antifragile is a standalone book in Nassim Nicholas Taleb’s landmark Incerto series, an investigation of opacity, luck, uncertainty, probability, human error, risk, and decision-making in a world we don’t understand. The… more

[ TIPS & TRICKS OF THE WEEK]

Keeping Biases Checked during the last mile of decision making
Today a data driven leader, a data scientist or a data driven expert is always put to test by helping his team solve a problem using his skills and expertise. Believe it or not but a part of that decision tree is derived from the intuition that adds a bias in our judgement that makes the suggestions tainted. Most skilled professionals do understand and handle the biases well, but in few cases, we give into tiny traps and could find ourselves trapped in those biases which impairs the judgement. So, it is important that we keep the intuition bias in check when working on a data problem.

[ DATA SCIENCE Q&A]

Q:Provide examples of machine-to-machine communications?
A: Telemedicine
– Heart patients wear specialized monitor which gather information regarding heart state
– The collected data is sent to an electronic implanted device which sends back electric shocks to the patient for correcting incorrect rhythms

Product restocking
– Vending machines are capable of messaging the distributor whenever an item is running out of stock

Source

[ VIDEO OF THE WEEK]

Pascal Marmier (@pmarmier) @SwissRe discusses running data driven innovation catalyst

 Pascal Marmier (@pmarmier) @SwissRe discusses running data driven innovation catalyst

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

He uses statistics as a drunken man uses lamp posts—for support rather than for illumination. – Andrew Lang

[ PODCAST OF THE WEEK]

George (@RedPointCTO / @RedPointGlobal) on becoming an unbiased #Technologist in #DataDriven World #FutureOfData #Podcast

 George (@RedPointCTO / @RedPointGlobal) on becoming an unbiased #Technologist in #DataDriven World #FutureOfData #Podcast

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

Big data is a top business priority and drives enormous opportunity for business improvement. Wikibon’s own study projects that big data will be a $50 billion business by 2017.

Sourced from: Analytics.CLUB #WEB Newsletter

Sep 05, 19: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
Data analyst  Source

[ AnalyticsWeek BYTES]

>> Out of the Loop on the Internet of Things? Here’s a Brief Guide. by analyticsweekpick

>> Three final talent tips: how to hire data scientists by analyticsweekpick

>> Measuring The Customer Experience Requires Fewer Questions Than You Think by bobehayes

Wanna write? Click Here

[ FEATURED COURSE]

Machine Learning

image

6.867 is an introductory course on machine learning which gives an overview of many concepts, techniques, and algorithms in machine learning, beginning with topics such as classification and linear regression and ending … more

[ FEATURED READ]

Introduction to Graph Theory (Dover Books on Mathematics)

image

A stimulating excursion into pure mathematics aimed at “the mathematically traumatized,” but great fun for mathematical hobbyists and serious mathematicians as well. Requiring only high school algebra as mathematical bac… more

[ TIPS & TRICKS OF THE WEEK]

Strong business case could save your project
Like anything in corporate culture, the project is oftentimes about the business, not the technology. With data analysis, the same type of thinking goes. It’s not always about the technicality but about the business implications. Data science project success criteria should include project management success criteria as well. This will ensure smooth adoption, easy buy-ins, room for wins and co-operating stakeholders. So, a good data scientist should also possess some qualities of a good project manager.

[ DATA SCIENCE Q&A]

Q:What are confounding variables?
A: * Extraneous variable in a statistical model that correlates directly or inversely with both the dependent and the independent variable
* A spurious relationship is a perceived relationship between an independent variable and a dependent variable that has been estimated incorrectly
* The estimate fails to account for the confounding factor

Source

[ VIDEO OF THE WEEK]

#GlobalBusiness at the speed of The #BigAnalytics

 #GlobalBusiness at the speed of The #BigAnalytics

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

Data is the new science. Big Data holds the answers. – Pat Gelsinger

[ PODCAST OF THE WEEK]

#FutureOfData with @theClaymethod, @TiVo discussing running analytics in media industry

 #FutureOfData with @theClaymethod, @TiVo discussing running analytics in media industry

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

A quarter of decision-makers surveyed predict that data volumes in their companies will rise by more than 60 per cent by the end of 2014, with the average of all respondents anticipating a growth of no less than 42 per cent.

Sourced from: Analytics.CLUB #WEB Newsletter

Aug 29, 19: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
Accuracy check  Source

[ LOCAL EVENTS & SESSIONS]

More WEB events? Click Here

[ AnalyticsWeek BYTES]

>> How can you reap the advantages of Big Data in your enterprise? Services you can expect from a Remote DBA Expert by thomassujain

>> Why Organizations Are Choosing Talend vs Informatica by analyticsweekpick

>> Predicting UX Metrics with the PURE Method by analyticsweek

Wanna write? Click Here

[ FEATURED COURSE]

CS229 – Machine Learning

image

This course provides a broad introduction to machine learning and statistical pattern recognition. … more

[ FEATURED READ]

The Industries of the Future

image

The New York Times bestseller, from leading innovation expert Alec Ross, a “fascinating vision” (Forbes) of what’s next for the world and how to navigate the changes the future will bring…. more

[ TIPS & TRICKS OF THE WEEK]

Data Have Meaning
We live in a Big Data world in which everything is quantified. While the emphasis of Big Data has been focused on distinguishing the three characteristics of data (the infamous three Vs), we need to be cognizant of the fact that data have meaning. That is, the numbers in your data represent something of interest, an outcome that is important to your business. The meaning of those numbers is about the veracity of your data.

[ DATA SCIENCE Q&A]

Q:How do you control for biases?
A: * Choose a representative sample, preferably by a random method
* Choose an adequate size of sample
* Identify all confounding factors if possible
* Identify sources of bias and include them as additional predictors in statistical analyses
* Use randomization: by randomly recruiting or assigning subjects in a study, all our experimental groups have an equal chance of being influenced by the same bias

Notes:
– Randomization: in randomized control trials, research participants are assigned by chance, rather than by choice to either the experimental group or the control group.
– Random sampling: obtaining data that is representative of the population of interest

Source

[ VIDEO OF THE WEEK]

@AnalyticsWeek: Big Data Health Informatics for the 21st Century: Gil Alterovitz

 @AnalyticsWeek: Big Data Health Informatics for the 21st Century: Gil Alterovitz

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

Data matures like wine, applications like fish. – James Governor

[ PODCAST OF THE WEEK]

#FutureOfData with @CharlieDataMine, @Oracle discussing running analytics in an enterprise

 #FutureOfData with @CharlieDataMine, @Oracle discussing running analytics in an enterprise

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

This year, over 1.4 billion smart phones will be shipped – all packed with sensors capable of collecting all kinds of data, not to mention the data the users create themselves.

Sourced from: Analytics.CLUB #WEB Newsletter

Aug 22, 19: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
Conditional Risk  Source

[ LOCAL EVENTS & SESSIONS]

More WEB events? Click Here

[ AnalyticsWeek BYTES]

>> Eurovision Data: Should Scandinavia Trust the Swedes? by analyticsweek

>> Why data is no longer just an IT function by analyticsweekpick

>> The Fate of Data Science: Into the Hands of Many by jelaniharper

Wanna write? Click Here

[ FEATURED COURSE]

Learning from data: Machine learning course

image

This is an introductory course in machine learning (ML) that covers the basic theory, algorithms, and applications. ML is a key technology in Big Data, and in many financial, medical, commercial, and scientific applicati… more

[ FEATURED READ]

On Intelligence

image

Jeff Hawkins, the man who created the PalmPilot, Treo smart phone, and other handheld devices, has reshaped our relationship to computers. Now he stands ready to revolutionize both neuroscience and computing in one strok… more

[ TIPS & TRICKS OF THE WEEK]

Winter is coming, warm your Analytics Club
Yes and yes! As we are heading into winter what better way but to talk about our increasing dependence on data analytics to help with our decision making. Data and analytics driven decision making is rapidly sneaking its way into our core corporate DNA and we are not churning practice ground to test those models fast enough. Such snugly looking models have hidden nails which could induce unchartered pain if go unchecked. This is the right time to start thinking about putting Analytics Club[Data Analytics CoE] in your work place to help Lab out the best practices and provide test environment for those models.

[ DATA SCIENCE Q&A]

Q:What is: collaborative filtering, n-grams, cosine distance?
A: Collaborative filtering:
– Technique used by some recommender systems
– Filtering for information or patterns using techniques involving collaboration of multiple agents: viewpoints, data sources.
1. A user expresses his/her preferences by rating items (movies, CDs.)
2. The system matches this user’s ratings against other users’ and finds people with most similar tastes
3. With similar users, the system recommends items that the similar users have rated highly but not yet being rated by this user

n-grams:
– Contiguous sequence of n items from a given sequence of text or speech
– ‘Andrew is a talented data scientist”
– Bi-gram: ‘Andrew is”, ‘is a”, ‘a talented”.
– Tri-grams: ‘Andrew is a”, ‘is a talented”, ‘a talented data”.
– An n-gram model models sequences using statistical properties of n-grams; see: Shannon Game
– More concisely, n-gram model: P(Xi|Xi?(n?1)…Xi?1): Markov model
– N-gram model: each word depends only on the n?1 last words

Issues:
– when facing infrequent n-grams
– solution: smooth the probability distributions by assigning non-zero probabilities to unseen words or n-grams
– Methods: Good-Turing, Backoff, Kneser-Kney smoothing

Cosine distance:
– How similar are two documents?
– Perfect similarity/agreement: 1
– No agreement : 0 (orthogonality)
– Measures the orientation, not magnitude

Given two vectors A and B representing word frequencies:
cosine-similarity(A,B)=?A,B?/||A||?||B||

Source

[ VIDEO OF THE WEEK]

@CRGutowski from @GE_Digital on Using #Analytics to #Transform Sales #FutureOfData #Podcast

 @CRGutowski from @GE_Digital on Using #Analytics to #Transform Sales #FutureOfData #Podcast

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

What we have is a data glut. – Vernon Vinge

[ PODCAST OF THE WEEK]

Understanding Data Analytics in Information Security with @JayJarome, @BitSight

 Understanding Data Analytics in Information Security with @JayJarome, @BitSight

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

Brands and organizations on Facebook receive 34,722 Likes every minute of the day.

Sourced from: Analytics.CLUB #WEB Newsletter

Aug 15, 19: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
SQL Database  Source

[ LOCAL EVENTS & SESSIONS]

More WEB events? Click Here

[ AnalyticsWeek BYTES]

>> Mastering Deep Learning with Self-Service Data Science for Business Users by jelaniharper

>> Jun 13, 19: #AnalyticsClub #Newsletter (Events, Tips, News & more..) by admin

>> Is withholding your data simply bad science, or should it fall under scientific misconduct? by analyticsweekpick

Wanna write? Click Here

[ FEATURED COURSE]

Introduction to Apache Spark

image

Learn the fundamentals and architecture of Apache Spark, the leading cluster-computing framework among professionals…. more

[ FEATURED READ]

On Intelligence

image

Jeff Hawkins, the man who created the PalmPilot, Treo smart phone, and other handheld devices, has reshaped our relationship to computers. Now he stands ready to revolutionize both neuroscience and computing in one strok… more

[ TIPS & TRICKS OF THE WEEK]

Data Analytics Success Starts with Empowerment
Being Data Driven is not as much of a tech challenge as it is an adoption challenge. Adoption has it’s root in cultural DNA of any organization. Great data driven organizations rungs the data driven culture into the corporate DNA. A culture of connection, interactions, sharing and collaboration is what it takes to be data driven. Its about being empowered more than its about being educated.

[ DATA SCIENCE Q&A]

Q:Do you know / used data reduction techniques other than PCA? What do you think of step-wise regression? What kind of step-wise techniques are you familiar with?
A: data reduction techniques other than PCA?:
Partial least squares: like PCR (principal component regression) but chooses the principal components in a supervised way. Gives higher weights to variables that are most strongly related to the response

step-wise regression?
– the choice of predictive variables are carried out using a systematic procedure
– Usually, it takes the form of a sequence of F-tests, t-tests, adjusted R-squared, AIC, BIC
– at any given step, the model is fit using unconstrained least squares
– can get stuck in local optima
– Better: Lasso

step-wise techniques:
– Forward-selection: begin with no variables, adding them when they improve a chosen model comparison criterion
– Backward-selection: begin with all the variables, removing them when it improves a chosen model comparison criterion

Better than reduced data:
Example 1: If all the components have a high variance: which components to discard with a guarantee that there will be no significant loss of the information?
Example 2 (classification):
– One has 2 classes; the within class variance is very high as compared to between class variance
– PCA might discard the very information that separates the two classes

Better than a sample:
– When number of variables is high relative to the number of observations

Source

[ VIDEO OF THE WEEK]

#BigData @AnalyticsWeek #FutureOfData #Podcast with @DavidRose, @DittoLabs

 #BigData @AnalyticsWeek #FutureOfData #Podcast with @DavidRose, @DittoLabs

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

Data matures like wine, applications like fish. – James Governor

[ PODCAST OF THE WEEK]

Venu Vasudevan @VenuV62 (@ProcterGamble) on creating a rockstar data science team #FutureOfData #Podcast

 Venu Vasudevan @VenuV62 (@ProcterGamble) on creating a rockstar data science team #FutureOfData #Podcast

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

Walmart handles more than 1 million customer transactions every hour, which is imported into databases estimated to contain more than 2.5 petabytes of data.

Sourced from: Analytics.CLUB #WEB Newsletter

Aug 08, 19: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
Tour of Accounting  Source

[ LOCAL EVENTS & SESSIONS]

More WEB events? Click Here

[ AnalyticsWeek BYTES]

>> Unifying the Data Tech Stack: Across the Edge, Analytics and Operational Tiers by analyticsweekpick

>> How to Build a Dedicated Usability Lab by analyticsweek

>> Bridging the Gap Between Dev and Ops [Infographic] by analyticsweek

Wanna write? Click Here

[ FEATURED COURSE]

Tackle Real Data Challenges

image

Learn scalable data management, evaluate big data technologies, and design effective visualizations…. more

[ FEATURED READ]

Data Science from Scratch: First Principles with Python

image

Data science libraries, frameworks, modules, and toolkits are great for doing data science, but they’re also a good way to dive into the discipline without actually understanding data science. In this book, you’ll learn … more

[ TIPS & TRICKS OF THE WEEK]

Fix the Culture, spread awareness to get awareness
Adoption of analytics tools and capabilities has not yet caught up to industry standards. Talent has always been the bottleneck towards achieving the comparative enterprise adoption. One of the primal reason is lack of understanding and knowledge within the stakeholders. To facilitate wider adoption, data analytics leaders, users, and community members needs to step up to create awareness within the organization. An aware organization goes a long way in helping get quick buy-ins and better funding which ultimately leads to faster adoption. So be the voice that you want to hear from leadership.

[ DATA SCIENCE Q&A]

Q:Is it better to have 100 small hash tables or one big hash table, in memory, in terms of access speed (assuming both fit within RAM)? What do you think about in-database analytics?
A: Hash tables:
– Average case O(1)O(1) lookup time
– Lookup time doesn’t depend on size

Even in terms of memory:
– O(n)O(n) memory
– Space scales linearly with number of elements
– Lots of dictionaries won’t take up significantly less space than a larger one

In-database analytics:
– Integration of data analytics in data warehousing functionality
– Much faster and corporate information is more secure, it doesn’t leave the enterprise data warehouse
Good for real-time analytics: fraud detection, credit scoring, transaction processing, pricing and margin analysis, behavioral ad targeting and recommendation engines

Source

[ VIDEO OF THE WEEK]

#FutureOfData Podcast: Peter Morgan, CEO, Deep Learning Partnership

 #FutureOfData Podcast: Peter Morgan, CEO, Deep Learning Partnership

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

The temptation to form premature theories upon insufficient data is the bane of our profession. – Sherlock Holmes

[ PODCAST OF THE WEEK]

#DataScience Approach to Reducing #Employee #Attrition

 #DataScience Approach to Reducing #Employee #Attrition

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

Every second we create new data. For example, we perform 40,000 search queries every second (on Google alone), which makes it 3.5 searches per day and 1.2 trillion searches per year.In Aug 2015, over 1 billion people used Facebook FB +0.54% in a single day.

Sourced from: Analytics.CLUB #WEB Newsletter

Aug 01, 19: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
SQL Database  Source

[ LOCAL EVENTS & SESSIONS]

More WEB events? Click Here

[ AnalyticsWeek BYTES]

>> Making Artificial Intelligence Accountable to Data Governance by jelaniharper

>> Automating AI in the Data-Driven Enterprise Webinar by analyticsweekpick

>> Achieving tribal leadership in 5 easy steps by v1shal

Wanna write? Click Here

[ FEATURED COURSE]

Python for Beginners with Examples

image

A practical Python course for beginners with examples and exercises…. more

[ FEATURED READ]

The Future of the Professions: How Technology Will Transform the Work of Human Experts

image

This book predicts the decline of today’s professions and describes the people and systems that will replace them. In an Internet society, according to Richard Susskind and Daniel Susskind, we will neither need nor want … more

[ TIPS & TRICKS OF THE WEEK]

Data Have Meaning
We live in a Big Data world in which everything is quantified. While the emphasis of Big Data has been focused on distinguishing the three characteristics of data (the infamous three Vs), we need to be cognizant of the fact that data have meaning. That is, the numbers in your data represent something of interest, an outcome that is important to your business. The meaning of those numbers is about the veracity of your data.

[ DATA SCIENCE Q&A]

Q:What is better: good data or good models? And how do you define ‘good”? Is there a universal good model? Are there any models that are definitely not so good?
A: * Good data is definitely more important than good models
* If quality of the data wasn’t of importance, organizations wouldn’t spend so much time cleaning and preprocessing it!
* Even for scientific purpose: good data (reflected by the design of experiments) is very important

How do you define good?
– good data: data relevant regarding the project/task to be handled
– good model: model relevant regarding the project/task
– good model: a model that generalizes on external data sets

Is there a universal good model?
– No, otherwise there wouldn’t be the overfitting problem!
– Algorithm can be universal but not the model
– Model built on a specific data set in a specific organization could be ineffective in other data set of the same organization
– Models have to be updated on a somewhat regular basis

Are there any models that are definitely not so good?
– ‘all models are wrong but some are useful” George E.P. Box
– It depends on what you want: predictive models or explanatory power
– If both are bad: bad model

Source

[ VIDEO OF THE WEEK]

Advanced #Analytics in #Hadoop

 Advanced #Analytics in #Hadoop

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

I’m sure, the highest capacity of storage device, will not enough to record all our stories; because, everytime with you is very valuable da

[ PODCAST OF THE WEEK]

#DataScience Approach to Reducing #Employee #Attrition

 #DataScience Approach to Reducing #Employee #Attrition

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

And one of my favourite facts: At the moment less than 0.5% of all data is ever analysed and used, just imagine the potential here.

Sourced from: Analytics.CLUB #WEB Newsletter

Jul 25, 19: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
Data shortage  Source

[ LOCAL EVENTS & SESSIONS]

More WEB events? Click Here

[ AnalyticsWeek BYTES]

>> Choosing an Analytics Platform: 3 Factors to Consider for Your Application by analyticsweek

>> Making Sense of the 2018 Gartner Magic Quadrant for Data Integration Tools by analyticsweekpick

>> From Storage to Data Virtualization by analyticsweekpick

Wanna write? Click Here

[ FEATURED COURSE]

Baseball Data Wrangling with Vagrant, R, and Retrosheet

image

Analytics with the Chadwick tools, dplyr, and ggplot…. more

[ FEATURED READ]

Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython

image

Python for Data Analysis is concerned with the nuts and bolts of manipulating, processing, cleaning, and crunching data in Python. It is also a practical, modern introduction to scientific computing in Python, tailored f… more

[ TIPS & TRICKS OF THE WEEK]

Save yourself from zombie apocalypse from unscalable models
One living and breathing zombie in today’s analytical models is the pulsating absence of error bars. Not every model is scalable or holds ground with increasing data. Error bars that is tagged to almost every models should be duly calibrated. As business models rake in more data the error bars keep it sensible and in check. If error bars are not accounted for, we will make our models susceptible to failure leading us to halloween that we never wants to see.

[ DATA SCIENCE Q&A]

Q:Explain what a local optimum is and why it is important in a specific context,
such as K-means clustering. What are specific ways of determining if you have a local optimum problem? What can be done to avoid local optima?

A: * A solution that is optimal in within a neighboring set of candidate solutions
* In contrast with global optimum: the optimal solution among all others

* K-means clustering context:
It’s proven that the objective cost function will always decrease until a local optimum is reached.
Results will depend on the initial random cluster assignment

* Determining if you have a local optimum problem:
Tendency of premature convergence
Different initialization induces different optima

* Avoid local optima in a K-means context: repeat K-means and take the solution that has the lowest cost

Source

[ VIDEO OF THE WEEK]

@EdwardBoudrot / @Optum on #DesignThinking & #DataDriven Products #FutureOfData #Podcast

 @EdwardBoudrot / @Optum on #DesignThinking & #DataDriven Products #FutureOfData #Podcast

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

Without big data, you are blind and deaf and in the middle of a freeway. – Geoffrey Moore

[ PODCAST OF THE WEEK]

#BigData @AnalyticsWeek #FutureOfData #Podcast with  John Young, @Epsilonmktg

 #BigData @AnalyticsWeek #FutureOfData #Podcast with John Young, @Epsilonmktg

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

Every second we create new data. For example, we perform 40,000 search queries every second (on Google alone), which makes it 3.5 searches per day and 1.2 trillion searches per year.In Aug 2015, over 1 billion people used Facebook FB +0.54% in a single day.

Sourced from: Analytics.CLUB #WEB Newsletter

Jul 18, 19: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
Weak data  Source

[ LOCAL EVENTS & SESSIONS]

More WEB events? Click Here

[ AnalyticsWeek BYTES]

>> [Step-by-Step] Using Talend to Bulk Load Data in Snowflake Cloud Data Warehouse by analyticsweekpick

>> Jan 24, 19: #AnalyticsClub #Newsletter (Events, Tips, News & more..) by admin

>> Best & Worst Time for Cold Call by v1shal

Wanna write? Click Here

[ FEATURED COURSE]

CPSC 540 Machine Learning

image

Machine learning (ML) is one of the fastest growing areas of science. It is largely responsible for the rise of giant data companies such as Google, and it has been central to the development of lucrative products, such … more

[ FEATURED READ]

The Industries of the Future

image

The New York Times bestseller, from leading innovation expert Alec Ross, a “fascinating vision” (Forbes) of what’s next for the world and how to navigate the changes the future will bring…. more

[ TIPS & TRICKS OF THE WEEK]

Data Have Meaning
We live in a Big Data world in which everything is quantified. While the emphasis of Big Data has been focused on distinguishing the three characteristics of data (the infamous three Vs), we need to be cognizant of the fact that data have meaning. That is, the numbers in your data represent something of interest, an outcome that is important to your business. The meaning of those numbers is about the veracity of your data.

[ DATA SCIENCE Q&A]

Q:What is random forest? Why is it good?
A: Random forest? (Intuition):
– Underlying principle: several weak learners combined provide a strong learner
– Builds several decision trees on bootstrapped training samples of data
– On each tree, each time a split is considered, a random sample of m predictors is chosen as split candidates, out of all p predictors
– Rule of thumb: at each split m=?p
– Predictions: at the majority rule

Why is it good?
– Very good performance (decorrelates the features)
– Can model non-linear class boundaries
– Generalization error for free: no cross-validation needed, gives an unbiased estimate of the generalization error as the trees is built
– Generates variable importance

Source

[ VIDEO OF THE WEEK]

#FutureOfData Podcast: Peter Morgan, CEO, Deep Learning Partnership

 #FutureOfData Podcast: Peter Morgan, CEO, Deep Learning Partnership

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

Processed data is information. Processed information is knowledge Processed knowledge is Wisdom. – Ankala V. Subbarao

[ PODCAST OF THE WEEK]

#FutureOfData with @theClaymethod, @TiVo discussing running analytics in media industry

 #FutureOfData with @theClaymethod, @TiVo discussing running analytics in media industry

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

By 2020, we will have over 6.1 billion smartphone users globally (overtaking basic fixed phone subscriptions).

Sourced from: Analytics.CLUB #WEB Newsletter