The Truth about Artificial Intelligence: As it Stands Today

So far, Artificial Intelligence has yet to live up to the enormous expectations which have always surrounded this technology.

Its failure to garner mainstream adoption during its inchoate inception in the 1960’s and 1970’s kept it out of the headlines until relatively recently. The surging interest around cognitive computing and self-service platforms such as IBM Watson has set the stage for more demands than ever, perhaps culminating in the multitude of prognostications from Gartner, Forrester, and others regarding AI’s ability to transform the data sphere in 2017.

With an array of self-service machine learning platforms, cloud-based data science solutions, and significant advances in both storage and compute power, machine intelligence is certainly more accessible to the enterprise that it has ever been.

But in practice in the business world today, AI’s moment is still impending, and not yet realized. According to Cambridge Semantics VP of Financial Services Marty Loughlin, approximately 90 percent of the attention focused on AI within the finance industry is media generated, versus only approximately 10 percent practical usage. “The interest is definitely high, but it’s a little bit ahead of reality I think,” Loughlin commented.

Theory Versus Practice
Although AI includes everything from basic Natural Language Processing capabilities to some of the more advanced options for virtual reality and augmented reality, it is perhaps most practical to the enterprise via the automation capabilities of machine learning and deep learning—which in turn fuel efforts towards speech recognition and other forms of pattern recognition. The potential of these AI components to drastically improve analytics, data preparation, and even elements of data modeling is considerable. Traditionally, their utility to the enterprise was girded by:

  • Time Constraints: Organizations can spend more than six months perfecting the predictive models for machine learning or deep learning.
  • Data Quantities: The predictive capacities of both of these facets of AI require exorbitant amounts of training data, contributing to the lengthy times required to fine-tune them.
  • Data Scientists: Data Scientists are arguably the chimera of the 21st century, that rarest of commodity with illimitable demands on their time.
  • Infrastructure: Costly hardware was previously required to accommodate the massive data quantities necessary for training machine learning and deep learning models.

Self-service SaaS offerings which have automated critical aspects of data science prerequisites for these manifestations of AI, along with accelerated processing speeds, have rendered most of these concerns obsolete. Nonetheless, actual use cases for AI in transformative roles impacting business value—as opposed to rudimentary forms of machine learning in comprehensive platforms for analytics, transformation, or data preparation—are limited due to more modern concerns.

Consultancy: Redressing the Data Science Shortage
Exploiting AI to effect competitive advantage requires leveraging it in the most vital business tasks, as opposed to simply quickening portions of everyday data management. Oftentimes, finding that niche requires more than a simple cloud-based deployment and necessitates substantial consultant work–which may simply occur with SaaS and PaaS providers functioning as consultants while aggrandizing costs. The Senior Vice President of Strategy, Research and Analytics at Shapiro+Raj  (which specializes in assisting clients with Bayesian machine learning methods) Dr. Lauren Tucker remarked: “People who can do the modeling and so forth have very little understanding of how to connect the dots to craft a story around the inisghts to come out of those models. But then you need people who understand how to present them, how to get that information originally from the client, and how to do the model assessment. It takes a village, and that village is more often found in firms that focus on that type of business rather than holding them in-house.” The implications are that the temporal and financial costs for using AI are often more than the rapid self-service cloud offerings may initially seem to be.

Unstructured and Structured Data
One of the principal usages for AI is in assisting organizations with the abundant amounts of unstructured data they are accounting for, which is a direct consequence of the normalization of big data to the enterprise today. “The amount of data that you actually put through the ETL pipeline and structure is only the classic tip of the iceberg,” indico Chief Customer Officer Vishal Daga said. “There’s so much data out there that’s of the unstructured variety.” Analyzing unstructured data has its own innate peculiarities, which may vary according to industry. “A lot of that analysis, if you can use machine learning or AI you can get into something where you can actually do the interpretation,” Biotricity CEO Waqaas Al-Siddiq said. “But it’s a different data type. It’s what IBM Watson is always talking about. You’re not looking at traditional data, you’re looking at unstructured data.”

However, there are many organizations which are still struggling with conventional structured data. Mastery over this data domain could very well take precedence over that of its unstructured counterpart, and could partially explain why AI technology adoption has yet to become more pervasive. Loughlin spoke about this reality in the financial services vertical, stating that despite the plethora of unstructured data required for analysts and trading purposes, “I see more talk than action there. I think right now the financial organizations have such a monstrous problem on their hands with structured data that unstructured data is deferred. It’s coming. The use cases are out there, but it’s still early days for unstructured content in the financial services world from what I’ve seen.”

Early Days
Loughlin’s sentiment seems widely applicable across industries as well. What is unequivocally changing is the inclusion of basic AI capabilities in platforms which can hasten numerous time-sensitive processes. Classic examples are found in data preparation and aggregation prerequisites prior to analytics in which AI and NLP “can recommend looking at the data and say oh, I see you have a customer data set and a demographic data set,” Paxata Co-Founder and Chief Product Officer explained. “Let me tell you how to bring these two data sets together so you can accelerate to get to the point where someone can get the data that they need.” The full scope of AI, however, which is why it so highly anticipated to alter the data landscape this year and beyond, involves accelerating core functions of business processes and, perhaps even displacing job positions due to expedited automation.

This perception may function as somewhat of an unspoken caveat restraining the full advent of AI upon the enterprise, as vendors continue to push the rhetoric that its technologies are not displacing laborers but allowing them to concentrate on ‘more profound’ problems. Regardless, unveiling this potential of machine intelligence requires a sizable allocation of resources in terms of cost and expertise denoting just how it can make core business functions more efficient. This phase is followed by implementation and assumes, (quite incorrectly, in some instances) that organizations have already mastered the fundamentals of structured data and their requirements. Thus, for better or worse, AI largely remains an intriguing idea, and one which is currently actualized only at a fraction of its full potential.

Source: The Truth about Artificial Intelligence: As it Stands Today by jelaniharper

Jan 25, 18: #AnalyticsClub #Newsletter (Events, Tips, News & more..)


Statistically Significant  Source

[ AnalyticsWeek BYTES]

>> Look to Risk Modeling for Best Analytics Practices by analyticsweekpick

>> Battling Misinformation in Customer Experience Management by bobehayes

>> Visualizing Product Quality and Customer Service Quality by bobehayes

Wanna write? Click Here


 J.D. Power promotes Thomas King to head data, analytics segment … – Auto Remarketing Under  Marketing Analytics

 Slack ups its enterprise appeal with new bots for meetings, sales, analytics – TechRepublic Under  Sales Analytics

 IBM Tames Big Data by Blending All-Flash Storage and Spectrum Scale Software – Enterprise Storage Forum Under  Big Data

More NEWS ? Click Here


CPSC 540 Machine Learning


Machine learning (ML) is one of the fastest growing areas of science. It is largely responsible for the rise of giant data companies such as Google, and it has been central to the development of lucrative products, such … more


Antifragile: Things That Gain from Disorder


Antifragile is a standalone book in Nassim Nicholas Taleb’s landmark Incerto series, an investigation of opacity, luck, uncertainty, probability, human error, risk, and decision-making in a world we don’t understand. The… more


Analytics Strategy that is Startup Compliant
With right tools, capturing data is easy but not being able to handle data could lead to chaos. One of the most reliable startup strategy for adopting data analytics is TUM or The Ultimate Metric. This is the metric that matters the most to your startup. Some advantages of TUM: It answers the most important business question, it cleans up your goals, it inspires innovation and helps you understand the entire quantified business.


Q:What are the drawbacks of linear model? Are you familiar with alternatives (Lasso, ridge regression)?
A: * Assumption of linearity of the errors
* Can’t be used for count outcomes, binary outcomes
* Can’t vary model flexibility: overfitting problems
* Alternatives: see question 4 about regularization



Big Data Introduction to D3

 Big Data Introduction to D3

Subscribe to  Youtube


The data fabric is the next middleware. – Todd Papaioannou


#FutureOfData with Rob(@telerob) / @ConnellyAgency on running innovation in agency

 #FutureOfData with Rob(@telerob) / @ConnellyAgency on running innovation in agency


iTunes  GooglePlay


More than 200bn HD movies – which would take a person 47m years to watch.

Sourced from: Analytics.CLUB #WEB Newsletter

Colleges are using big data to identify when students are likely to flame out

Virginia Commonwealth University, like lots of U.S. colleges, has worked to keep its freshmen around for sophomore year. Research has shown that students are most at risk of dropping out early in their college careers, and freshman retention rates also factor into college rankings.

Although VCU has had success in getting students to return to its Richmond campus for a second year, the university has struggled to get them all the way to graduation. Now the school is turning to big data to help it identify students who are most at risk of falling through the cracks.

Those students, referred to by some as the “murky middle,” tend to be sophomores or juniors with grade-point averages between 2.0 and 3.0, students whose academic performance has not raised any alarms and yet who ultimately are not on track to graduate. Not graduating affects a school’s standing and also can cause lasting damage for the students who don’t make it.

An estimated 400,000 students drop out of college every year across the country. Many are left with student loans and poor prospects of earning enough to repay the debt they’ve accumulated. College dropouts run a high risk of defaulting on student loans, which could damage their credit and make it difficult to buy a car or a house, or even get a job.

The Obama administration is grappling with ways to hold colleges accountable when students fail to earn their degrees, and it is developing a college-rating system to measure outcomes.

Schools have been feeling the pressure of that government scrutiny, especially as more states are tying funding to performance. And with every student who leaves, colleges are losing tuition revenue that has become critical in the face of dwindling enrollment and flagging state investment.

“People are becoming really anxious about not meeting their enrollment numbers, so they’re turning to [big data] not just as the moral imperative — it’s the right thing to do — but also as a financial imperative,” said Ed Venit, senior director at Education Advisory Board (EAB), a consulting firm that uses predictive analytics to improve retention and graduation rates. “Students represent enrollment . . . so it is best to keep them all the way to graduation.”

Seth Sykes, associate vice provost of strategic enrollment management at VCU, contacted District-based EAB after hearing about its research on students in the “murky middle.”

Researchers at the firm pored over the university’s files and found that students who were withdrawing from or failing classes were most likely to leave. With that insight, the company created a platform that let VCU advisers flag students who are in that danger zone and intervene. Sometimes that means getting students set up with a tutor or simply making sure they are taking the right classes to complete their degree.

Within one semester, the school recorded a 16 percent increase in the number of students completing courses and an 8 percent increase in students enrolling for the following term.

“It’s a little bit too early for us to see what impact this will have on graduation rates, but it feels like we’re on the right track,” Sykes said.
VCU established “success markers” for every major, identifying classes students should be completing at various points on their path to graduation. A chemistry major, for instance, should earn at least a C in general chemistry by the end of the first year, Sykes said. If that student fails or enters sophomore or junior year without finishing that course, he or she would be flagged for counseling.

Advisers can use the school’s early-alert system to search for groups of students who have accumulated a lot of credits but haven’t graduated or those starting to fall below a 2.0 grade point average.

Until recently, many college advisers did not have the technology to quickly identify students in need of an extra nudge, Venit said. Advisers at many schools see students only when they need to register, when they seek out help or when they are on the verge of flunking out.

That was the case at Middle Tennessee State University before the school retooled its academic advisement based on EAB data. The university spent $3 million to hire 47 additional advisers and develop “campaigns” to increase re-enrollment.

Administrators discovered that 20 percent of students were leaving in their second year, a trend they figured could be reversed with more directed advising, said Richard D. Sluder, vice provost for student success.
Now advisers can pull together weekly reports of the percentage of students who have reenrolled for the next semester, giving them a shot at reaching students who are on the fence before they drop out. As of this month, enrollment rates are up three percentage points from last year.

“It doesn’t sound like very much, but when you’re working with 23,000 students, it turns out to be more than 600 students pre-registered this year,” Sluder said. “If we do nothing different, 3,000 freshman will come here August 24, and a year later, 900 will be gone. Is that a sustainable business model? Of course not.”

To gain insight into when and why people drop out, EAB has collected 10 ­­­­­­­­­­­­­­­years of student transcripts from more than 150 four-year public and private U.S. colleges.

Researchers found that of the students who return for a second year with grade-point averages between 2.0 and 3.0, two out of five will drop out. Venit said a student’s GPA is a good indicator of their chances of graduating, though there is no conclusive evidence that bad grades lead students to quit school.

“The murky-middle students who don’t graduate tend to have GPAs that stay flat and then fall off over time. That suggests they are trucking along, doing their best, maybe they’re more susceptible to something going wrong or maybe they’re treading water,” Venit said. “You talk to advisers and there are a lot of reasons, but the most common is these students are losing motivation.”

At VCU, Sykes noticed that a lot of students landing in academic limbo were majoring in science, engineering and math. It turned out that many of those students were waiting too long to declare a major, which can be a problem in science majors that require a specific sequence of classes, Sykes said.

“The platform helps us identify students who are either spinning their wheels or are late in starting their classes,” he said. “Advisers can then have the conversation ‘Is this an appropriate major for you in terms of your ability to complete it?’ Or talk about how long it might take and the financial implications.”
VCU charges in-state undergraduates an average of $16,764 a year, after grants and scholarships are deducted. That net price is higher than the $12,830 national average for public schools, according to data from the College Board, an education nonprofit.

At those prices, taking longer to graduate can get expensive. And dropping out, in some cases, could be just as costly if students borrow to pay for school.

According to the New America Foundation, of all the students who started school in 2003-2004 and were in default on their loans six years later, more than 60 percent left college without a degree. Researchers at Education Sector found that borrowers who drop out face higher unemployment and lower median incomes.

“Even if you don’t have the capacity to advise every student multiple times a semester, which most schools don’t, you can still get a lot done with modest resources that you have,” Venit said.

To read the original article on The Washington Post, click here.

Originally Posted at: Colleges are using big data to identify when students are likely to flame out by analyticsweekpick

@CRGutowski from @GE_Digital on Using #Analytics to #Transform Sales


In this podcast @CRGutowski from @GE_Digital talks about the importance of data and analytics in transforming sales organizations. She sheds light on challenges and opportunities with transforming sales organization of a transnational enterprise using analytics and implement a growth mindset. Cate shared some of the tenet of transformation mindset. This podcast is great for future leaders who are thinking of shaping their sales organization and empower them with the digital mindset.


Cate’s Recommended Read:
Start with Why: How Great Leaders Inspire Everyone to Take Action by Simon Sinek

Podcast Link:

Cate’s BIO:
Cate has 20 years of technical sales, marketing and product leadership experience across various global divisions in GE. Cate is currently based in Boston, MA and works as the VP – Commercial Digital Thread, leading the digital transformation of GE’s 25,000+ sales organization globally. Prior to relocating to Boston, Cate and her family lived in Budapest, Hungary where she led product management, marketing and commercial operations across EMEA for GE Current. Cate holds an M.B.A. from the University of South Florida and a Bachelor’s degree in Communications and Business Administration from the University of Illinois at Urbana-Champaign.

About #Podcast:
#FutureOfData podcast is a conversation starter to bring leaders, influencers and lead practitioners to come on show and discuss their journey in creating the data driven future.

Wanna Join?
If you or any you know wants to join in,
Register your interest @

Want to sponsor?
Email us @

#FutureOfData, #Data, #Analytics, #Leadership Podcast, #Big Data, #Strategy

Source: @CRGutowski from @GE_Digital on Using #Analytics to #Transform Sales by v1shal

Jan 18, 18: #AnalyticsClub #Newsletter (Events, Tips, News & more..)


Human resource  Source

[ AnalyticsWeek BYTES]

>> AtScale opens Hadoop’s big-data vaults to nonexpert business users by analyticsweekpick

>> 3 ways to boost revenue with data analytics by analyticsweekpick

>> The 5 Types of Healthcare Analytics Solutions by analyticsweekpick

Wanna write? Click Here


 Customer experience is key driver to every business – Groupe Ideal Manager – Under  Customer Experience

 Using artificial intelligence on health records could help predict ailments – Economic Times Under  Artificial Intelligence

 I used to believe in gun control. But then I ran the numbers – Kansas City Star (blog) Under  Statistics

More NEWS ? Click Here


Machine Learning


6.867 is an introductory course on machine learning which gives an overview of many concepts, techniques, and algorithms in machine learning, beginning with topics such as classification and linear regression and ending … more


The Signal and the Noise: Why So Many Predictions Fail–but Some Don’t


People love statistics. Statistics, however, do not always love them back. The Signal and the Noise, Nate Silver’s brilliant and elegant tour of the modern science-slash-art of forecasting, shows what happens when Big Da… more


Keeping Biases Checked during the last mile of decision making
Today a data driven leader, a data scientist or a data driven expert is always put to test by helping his team solve a problem using his skills and expertise. Believe it or not but a part of that decision tree is derived from the intuition that adds a bias in our judgement that makes the suggestions tainted. Most skilled professionals do understand and handle the biases well, but in few cases, we give into tiny traps and could find ourselves trapped in those biases which impairs the judgement. So, it is important that we keep the intuition bias in check when working on a data problem.


Q:Explain selection bias (with regard to a dataset, not variable selection). Why is it important? How can data management procedures such as missing data handling make it worse?
A: * Selection of individuals, groups or data for analysis in such a way that proper randomization is not achieved
– Sampling bias: systematic error due to a non-random sample of a population causing some members to be less likely to be included than others
– Time interval: a trial may terminated early at an extreme value (ethical reasons), but the extreme value is likely to be reached by the variable with the largest variance, even if all the variables have similar means
– Data: “cherry picking”, when specific subsets of the data are chosen to support a conclusion (citing examples of plane crashes as evidence of airline flight being unsafe, while the far more common example of flights that complete safely)
– Studies: performing experiments and reporting only the most favorable results
– Can lead to unaccurate or even erroneous conclusions
– Statistical methods can generally not overcome it

Why data handling make it worse?
– Example: individuals who know or suspect that they are HIV positive are less likely to participate in HIV surveys
– Missing data handling will increase this effect as it’s based on most HIV negative
-Prevalence estimates will be unaccurate



Understanding Data Analytics in Information Security with @JayJarome, @BitSight

 Understanding Data Analytics in Information Security with @JayJarome, @BitSight

Subscribe to  Youtube


You can have data without information, but you cannot have information without data. – Daniel Keys Moran


@AnalyticsWeek #FutureOfData with Robin Thottungal(@rathottungal), Chief Data Scientist at @EPA

 @AnalyticsWeek #FutureOfData with Robin Thottungal(@rathottungal), Chief Data Scientist at @EPA


iTunes  GooglePlay


Facebook users send on average 31.25 million messages and view 2.77 million videos every minute.

Sourced from: Analytics.CLUB #WEB Newsletter

7 Keys To Market Share Growth And Sustainability

7 Keys To Market Share Growth And SustainabilityWith a rapidly changing world, more social, more open, and new ways have emerged to do business. The definition of doing business has fundamentally changed. Any business that is still sticking to its legacy ways would starve and lose market share to competition. The base of it lies in the changes in the customer behavior, customer expectation, technology evolution, regulatory complexities and competitive landscape. It is important for businesses to understand the impact of these changes and prepare accordingly. Businesses should

build strategies that are proactively mitigating the effect and preparing for the future. Else, they would be at the mercy of competition (new and old) to maintain profitability and would
eventually go bust. So adhering to this wave of change, understanding it and steering in the right direction is essential for keeping market share and businesses afloat.

In this wave of change, there are certain things that businesses could do to fundamentally change the way they approach their business processes to gain market share. Following few key areas would be great starting points.

Understand customer landscape and needs:
Focusing on the customer’s need is the most important thing to do to gain market share. With many customer focused companies like Apple, Amazon etc., keeping customers happy and satisfied is the key that differentiates them from the herd. In recent times, customers’  expectations/ needs have changed and companies need to constantly evaluate their methodology and align themselves to read, meet and exceed customer’s expectations. They need to setup a continuous monitoring process to analyze the customer landscape, their needs, and its impact to the business. This helps keeping businesses close to their customer, which is a mantra for success.

Be a thought leader:
Finding a niche and staying as a thought leader is another crucial step in establishing sustainable footprint in business growth. Most successful companies are the ones that are leaders in their space. Laggards will have some room to flex but it would be at the mercy of demand supply gap. It is important to find a core area and excel in it. Such companies innovate, lead and create a brand of trust and credibility with the customers. So, building thought leadership is extremely important for building a business for success.

Be the network:
It is rightly said that “one who builds the network, owns the network”. There is a need to facilitate open communication both within the company, with its external partners, stakeholders and other communities. Building a system that facilitates this communication is beneficial for the company and it is a win-win for everyone. It helps keep a full innovation pipeline, lets the dialog going, saves from blinders, and builds better business connections. This network could also be used as a channel for fostering numerous innovations in the company.

Learn to love data:
Data is your friend. It is impartial and insightful. Make data your friend and leverage it in all business decisions. Data brings sanctity, facts and gutless metric in decisions, thus making success more predictable and informed. As we all know, informed decisions are always better than the decisions based on hunches. Data can help a company in preparing for the future, in understanding customer’s expectation, analyzing industry trends; do predictive modeling to solve big problem questions etc. So, there are many used cases for successful use of data in current competitive landscape. A company closer to data will never find difficulty in maintaining a healthy market share.

Make partners not competitors:
It is a strong statement, but a realistic one. In today’s economy, no one can win in isolation and especially when there are hardly any boundaries. It is important to partner to make things happen. People understand the importance of shorter cycle time and fast time to market and strategic partners and new business models can make this thing a reality. So, it is important to think of a system that has more partners than competition.

Make friends not customers:
To outdo the competition, it is required to stay close to the customers and improve retention. The best way to achieve this is to provide personal touch to products and services and make them feel special and friend like. This would enable customers to reach out to you for help/ suggestions and listening and catering to them would create strong loyalty sentiments. On the other side, customer would act as strong brand advocates, generating leads, providing references and helping your business grow. This creates a bond and has a positive multiplier effect especially in the socially connected world that we live in.

Build a platform on which other could build:

As discussed earlier, innovation could be a pillar stone for differentiation and growth. Crowdsourcing is a concept that is very well adapted and could be a gateway for numerous innovations. So, companies can open certain part of business and leave it for other the stakeholders (partners, customers) to disrupt. For instance, nowadays all the companies from Google, to Apple have opened their platforms to let others build on their platforms and disrupt the market place.


Source: 7 Keys To Market Share Growth And Sustainability

Data science’s limitations in addressing global warming

Data science is not a magic bag of tricks that can somehow find valid patterns under all circumstances. Sometimes the data itself is far too messy to analyze comprehensively in any straightforward way. Sometimes, it’s so massive, heterogeneous and internally inconsistent that no established data-scientific approach can do it justice.

When the data in question is big, the best-laid statistical models can only grasp pieces of its sprawling mosaic. As this recent article notes, that’s often the case with climate change data, which is at the heart of the global warming debate. Authors James H. Faghmous and Vipin Kumar state their thesis bluntly: “Despite the urgency, data science has had little impact on furthering our understanding of our planet in spite of the abundance of climate data….This…stems from the complex nature of climate data as well as the scientific questions climate science brings forth.”

It’s most instructive about their discussion is how they peel the methodological onion behind statistical methods in climate-data analysis. The chief issues, they argue, are as follows:

  • iHistorically shallow data: Modern climate science is a relatively new interdisciplinary field that integrates the work of scientists in meteorology, oceanography, hydrology, geology, biology and other established fields. Consequently, unified climatological data sets that focus on long-term trends are few and far between. Also, some current research priorities (such as global warming) have only come into climatology’s core radar over the past decade or so. As the authors note, “some datasets span only a decade or less. Although the data might be large—for example, high spatial resolution but short temporal duration—the spatiotemporal questions that we can ask from such data are limited.”
  • Spatiotemporal scale-mixing: As a closed system, the planet and all of its dynamic components interact across all spatial scales, from the global to the microscopic, and on all temporal scales, from the geological long-term to the split-second. As the authors note, “Some interactions might last hours or days—such as the influence of sea surface temperatures on the formation of a hurricane—while other interactions might occur over several years (e.g., ice sheets melting).” As anybody who has studied fractal science would point out, all these overlapping interactions introduce nonlinear dynamics that are fearsomely difficult to model statistically.
  • Heterogeneous data provenance: Given how global climate data is, it’s no surprise that no single source, method or instrumentation can possibly generate all of it, either at any single point in time or over the long timeframes necessary to identify trends. The authors note that climate data comes from four principle methodologies, each of them quite diverse in provenance: in situ (example: local meteorological stations), remote sensed (example: satellite imaging), model output (example: simulations of climatic conditions in the distant past) and paleoclimatic (examples: core samples, tree rings, lake sediments). These sources cover myriad variables that may be complementary or redundant with each other, further complicating efforts to combine them into a unified data pool for further analysis. In addition, measurement instrumentation and data post-processing approaches change over the years, making longitudinal comparisons difficult. The heterogeneous provenance of this massive data set frustrates any attempt to ascertain its biases and vet the extent to which it meets consistent standards of quality and consistency. Consequently, any statistical models derived from this mess will suffer the same intrinsic issues.
  • Auto-correlated measurements: Even when we consider a very constrained spatiotemporal domain, the statistical modeling can prove tricky. That’s because adjacent climate-data measurements aren’t often not statistically independent of each other. Unlike the canonical example of rolling a dice, where the outcome of each roll is independent of other rolls, climate-data measurements are often quite correlated with each other, especially if they’re near to each other in space and time. Statisticians refer to this problem as “auto-correlation,” and it wreaks havoc with standard statistical modeling techniques, making it difficult to isolate the impacts of different independent variables on the outcomes of interest.
  • Machine learning difficulties: In climatological data analysis, supervised learning is complicated by the conceptual difficulties of defining what specific data pattern describes “global warming,” “ice age,” “drought” and other trends. One key issue is where you put the observational baseline. Does the training data you’re employing simply describe one climatological oscillation in a long-term cycle? Or does it describe a longer-term trend? How can you know? If you intend to use unsupervised learning, your machine learning model may fit historical data patterns. However, the model may suffer from a statistical problem known as “overfitting”: being so complex and opaque that domain scientists can’t map its variables clearly to well-understood climatological mechanisms. This might make the model useless for predictive and prescriptive analyses.

In spite of all those issues, the authors don’t deny the value of data-scientific methods in climatological research. Instead, they call for a more harmonious balance between theory-driven domain science and data-driven statistical analysis. “What is needed,” they say, “is an approach that leverages the advances in data-driven research yet constrains both the methods and the interpretation of the results through tried-and-true scientific theory. Thus, to make significant contributions to climate science, new data science methods must encapsulate domain knowledge to produce theoretically-consistent results.”

These issues aren’t limited to climate data. Those same data-scientific issues apply to other heterogeneous data domains. For example, social-network graph analysis is a young field that has historically shallow data and attempts to analyze disparate sources, both global and local. How can data scientists effectively untangle intertwined sociological and psychological factors, considering that auto-correlations in human behavior, sentiment and influence run rampant always and everywhere?

If data science can’t get its arms around global warming, how can it make valid predictions of swings in the climate of world opinion?

Originally posted via “Data science’s limitations in addressing global warming”


Source: Data science’s limitations in addressing global warming

See what you never expected with data visualization

Written by Natan Meekers

A strong quote from John Tukey explains the essence of data visualization:

“The greatest value of a picture is when it forces us to notice what we never expected to see.”

Tukey was a famous American mathematician who truly understood data – its structure, patterns and what to look for. Because of that, he was able to come up with some great innovations, like the box plot. His powerful one-liner is a perfect introduction to this topic, because it points out the value of seeing things that we never expected to see.

With the large amounts of data generated every day, it’s impossible to keep up by looking at numbers only. Applying simple visualization techniques helps us to “hear” what the data is telling us. This is because our brain exists in two parts. The left side is logical, the mathematician; the right side is creative, the artist.

Mercedes-Benz, the luxury carmaker, illustrated the value of visualization in its “Whole Brain” campaign in 2012. Ads showed how the two opposing parts of the brain complement each other. They juxtaposed the left side responsible for logic and analysis with the creative and intuitive right side. Through visualization, the campaign communicated that Mercedes-Benz, like the brain, is a combination of opposites. Working together, they create technological innovation, breakthrough engineering, inspiring design and passion.

Mercedes ad depicting left and right brain functions

Visualizing data, i.e. combining left and right sides, lets you optimize decision-making and speed up ad-hoc analysis. That helps you see trends as they’re occurring and take immediate action when needed.

The most impressive thing is that accurate and informative visualizations are just a click away for you, even as a business user. NO technical background or intensive training required at all. With self-service capabilities of modern tools, you can get much more value out of your data just by pointing and clicking.

Data visualization plays a critical role in a world where so much data is pouring in from so many sources every day. It helps us to understand that data more easily. And we can detect hidden patterns, trends or events quicker than ever before. So start using your data TODAY for what it’s really worth.

To read the original article on S.A.S. Voices, click here.


3 ways to boost revenue with data analytics

Financial management

In a mere decade, the physician practice revenue cycle has been transformed. Gone are the days when most patients had $10 or $20 co-payments and their insurance companies generally paid claims in full. Physicians can no longer order lab work and tests according to their preference without considering medical necessity. And as patients shoulder rising care costs, they have become payers themselves, and they’re not quite accustomed to this role.

All of these factors have led to an increasingly complex and challenging revenue cycle — one that requires innovation. “Doing more with less” may be a cliché, but it rings true for physician practices striving to thrive financially while providing the highest quality care; however with the myriad of new initiatives and demands vying for their time, revenue cycle managers and practice leadership may ask, “Is it even possible to do more with less?”

Surprisingly, the answer is “yes” for most practices. Fortunately, you can achieve this goal leveraging something you already have, or can obtain, within the four walls of your practice: knowledge.

Not many practices can afford to purchase technology strictly for analytics and business intelligence. Additionally, in an environment where challenges such as health reform and regulatory demands take substantial time and attention, practices don’t have the luxury of adding resources to tackle such efforts. Nonetheless, practices can jump-start their analytics efforts and fuel more informed decisions via their clearinghouse. By reviewing clearinghouse reports — both standard and custom — you can identify revenue cycle trends, spot problems and test solutions such as process improvements.

Here’s how you can leverage data to achieve revenue cycle improvement goals such as decreasing days in accounts receivable (A/R), reducing denials and optimizing contract negotiations with payers.

1. Reduce denials and rejections
Effectively managing denials and rejections has always been one of physician practices’ greatest revenue cycle challenges. The more denials and rejections a practice has, the more likely key metrics such as days in A/R are to be low-performing, since practices aren’t able to get paid in a timely manner. Denials and rejections are just two of many areas that cause cash flow delays, and when reasons for denials and rejections are identified, such as eliminating unproductive work, practices can begin to improve days in A/R and increase profitability because payment comes in more quickly. These basic revenue cycle challenges, coupled with more stringent medical necessity requirements and value-based reimbursement, are now creating even more challenges in the healthcare industry.

Since ineligibility is often a leading cause for denials, a denial reduction strategy begins in the front office with quality eligibility information. An automated eligibility process provides front-office staff the data they need while also reducing errors. Allowing staff to check eligibility before patients are seen will set the stage for a more informed discussion regarding patient financial responsibility while also ensuring proper claims submission and reducing write-offs. Denial reports by reason are also an important tool; they can help practice managers identify staff or processes that require additional training.

A customized rejection report can help your team stay abreast of changing payer requirements and identify emerging patterns. Your clearinghouse should be able to generate a quarterly or monthly report that shows the most common reasons for claims rejections. Make sure the report details this information by practice location; staff at high-performing locations may be able to offer tips and advice to other offices with higher rejection rates.

Practice leadership can email the report and an analysis of patterns and trends to the entire team. An excellent tool to educate managers, coders and billing staff, this email can highlight areas for improvement or where additional training is required. This analysis should be simple and easy to comprehend, providing a quick snapshot of rejections along with practical ideas for improvements. The goal is for staff to be able to make adjustments to day-to-day work processes simply by reviewing the email. It can even generate some healthy competition as teams at different locations strive to make the greatest improvements.

2. Identify problematic procedures and services
In an era of value-based reimbursement, knowing which codes are prone to reimbursement issues can help your practice navigate an increasingly tricky landscape for claims payment. This information can be particularly helpful as you acclimate your practice to each payer’s value-based methodology such as bundled payments or shared savings. A report showing denials by code and per physician can generate awareness regarding potentially problematic claims submission. It can facilitate team education regarding coding conventions, medical necessity rules and payer requirements.

3. Improve contract negotiations
Clearinghouse reports aren’t just useful for education and improvements within your practice; they can also provide valuable insights as you review payer contracts and prepare for negotiations. In payer-specific reports, look for trends such as the average amount paid on specific codes over time. Compare these averages with your other payers, and go into negotiations armed with this data.

A recent survey of College of Healthcare Information Management Executives (CHIME) indicates that data analytics is the top investment priority for senior executives at large health systems, trumping both Accountable Care and ICD-10. Their reason: quality improvement and cost reduction can best be achieved by evaluating organizational data.

Physician practices can obtain the necessary data to optimize revenue without making costly technology investments. Whether your practice has two physicians or 200, the black-and-white nature of claims data can be invaluable. It can help you evaluate revenue cycle performance, identify problems, drive process changes and ultimately improve cash flow, simply by coupling your newfound knowledge with analytical and problem-solving skills.

Originally posted via “3 ways to boost revenue with data analytics”

Source: 3 ways to boost revenue with data analytics by analyticsweekpick