Jul 02, 20: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
Human resource  Source

[ AnalyticsWeek BYTES]

>> The Big Data Debate: Batch vs. Streaming Processing by analyticsweekpick

>> The Changing Landscape of Customer Acquisition, Engagement and Retention in 2020 by administrator

>> â€˜Big Data’ Alters Investment Research Landscape by analyticsweekpick

Wanna write? Click Here

[ FEATURED COURSE]

Lean Analytics Workshop – Alistair Croll and Ben Yoskovitz

image

Use data to build a better startup faster in partnership with Geckoboard… more

[ FEATURED READ]

Storytelling with Data: A Data Visualization Guide for Business Professionals

image

Storytelling with Data teaches you the fundamentals of data visualization and how to communicate effectively with data. You’ll discover the power of storytelling and the way to make data a pivotal point in your story. Th… more

[ TIPS & TRICKS OF THE WEEK]

Data Have Meaning
We live in a Big Data world in which everything is quantified. While the emphasis of Big Data has been focused on distinguishing the three characteristics of data (the infamous three Vs), we need to be cognizant of the fact that data have meaning. That is, the numbers in your data represent something of interest, an outcome that is important to your business. The meaning of those numbers is about the veracity of your data.

[ DATA SCIENCE Q&A]

Q:What is latent semantic indexing? What is it used for? What are the specific limitations of the method?
A: * Indexing and retrieval method that uses singular value decomposition to identify patterns in the relationships between the terms and concepts contained in an unstructured collection of text
* Based on the principle that words that are used in the same contexts tend to have similar meanings
* “Latent”: semantic associations between words is present not explicitly but only latently
* For example: two synonyms may never occur in the same passage but should nonetheless have highly associated representations

Used for:

* Learning correct word meanings
* Subject matter comprehension
* Information retrieval
* Sentiment analysis (social network analysis)

Source

[ VIDEO OF THE WEEK]

#FutureOfData Podcast: Conversation With Sean Naismith, Enova Decisions

 #FutureOfData Podcast: Conversation With Sean Naismith, Enova Decisions

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

The data fabric is the next middleware. – Todd Papaioannou

[ PODCAST OF THE WEEK]

#BigData @AnalyticsWeek #FutureOfData #Podcast with Juan Gorricho, @disney

 #BigData @AnalyticsWeek #FutureOfData #Podcast with Juan Gorricho, @disney

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

Estimates suggest that by better integrating big data, healthcare could save as much as $300 billion a year — that’s equal to reducing costs by $1000 a year for every man, woman, and child.

Sourced from: Analytics.CLUB #WEB Newsletter

Amazon Redshift COPY command cheatsheet

Although it’s getting easier, ramping up on the COPY command to import tables into Redshift can become very tricky and error-prone.   Following the accordion-like hyperlinked Redshift documentation to get a complete command isn’t always straighforward, either.

Treasure Data got in on the act (we always do!) with a guide to demystify and distill all the COPY commands you could ever need into one short, straightforward guide.

Load tables into Redshift from S3, EMR, DynamoDB, over SSH, and more.
Load tables into Redshift from S3, EMR, DynamoDB, over SSH, and more!

 

Includes example commands, how to use data sources – including the steps for setting up an SSH connection,  using temporary and encrypted credentials, formatting, and much more.

Get the guide here.

Originally Posted at: Amazon Redshift COPY command cheatsheet by john-hammink

Rethinking classical approaches to analysis and predictive modeling

Rethinking classical approaches to analysis and predictive modeling
Rethinking classical approaches to analysis and predictive modeling

Synopsis:

The speaker will address the need to rethink classical approaches to analysis and predictive modeling. He will examine “iterative analytics” and extremely fine grained segmentation down to a single customer – ultimately building one model per customer or millions of predictive models delivering on the promise of “segment of one” . The speaker will also address the speed at which all this has to work to maintain a competitive advantage for innovative businesses.

Speaker:

Afshin Goodarzi Chief Analyst 1010data

A veteran of analytics, Goodarzi has led several teams in designing, building and delivering predictive analytics and business analytical products to a diverse set of industries. Prior to joining 1010data, Goodarzi was the Managing Director of Mortgage at Equifax, responsible for the creation of new data products and supporting analytics to the financial industry. Previously, he led the development of various classes of predictive models aimed at the mortgage industry during his tenure at Loan Performance (Core Logic). Earlier on he had worked at BlackRock, the research center for NYNEX (present day Verizon) and Norkom Technologies. Goodarzi’s publications span the fields of data mining, data visualization, optimization and artificial intelligence.

Presentation Video:

Presentation Slideshare:

Sponsor:
1010Data [ http://1010data.com ]
Microsoft NERD [ http://microsoftnewengland.com ]
Cognizeus [ http://cognizeus.com ]

Originally Posted at: Rethinking classical approaches to analysis and predictive modeling

Jun 25, 20: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
Extrapolating  Source

[ AnalyticsWeek BYTES]

>> Model Risk 101: A Checklist for Risk Managers by analyticsweekpick

>> Apr 26, 18: #AnalyticsClub #Newsletter (Events, Tips, News & more..) by admin

>> Consider The Close Variants During Page Segmentation For A Better SEO by thomassujain

Wanna write? Click Here

[ FEATURED COURSE]

Probability & Statistics

image

This course introduces students to the basic concepts and logic of statistical reasoning and gives the students introductory-level practical ability to choose, generate, and properly interpret appropriate descriptive and… more

[ FEATURED READ]

Superintelligence: Paths, Dangers, Strategies

image

The human brain has some capabilities that the brains of other animals lack. It is to these distinctive capabilities that our species owes its dominant position. Other animals have stronger muscles or sharper claws, but … more

[ TIPS & TRICKS OF THE WEEK]

Winter is coming, warm your Analytics Club
Yes and yes! As we are heading into winter what better way but to talk about our increasing dependence on data analytics to help with our decision making. Data and analytics driven decision making is rapidly sneaking its way into our core corporate DNA and we are not churning practice ground to test those models fast enough. Such snugly looking models have hidden nails which could induce unchartered pain if go unchecked. This is the right time to start thinking about putting Analytics Club[Data Analytics CoE] in your work place to help Lab out the best practices and provide test environment for those models.

[ DATA SCIENCE Q&A]

Q:How frequently an algorithm must be updated?
A: You want to update an algorithm when:
– You want the model to evolve as data streams through infrastructure
– The underlying data source is changing
– Example: a retail store model that remains accurate as the business grows
– Dealing with non-stationarity

Some options:
– Incremental algorithms: the model is updated every time it sees a new training example
Note: simple, you always have an up-to-date model but you can’t incorporate data to different degrees.
Sometimes mandatory: when data must be discarded once seen (privacy)
– Periodic re-training in “batch” mode: simply buffer the relevant data and update the model every-so-often
Note: more decisions and more complex implementations

How frequently?
– Is the sacrifice worth it?
– Data horizon: how quickly do you need the most recent training example to be part of your model?
– Data obsolescence: how long does it take before data is irrelevant to the model? Are some older instances
more relevant than the newer ones?
Economics: generally, newer instances are more relevant than older ones. However, data from the same month, quarter or year of the last year can be more relevant than the same periods of the current year. In a recession period: data from previous recessions can be more relevant than newer data from different economic cycles.

Source

[ VIDEO OF THE WEEK]

Venu Vasudevan @VenuV62 (@ProcterGamble) on creating a rockstar data science team #FutureOfData #Podcast

 Venu Vasudevan @VenuV62 (@ProcterGamble) on creating a rockstar data science team #FutureOfData #Podcast

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

Hiding within those mounds of data is knowledge that could change the life of a patient, or change the world. – Atul Butte, Stanford

[ PODCAST OF THE WEEK]

Understanding #BigData #BigOpportunity in Big HR by @MarcRind #FutureOfData #Podcast

 Understanding #BigData #BigOpportunity in Big HR by @MarcRind #FutureOfData #Podcast

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

Data production will be 44 times greater in 2020 than it was in 2009.

Sourced from: Analytics.CLUB #WEB Newsletter

Big Data: Would number geeks make better football managers?

Charles Reep was a retired RAF Wing Commander who loved football.

Specifically, Swindon Town. And it ached to see them losing – something the team made a habit of in the 1949/50 season.

So frustrated was Wing Cdr Reep with one particular performance, that for the second half he pulled out his notepad and started making notes on the players – their movements, their positions, the shape of their play. He identified small changes that he thought could help the team grab a few more goals.

He was decades ahead of his time.

Now, behind the biggest football teams in the world, lies a sophisticated system of data gathering, metrics and number-crunching. Success on the pitch – and on the balance sheet – is increasingly becoming about algorithms.

The richest 20 clubs in the world bring in combined revenues of 5.4bn euros ($7.4bn, £4.5bn), according to consultancy firm Deloitte. And increasingly, data is being seen as crucial to maximising that potential income by getting the most from football’s prized investments – the players.

Hoof it!

Data and football have had a strained relationship over the years.

Back in the 1950s, Swindon didn’t have much time for Wing Cdr Reep’s approach. But west London side Brentford did.

ProzoneProzone’s software offers real-time match tracking, and is used by over 300 clubs worldwide

The club was facing a relegation battle. Wing Cdr Reep was taken on as an advisor – and with his counsel, the team turned their fortunes around and were safe from relegation at the close of the season.

A triumph, you would think – but his approach, despite the measurable success, drew considerable scorn.

His data suggested that most goals were scored from fewer than three direct passes, and he therefore recommended the widely-despised “long-ball” game.

In other words, the ugliest type of football imaginable. Hoof the ball forward, hope you get a lucky break, and poke it into the net.

“Unfortunately it kind of brought statistics and football into disrepute,” says Chris Anderson, author of The Numbers Game, an analytical and historical look at the use of data in football.

“Because people pooh-poohed the idea of the long ball game in football and thought it responsible for the England team not doing nearly as well as they should have for all these years.”

Leg sensors

Wing Cdr Reep passed away in 2002. Were he alive today, he would likely be a welcome guest at German football club TSG Hoffenheim, where the “big data” revolution is changing everything about how they prepare for a match.

Through a partnership with SAP – which specialises in handling “big data” for business – the club has incorporated real-time data measurements into its training schedule.

“It’s a very new way of training,” says Stefan Lacher, head of technology at SAP.

SAP data at HoffenheimThe data can be analysed in real-time by data experts – and training schedules can be adapted

“The entire training area becomes accessible virtually by putting trackers on everything that’s important – on the goals, on the posts. Every player gets several of them – one on each shinpad – and the ball of course has a sensor as well.

“If you train for just 10 minutes with 10 players and three balls – it produces more than seven million data points, which we can then process in real time.”

SAP’s software is able to crunch that data, and suggest tweaks that each individual player can make.

“It’s about better understanding the strengths and weaknesses of the players,” Mr Lacher says, “and spending more time working on the weaknesses and making better use of the strengths.

“It’s moving from gut feeling to facts and figures.”

Career-threatening injury

But it’s in the boardroom where football data has an even more critical role to play in the success of the team, says Dr Paul Neilson from football technology specialists Prozone.

“One of the most important things within elite sport is making sure your players are available for training and matches as much as possible, and that is about mitigating injury risks,” he says.

“If you’re doing that you should be able to reduce the risk of physical overload, and reduce the risk of injury.

SAP data at HoffenheimThe data can be relayed to players so they can work on their weaknesses

“When you’re paying players as much as players get paid, it’s very important to make sure they’re on the pitch as much as possible.”

Non-playing players is a massive financial concern for football clubs. The famous example is the case of Jonathan Woodgate, who left Newcastle United in 2004 to join Spanish giants Real Madrid – for a tasty £13.4m.

Plagued by injury, Woodgate played for Real just nine times before leaving in 2007. That’s just under £1.5m per game – without his weekly wages taken into consideration.

Prozone’s research lab wants to reduce this risk for clubs by using data to analyse body movements and spot, before a physio can, where future injuries may occur.

In young players, analysis of movement can also provide an early warning system for those who may develop career-threatening injuries.

Collecting this data is a sophisticated task. Prozone’s approach relies on a complex network of cameras fitted around the stadium, picking up player movements from several angles at once.

Two worlds

Football managers and coaches like to think it’s their instinct, not geeky data, that gets results. And so, uptake of data analysis in football has been a slow process.

“Football, particularly in the UK, can be a little bit conservative,” says Dr Neilson.

“You look at rugby, and the head coach/manager will often be in the stand for all the game and be surrounded by data and technology and video analysis.

SAP sensor
Players at Hoffenheim attach sensors to their kit to monitor their movements

“Compare that with football and the manager is still very much in the dugout, trying to affect the players personally, in terms of instructions and shouting – and very much being part of the sometimes chaotic nature of football.”

This culture clash means there are no managers that prowl the touchline with a tablet – yet. But behind the scenes it’s a very different picture.

Prozone provides intricate data for more than 300 football clubs around the world, including every team in the lucrative English Premier League.

But to make sense of it all requires talent – and Dr Neilson believes that soon, fans will come to admire – or despise – their club’s data scientist in the same way they treat the manager now.

“In a typical football club you have technical people like your sports analysis staff, or sports science staff. They are very analytical, very objective and process driven.

“At the opposite end of the scale you have the decision makers – the chief executive who writes the cheques, the manager that makes the weekly decision in terms of team selection.

“The challenge is connecting those two worlds – so the decision makers trust in that data.”

Sadly, Wing Cdr Reep didn’t live to see the true appreciation of his craft.

And to this day, his long-ball philosophy is criticised by many who say that his data collection was far too primitive to come to such sweeping conclusions.

But nevertheless, his work pioneered what has become a cornerstone of the modern, beautiful game.

Somewhere, in the not-so-distant future, at a football club losing three-nil at home – the fans are chanting “you’re getting sacked in the morning”. Not at the manager, but at the man with the big data.

Follow Dave Lee on Twitter @DaveLeeBBC

Originally posted via “Big Data: Would number geeks make better football managers?”

Source

Jun 18, 20: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
Extrapolating  Source

[ AnalyticsWeek BYTES]

>> Big data could power on-demand public transport: IDA by analyticsweekpick

>> The Future of Big Data [Infographics] by v1shal

>> May 11, 17: #AnalyticsClub #Newsletter (Events, Tips, News & more..) by admin

Wanna write? Click Here

[ FEATURED COURSE]

Artificial Intelligence

image

This course includes interactive demonstrations which are intended to stimulate interest and to help students gain intuition about how artificial intelligence methods work under a variety of circumstances…. more

[ FEATURED READ]

Introduction to Graph Theory (Dover Books on Mathematics)

image

A stimulating excursion into pure mathematics aimed at “the mathematically traumatized,” but great fun for mathematical hobbyists and serious mathematicians as well. Requiring only high school algebra as mathematical bac… more

[ TIPS & TRICKS OF THE WEEK]

Analytics Strategy that is Startup Compliant
With right tools, capturing data is easy but not being able to handle data could lead to chaos. One of the most reliable startup strategy for adopting data analytics is TUM or The Ultimate Metric. This is the metric that matters the most to your startup. Some advantages of TUM: It answers the most important business question, it cleans up your goals, it inspires innovation and helps you understand the entire quantified business.

[ DATA SCIENCE Q&A]

Q:What is latent semantic indexing? What is it used for? What are the specific limitations of the method?
A: * Indexing and retrieval method that uses singular value decomposition to identify patterns in the relationships between the terms and concepts contained in an unstructured collection of text
* Based on the principle that words that are used in the same contexts tend to have similar meanings
* “Latent”: semantic associations between words is present not explicitly but only latently
* For example: two synonyms may never occur in the same passage but should nonetheless have highly associated representations

Used for:

* Learning correct word meanings
* Subject matter comprehension
* Information retrieval
* Sentiment analysis (social network analysis)

Source

[ VIDEO OF THE WEEK]

#BigData #BigOpportunity in Big #HR by @MarcRind #JobsOfFuture #Podcast

 #BigData #BigOpportunity in Big #HR by @MarcRind #JobsOfFuture #Podcast

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

The temptation to form premature theories upon insufficient data is the bane of our profession. – Sherlock Holmes

[ PODCAST OF THE WEEK]

Andrea Gallego(@risenthink) / @BCG on Managing Analytics Practice #FutureOfData #Podcast

 Andrea Gallego(@risenthink) / @BCG on Managing Analytics Practice #FutureOfData #Podcast

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

In 2008, Google was processing 20,000 terabytes of data (20 petabytes) a day.

Sourced from: Analytics.CLUB #WEB Newsletter

Best Practices for Using Context Variables with Talend – Part 2

First off, a big thank you to all those who have read the first part of this blog series!  If you haven’t read it, I invite you to read it now before continuing, as ~Part 2 will build upon it and dive a bit deeper.  Ready to get started? Let’s kick things off by discussing the implicit context load.

The Implicit Context Load

The Implicit Context Load is one of those pieces of functionality that can very easily be ignored but is incredibly valuable.

Simply put, the implicit context load is just a way of linking your jobs to a hardcoded file path or database connection to retrieve your context variables. That’s great, but you still have to hardcode your file path/connection settings, so how is it of any use here if we want a truly environment agnostic configuration?

Well, what is not shouted about as much as it probably should be is that the Implicit Context Load configuration variables can not only be hardcoded, but they can be populated by Talend Routine methods. This opens up a whole new world of environment agnostic functionality and makes Contexts completely redundant for configuring Context variables per environment.

You can find the Talend documentation for the Implicit Context Load here. You will notice that it doesn’t say (at the moment…maybe an amendment is due :)) that each of the fields shown in the screenshot below can be populated by Talend routine methods instead of being hardcoded.

JASYPT

Before I go any further it makes sense to jump onto a slight tangent and mention JASYPT. JASYPT is a java library which allows developers to add basic encryption capabilities to his/her projects with minimum effort, and without the need of having deep knowledge on how cryptography works. JASYPT is supplied with Talend, so there is no need to hunt around and download all sorts of Jars to use here. All you need to be able to do is write a little Java to enable you to obfuscate your values to prevent others from being able to read them in clear text.

Now, you won’t necessarily want all of your values to be obfuscated. This might actually be a bit of a pain. However, JASYPT makes this easy as well. JASYPT comes built-in with some functionality which will allow it to ingest a file of parameters and decrypt only the values which are surrounded by ….

ENC(………)

This means a file with values such as below (example SQL server connection settings)…..

TalendContextAdditionalParams=instance=TALEND_DEV

TalendContextDbName=context_db

TalendContextEnvironment=DEV

TalendContextHost=MyDBHost

TalendContextPassword=ENC(4mW0zXPwFQJu/S6zJw7MIJtHPnZCMAZB)

TalendContextPort=1433

TalendContextUser=TalendUser

…..will only have the “TalendContextPassword” variable decrypted, the rest will be left as they are.

This piece of functionality is really useful in a lot of ways and often gets overlooked by people looking to hide values which need to be made easily available to Talend Jobs. I will demonstrate precisely how to make use of this functionality later, but first I’ll show you how simple using JASYPT is if you simply want to encrypt and decrypt a String.

Simple Encrypt/Decrypt Talend Job

In the example I will give you in part 3 of this blog series (I have to have something to keep you coming back), the code will be a little harder than below. Below is an example job showing how simple it is to use the JASYPT functionality. This job could be used for encrypting whatever values you may wish to encrypt manually. It’s layout is shown below….

 

Two components. A tLibraryLoad to load the JASYPT Jar and a tJava to carry out the encryption/decryption.

The tLibraryLoad is configured as below. Your included version of JASYPT may differ from the one I have used. Use whichever comes with your Talend version.

The tJava needs to import the relevant class we are using from the JASYPT Jar. This import is shown below…..

The actual code is….

import org.jasypt.encryption.pbe.StandardPBEStringEncryptor;

Now to make use of the StandardPBEStringEncryptor I used the following configuration….

The actual code (so you can copy it) is shown below….

//Configure encryptor class

StandardPBEStringEncryptor encryptor = new StandardPBEStringEncryptor();

encryptor.setAlgorithm("PBEWithMD5AndDES");

encryptor.setPassword("BOB");




//Set the String to encrypt and print it

String stringToEncrypt = "Hello World";

System.out.println(stringToEncrypt);




//Encrypt the String and store it as the cipher String. Then print it

String cipher = encryptor.encrypt(stringToEncrypt);

System.out.println(cipher);




//Decrypt the String just encrypted and print it out

System.out.println(encryptor.decrypt(cipher));

In the above it is all hardcoded. I am encrypting the String “Hello World” using the password “BOB” and the algorithm “PBEWithMD5AndDES”. When I run the job, I get the following output….

Starting job TestEcryption at 07:47 19/03/2018.




[statistics] connecting to socket on port 3711

[statistics] connected

Hello World

73bH30rffMwflGM800S2UO/fieHNMVdB

Hello World

[statistics] disconnected

Job TestEcryption ended at 07:47 19/03/2018. [exit code=0]

These snippets of information are useful, but how do you knit them together to provide an environment agnostic Context framework to base your jobs on? I’ll dive into that in Part 3 of my best practices blog. Until next week!

The post Best Practices for Using Context Variables with Talend – Part 2 appeared first on Talend Real-Time Open Source Data Integration Software.

Source: Best Practices for Using Context Variables with Talend – Part 2

Jun 11, 20: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
Big Data knows everything  Source

[ AnalyticsWeek BYTES]

>> 25 Cartoons To Give Current Big Data Hype A Perspective by v1shal

>> Radar: More Evil Than Pie? by analyticsweek

>> Cloud Data Lakes, Data Warehouses and Data Prep: Making Them All Work Together by analyticsweekpick

Wanna write? Click Here

[ FEATURED COURSE]

Learning from data: Machine learning course

image

This is an introductory course in machine learning (ML) that covers the basic theory, algorithms, and applications. ML is a key technology in Big Data, and in many financial, medical, commercial, and scientific applicati… more

[ FEATURED READ]

Storytelling with Data: A Data Visualization Guide for Business Professionals

image

Storytelling with Data teaches you the fundamentals of data visualization and how to communicate effectively with data. You’ll discover the power of storytelling and the way to make data a pivotal point in your story. Th… more

[ TIPS & TRICKS OF THE WEEK]

Keeping Biases Checked during the last mile of decision making
Today a data driven leader, a data scientist or a data driven expert is always put to test by helping his team solve a problem using his skills and expertise. Believe it or not but a part of that decision tree is derived from the intuition that adds a bias in our judgement that makes the suggestions tainted. Most skilled professionals do understand and handle the biases well, but in few cases, we give into tiny traps and could find ourselves trapped in those biases which impairs the judgement. So, it is important that we keep the intuition bias in check when working on a data problem.

[ DATA SCIENCE Q&A]

Q:Give examples of bad and good visualizations?
A: Bad visualization:
– Pie charts: difficult to make comparisons between items when area is used, especially when there are lots of items
– Color choice for classes: abundant use of red, orange and blue. Readers can think that the colors could mean good (blue) versus bad (orange and red) whereas these are just associated with a specific segment
– 3D charts: can distort perception and therefore skew data
– Using a solid line in a line chart: dashed and dotted lines can be distracting

Good visualization:
– Heat map with a single color: some colors stand out more than others, giving more weight to that data. A single color with varying shades show the intensity better
– Adding a trend line (regression line) to a scatter plot help the reader highlighting trends

Source

[ VIDEO OF THE WEEK]

Pascal Marmier (@pmarmier) @SwissRe discusses running data driven innovation catalyst

 Pascal Marmier (@pmarmier) @SwissRe discusses running data driven innovation catalyst

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

Numbers have an important story to tell. They rely on you to give them a voice. – Stephen Few

[ PODCAST OF THE WEEK]

Understanding Data Analytics in Information Security with @JayJarome, @BitSight

 Understanding Data Analytics in Information Security with @JayJarome, @BitSight

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

29 percent report that their marketing departments have ‘too little or no customer/consumer data.’ When data is collected by marketers, it is often not appropriate to real-time decision making.

Sourced from: Analytics.CLUB #WEB Newsletter