Subscribe to our Newsletter

Originally posted on Data Science Central

Here's a different angle on a much analyzed question at the heart of our professional activities.  In this article, Steve Miller of Inquidia tackles how NoSQL has changed our traditional understanding of Predictive Analytics and Data Science.  You might also look back at our previous post How NoSQL Fundamentally Changed Machine Learning.

Here's the beginning of Steve's take on this:

My company, Inquidia Consulting, is currently engaged in/completing several predictive analytics and data science projects. While we distinguish PA from DS, there's often not a hard dividing line between the two with our customers. Indeed, though we demur, some now consider data science to be any application of statistical methods to business problems.

For Inquidia, both PA and DS generally involve statistics and machine learning of some sort, often “climaxing” with predictive models trained and validated on existing data. The ultimate goal is to deploy the models to make go-forward predictions in a business process.

Inquidia's PA work is usually more narrowly focused than its DS cousin, often as not a particular modeling task with relevant data identified in advance for a relatively short-term project. And the PA customer may suggest “theories” on what the final models might look like for us to test. R, Python and SAS are preferred PA platforms.

DS projects, in contrast, are more comprehensive but nebulous, with substantial computation/data integration/wrangling, big (and perhaps unstructured) data , and exploration challenges that precede theorizing and  subsequent modeling. In many cases, DS work is shaped more by data programming than by modeling. The Cloud, Redshift, Hadoop/Impala, Spark, R and Python are Inquidia's usual suspect DS platforms.

Read the entire article here.

Email me when people comment –

You need to be a member of Hadoop360 to add comments!

Join Hadoop360

Resources

Research