Subscribe to our Newsletter

Featured Posts (324)

  • A Guide to Managing Webpack Dependencies

    Read more…
    • Comments: 0
    • Tags:
  • Top 10 Commercial Hadoop Platforms

    Guest blog post by Bernard Marr

    Hadoop – the software framework which provides the necessary tools to carry out Big Data analysis – is widely used in industry and commerce for many Big Data related tasks.

    It is open source, essentially meaning that it is free for anyone to use for any purpose, and can be modified for any use. While designed to be user-friendly, in its “raw” state it still needs considerable specialist knowledge to set up and run.

    Because of this a large number of commercial versions have come onto the market in recent years, as vendors have created their own versions designed to be more easily used, or supplied alongside consultancy services to get you crunching through your data in no time.…

    Read more…
    • Comments: 0
    • Tags:
  • Where & Why Do You Keep Big Data & Hadoop?

    Guest blog post by Manish Bhoge

    I am Back ! Yes, I am back (on the track) on my learning track. Sometime, it is really necessary to take a break and introspect why do we learn, before learning.  Ah ! it was 9 months safe refuge to learn how Big Data & Analytics can contribute to Data Product.

    DataLake

    Data strategy has always been expected to be revenue generation. As Big data and Hadoop entering into the enterprise data strategy it is also expected from big data infrastructure to be revenue addition. This is really a tough expectation from new entrant (Hadoop) when the established candidate (DataWarehouse & BI) itself struggle mostly for its existence. So, it is very pertinent for solution architects to raise a question WHERE and WHY to bring the Big data (Obviously Hadoop) in the Data Strategy. And, the safe…

    Read more…
    • Comments: 0
    • Tags:
  • Top 30 people in Big Data and Analytics

    Originally posted on Data Science Central

    Innovation Enterprise has compiled a top 30 list for individuals in big data that have had a large impact on the development or popularity of the industry. …

    Read more…
    • Comments: 0
    • Tags:
  • Ember Data (a.k.a ember-data or ember.data) is a library for robustly managing model data in Ember.jsapplications. The developers of Ember Data state that it is designed to be agnostic to the underlying persistence mechanism, so it works just as well with JSON APIs over HTTP as it does with streaming WebSockets or local IndexedDB storage. It provides many of the facilities you’d find in server-side object relational mappings (ORMs) like ActiveRecord, but is designed specifically for the unique environment of JavaScript in the browser.

    While Ember Data may take some time to…

    Read more…
  • Google formally announced Android 7.0 a few weeks ago, but as usual, you’ll have to wait for it. Thanks to the Android update model, most users won’t get their Android 7.0 over-the-air (OTA) updates for months. However, this does not mean developers can afford to ignore Android Nougat. In this article, Toptal Technical Editor Nermin Hajdarbegovic takes a closer look at Android 7.0, outlining new features and changes. While Android 7.0 is by no means revolutionary, the introduction of a new graphics API, a new JIT compiler, and a range of UI and performance tweaks will undoubtedly unlock more potential and generate a few new possibilities.
    Read more…
  • I first heard of Spark in late 2013 when I became interested in Scala, the language in which Spark is written. Some time later, I did a fun data science project trying to predict survival on the Titanic. This turned out to be a great way to get further introduced to Spark concepts and programming. I highly recommend it for any aspiring Spark developers looking for a place to get started.

    Today, Spark is being adopted by major players like Amazon, eBay, and Yahoo! Many organizations run Spark on clusters with thousands of nodes. According to the Spark FAQ, the largest known cluster has over 8000 nodes. Indeed, Spark is a technology well worth taking note of and learning about.

    apache spark tutorial

    This article provides an introduction to Spark including use cases and examples. It contains…

    Read more…
    • Comments: 0
    • Tags:
  • Guest blog post by Alessandro Piva

    The proliferation of data and the huge potentialities for companies to turn data into valuable insights are increasing more and more the demand of Data Scientists.

    But what skills and educational background must a Data Scientist have? What is its role within the organization? What tools and programming languages does he/she mostly use? These are some of the questions that the Observatory for Big Data Analytics of Politecnico di Milano is investigating through an international survey submitted to Data Scientists: if you work with data in your company, please support us in our…

    Read more…
    • Comments: 0
    • Tags:
  • Associative Data Modeling Demystified - Part1

    Guest blog post by Athanassios Hatzis

    Relation, Relationship and Association

    While most players in the IT sector adopted Graph or Document databases and Hadoop based solutions, Hadoop is an enabler of HBase column store, it went almost unnoticed that several new DBMS, AtomicDB previous database engine of X10SYS, and Sentences, based on associative technology appeared on the scene. We have introduced and discussed about the…

    Read more…
    • Comments: 0
    • Tags:
  • Originally posted on Data Science Central

    Recently, in a previous post, we reviewed a path to leverage legacy Excel data and import CSV files thru MySQL into Spark 2.0.1. This may apply frequently in businesses where data retention did not always take the database route… However, we demonstrate here that the same result can be achieved…

    Read more…
    • Comments: 0
    • Tags:
  • 25 Predictions About The Future Of Big Data

    Guest blog post by Robert J. Abate.

    In the past, I have published on the value of information, big data, advanced analytics and the Abate Information Triangle and have recently been asked to give my humble opinion on the future of Big Data.

    I have been fortunate to have been on three panels recently at industry conferences which discussed this very question with such industry thought leaders as: Bill Franks (CTO, Teradata), Louis DiModugno (CDAO, AXA US), Zhongcai Zhang, (CAO, NY Community Bank), Dewey Murdick, (CAO, Department Of Homeland Security), Dr. Pamela Bonifay Peele (CAO, UPMC Insurance Services), Dr. Len Usvyat (VP Integrated Care Analytics, FMCNA), Jeffrey Bohn (Chief Science Officer, State Street), Kenneth Viciana (Business Analytics Leader, Equifax) and others.

    Each brought their unique perspective to the challenges of Big Data and their insights into their…

    Read more…
    • Comments: 0
    • Tags:
  • Guest blog post by Marc Borowczak

    Moving legacy data to modern big data platform can be daunting at times. It doesn’t have to be. In this short tutorial, we’ll briefly review an approach and demonstrate on my preferred data set: This isn’t a ML repository nor a Kaggle competition data set, simply the data I accumulated over decades to keep track of my plastic model collection, and as such definitely meets the legacy standard!

    We’ll describe steps followed on a laptop VirtualBox machine…

    Read more…
    • Comments: 0
    • Tags:
  • Java versus Python

    Originally posted on Data Science Central

    Interesting picture that went viral on Facebook. We've had plenty of discussions about Python versus R on DSC. This picture is trying to convince us that Python is superior to Java. It is about a tiny piece of code to draw a pyramid.

    This raises several questions:

    • Is Java faster than Python? If yes, under what circumstances? And by how…
    Read more…
    • Comments: 1
    • Tags:
  • Why Not So Hadoop?

    Guest blog post by Kashif Saiyed

    Does Big Data mean Hadoop? Not really, however when one thinks of the term Big Data, the first thing that comes to mind is Hadoop along with heaps of unstructured data. An exceptional lure for data scientists having the opportunity to work with large amounts data to train their models and businesses getting knowledge previously never imagined. But has it lived up to the hype? In this article, we will look at a brief history of Hadoop and see how it stands today.

    2015 Hype Cycle – Gartner

     
    hadoophype

    Some key takeaways from the Hype cycle of 2015:

    1. ‘Big Data’ was at the Trough of Disillusionment stage in 2014, but is not seen in the 2015 Hype cycle.
    2. Another interesting point is that ‘Internet of Things’ which suggests a network of interconnected devices around us, is at peak for 2 years consistently…
    Read more…
    • Comments: 0
    • Tags:
  • Originally posted on Data Science Central

    Summary

    Introducing Data Science teaches you how to accomplish the fundamental tasks that occupy data scientists. Using the Python language and common Python libraries, you'll experience firsthand the challenges of dealing with data at scale and gain a solid foundation in data science.

    About the Technology

    Many companies need developers with data science skills to work on projects ranging from social media marketing to machine learning. Discovering what you need to learn to begin a career as a data scientist can seem bewildering. This book is designed to help you get started.

    About the Book

    Introducing Data ScienceIntroducing Data Science explains vital data science concepts and teaches you how to…

    Read more…
    • Comments: 0
    • Tags:
  • Originally posted on Data Science Central

    Summary:  This is the first in a series of articles aimed at providing a complete foundation and broad understanding of the technical issues surrounding an IoT or streaming system so that the reader can make intelligent decisions and ask informed questions when planning their IoT system. 

    In This Article

    In Lesson 2

    In Lesson 3

    Is it IoT or…

    Read more…
    • Comments: 0
    • Tags:
  • Originally posted on Data Science Cental

    Cloud giants like Amazon, Google, Azure and IBM have rushed into the big data analytics cloud market.  They claim their tools will make developer tasks simple. For machine learning, they say their cloud products will free data scientists and developers from implementation details so they can focus on business logic.  …

    Read more…
    • Comments: 0
    • Tags:
  • Originally posted on Data Science Central

    Thousands of articles and tutorials have been written about data science and machine learning. Hundreds of books, courses and conferences are available. You could spend months just figuring out what to do to get started, even to understand what data science is about.

    In this short contribution, I share what I believe to be the most valuable resources - a small list of top resources and starting points. This will be most valuable to any data practitioner who has very little free time. 

    Map-Reduce Explained

    These resources cover data…

    Read more…
    • Comments: 0
    • Tags:
  • 5 Big Data Myths Businesses Should Know

    Guest blog post by Larry Alton

    Big data is seeping into every facet of our lives. Smart home gadgets are becoming part of the nerve systems of new and remodeled homes, and many renters are demanding these interconnected gadgets from landlords.

    But nowhere has Big Data created a bigger buzz than in business. Companies of all sizes are collecting data at a seemingly insurmountable rate. Big data is larger than ever before.

    We’ve collected more data in…

    Read more…
    • Comments: 0
    • Tags:
  • Originally posted on Data Science Central

    We just started in this article to provide answers to one of the largest collection of data science job interview questions ever published, and we will continue to add answers to most of these questions. Some answers link to solutions offered in my Wiley data science book: you can find this book here. The 91 job interview questions were originally published here with no answers, and we recently added 50 questions to identify a true data scientist, …

    Read more…
    • Comments: 0
    • Tags:

Resources

Research