Data Science for Developers Webinar

What exactly is data science?  How does one become a Data Scientist? Data Scientist has been labelled by the Harvard Business Review, as “the sexiest job of the 21st century.” A quick search of job search sites reveal that this field is in high demand. However, no one can agree on a common definition of…

Visualizing Sales Data in Python with Matplotlib

In our last post we interpreted a data set with pandas to gain some insights from it. In this post, we will do the same, but instead of interpreting the raw data we will use visualizations to help us determine patterns in the data. But before we dive into the implementation, let’s review the benefits…

Data Science and the Data Science Process

Before we get into the fun part of working with data, let’s break down how data science involves more than just statistics, why it’s becoming more important, and the data science process. Data Science vs. Statistics In short, data science is extracting knowledge from data. But how is that different between statistics? Data science encompasses…

Data Science with R in Visual Studio

Like our previous post on Python, we will walk through all the really nice offerings Visual Studio gives us now when working with R and related tools. Since we looked at installing in the previous post, and the steps are the same, I will omit that from this post. We’ll just focus on all the…

Microsoft Announces New Big Data Azure Services

Microsoft announced several new features targeting Big Data processing including support for HDInsights on Ubuntu Linux, as well as a set of new features in their Data Lake services. The first announcement was of the availability of HDInsights, Microsoft’s Hadoop services, on Ubuntu Linux virtual machines.  Features include the ability to create HDInsight clusters from the…

Microsoft Announces Azure SQL Data Warehouse Public Preview

Microsoft has announced the Azure SQL Data Warehouse is open for business, at least for a lucky few customers.  The limited public preview is designed to for data warehouses in the 5-10TB range allowing Microsoft to test scalability and performance before taking on more heavy lifting. The initial public preview is designed for data warehouses…

Enterprise Adoption of NoSQL

For all of its obvious success, I still think that NoSQL is underutilized (or, at least, misunderstood) in the enterprise. Some of this can be explained by inertia… the classic relational model has ruled the roost for decades, and certainly isn’t going away anytime soon. Enterprises have significant skillset and infrastructure investments in the care…

The Sterling NoSQL Database in a Mango World

I was at the MIX 2011 event. There was an “Open Source Fest” before the event and I came to showcase the Sterling NoSQL Database project that I run (and code most of, although there have been numerous enhancements and patches now added by a growing team of fantastic supporters). I was happy to speak…