Pre-processing Text Data with NLTK and Azure Machine Learning

Data comes in all forms. Lately, we’ve been going over mostly numerical and categorical data. Even though the categorical data contains words, we transform it into numerical data for our algorithms. However, what if your data is only words? That’s where natural language processing comes in, and in this post, we’ll go over the basics…

Using the Cognitive Services Text Analytics API: Detecting Languages

Microsoft has a lot of fascinating APIs available to build intelligent applications with using their Cognitive Services. Among those services is the Text Analytics API. This API offers a wide range of valuable text-based functionality such as sentiment analysis and key phrase extraction. With these useful APIs available, what could be a better means of…

Beginning Statistics for Data Science: Analyzing Data

In our last post we discussed different types your data can have. Now let’s focus on how to analyze on those types of data. Python code will be used to demonstrate a few of these concepts. To get things start in regards to the Python code, let’s go ahead and import our packages and review…

Getting Quick Insights on Sales Data with PowerBI

To finish off getting insights from a sales data set, we’re going to look at using Microsoft’s PowerBI. PowerBI is a very helpful tool for looking at our data through visualizations. The insights will be the same that we got in our visualization post from before, but using PowerBI we get these visualizations quicker and…

Beginning Statistics for Data Science: Types of Data

Statistics is becoming a must learn topic for anyone looking to get into data science. Look at any data scientist job posting, and you will be hard-pressed to find a listing that does not mention a degree in statistics, mathematics, or some experience in analytics as a minimum qualification. Courses in data science are including…

Creating Web Apps for Your Machine Learning Models with Dash

In the last post, we created APIs for our machine learning models so they could be deployed and clients could invoke them. However, what if you just wanted a simple web page that included interactive and attractive graphs? That’s where Dash comes to the rescue. Dash is similar to Shiny, a framework to build interactive…

Creating a Machine Learning Web API with Flask

In our previous post, we went over how to create a simple linear regression model with scikit-learn and how to use it to make predictions. But, that’s not very useful for anyone other than the creator of the model since it’s only available on their machine. In this post, we’ll go over how to use…

Top Free Data Science Books

There are probably thousands upon thousands of tutorials, articles, videos, and blog posts on all things data science on the internet now. Yet I’m still a big fan of books. Throughout history books have given wisdom, advice, and knowledge to everyone who wants to read them. Seneca, a Stoic philosopher, mentioned something similar: Men who…

Visualizing Sales Data in Python with Matplotlib

In our last post we interpreted a data set with pandas to gain some insights from it. In this post, we will do the same, but instead of interpreting the raw data we will use visualizations to help us determine patterns in the data. But before we dive into the implementation, let’s review the benefits…

Using Pandas to Analyze Sales Data

Now that we know how the data science process works, let’s leverage some of it and try to find insights into some data. We’ll be using pandas, a popular data analysis package for Python, to load and work with our data. Feel free to follow along by downloading the Jupyter notebook. If you went through…

Data Science and the Data Science Process

Before we get into the fun part of working with data, let’s break down how data science involves more than just statistics, why it’s becoming more important, and the data science process. Data Science vs. Statistics In short, data science is extracting knowledge from data. But how is that different between statistics? Data science encompasses…