Azure Databricks Series - Part 5
Databricks Jobs

Today’s post will cover Databricks Jobs. There is also the concept of a Spark Job which will be covered briefly to try to avoid confusion. Spark Job When running a Spark application there is the concept of a Spark job. At runtime, the Spark driver converts your Spark application into a job that is transformed…

Azure Databricks Series - Part 4
Databricks Clusters

This is a continuation of my series of posts on Databricks where we most recently reviewed the Workspace & Notebooks. Now let’s get more familiar with the concept of clusters. Clusters Databricks breaks clusters into multiple categories: All-Purpose Clusters Job Clusters Pools Spark clusters consist of a single driver node and multiple worker nodes. The…

Azure Databricks Series - Part 1
Intro to Azure Databricks

Many companies today have aging data architectures. As you look to modernize your traditional ETL pipeline, there is a tool you should keep in mind: Azure Databricks. During your move into Azure, there will probably be a place for Azure Databricks. In the past, general DTS/SSIS pipelines and SQL Server engines were sufficient but with…

Azure Resources: Saving Time And Money

Money is always important whether you are developing for yourself, your employer, or a client. One way to save money in Azure is to shut down resources at the end of the day. As an Azure data engineer, it can be a tedious task that is easy to forget. There can be a lot of…

Visualizing Data with Power BI

In this webinar, we’ll introduce Power BI to give you a lightning fast look into your data with more insight than you could ever have imagined.

An Overview of Azure Databricks

With the announcement of the general availability of Azure Databricks, in this post we’ll take this opportunity to get a brief feel to what Azure Databricks is and what it can do. What is Databricks? Databricks is a data solution that sits on top of Apache Spark to help accelerate a business’ data analytics side…

Using the Cognitive Services Text Analytics API: Detecting Languages

Microsoft has a lot of fascinating APIs available to build intelligent applications with using their Cognitive Services. Among those services is the Text Analytics API. This API offers a wide range of valuable text-based functionality such as sentiment analysis and key phrase extraction. With these useful APIs available, what could be a better means of…

Beginning Statistics for Data Science: Analyzing Data

In our last post we discussed different types your data can have. Now let’s focus on how to analyze on those types of data. Python code will be used to demonstrate a few of these concepts. To get things start in regards to the Python code, let’s go ahead and import our packages and review…

Getting Quick Insights on Sales Data with PowerBI

To finish off getting insights from a sales data set, we’re going to look at using Microsoft’s PowerBI. PowerBI is a very helpful tool for looking at our data through visualizations. The insights will be the same that we got in our visualization post from before, but using PowerBI we get these visualizations quicker and…

Beginning Statistics for Data Science: Types of Data

Statistics is becoming a must learn topic for anyone looking to get into data science. Look at any data scientist job posting, and you will be hard-pressed to find a listing that does not mention a degree in statistics, mathematics, or some experience in analytics as a minimum qualification. Courses in data science are including…

Creating a Machine Learning Web API with Flask

In our previous post, we went over how to create a simple linear regression model with scikit-learn and how to use it to make predictions. But, that’s not very useful for anyone other than the creator of the model since it’s only available on their machine. In this post, we’ll go over how to use…