Azure Databricks Series - Part 6
Batch and Streaming Pipelines

One of the best features of Databricks is that it facilitates transitioning batch pipelines to streaming pipelines. Depending on your transformation logic it may be as easy as changing two lines of code. In this post we will review some batch and streaming concepts and show how to write a notebook that can be run…

Databricks Batch and Streaming

Databricks is a leading big data processing service available on Azure. One of the nicest features in Databricks is the ability to develop workloads that run in batch or streaming modes. In this webinar, we’ll review the basics of a notebook running in batch mode. In this example, we’ll be focusing on Delta Tables for…

Azure Databricks Series - Part 5
Databricks Jobs

Today’s post will cover Databricks Jobs. There is also the concept of a Spark Job which will be covered briefly to try to avoid confusion. Spark Job When running a Spark application there is the concept of a Spark job. At runtime, the Spark driver converts your Spark application into a job that is transformed…

Azure Databricks Series - Part 4
Databricks Clusters

This is a continuation of my series of posts on Databricks where we most recently reviewed the Workspace & Notebooks. Now let’s get more familiar with the concept of clusters. Clusters Databricks breaks clusters into multiple categories: All-Purpose Clusters Job Clusters Pools Spark clusters consist of a single driver node and multiple worker nodes. The…

Migrating Data Workloads from SQL to Azure Synapse

You’ve heard about the advances in the Azure Synapse Analytics service. Microsoft has brought together data integration, enterprise data warehousing, and big data analytics. In this webinar, we’ll review the fundamentals of Azure Synapse Analytics and why it may be the right choice for you. Both serverless and dedicated resource consumptions models will be covered.…

Azure Databricks Series - Part 3
Workspaces & Notebooks

Now that you’ve instantiated the Databricks service within Azure, let’s take a tour of the workspace & become familiar with Notebooks. Workspace The above image shows the Databricks homepage of this workspace. The left menu provides the majority of your options (outside of administration). Clicking on Workspace expands to the following: The workspace is divided…

Azure Databricks Series - Part 1
Intro to Azure Databricks

Many companies today have aging data architectures. As you look to modernize your traditional ETL pipeline, there is a tool you should keep in mind: Azure Databricks. During your move into Azure, there will probably be a place for Azure Databricks. In the past, general DTS/SSIS pipelines and SQL Server engines were sufficient but with…

Azure Resources: Saving Time And Money

Money is always important whether you are developing for yourself, your employer, or a client. One way to save money in Azure is to shut down resources at the end of the day. As an Azure data engineer, it can be a tedious task that is easy to forget. There can be a lot of…