Azure Data Platform · Azure Services · Data Engineering · Performance tune

T-SQL Queries Vs Spark SQL Queries

Introduction

In this blog post, we will see some of the Transact SQL (T-SQL) Queries and its equivalents in Spark SQL with examples.

some T-SQL vs Spark SQL Keyword Differences

T-SQL Queries

The below input as well as output taken by running T-SQL Queries in SSMS via Azure Synapse Analytics/Dedicated SQL Pool, the same we can also replicate via normal Azure SQL DB.

T-SQL Queries input in SSMS
T-SQL Queries output in SSMS

Equivalent Spark SQL Queries

The below input as well as output taken by running Spark SQL Queries in Azure Databricks Notebook.

Spark SQL Queries input in Databricks Notebook
Spark SQL Queries Output in Databricks Notebook

Use case

The above mentioned T-SQL Queries are actually written/used in Azure Synapse Analytics Dedicated SQL pool with additional business kind of logics and later migrated to Azure Databricks as Spark SQL Queries to get the spark power as well as to avoid performance related concerns.

Benefit

After the above use case implementation, the Data Processing pipeline run is greatly perf-optimized and it runs from Several hours(approx. 7 hours) to Some mins(approx. 35 mins).

If you are interest to re-use the above T-SQL & Spark SQL codes, visit AzureStuffs repos.

Recent Related Posts

Summary

Thus via this blog post, we came to know some of the Equivalent Spark SQL Queries for respective T-SQL Queries as well as use case and benefit behind at high level.

Follow Blog and Show your Support for many more interesting upcoming Posts!

Advertisement

3 thoughts on “T-SQL Queries Vs Spark SQL Queries

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s