Sitemap

Member-only story

Spark: Key Topics for Data Engineering Interviews Part - 3

Key Topics for Data Engineering Interviews

Pravash
6 min readMay 2, 2025

In this continuation (Key Topics for Data Engineering Interviews Part — 2), I will explore some more crucial concepts that not only illuminate the inner workings of Spark but also serve as key markers in Spark interviews

Whether you’re gearing up for a technical discussion or simply looking to deepen your understanding, this exploration promises to be a rewarding endeavor into the core of Spark’s essence.

Lets get started —

2️⃣1️⃣ Logical Plan vs Physical Plan

Logical Plan:

  • Once the code is supplied by the user, Spark creates an unresolved Logical plan. This unresolved logical plan is validated against the Catalog in order to validate the column name and the table name.
  • Once the validation is done, Sparks creates Reserved Logical Plan. This is then taken to catalyst optimizer which basically does the whole optimization of the logical planning.
  • After optimization its generates the optimized logical plan, which is the Logical DAG.

Physical Plan:

  • Once the optimized logical plan is ready, Spark generates…

--

--

Pravash
Pravash

Written by Pravash

I am a passionate Data Engineer and Technology Enthusiast. Here I am using this platform to share my knowledge and experience on tech stacks.

No responses yet