External Skills and Expertise:
· 5–10 years of total IT experience, including at least 3 years in big data engineering on Microsoft Azure.
· Strong SQL expertise with experience in writing, optimizing, and troubleshooting complex queries on Azure SQL, Synapse, or similar cloud databases.
· Solid understanding of Spark architecture and core APIs - RDD, Dataframe, Dataset.
· Strong understanding of the Databricks ecosystem, including Notebooks, Workflows, Unity Catalog, SQL Warehouse, Serverless compute, latest databricks features.
· Proven expertise in designing and developing scalable, high-performance data pipelines and automated workflows in Azure Databricks leveraging PySpark, Spark SQL.
· Experience in designing, building and orchestrating complex, parameterized pipelines in Azure Data Factory.
· Familiarity with both batch and streaming data processing.
· Solid understanding of data-modelling techniques (dimensional, 3NF) and data warehousing concepts
· Experience delivering at least one end-to-end Data Lakehouse solution in Azure using the Medallion Architecture
· Knowledge of different file formats such as Delta Lake, Avro, Parquet, JSON, and CSV
· Advanced programming, unit-testing, and debugging skills in Python, PySpark, and SQL
· Knowledge of data security and lifecycle management policies across Azure environment.
· Familiarity with DevOps practices (CI/CD, Git, automated deployments).
· Collaborative mindset with enthusiasm for working with stakeholders across the organization and taking ownership of deliverables.
Good to Have:
· Exposure to developing LLM/Generative AI-powered applications.
· Knowledge about NoSQL database.
· Experience in supporting BI and Data Science teams in consuming the data in a secure and governed manner.
· Relevant Certifications on Microsoft Azure or Databricks are valuable addition.