brand logo
View All Jobs

Databricks - Architect

Chennai, Bangalore, Hyderabad
Job Description

We are seeking a highly skilled and experienced Databricks Expert/Architect to join our global data services team. This individual will play a key role in designing, implementing, and optimizing Databricks solutions across our organization. The ideal candidate will have deep expertise in Databricks architecture, development, security, and governance, with a proven track record of successfully delivering complex data projects.
  • Proven experience as a Databricks Architect or similar role with a deep understanding of the Databricks platform and its capabilities.
  • Analyze business requirements and translate them into technical specifications for data pipelines, data lakes, and analytical processes on the Databricks platform.
  • Design and architect end-to-end data solutions, including data ingestion, storage, transformation, and presentation layers, to meet business needs and performance requirements.
  • Lead the setup, configuration, and optimization of Databricks clusters, workspaces, and jobs to ensure the platform operates efficiently and meets performance benchmarks.
  • Manage access controls and security configurations to ensure data privacy and compliance.
  • Design and implement data integration processes, ETL workflows, and data pipelines to extract, transform, and load data from various sources into the Databricks platform.
  • Optimize ETL processes to achieve high data quality and reduce latency.
  • Monitor and optimize query performance and overall platform performance to ensure efficient execution of analytical queries and data processing jobs.
  • Identify and resolve performance bottlenecks in the Databricks environment.
  • Establish and enforce best practices, standards, and guidelines for Databricks development, ensuring data quality, consistency, and maintainability.
  • Implement data governance and data lineage processes to ensure data accuracy and traceability.
    Mentor and train team members on Databricks best practices, features, and capabilities.
  • Conduct knowledge-sharing sessions and workshops to foster a data-driven culture within the organization.
  • Will be responsible for Databricks Practice Technical/Partnership initiatives.
  • Build skills in technical areas which support the deployment and integration of Databricks-based solutions to complete customer projects.
Job Requirement
Strong in Unity Catalog, Delta Lake, dbConnect,db API 2.0
Databricks workflow orchestration, Security management, Platform governance, and Data Security
Must know new features available in Databricks and their implications along with various possible use cases.
Must have followed various architectural principles to design best suited per problem.
Must be well versed with Databricks Lakehouse concept and its implementation in enterprise environments.
Must have a strong understanding of Data warehousing and various governance and security standards around Databricks.
Must know about cluster optimization and its integration with various cloud services.
Must have a good understanding of creating complex data pipelines.
Must be strong in SQL and Spark-SQL.
Must have strong performance optimization skills to improve efficiency and reduce cost.
Must have worked on designing both Batch and streaming data pipelines.
Must have extensive knowledge of Spark and Hive data processing frameworks.
Must have worked on any cloud (Azure, AWS, GCP) and most common services like ADLS/S3, ADF/Lambda, CosmosDB/DynamoDB, ASB/SQS, and Cloud databases.
Must be strong in writing unit test cases and integration tests.
Responsible for setting best practices around Databricks CI/CD.
Must understand composable architecture to take fullest advantage of Databricks capabilities.
Good to have Rest API knowledge.
Good to have an understanding of cost distribution.
Good to have if worked on a migration project to build a Unified data platform.
SQL Endpoint – Photon engine
Hands-on Experience to design and build Databricks-based solutions on any cloud platform.
Must be very good at designing End-to-End solutions on cloud platforms.
Must have good knowledge of Data Engineering concepts and related services of the cloud.
In-depth hands-on implementation knowledge of Databricks. Delta Lake, Delta table - Managing Delta Tables, Delta live tables, Databricks Cluster Configuration, Cluster policies.
Experience handling structured and unstructured datasets
Strong proficiency in programming languages like Python, Pyspark, Scala, or SQL.
Experience with Cloud platforms like AWS understanding of cloud-based data storage and computing services.
Familiarity with big data technologies like Apache Spark, Hadoop, and data lake architectures.
Develop and maintain data pipelines, ETL workflows, and analytical processes on the Databricks platform.
Should have good experience in Data Engineering in Databricks Batch process and Streaming
Should have good experience in creating Workflows & Scheduling pipelines.
Should have good exposure to how to make packages or libraries available in DB.
Familiarity with Databricks default runtimes
Databricks Certified Data Engineer Associate/Professional Certification (Desirable).
Should have experience working in Agile methodology
Strong verbal and written communication skills.
Strong analytical and problem-solving skills with a high attention to detail.