Wednesday, 30 April 2025

C2C-Remote Data Architect with (18495-1)- AI/ML, Databricks Lakehouse architecture (Delta Lake, Unity Catalog, Photon Engine)., Analytics, Data Governance, Security, Data Integration, Data Model, ETL Pipeline, Cloud,

0 comments
Hi ,

I hope this email finds you well.



My name is Yashasvi Hasija and I am a Technical Recruiter from Empower
Professionals Inc. I came across your profile and wanted to reach out
regarding a Data Architect (18495-1) role with one of our clients based
in Remote .Please let me know if you are available in the job market and
interested in this role (see job description below) - if so, we can
connect and speak further.




If you have any suitable profiles, please share their updated resumes
with the candidate's location, work authorization and expected rate so
that we can proceed.




Role: Data Architect (18495-1)
Location: Remote
Duration: 12+ Months

Must: Data Architect, AI/ML, Databricks Lakehouse architecture (Delta
Lake, Unity Catalog, Photon Engine)., Analytics, Data Governance,
Security, Data Integration, Data Model, ETL Pipeline, Cloud
(AWS/Azure/GCP)

Note: Candidate must have Hands on experience in Databricks



Requirements:

Data Strategy & Architecture Development



Define and implement the data architecture and data strategy aligned
with business goals.
Design scalable, cost-effective, and high-performance data solutions
using Databricks on AWS, Azure, or GCP.
Establish best practices for Lakehouse Architecture and Delta Lake
for optimized data storage, processing, and analytics.



Data Engineering & Integration Architect ETL/ELT pipelines
leveraging Databricks Spark, Delta Live Tables, and Databricks
Workflows.

Optimize data ingestion from sources like Oracle Fusion Middleware,
Web Methods, MuleSoft, and Informatica into Databricks.
Ensure real-time and batch data processing with Apache Spark and
Delta Lake.
Work on data integration strategies, ensuring seamless connectivity
with enterprise systems (e.g., Salesforce, SAP, ERP, CRM).



Data Governance, Security & Compliance Implement data governance
frameworks leveraging Unity Catalog for data lineage, metadata
management, and access control.

Ensure compliance with HIPAA, GDPR, and other regulatory standards
in life sciences.
Define RBAC (Role-Based Access Control) and enforce data security
best practices using Databricks SQL and access policies.
Enable data stewardship and ensure data cataloging for self-service
data democratization.



Performance Optimization & Cost Management Optimize Databricks
compute clusters (DBU usage) for cost efficiency and performance tuning.



Define and implement query optimization techniques using Photon
Engine, Adaptive Query Execution (AQE), and caching strategies.
Monitor Databricks workspace health, job performance, and cost
analytics.



AI/ML Enablement & Advanced Analytics Design and support ML
pipelines leveraging Databricks ML flow for model tracking and
deployment.

Enable AI-driven analytics in genomics, drug discovery, and clinical
data processing.
Collaborate with data scientists to operationalize AI/ML models in
Databricks.



Collaboration & Stakeholder Alignment Work with business teams, data
engineers, AI/ML teams, and IT leadership to align data strategy with
enterprise goals.



Collaborate with platform vendors (Databricks, AWS, Azure, GCP,
Informatica, Oracle, MuleSoft) for solution architecture and support.
Provide technical leadership, conduct PoCs, and drive Databricks
adoption across the organization.



Data Democratization & Self-Service Enablement Implement data
sharing frameworks for self-service analytics using Databricks SQL and
BI integrations (Power BI, Tableau).

Promote data literacy and empower business users with self-service
analytics.
Establish data lineage and cataloging to improve data
discoverability and governance.



Migration & Modernization Lead the migration of legacy data
platforms (Informatica, Oracle, Hadoop, etc.) to Databricks Lakehouse.

Design a roadmap for cloud modernization, ensuring seamless data
transition with minimal disruption.



Mandatory Key Skills:

Databricks & Spark Expertise Strong knowledge of Databricks
Lakehouse architecture (Delta Lake, Unity Catalog, Photon Engine).

Expertise in Apache Spark (PySpark, Scala, SQL) for large-scale data
processing.
Experience with Databricks SQL and Delta Live Tables (DLT) for
real-time and batch processing.
Understanding of Databricks Workflows, Job Clusters, and Task
Orchestration.



Cloud & Infrastructure Knowledge Hands-on experience with Databricks
on AWS, Azure, or GCP (preferred AWS Databricks).

Strong understanding of cloud storage (ADLS, S3, GCS) and cloud
networking (VPC, IAM, Private Link).
Experience with Infrastructure as Code (Terraform, ARM,
CloudFormation) for Databricks setup.



Data Modeling & Architecture Expertise in data modeling
(Dimensional, Star Schema, Snowflake, Data Vault).

Experience with Lakehouse, Data Mesh, and Data Fabric architectures.
Knowledge of data partitioning, indexing, caching, and query
optimization.



ETL/ELT & Data Integration Experience designing scalable ETL/ELT
pipelines using Databricks, Informatica, MuleSoft, or Apache NiFi.

Strong knowledge of batch and streaming ingestion (Kafka, Kinesis,
Event Hubs, Auto Loader).
Expertise in Delta Lake & Change Data Capture (CDC) for real-time
updates.



Data Governance & Security Deep understanding of Unity Catalog,
RBAC, and ABAC for data access control.

Experience with data lineage, metadata management, and compliance
(HIPAA, GDPR, SOC 2).
Strong skills in data encryption, masking, and role-based access
control (RBAC).



Performance Optimization & Cost Management Ability to optimize
Databricks clusters (DBU usage, Auto Scaling, Photon Engine) for cost
efficiency.

Knowledge of query tuning, caching, and performance profiling.
Experience monitoring Databricks job performance using Ganglia,
CloudWatch, or Azure Monitor.



AI/ML & Advanced Analytics

Experience integrating Databricks ML flow for model tracking and
deployment.
Knowledge of AI-driven analytics, Genomics, and Drug Discovery in
life sciences.







Awaiting your quick response. Thanks!







Thanks

Yashasvi Hasija

Technical Recruiter | Empower Professionals

......................................................................................................................................

Yashasvi@empowerprofessionals.com |

LinkedIn: linkedin.com/in/yashasvi-hasija-6a745625b

100 Franklin Square Drive – Suite 104 | Somerset, NJ 08873

www.empowerprofessionals.com

Certified NJ and NY Minority Business Enterprise (NMSDC)

Empower Professionals firmly opposes e-mail "spamming". We apologize to
those who do not wish to receive this e-mail and also to those who have
accidentally received it again. Please reply with "REMOVE" in the
subject listing, with all aliases email addresses that you would want
removed and any inconvenience caused is highly regretted. We appreciate
your patience and cooperation. This e-mail and any files transmitted
with it are for the sole use of the intended recipient(s) and may
contain confidential and privileged information. If you are not the
intended recipient(s), please reply to the sender and destroy all copies
of the original message. Any unauthorized review, use, disclosure,
dissemination, forwarding, printing or copying of this email, and/or any
action taken in reliance on the contents of this e-mail is strictly
prohibited and may be unlawful.











To subscribe or unsubscribe: https://send.empowerprofessionals.com/newsletter/subscribe/647186e8-bcb0-4f73-8f80-cb3daff9ad90

No comments:

Post a Comment