Urgent Need Senior Data Scientist with VLM (Vision-Language Models) Experience at San Jose, CA & Waukesha, WI (Onsite)

Need 10+ Years of experience.

An early response is really appreciated.

Job Title : Senior Data Scientist with VLM (Vision-Language Models) Experience

Location : San Jose, CA & Waukesha, WI (Onsite)

Duration : 12 Months

Job Description:

As a Senior Data Scientist with expertise in Vision-Language Models (VLMs) and related technologies to lead the development of efficient, cost-effective multimodal AI solutions. The ideal candidate will have experience with advanced VLM frameworks such as VILA, Isaac, and VSS, and a proven track record of implementing production-grade VLMs for training and testing in real-world environments. A background in healthcare, particularly medical devices, is highly desirable. This role will focus on exploring and deploying state-of-the-art VLM methodologies on cloud platforms like AWS or Azure.

Responsibilities: -

VLM Development & Deployment:

Design, train, and deploy efficient Vision-Language Models (e.g., VILA, Isaac Sim) for multimodal applications.
Explore cost-effective methods such as knowledge distillation, modal-adaptive pruning, and LoRA fine-tuning to optimize training and inference.
Implement scalable pipelines for training/testing VLMs on cloud platforms (AWS SageMaker, Azure ML).

Multimodal AI Solutions:

Develop solutions that integrate vision and language capabilities for applications like image-text matching, visual question answering (VQA), and document data extraction.
Leverage interleaved image-text datasets and advanced techniques (e.g., cross-attention layers) to enhance model performance.

Healthcare Domain Expertise:

Apply VLMs to healthcare-specific use cases such as medical imaging analysis, position detection, motion detection and measurements.
Ensure compliance with healthcare standards while handling sensitive data.

Efficiency Optimization:

Evaluate trade-offs between model size, performance, and cost using techniques like elastic visual encoders or lightweight architectures.
Benchmark different VLMs (e.g., GPT-4V, Claude 3.5) for accuracy, speed, and cost-effectiveness on specific tasks.

Educational Qualifications: -

Education: Master’s or Ph.D. in Computer Science, Data Science, Machine Learning, or a related field.

Skills: -

Mandatory skills

Experience:

Minimum of 10+ years of experience in machine learning or data science roles with a focus on vision-language models.
Proven expertise in deploying production-grade multimodal AI solutions.
Experience in healthcare or medical devices is highly preferred.

Technical Skills:

Proficiency in Python and ML frameworks (e.g., PyTorch, TensorFlow).
Hands-on experience with VLMs such as VILA, Isaac Sim, or VSS.
Familiarity with cloud platforms like AWS SageMaker or Azure ML Studio for scalable AI deployment.

Domain Knowledge:

Understanding of medical datasets (e.g., imaging data) and healthcare regulations.

Soft Skills:

Strong problem-solving skills with the ability to optimize models for real-world constraints.
Excellent communication skills to explain technical concepts to diverse stakeholders

Good to have skills: -