Data Scientist / Applied ML
ConnectGuru
ConnectGuru
6L – 9L / yr Contract To Hire 4 Years Day Shift Work From Office New Delhi
11-50 Employees Women Friendly
Posted today  
Job highlights
  • 4-6 years Immediate C2H joiners 
Key skills
OCR AI LLMs Real - ESRGAN Computer Vision Multimodal AI DocTR TrOCR-class models Golden datasets Applied Machine Learning PyTorch

Skills highlighted in blue are preferred key skills

Job description

Job Description – Data Scientist / Applied ML Engineer 

Location: August Kranti Marg, Siri Fort Institutional Area, near Siri Fort Auditorium, New Delhi - 110049 

Experience: 4 – 6 years 

C2H - 3 months with the possibility of an extension based on business requirements.

Budget - 75k to 80K per month

Role Overview 

We are looking for a highly skilled Data Scientist / Applied ML Engineer to build and optimize DigiCatalog’s probabilistic AI stack for catalogue intelligence, multilingual understanding, and product attribute extraction. 

The role focuses on selecting, orchestrating, and evaluating open-source AI models across OCR, computer vision, multimodal AI, image enhancement, and Indic language processing. The ideal candidate should have strong expertise in applied CV/NLP systems, model evaluation, pipeline optimization, and practical ML engineering with a strong focus on cost, latency, and quality trade-offs. 

This role emphasizes building efficient, purpose-built AI pipelines rather than relying on large general-purpose models. 

 

Key Responsibilities 

Design and orchestrate AI/ML pipelines using open-source models across:  

OCR  

Computer Vision  

Multimodal AI  

Indic language processing  

Image enhancement and correction  

Select and optimize models based on:  

Cost  

Latency  

Accuracy  

Deployment constraints  

Edge-to-cloud trade-offs  

Build probabilistic AI workflows using small specialized models instead of defaulting to large LLMs.  

Work with technologies and frameworks such as:  

DocTR  

PaddleOCR  

TrOCR-class models  

Grounding DINO  

OWL-ViT  

CLIP  

2–3B parameter VLMs  

Real-ESRGAN  

SDXL-Turbo class models  

IndicTrans2  

NLLB  

IndicWhisper  

Build and maintain:  

Golden datasets  

Automated evaluation pipelines  

Regression testing frameworks  

Model benchmarking systems  

Define and monitor evaluation metrics including:  

Attribute F1 score  

OCR CER/WER  

Hallucination rate  

Per-language performance slices  

Perform:  

Error analysis  

Threshold tuning  

Calibration  

Ensembling  

Post-processing optimization  

Run regression validation on every model replacement or pipeline change.  

Clean, curate, and prepare datasets for training and evaluation workflows.  

Fine-tune models only when justified through measurable cost and quality improvements.  

Continuously evaluate emerging research and rapidly productionize relevant open-source innovations.  

 

4-6 years Immediate C2H joiners 

Required Skills & Qualifications 

4–6 years of experience in:  

Applied Machine Learning  

Computer Vision  

NLP  

Multimodal AI systems  

Strong hands-on expertise in:  

PyTorch  

Python  

ML pipeline orchestration  

Model evaluation systems  

Experience with:  

OCR systems  

Vision-language models  

Image processing pipelines  

Share this job:
About the company
ConnectGuru
Exclusive immediate-hire listing on ZeroNoticePeriod

ConnectGuru is a specialized IT staffing partner, connecting top talent in enterprise applications, cloud, and digital transformation. With a presence in Mumbai and Indore, we focus on speed, integrity, and quality placements.”

Mode of interview
CV Screening Technical Assessment Client Interview
Similar jobs on ZNP
Data Center Monitoring and Compliance Specialist
Lets Hire! · Bengaluru
Less than 1 Year · 2–8 LPA
Hybrid Immediate
Data Center Monitoring and Compliance Specialist
THC · Bengaluru
Less than 1 Year · 1–8 LPA
Hybrid Immediate
Trainee Consultant - SAP Datasphere
Shiv Infosystems · Hyderabad
1 Year · 2–8 LPA
Hybrid Immediate
Browse all jobs →