About the role
You'll work at the core of LambdaQ: training the foundation models that power every Vikasit product. From data curation and tokenization to large-scale distributed training and post-training, you'll push our models up the frontier while keeping inference economics sane.
What you'll do
- —Own parts of the pretraining pipeline — data, architecture, or training infrastructure
- —Run and analyze large-scale training runs on multi-node GPU clusters
- —Improve data quality, mixtures, and curricula for Indian and global use
- —Design experiments that move benchmark and downstream quality measurably
What we're looking for
- —Strong PyTorch and distributed-training experience (FSDP / DeepSpeed / Megatron)
- —Solid grasp of transformer internals, optimization, and scaling laws
- —Experience training models at 1B+ scale, or equivalent research depth
Nice to have
- —MoE training experience
- —Tokenizer / data-pipeline work
- —Publications at top ML venues
Sound like you?
We hire for skill over credentials. Tell us why you're a fit — links and projects welcome.
Apply for this role