Applied Research Scientist – Multimodal Modeling
About the Role
Trellion is building the intelligence core of next-generation hiring: not just parsing text, but understanding humans across video, audio, language, and behavior. Our mission is to turn messy human signals into clean, structured insight that empowers better decisions.
This role helps shape our multimodal AI backbone. We are looking for an Applied Research Scientist who can design, train, evaluate, and deploy multimodal models for real-time and offline understanding of interviews. You will drive frontier work combining computer vision, speech processing, NLP, behavioral inference, and temporal modeling into robust, high-signal systems.
Responsibilities
- Design multimodal architectures that combine video, audio, text, and interaction signals
- Build feature representations for gaze, posture, prosody, sentiment, and conversational dynamics
- Train sequence and transformer models on multimodal datasets
- Research and implement state-of-the-art models in vision, speech, and language
- Improve robustness, fairness, and generalizability across diverse users
- Work closely with ML engineering and product teams to integrate models into production
- Publish internal research notes and advance Trellion’s intellectual backbone
You will operate as both a scientist and an engineer. Ideas matter, but so does execution. Your models will power production systems used by real customers.
Requirements
Core Expertise (strong in most of the following):
- Deep learning with PyTorch or TensorFlow
- Experience with transformer architectures
- Strong mathematical grounding in optimization and representation learning
- Ability to build and iterate on multimodal models (any two of: vision, audio, NLP)
Vision
- Face and gesture analysis
- Action recognition or temporal CNN/ViT models
- Pose estimation, gaze tracking, or visual attention modeling
Speech & Audio
- Prosody and paralinguistic features
- Voice activity detection
- ASR models and embeddings
Language
- LLMs, text embeddings, contextual modeling
General Skills
- Dataset curation, alignment, and augmentation
- Experiment design and large-scale training
- Model evaluation and interpretability methods
- Experience with GPUs, distributed training, and model optimization
Nice to have
- Prior published research
- Experience with fairness, bias mitigation, or explainability
- Real-time inference constraints
What We Care About
- You think clearly and write clearly
- You understand the difference between novelty and usefulness
- You believe models should earn their place in production
- You value signal over complexity
- You can collaborate with engineers, designers, and product teams
- You enjoy building systems that help real humans
Compensation & Benefits
- $160,000–$200,000 CAD base salary
- Equity in a fast-moving AI startup
- Hybrid work setting in Montreal
- Ownership over a critical research track
- A culture of rigor, speed, and autonomy
How to Apply
Send your resume, GitHub, and any research or project links to [email protected]. Strong candidates typically include a portfolio of experiments or publications demonstrating real depth.
Requirements
Ready to apply?
Join Trellion