Principal AI Scientist – Model Optimization
Join us ! 🚀
We usually respond within three days
Join FlexAI:Â
FlexAI is at the forefront of revolutionizing AI computing by reengineering infrastructure at the system level. Our groundbreaking architecture, combined with sophisticated software intelligence, abstraction, and an orchestration layer, allows developers to leverage a diverse array of compute, resulting in efficient, more reliable computing at a fraction of the cost. We are seeking a skilled and experienced Principal AI Scientist.
Founded by Brijesh Tripathi and Dali Kilani, who bring experience from Nvidia, Apple, Tesla, Intel, Lifen, and Zoox, FlexAI is not just building a product – we’re shaping the future of AI. Our teams are strategically distributed across Paris, Silicon Valley, and Bangalore, united by a shared mission: to deliver more compute with less complexity.
 If you're passionate about shaping the future of artificial intelligence, driving innovation, and contributing to a sustainable and inclusive AI ecosystem, FlexAI is the place for you !
Position Overview:
Â
We are looking for a Principal AI Data Scientist with deep technical expertise in model optimization to spearhead the development of high-performance AI models derived from both open-source architectures and customer-proprietary models. You’ll lead initiatives focused on enhancing model efficiency, latency, and throughput across diverse hardware and deployment environments.
This role is ideal for a hands-on leader who thrives at the intersection of cutting-edge AI, systems engineering, and real-world scalability.
What you’ll do:
Model Development and Optimization:
Take ownership of adapting, compressing, and optimizing open-source or proprietary models to meet specific performance goals—such as speed, accuracy, and resource efficiency—across various edge and cloud environments.
Tech Evaluation & Customization:
Evaluate and benchmark open-source LLMs, CV, and multimodal models. Customize architectures to meet customer use-case requirements including quantization, pruning, distillation, and architecture search.
Hardware-Aware Optimization:
Work closely with performance engineers to tailor models for GPUs, CPUs, and emerging accelerators (e.g., AMD, ARM, or edge chips).
Customer Collaboration:
Partner with strategic customers to understand application needs and co-develop custom model optimization workflows and pipelines.
Cross-Functional Leadership:
Collaborate with product, engineering, MLOps, and systems teams to ensure end-to-end delivery of robust and production-grade models.
Innovation & Technical Strategy:
Stay ahead of AI optimization trends, propose new approaches, and guide architectural decisions around tooling, frameworks, and methodology.
What you’ll need to be successful:
Master’s or Ph.D. in Computer Science, Machine Learning, or related technical field.
8–10+ years in machine learning, AI R&D, or systems optimization roles.
Deep experience with model optimization techniques: quantization (INT8/FP16), pruning, distillation, NAS, etc.
Proven track record of working with LLMs, vision models, or multimodal architectures.
Experience with deep learning frameworks (e.g., PyTorch, TensorFlow) and optimization tools (e.g., ONNX, TVM, TensorRT, Hugging Face Optimum).
Strong software engineering skills in Python and/or C++.
Demonstrated ability to translate AI research into high-performance production systems.
Excellent communication skills and a customer-first mindset.
What we offer:
- A competitive salary and benefits package, tailored to recognize your dedication and contributions.
- The opportunity to collaborate with leading experts in AI and cloud computing, learning from the best and the brightest, fostering continuous growth.
- An environment that values innovation, collaboration, and mutual respect.
- Support for personal and professional development, empowering you with the tools and resources to elevate your skills and leave a lasting impact.
- A pivotal role in the AI revolution, shaping the technologies that power the innovations of tomorrow.
Offices :
Our teams are strategically distributed across three continents—Europe, North America, and Asia—united by a shared mission: to deliver more compute with less complexity.
- Paris - HQ
- San Francisco (Bay Area) - US office
- Bangalore - India office
Apply NOW!
You’ve seen what this role entails. Now we want to hear from you! Does this opportunity align with your aspirations? If you’re even slightly curious, we encourage you to apply – it could be the start of something extraordinary!
At FlexAI, we believe diverse teams are the most innovative teams. We’re committed to creating an inclusive environment where everyone feels valued, and we proudly offer equal opportunities regardless of gender, sexual orientation, origin, disabilities, veteran status, or any other facets of your identity that make you uniquely you.
- Department
- R&D SW
- Locations
- San Francisco (Bay Area)
- Remote status
- Hybrid
- Employment type
- Full-time
Principal AI Scientist – Model Optimization
Join us ! 🚀
Loading application form
Already working at FlexAI?
Let’s recruit together and find your next colleague.