AI Cluster Network Architect
Join us ! đ
We usually respond within a day
đ Join FlexAI:Â
FlexAI is at the forefront of revolutionizing AI computing by reengineering infrastructure at the system level. Our groundbreaking architecture, combined with sophisticated software intelligence, abstraction, and an orchestration layer, allows developers to leverage a diverse array of compute, resulting in efficient, more reliable computing at a fraction of the cost.Â
The rapid evolution of machine intelligence has created a need for a new system architecture capable of handling high memory capacity and bandwidth. These are critical bottlenecks in pushing machine intelligence to the next level, where compute demand is expected to increase up to 1000 times current levels.
FlexAI has pioneered a groundbreaking solution to tackle these memory challenges. Our innovative compute architecture ensures a well-balanced distribution of memory bandwidth, capacity, and compute density, ensuring maximum utilization of system resources. This architecture is the cornerstone of our datacenter-in-a-box concept, which is enabled by our universal AI compute cloud service. Our hardware solutions are built for seamless deployment with our own AI cloud offerings and other cloud service providers worldwide, setting new standards in performance and efficiency.
 We are seeking a skilled and experienced AI Cluster Network Architect to design, optimize, and scale our AI infrastructure, driving the development and efficiency of our groundbreaking compute architecture.
Our innovative compute architecture, coupled with sophisticated software intelligence and orchestration, allows developers to leverage a diverse array of compute, resulting in efficient, more reliable computing at a fraction of the cost. This architecture ensures a well-balanced distribution of memory bandwidth, capacity, and compute densityâforming the backbone of our datacenter-in-a-box concept. Enabled by our universal AI compute cloud service, our hardware solutions set new benchmarks in performance and efficiency, seamlessly integrating with our AI cloud offerings and other cloud service providers worldwide.
Position Overview:
As the AI Cluster Network Architect, you will be responsible for designing and implementing high-performance network architectures that support AI clusters, ensuring efficient communication between nodes, optimized data flow, and scalability. This role requires a deep understanding of AI workloads, networking protocols, and distributed computing. You will work closely with AI researchers, data scientists, and infrastructure teams to deliver networking solutions that meet the demands of advanced AI systems.
Success at FlexAI requires an entrepreneurial spirit and startup mindset: the ability to rapidly iterate and make meaningful progress while staying focused on our mission to deliver more compute with less complexity. Your proven expertise in cultivating influence, aligning diverse stakeholders, and driving efficient operationsâwhile fostering a supportive environment through mentorship and thoughtful leadership of a growing teamâwill be critical to your success.
What youâll do:
Design and implement high-performance network architectures for AI clusters, ensuring low-latency, high-throughput data communication between nodes.
Collaborate with AI engineers, data scientists, and IT teams to understand AI workloads and optimize network infrastructure accordingly.
Architect scalable and fault-tolerant networking solutions that can handle massive datasets and distributed AI computations.
Evaluate and integrate advanced networking technologies, including RDMA (Remote Direct Memory Access), InfiniBand, and high-speed ethernet.
Ensure that network architectures are optimized for AI-specific workloads, such as deep learning, machine learning, and data-intensive computing.
Monitor and troubleshoot network performance issues, implementing solutions to maintain optimal performance across AI clusters.
Work closely with hardware engineers and datacenter teams to design and implement networking infrastructure that supports AI training and inference.
Drive continuous improvement in network design, focusing on scalability, security, and performance.
Stay updated on the latest trends and innovations in AI networking, contributing to the development of best practices and new technologies.
Model inclusive behaviors and contribute to a culture that values and respects different backgrounds and perspectives.
What youâll need to be successful:
Bachelorâs or Masterâs degree in Computer Science, Network Engineering, or a related field. Advanced degrees are a plus.
8+ years of network architecture experience, focusing on high-performance computing (HPC) or AI infrastructure.
Proven experience designing and implementing large-scale network architectures for AI clusters or distributed computing environments.
Deep knowledge of networking protocols, including TCP/IP, RDMA, InfiniBand, and high-speed Ethernet.
Experience with AI workloads, including deep learning and machine learning, and their impact on network design.
Strong understanding of network security best practices and strategies for protecting AI infrastructure.
Hands-on experience with network monitoring tools and techniques for performance optimization and troubleshooting.
Ability to collaborate effectively with cross-functional teams, including AI engineers, data scientists, and infrastructure teams.
Strong problem-solving skills and a data-driven approach to decision-making.
Preferred Skills
Experience with containerized environments (e.g., Kubernetes) and their networking challenges in AI clusters.
Knowledge of cloud-based AI infrastructure and hybrid cloud networking solutions.
Familiarity with network simulation and modeling tools for AI workloads.
Experience with emerging networking technologies, such as Software-Defined Networking (SDN) and Network Function Virtualization (NFV).
What we offer:
- A competitive salary and benefits package, tailored to recognize your dedication and contributions.
- The opportunity to collaborate with leading experts in AI and cloud computing, learning from the best and the brightest, fostering continuous growth.
- An environment that values innovation, collaboration, and mutual respect.
- Support for personal and professional development, empowering you with the tools and resources to elevate your skills and leave a lasting impact.
- A pivotal role in the AI revolution, shaping the technologies that power the innovations of tomorrow.
đ¤ About FlexAI:
Founded by Brijesh Tripathi and Dali Kilani, who bring experience from Nvidia, Apple, Tesla, Intel, Lifen, and Zoox, FlexAI is not just building a product â weâre shaping the future of AI.
đ Offices :
Our teams are strategically distributed across three continentsâEurope, North America, and Asiaâunited by a shared mission: to deliver more compute with less complexity.
- Paris - HQ
- San Francisco (Bay Area) - US office
- Bangalore - India office
đđź Apply NOW!
Youâve seen what this role entails. Now we want to hear from you! Does this opportunity align with your aspirations? If youâre even slightly curious, we encourage you to apply â it could be the start of something extraordinary!
At FlexAI, we believe diverse teams are the most innovative teams. Weâre committed to creating an inclusive environment where everyone feels valued, and we proudly offer equal opportunities regardless of gender, sexual orientation, origin, disabilities, veteran status, or any other facets of your identity that make you uniquely you.
- Department
- R&D HW
- Locations
- San Francisco (Bay Area)
- Remote status
- Hybrid
- Employment type
- Full-time
AI Cluster Network Architect
Join us ! đ
Loading application form
Already working at FlexAI?
Letâs recruit together and find your next colleague.