Job Description
Nexus Horizon is at the forefront of the artificial intelligence revolution, pioneering the systems that will define the future of work. We are seeking a visionary Senior AI Infrastructure Engineer to architect the scalable backbone for our next-generation large language models and autonomous agents.
In this pivotal role, you will bridge the gap between cutting-edge research and production-grade deployment, ensuring our AI solutions are robust, efficient, and ready for the demands of 2026 and beyond. You will work in a dynamic environment where innovation is not just encouraged—it is the standard.
Key Highlights:
- Impact: Directly influence the architecture of systems that power millions of users.
- Culture: Join a diverse team of engineers, researchers, and ethicists committed to responsible AI.
- Compensation: Competitive salary, equity package, and comprehensive benefits.
Responsibilities
- Design and implement high-throughput, low-latency inference pipelines for large-scale LLMs.
- Optimize model serving architectures to handle millions of concurrent requests while minimizing latency.
- Collaborate closely with ML researchers to translate experimental models into production-ready services.
- Implement robust monitoring, logging, and observability stacks (e.g., Prometheus, Grafana, ELK) for complex AI workloads.
- Ensure strict data privacy, security compliance, and ethical AI deployment standards.
- Drive technical strategy for cloud infrastructure scaling and multi-region cost optimization.
- Mentor junior engineers and establish best practices for MLOps and DevOps within the AI team.
Qualifications
- 5+ years of experience in software engineering, with a specific focus on machine learning infrastructure.
- Deep proficiency in Python, TensorFlow, PyTorch, and CUDA.
- Strong experience with cloud platforms (AWS, GCP, or Azure) and containerization (Kubernetes, Docker).
- Experience with MLOps tools such as MLflow, Kubeflow, or Airflow.
- Strong understanding of distributed systems, message queues, and microservices architecture.
- Excellent problem-solving skills and the ability to thrive in a fast-paced, agile startup environment.