DevOps Engineer

Astana, KZ

Nace.AI is a hive for people who see a tough problem and itch to solve it with AI. We mix curiosity, rigor, and play to turn raw research into tools that feel like magic.

Role Overview

As a DevOps Engineer, you will play a critical role in building, scaling, and maintaining the infrastructure that powers our advanced AI systems and enterprise-grade platforms. You will be responsible for designing highly reliable CI/CD pipelines, automating cloud deployments, managing large-scale distributed compute clusters, and ensuring that our AI agents, data pipelines, and backend services operate with maximum efficiency, security, and uptime. Your work will directly impact engineering velocity, system resilience, and the seamless delivery of cutting-edge AI capabilities to customers.

What You'll Do

  • Design, build, and maintain scalable DevOps infrastructure, including CI/CD pipelines, automated deployment systems, and monitoring/observability layers.

  • Manage and optimize cloud environments (AWS/GCP/Azure), ensuring high availability, fault tolerance, and cost-efficient resource allocation across compute, storage, and networking layers.

  • Implement infrastructure-as-code (IaC) using tools such as Terraform, CloudFormation, enabling reproducible and automated environment provisioning.

  • Architect and manage containerized and orchestrated environments using Docker, Kubernetes, and modern service mesh technologies.

  • Build and maintain secure, resilient, and compliant infrastructure for LLM workloads, distributed training pipelines, and large-scale inference systems.

  • Develop automated tooling for deployment, scaling, rollback, blue-green and canary releases, configuration management, and environment consistency.

  • Set up robust monitoring, alerting, logging, and observability systems using tools such as Prometheus, Grafana, ELK stack, Datadog, OpenTelemetry.

  • Collaborate closely with Software Engineering, AI Infrastructure, and Security teams to support rapid iteration and stable production delivery.

  • Lead incident response, root-cause analysis, post-mortems, and continuous reliability improvements across the entire stack.

Minimum Qualifications

  • Bachelor’s degree in Computer Science, Computer Engineering, Information Systems, or equivalent practical experience.

  • 3+ years of experience as a DevOps, SRE, or Infrastructure Engineer supporting production systems.

  • Hands-on experience with cloud platforms (AWS/GCP/Azure) and container orchestration (Kubernetes), including cluster management and autoscaling.

  • Strong proficiency with infrastructure-as-code tools: Terraform, Helm, or similar.

  • Experience designing, monitoring, and optimizing CI/CD pipelines (GitHub Actions, ArgoCD, etc.).

  • Practical experience with distributed systems, load balancing, networking, and high-availability architectures.

  • Proficiency in one or more scripting/programming languages: Python, Bash, Go, or similar.

Preferred Qualifications

  • Master’s degree in Computer Science, Engineering, or a related technical field.

  • Experience managing high-throughput AI infrastructure, GPU clusters, distributed training systems, or large-scale inference deployments.

  • Expertise in Kubernetes ecosystem tools: ArgoCD, Kustomize, FluxCD, Istio, Linkerd, or similar.

  • Experience implementing advanced monitoring and observability frameworks leveraging OpenTelemetry or custom instrumentation.

  • Background in supporting enterprise-grade AI/ML systems, including data pipelines, vector databases, and real-time inference services.

  • Demonstrated success in improving engineering velocity through DevOps automation, release engineering, and infrastructure scaling.

Why Nace.AI?

Pedigree: Work with a team from top-tier institutions and companies, backed by the best VCs in the world.

Impact: You are joining early enough to shape the culture and the growth roadmap of a company aiming to be the "OS" for professional knowledge.

Competitive Package: Silicon Valley-standard salary, significant equity, and premium benefits.

Apply: send CV and a brief note on a complex audit you led to career@nace.ai.

All rights reserved. Nace.AI © 2026

All rights reserved. Nace.AI © 2026