DevOps Engineer (Full-time)
Company: Strong Compute Corporation
Location: San Francisco
Posted on: June 1, 2025
Job Description:
AI workloads are brutal-petabytes of data, distributed jobs, and
real-time GPU orchestration. We're building an AI-first DevOps
infrastructure that makes compute reliable, scalable, and
cost-effective. If you love infrastructure automation, cloud-native
engineering, and AI performance tuning, you'll love this role.What
you'll do
- Design and manage scalable, fault-tolerant AI compute
infrastructure
- Automate GPU provisioning, multi-cloud scheduling, and scaling
strategies
- Improve observability, logging, and monitoring for real-time AI
workloads
- Optimize containerized deployments for Kubernetes, Nomad, or
Slurm
- Enhance security, CI/CD, and cloud networking for
high-performance distributed training
- Implement security best practices for DevOps pipelines,
including secrets management, infrastructure security, and
compliance automation
- Reduce infrastructure cost and maximize performance through
automation and tuningWhat we're looking for
- Deep knowledge of CI/CD pipelines and infrastructure as
code
- Hands-on experience with monitoring and logging tools
(Prometheus, Grafana, OpenTelemetry)
- Proficiency in shell scripting, Python, or Go for
automation
- Experience with security best practices for cloud environments,
including IAM, container security, and incident responseNice to
haves:
- Experience managing large-scale clusters with Kubernetes or
other approaches and cloud infrastructure
- Experience with Terraform, Ansible, Helm, or Pulumi
- Understanding of AI/ML compute environments (GPUs, CUDA, NCCL,
Slurm, Horovod)Our cultureWe move fast. We ship weekly-new
features, improvements, and fixes go live fast.We test big. Every
month, we stress test with large groups of users face to face, get
real-world feedback, and iterate rapidly. We build together. On
site only, in SF or Sydney.We iterate relentlessly. Direct user
feedback shapes our roadmap-we release, test, refine, and keep
moving.--- We travel when needed. Engineers may travel between SF
and Sydney to run events and meet with clients.Location: SF or
Sydney (OG startup house vibe, great food, late nights, all the
GPUs)Equipment & Benefits:Top spec Macbook + separate GPU cluster
dev environments for each engineer.Weekly cash bonus when you work
out 3+ times a week.Comprehensive health benefits, including a
choice of Kaiser, Aetna OAMC, and HDHP (HSA-eligible) plans for our
SF-based team members.Highest in the world 20 year exercise window
for optionsDon't have all the skills? Apply anyway! We're looking
for people who move fast, learn fast, and ship fast. If that's you,
let's talk.Want to get to know us first? Attend one of our .
#J-18808-Ljbffr
Keywords: Strong Compute Corporation, Davis , DevOps Engineer (Full-time), Engineering , San Francisco, California
Didn't find what you're looking for? Search again!
Loading more jobs...