As a distributed systems engineer at Sieve, you’ll design and engineer systems that handle the compute, scheduling, and orchestration of complex ML + ETL pipelines that need to run quickly, reliably, and cost-effectively on large sums of video.
Commitments
In-person at our SF HQBreakfast, Lunch, and Dinner covered and your choice of snacks
Not Met Priorities
What still needs stronger evidence
Requirements
3+ years of experience building foundational data infrastructure
Proficient in working across diverse cloud architectures
Designed and maintained pipelines that process petabytes of data
Developed robust CI/CD pipelines tailored for ML-focused teams
Strong coding experience with Go and Python; Experience with Rust is a plus
Operates as an IC who leads by example
Experience with large-scale video data systems
Preferred Skills
Strong coding experience with Go and Python; Experience with Rust is a plus
About Us Sieve is the only AI research lab exclusively focused on video data. We combine exabyte-scale video infrastructure, novel video understanding techniques, and dozens of data sources to develop datasets that push the frontier of video modeling. Video makes up 80% of internet traffic and has become the enabling digital medium powering creativity, communication, gaming, AR/VR, and robotics. Sieve exists to solve the biggest bottleneck in growth of these applications: high-quality training data. We've partnered with top AI labs and did $XXM last quarter alone, as a team of just 15 people. We also raised our Series A last year from Tier 1 firms such as Matrix Partners, Swift Ventures, Y Combinator, and AI Grant. About The Role As a distributed systems engineer at Sieve, you’ll design and engineer systems that handle the compute, scheduling, and orchestration of complex ML + ETL pipelines that need to run quickly, reliably, and cost-effectively on large sums of video. You’re likely a good fit if you love optimizing for system uptime, have worked with cloud technologies, optimizing hyper-fast distributed systems at the scale of thousands of GPUs, and building great internal tooling and CI/CD for rapid iteration. Requirements
3+ years of experience building foundational data infrastructure Proficient in working across diverse cloud architectures Designed and maintained pipelines that process petabytes of data Developed robust CI/CD pipelines tailored for ML-focused teams Strong coding experience with Go and Python; Experience with Rust is a plus Operates as an IC who leads by example Experience with large-scale video data systems In-person at our SF HQ Benefits
401k + Full Health Insurance Breakfast, Lunch, and Dinner covered and your choice of snacks Ubers covered home