← Serch more jobs

Senior Software Engineer, Data Pipelines

LinkedIn Ginkgo Bioworks Boston, MA
Not Applicable Posted April 5, 2026 Job link
Thinking about this job
Not Met Priorities
What still needs stronger evidence
Requirements
  • Candidates must have the ability to obtain and maintain a U.S. security clearance per business requirements.
  • In connection therewith, candidates must be willing to undergo a background investigation and meet eligibility requirements.
  • 7+ years of professional experience in data or software engineering, with a focus on building production-grade data products and scalable architectures
  • Expert proficiency with SQL for complex transformations, performance tuning, and query optimization
  • Strong Python skills for data engineering workflows, including pipeline development, ETL/ELT processes, and data processing; experience with backend frameworks (FastAPI, Flask) for API development; focus on writing modular, testable, and reusable code
  • Proven experience with dbt for data modeling and transformation, including testing frameworks and documentation practices
  • Hands-on experience with cloud data warehouses (Snowflake, BigQuery, or Redshift), including performance tuning, security hardening, and managing complex schemas
  • Experience with workflow orchestration tools (Airflow, Dagster, or equivalent) for production data pipelines, including DAG development, scheduling, monitoring, and troubleshooting
  • Solid grounding in software engineering fundamentals: system design, version control (Git), CI/CD pipelines, containerization (Docker), and infrastructure-as-code (Terraform, CloudFormation)
  • Hands-on experience managing AWS resources, including S3, IAM roles/policies, API integrations, and security configurations
  • Strong ability to analyze large datasets, identify data quality issues, debug pipeline failures, and propose scalable solutions
  • Excellent communication skills and ability to work cross-functionally with scientists, analysts, and product teams to turn ambiguous requirements into maintainable data products
Preferred Skills
  • Domain familiarity with biological data (PCR, sequencing, wastewater surveillance, TAT metrics) and experience working with lab, bioinformatics, NGS, or epidemiology teams
  • Production ownership of Snowflake environments including RBAC, secure authentication patterns, and cost/performance optimization
  • Experience with observability and monitoring stacks (Grafana, Datadog, or similar) and data quality monitoring (anomaly detection, volume/velocity checks, schema drift detection)
  • Familiarity with container orchestration platforms (Kubernetes) for managing production workloads
  • Experience with data ingestion frameworks (Airbyte, Fivetran) or building custom ingestion solutions for external partner data delivery
  • Familiarity with data cataloging, governance practices, and reference data management to prevent silent data drift
  • Experience designing datasets for visualization tools (Tableau, Looker, Metabase) with strong understanding of dashboard consumption patterns; familiarity with JavaScript for custom visualizations or front-end dashboard development
  • Comfort with AI-assisted development tools (GitHub Copilot, Cursor) to accelerate code generation while maintaining quality standards
  • Startup or fast-paced environment experience with evolving priorities and rapid iteration
  • Scientific or data-intensive domain experience (life sciences, healthcare, materials science)