← Serch more jobs

AI Engineer- Responsible AI

LinkedIn Centific Seattle, WA
Mid-Senior level Posted April 2, 2026 Job link
Thinking about this job
Not Met Priorities
What still needs stronger evidence
Requirements
  • 2+ years of industry experience in applied ML/AI research or ML engineering
  • Track record of publications in AI Safety, NLP robustness, or adversarial ML (ACL, NeurIPS, ICML, EMNLP, IEEE S&P, etc.) or equivalent applied research impact
  • Strong Python and PyTorch/JAX skills with experience deploying ML models to production
  • Demonstrated experience in at least one of: LLM jailbreak attacks/defense, agentic AI safety, adversarial ML, or human-AI interaction vulnerabilities
  • Experience with containerization (Docker, Kubernetes) and cloud platforms (AWS, GCP, or Azure)
  • Proven ability to take research from concept to code to production deployment with rigorous testing and monitoring Preferred Qualifications
  • Experience in adversarial prompt engineering, jailbreak detection (narrative, obfuscated, sequential attacks)
  • Prior work on multi-agent architectures or robust defense strategies for LLMs in production environments
  • Experience with large-scale data processing frameworks (Spark, Flink, Kafka) and data warehousing
  • MLOps expertise: model serving (Triton, TensorRT, vLLM), experiment tracking (W&B, MLflow), and CI/CD for ML
  • Infrastructure as Code experience (Terraform, Pulumi) and DevOps best practices
  • Experience with distributed computing frameworks (Ray, Dask) for scalable training and evaluation
  • Familiarity with observability stacks (Prometheus, Grafana, DataDog) and incident management
  • First-author publications, strong GitHub profile, or significant open-source contributions Our Stack
  • Modeling: PyTorch/JAX, Hugging Face, vLLM, Mistral, LLaMA, OpenAI APIs
  • Safety: Red-teaming frameworks, LLM benchmarking (SODE, ART, HarmBench), human behavior simulation
  • Infrastructure: Kubernetes, Docker, Terraform, AWS/GCP, Ray, Spark
  • MLOps: Triton Inference Server, Weights & Biases, MLflow, Airflow, ArgoCD
  • Data: PostgreSQL, Redis, Kafka, Snowflake/BigQuery, dbt
  • Observability: Prometheus, Grafana, DataDog, PagerDuty What Success Looks Like
  • Production systems that measurably improve safety KPIs: adversarial robustness, over-defensiveness rates, and incident response latency
  • Publishable research outcomes (with company approval) demonstrating novel contributions to AI safety
  • Well-documented, tested, and maintainable code with comprehensive CI/CD and monitoring
  • Research to Production: Bridge the gap between cutting-edge research and production systems
Preferred Skills
  • Track record of publications in AI Safety, NLP robustness, or adversarial ML (ACL, NeurIPS, ICML, EMNLP, IEEE S&P, etc.) or equivalent applied research impact
  • Demonstrated experience in at least one of: LLM jailbreak attacks/defense, agentic AI safety, adversarial ML, or human-AI interaction vulnerabilities
  • Prior work on multi-agent architectures or robust defense strategies for LLMs in production environments
  • MLOps expertise: model serving (Triton, TensorRT, vLLM), experiment tracking (W&B, MLflow), and CI/CD for ML
  • Infrastructure as Code experience (Terraform, Pulumi) and DevOps best practices
  • Experience with distributed computing frameworks (Ray, Dask) for scalable training and evaluation
  • Familiarity with observability stacks (Prometheus, Grafana, DataDog) and incident management
  • First-author publications, strong GitHub profile, or significant open-source contributions Our Stack
  • Modeling: PyTorch/JAX, Hugging Face, vLLM, Mistral, LLaMA, OpenAI APIs
  • Safety: Red-teaming frameworks, LLM benchmarking (SODE, ART, HarmBench), human behavior simulation
  • Infrastructure: Kubernetes, Docker, Terraform, AWS/GCP, Ray, Spark
  • MLOps: Triton Inference Server, Weights & Biases, MLflow, Airflow, ArgoCD
  • Data: PostgreSQL, Redis, Kafka, Snowflake/BigQuery, dbt
  • Observability: Prometheus, Grafana, DataDog, PagerDuty What Success Looks Like
  • Production systems that measurably improve safety KPIs: adversarial robustness, over-defensiveness rates, and incident response latency
  • Publishable research outcomes (with company approval) demonstrating novel contributions to AI safety
  • Well-documented, tested, and maintainable code with comprehensive CI/CD and monitoring
  • Infrastructure that scales reliably and enables the broader team to iterate quickly on safety research Why Centific
  • Research to Production: Bridge the gap between cutting-edge research and production systems
  • Mentorship: Collaborate with Principal Architects and senior researchers in AI safety and adversarial ML
Education
  • (Not required) – Master's degree in CS/EE/ML/Security or related field (Ph.D. preferred)
  • (Not required) – 2+ years of industry experience in applied ML/AI research or ML engineering