← Serch more jobs

Principal Scientist

LinkedIn Rancho BioSciences San Diego, CA
Not Applicable Posted March 13, 2026 Job link
Thinking about this job
Not Met Priorities
What still needs stronger evidence
Requirements
  • 5+ years delivering ML/AI solutions in life sciences (discovery, translational, clinical, or RWE), including 3+ years leading cross-functional technical teams.
  • Hands-on expertise with Python and core ML/DL frameworks (PyTorch and/or TensorFlow; Keras); strong software engineering practices (testing, code review, version control).
  • Proven experience building production-grade data and deployment pipelines: SQL and Spark, containerization (Docker), orchestration (Airflow/Prefect), cloud services (AWS preferred; Azure/GCP welcome).
  • Experience with multi-agent systems and agent orchestration in production use cases.
  • Track record of rigorous LLM evaluation: designing task-specific benchmarks, implementing automated evaluation frameworks, diagnosing failure modes, and iteratively optimizing retrieval and generation pipelines for accuracy, latency, and cost.
  • Practical GenAI/LLM experience: retrieval-augmented generation, vector databases (e.g., FAISS, Milvus, pgvector), prompt engineering, evaluation frameworks, and safety/guardrail techniques.
  • Strong client-facing skills: translating scientific needs into technical solutions, presenting to senior stakeholders, and contributing to scope and SOWs.
  • Domain fluency with clinical, preclinical, or RWE data and relevant standards (CDISC, OMOP, FHIR) and biomedical ontologies (e.g., OBO, SNOMED, MeSH).
  • Experience with knowledge graphs (RDF/OWL, SPARQL, Neo4j) and entity/relationship modeling.
  • Biomedical NLP (e.g., BioBERT, SciBERT) and ontology-driven text mining.
  • Privacy and compliance expertise: de-identification, data use agreements, and audit readiness.
Preferred Skills
  • Experience with knowledge graphs (RDF/OWL, SPARQL, Neo4j) and entity/relationship modeling.
  • Biomedical NLP (e.g., BioBERT, SciBERT) and ontology-driven text mining.
  • Privacy and compliance expertise: de-identification, data use agreements, and audit readiness.
  • Familiarity with data product thinking and monetization of curated datasets.
  • Familiarity with multimodal foundation models in biomedical domains: single-cell embeddings (e.g., scGPT, Geneformer), molecular/chemical LLMs (e.g., ChemBERTa, MolBERT), or medical imaging models (e.g., BiomedCLIP, pathology foundation models).
  • MLOps proficiency with platforms such as AWS SageMaker, Vertex AI, or Kubeflow; experiment tracking (MLflow/Weights & Biases); model registry and monitoring.
Education
  • (Not required) – PhD in Computational Biology, Bioinformatics, Computer Science, Statistics, or related field (or comparable demonstrated relevant experience).