← Serch more jobs

Associate Principal/Forensic AI Engineer (Forensic Services practice)

LinkedIn Charles River Associates Boston, MA
Not Applicable Posted April 5, 2026 Job link
Thinking about this job
Not Met Priorities
What still needs stronger evidence
Requirements
  • 8–10+ years of progressive experience in machine learning engineering, AI research, digital forensics, data science, or a closely related technical field, with deep expertise in at least two of the following domains:
  • Development, fine-tuning, evaluation, or production deployment of large language models (LLMs) or generative AI systems
  • Synthetic media detection, deepfake analysis, or AI-generated content forensics
  • Natural language processing, computational linguistics, or authorship attribution
  • Digital forensics, incident response, eDiscovery, or cybercrime investigation
  • Consulting delivery, expert witness services, or client-facing technical advisory roles in a litigation or regulatory context
  • A representative portfolio of project contributions , including open-source contributions, published research, technical blog posts, or other observable works , demonstrating sustained, applied AI proficiency across investigation-relevant domains.
  • Demonstrated ability to conduct or support expert witness engagements, produce legally defensible forensic reports, and communicate complex technical findings to non-technical audiences including judges, regulators, and corporate executives.
  • Familiarity with AI governance and risk management frameworks, including the NIST AI Risk Management Framework, ISO/IEC 42001:2023 (AI Management Systems), and the OWASP Generative AI Security guidelines.
  • Technical Skills
  • LLM Proficiency: Deep understanding of large language model architectures (transformer-based models including GPT, LLaMA, Mistral, Gemma, and related families); proficiency with fine-tuning methodologies including LoRA, QLoRA, and instruction tuning; and experience with alignment techniques (RLHF, RLAIF, DPO).
  • LLM Ecosystems and Tooling: Proficiency with the HuggingFace ecosystem (Transformers, PEFT, Datasets, Evaluate, TRL); major LLM inference frameworks (vLLM, llama.cpp, Ollama); and orchestration frameworks including LangChain and LlamaIndex.
  • Retrieval-Augmented Generation (RAG) Pipelines: Experience building, evaluating, and optimizing RAG pipelines using vector databases (Pinecone, Weaviate, ChromaDB, pgvector) and embedding models, including chunking strategy, retrieval evaluation, and hybrid search.
  • Deepfake Detection and Synthetic Media Forensics: Competency in deepfake detection methodologies including CNN- and transformer-based detection models (trained on FaceForensics++, DFDC, or equivalent datasets); GAN architecture analysis; multimodal artifact inspection (lip synchronization, temporal consistency, audio-visual misalignment); and pixel-level manipulation detection tools (e.g., Amped Authenticate, Sensity AI).
  • Content Provenance and Authentication: Understanding of content provenance standards including C2PA (Coalition for Content Provenance and Authenticity), cryptographic content credentials, digital watermarking approaches (including SynthID), and blockchain-based authenticity verification.
  • Text Forensics and Authorship Attribution: Experience with NLP-based forensic methodologies including stylometric analysis, semantic embedding-based authorship attribution, human vs. machine-generated text classification, and LLM source attribution (model fingerprinting, training data membership inference).
  • Programming Languages and Core Libraries: Advanced proficiency in Python; competency with PyTorch and/or TensorFlow for deep learning model development and evaluation; data analysis using pandas, NumPy, and scikit-learn; and SQL for structured data querying and investigation support.
  • Computer Vision and Multimodal Analysis: Experience with computer vision libraries (OpenCV, torchvision, PIL/Pillow) and audio/video processing tools for media forensics, including EXIF metadata analysis, sensor pattern noise analysis, and compression artifact forensics.
  • Cloud and Infrastructure: Familiarity with cloud ML platforms (AWS SageMaker, GCP Vertex AI, Azure Machine Learning) and containerization approaches (Docker, Kubernetes) for deploying and managing AI forensic tooling in enterprise environments.
  • Data Engineering: Experience with data engineering frameworks and tools including Apache Spark, Airflow, and modern data warehousing platforms (Snowflake, BigQuery, Redshift) for processing and analyzing large-scale unstructured and semi-structured datasets.
Preferred Skills
  • LLM Proficiency: Deep understanding of large language model architectures (transformer-based models including GPT, LLaMA, Mistral, Gemma, and related families); proficiency with fine-tuning methodologies including LoRA, QLoRA, and instruction tuning; and experience with alignment techniques (RLHF, RLAIF, DPO).
  • LLM Ecosystems and Tooling: Proficiency with the HuggingFace ecosystem (Transformers, PEFT, Datasets, Evaluate, TRL); major LLM inference frameworks (vLLM, llama.cpp, Ollama); and orchestration frameworks including LangChain and LlamaIndex.
  • Content Provenance and Authentication: Understanding of content provenance standards including C2PA (Coalition for Content Provenance and Authenticity), cryptographic content credentials, digital watermarking approaches (including SynthID), and blockchain-based authenticity verification.
  • Text Forensics and Authorship Attribution: Experience with NLP-based forensic methodologies including stylometric analysis, semantic embedding-based authorship attribution, human vs. machine-generated text classification, and LLM source attribution (model fingerprinting, training data membership inference).
  • Programming Languages and Core Libraries: Advanced proficiency in Python; competency with PyTorch and/or TensorFlow for deep learning model development and evaluation; data analysis using pandas, NumPy, and scikit-learn; and SQL for structured data querying and investigation support.
  • Computer Vision and Multimodal Analysis: Experience with computer vision libraries (OpenCV, torchvision, PIL/Pillow) and audio/video processing tools for media forensics, including EXIF metadata analysis, sensor pattern noise analysis, and compression artifact forensics.
  • Cloud and Infrastructure: Familiarity with cloud ML platforms (AWS SageMaker, GCP Vertex AI, Azure Machine Learning) and containerization approaches (Docker, Kubernetes) for deploying and managing AI forensic tooling in enterprise environments.
  • Data Engineering: Experience with data engineering frameworks and tools including Apache Spark, Airflow, and modern data warehousing platforms (Snowflake, BigQuery, Redshift) for processing and analyzing large-scale unstructured and semi-structured datasets.
Education
  • (Not required) – Education
  • (Required) – Bachelor’s degree required; Computer Science, Electrical Engineering, Data Science, Computational Linguistics, Information Systems, or a related technical field.
  • (Not required) – Graduate degree (M.S. or Ph.D.) in Machine Learning, Artificial Intelligence, Computer Vision, Natural Language Processing, or a closely related discipline preferred.