← Serch more jobs

Language Data Scientist

LinkedIn Synodex Ridgefield Park, NJ
Not Applicable Posted March 30, 2026 Job link
Thinking about this job
Not Met Priorities
What still needs stronger evidence
Requirements
  • You have at least 3 years of relevant experience with data creation, curation and analysis for GenAI applications (e.g.
  • Knowledge of how components of GenAI products or services combine to work
  • Language and language data expertise: Extensive experience working with human language data and designing human evaluation tasks, including multi-phase and complex workflows.
  • Deep understanding of language and its relationship with culture
  • Ability to identify ambiguity and subjectivity in language
  • Ability to work with multi-lingual and multi-modal projects
  • Quantitative Analysis Skills: Advanced knowledge of statistics, metrics (e.g. f1 score, inter-rater reliability metrics), and data analysis methods such as sampling.
  • Technical skills:
  • Experience with Natural Language Processing (NLP) techniques and tools, such as SpaCy, NLTK, or Hugging Face.
  • Proficiency in Python to:
  • handle / transform large datasets (e.g. pre- and postprocessing data, pandas)
  • perform quantitative analyses
  • visualize data (for example matplotlib, seaborn)
  • Data processing:
  • Deep understanding of data pipelines to support ML and NLP workflows,
  • Knowledge of efficient data collection, transformation, and storage
  • Knowledge of data structures, algorithms, and data engineering principles
  • Excellent interpersonal skills for effective cross-functional stakeholder engagement
  • Excellent problem-solving skills, with the ability to think critically and creatively to develop innovative AI solutions
  • Ability to work independently and collaborate as part of a team
  • Adaptable to changing technologies and methodologies
  • Ability to translate experience, research and development information to understand client products and services.
Preferred Skills
  • Ability to translate experience, research and development information to understand client products and services.
  • Conducting research to stay up-to-date with the latest advancements in generative AI, machine learning, and deep learning techniques
  • Knowledge of optimizing existing generative AI models for improved performance, scalability, and efficiency
  • Experience of developing and maintaining ML/AI pipelines, including data preprocessing, feature extraction, model training, and evaluation
  • Model Fine-Tuning: Knowledge of Fine-tuning pre-trained models to adapt them to specific tasks and datasets, improving their performance and relevance
  • Developing clear and concise documentation, including technical specifications, user guides, and presentations, to communicate complex AI concepts to both technical and nontechnical stakeholders
  • Contributing to establishing best practices and standards for generative AI development with customers and within the organization
  • Providing technical mentorship and guidance to junior team members
  • Understanding of techniques such as GPT, VAE, and GANs
Education
  • (Not required) – MA in (computational) linguistics, data science, computer science (AI / ML / NLU), quantitative social sciences or a related scientific / quantitative field, PhD strongly preferred