← Serch more jobs

Intern - Research Engineer

SummitTX Capital • New York, NY

Internship Posted March 14, 2026 Job link

Thinking about this job

Not Met Priorities

What still needs stronger evidence

Requirements

Contribute to the evolution of the data platform roadmap, including observability, governance, access controls, cataloging, and documentation standards Qualifications
Strong Python and SQL fundamentals; comfort with Git and testing frameworks
Coursework or internship experience in data modeling, ETL/ELT, artificial intelligence/machine learning/statistics, or time-series analysis
Clear communication skills and ability to partner with investment, risk, and operations stakeholders Preferred
Hands-on experience with Python, SQL, DBT, Spark, and modern data-quality toolkits
Exposure to ML frameworks (pandas, scikit-learn, PyTorch, MLflow) and feature pipelines
Familiarity with Databricks (Lakehouse, Unity Catalog) and AWS data services (S3, Glue/Athena, Lake Formation)
Experience with visualization and BI tools (e.g., Plotly, Tableau/Power BI), and Financial Data Platform (e.g.
Bloomberg Terminal)
Experience in GenAI/LLM applications (prompt engineering, agentic workflow, RAG) Tech Stack
Languages & Frameworks: Python (Pandas, scikit-learn, PyTorch, MLflow), SQL, DBT, Spark
Data & Platform: Databricks (Delta Lake, Unity Catalog, Serverless Compute), DBT, AWS (EC2, S3, Athena), Bloomberg Terminal
Tooling & Ops: GitHub/Bitbucket, Databricks Lakeflow, Airflow, CI/CD pipelines, observability frameworks, Linux, Cursor/VS Code Compensation

Preferred Skills

Strong Python and SQL fundamentals; comfort with Git and testing frameworks
Coursework or internship experience in data modeling, ETL/ELT, artificial intelligence/machine learning/statistics, or time-series analysis
Clear communication skills and ability to partner with investment, risk, and operations stakeholders Preferred
Hands-on experience with Python, SQL, DBT, Spark, and modern data-quality toolkits
Exposure to ML frameworks (pandas, scikit-learn, PyTorch, MLflow) and feature pipelines
Familiarity with Databricks (Lakehouse, Unity Catalog) and AWS data services (S3, Glue/Athena, Lake Formation)
Experience with visualization and BI tools (e.g., Plotly, Tableau/Power BI), and Financial Data Platform (e.g.
Bloomberg Terminal)
Experience in GenAI/LLM applications (prompt engineering, agentic workflow, RAG) Tech Stack
Languages & Frameworks: Python (Pandas, scikit-learn, PyTorch, MLflow), SQL, DBT, Spark
Data & Platform: Databricks (Delta Lake, Unity Catalog, Serverless Compute), DBT, AWS (EC2, S3, Athena), Bloomberg Terminal
Tooling & Ops: GitHub/Bitbucket, Databricks Lakeflow, Airflow, CI/CD pipelines, observability frameworks, Linux, Cursor/VS Code Compensation

Education

(Not required) – BS or pursuing an MS in Data Science, Data Engineering, Statistics, Business Analytics, Applied Math, or related field with strong academic performance
(Not required) – Coursework or internship experience in data modeling, ETL/ELT, artificial intelligence/machine learning/statistics, or time-series analysis

SummitTX Capital is a multi-manager, multi-strategy hedge fund managing over $3 billion in AUM. Founded in 2015, the firm spun out from Crestline Investors in 2025 to become an independent SEC-registered adviser under the SummitTX Capital brand. We operate an open-architecture platform across Fundamental, Tactical, Quantitative, and Capital Markets strategies, with offices in Fort Worth and New York. SummitTX is seeking exceptional master’s candidates for our Research Engineer Internship beginning in the summer of 2026. This intern will help build and scale our systematic data platform that powers alpha research and production signals. You will work end-to-end, from idea generation and data acquisition to model development, backtesting, deployment, and monitoring, with an initial portfolio mix of Long/Short Equity initiatives and Systematic Fundamental research. The role reports to the Head of Data and partners daily with portfolio managers, analysts, the central research team, risk, and operations. Key Responsibilities
Design, build, and maintain systematic data pipelines, including ingestion, medallion-style data modeling, feature engineering, and experiment tracking
Operationalize robust ELT workflows using DBT/SQL and Python on Databricks, with strong enforcement of data quality, lineage, and documentation
Develop research-grade datasets and features across market, alternative, and fundamental domains to support L/S Equity and systematic strategies
Productionize models and alpha signals with CI/CD pipelines, model registries, monitoring, and cost/performance optimization on Databricks and AWS
Partner with PMs and Analysts to translate investment hypotheses into testable research artifacts, delivering clear results, visualizations, and readouts to guide decision-making
Contribute to the evolution of the data platform roadmap, including observability, governance, access controls, cataloging, and documentation standards Qualifications
BS or pursuing an MS in Data Science, Data Engineering, Statistics, Business Analytics, Applied Math, or related field with strong academic performance
Strong Python and SQL fundamentals; comfort with Git and testing frameworks
Coursework or internship experience in data modeling, ETL/ELT, artificial intelligence/machine learning/statistics, or time-series analysis
Clear communication skills and ability to partner with investment, risk, and operations stakeholders Preferred
Hands-on experience with Python, SQL, DBT, Spark, and modern data-quality toolkits
Exposure to ML frameworks (pandas, scikit-learn, PyTorch, MLflow) and feature pipelines
Familiarity with Databricks (Lakehouse, Unity Catalog) and AWS data services (S3, Glue/Athena, Lake Formation)
Experience with visualization and BI tools (e.g., Plotly, Tableau/Power BI), and Financial Data Platform (e.g. Bloomberg Terminal)
Experience in GenAI/LLM applications (prompt engineering, agentic workflow, RAG) Tech Stack
Languages & Frameworks: Python (Pandas, scikit-learn, PyTorch, MLflow), SQL, DBT, Spark
Data & Platform: Databricks (Delta Lake, Unity Catalog, Serverless Compute), DBT, AWS (EC2, S3, Athena), Bloomberg Terminal
Tooling & Ops: GitHub/Bitbucket, Databricks Lakeflow, Airflow, CI/CD pipelines, observability frameworks, Linux, Cursor/VS Code Compensation
Base Compensation Range: $40 - 50/hr
Eligible for overtime