Mid-Senior level
Posted April 2, 2026
Job link
Thinking about this job
Responsibilities
Commitments
Responsibilities
- Experience with CI/CD automation and pipeline development.
- Strong analytical and troubleshooting skills for resolving complex data issues.
- Ability to collaborate with cross-functional teams and convert business requirements into technical solutions.
- Design, develop, and maintain robust, scalable ETL/ELT pipelines.
- Write efficient, reusable, and scalable code in Python and PySpark for distributed data processing.
- Review existing data engineering code and identify opportunities for refactoring or performance improvement.
- Implement data validation, cleansing, reconciliation, and quality checks across the data lifecycle.
- Collaborate with IT and business stakeholders to understand data requirements and translate them into solutions.
- Monitor pipeline performance, troubleshoot failures, and optimize for latency, throughput, and cost.
- Participate in code reviews, enforce coding standards, and contribute to engineering best practices.
- Build and maintain CI/CD pipelines for testing, packaging, and deployment of data pipelines.
- Ensure data reliability, security, and consistency across environments.
- Work with cloud services and big data platforms to support modern data architecture .
Commitments
We are looking for Data Engineer with Trading and Securities Exp Title – Data Engineer - Python/PySpark Location- Irving TX (3 Days onsite/week) Terms- Direct Hire Job Description:
Not Met Priorities
What still needs stronger evidence
Requirements
- Strong hands-on development experience in Python, PySpark, and SQL.
- Experience building large-scale ETL/ELT pipelines for structured and unstructured data.
- Deep understanding of Spark and distributed computing fundamentals (transformations, shuffles, optimization).
- Experience with big data frameworks such as Hadoop and Spark.
- Proficiency with Git-based repositories (Bitbucket / GitHub).
- Experience working with AWS, Azure, or GCP environments.
- Strong understanding of database design, data modeling, warehouse schemas (star/snowflake).
- Experience with CI/CD automation and pipeline development.
- Strong analytical and troubleshooting skills for resolving complex data issues.
- Ability to collaborate with cross-functional teams and convert business requirements into technical solutions.
- Design, develop, and maintain robust, scalable ETL/ELT pipelines.
- Write efficient, reusable, and scalable code in Python and PySpark for distributed data processing.
- Review existing data engineering code and identify opportunities for refactoring or performance improvement.
- Implement data validation, cleansing, reconciliation, and quality checks across the data lifecycle.
- Collaborate with IT and business stakeholders to understand data requirements and translate them into solutions.
- Monitor pipeline performance, troubleshoot failures, and optimize for latency, throughput, and cost.
- Participate in code reviews, enforce coding standards, and contribute to engineering best practices.
- Build and maintain CI/CD pipelines for testing, packaging, and deployment of data pipelines.
- Ensure data reliability, security, and consistency across environments.
- Work with cloud services and big data platforms to support modern data architecture .
We are looking for Data Engineer with Trading and Securities Exp Title – Data Engineer - Python/PySpark Location- Irving TX (3 Days onsite/week) Terms- Direct Hire Job Description:
Strong hands-on development experience in Python, PySpark, and SQL.
Experience building large-scale ETL/ELT pipelines for structured and unstructured data.
Deep understanding of Spark and distributed computing fundamentals (transformations, shuffles, optimization).
Experience with big data frameworks such as Hadoop and Spark.
Proficiency with Git-based repositories (Bitbucket / GitHub).
Experience working with AWS, Azure, or GCP environments.
Strong understanding of database design, data modeling, warehouse schemas (star/snowflake).
Experience with CI/CD automation and pipeline development.
Strong analytical and troubleshooting skills for resolving complex data issues.
Ability to collaborate with cross-functional teams and convert business requirements into technical solutions.
Design, develop, and maintain robust, scalable ETL/ELT pipelines.
Write efficient, reusable, and scalable code in Python and PySpark for distributed data processing.
Review existing data engineering code and identify opportunities for refactoring or performance improvement.
Implement data validation, cleansing, reconciliation, and quality checks across the data lifecycle.
Collaborate with IT and business stakeholders to understand data requirements and translate them into solutions.
Monitor pipeline performance, troubleshoot failures, and optimize for latency, throughput, and cost.
Participate in code reviews, enforce coding standards, and contribute to engineering best practices.
Build and maintain CI/CD pipelines for testing, packaging, and deployment of data pipelines.
Ensure data reliability, security, and consistency across environments.
Work with cloud services and big data platforms to support modern data architecture .
Strong hands-on development experience in Python, PySpark, and SQL.
Experience building large-scale ETL/ELT pipelines for structured and unstructured data.
Deep understanding of Spark and distributed computing fundamentals (transformations, shuffles, optimization).
Experience with big data frameworks such as Hadoop and Spark.
Proficiency with Git-based repositories (Bitbucket / GitHub).
Experience working with AWS, Azure, or GCP environments.
Strong understanding of database design, data modeling, warehouse schemas (star/snowflake).
Experience with CI/CD automation and pipeline development.
Strong analytical and troubleshooting skills for resolving complex data issues.
Ability to collaborate with cross-functional teams and convert business requirements into technical solutions.
Design, develop, and maintain robust, scalable ETL/ELT pipelines.
Write efficient, reusable, and scalable code in Python and PySpark for distributed data processing.
Review existing data engineering code and identify opportunities for refactoring or performance improvement.
Implement data validation, cleansing, reconciliation, and quality checks across the data lifecycle.
Collaborate with IT and business stakeholders to understand data requirements and translate them into solutions.
Monitor pipeline performance, troubleshoot failures, and optimize for latency, throughput, and cost.
Participate in code reviews, enforce coding standards, and contribute to engineering best practices.
Build and maintain CI/CD pipelines for testing, packaging, and deployment of data pipelines.
Ensure data reliability, security, and consistency across environments.
Work with cloud services and big data platforms to support modern data architecture .