
Data Engineer II
at J.P. Morgan
Posted a month ago
No clicks
- Compensation
- Not specified
- City
- Pune
- Country
- India
Currency: Not specified
As a Data Engineer II on the Connected Commerce Travel Technology team, you will design, build and maintain large-scale cloud-based data integration and analytics solutions. You will transform complex, high-volume data into actionable insights, develop scalable data pipelines and optimize data models while ensuring data quality and governance. The role requires collaboration with cross-functional stakeholders and a focus on automation, CI/CD, and test-driven development practices. Technologies include Python, Spark, cloud Data Lakehouse platforms, Airflow and related big-data tooling.
Location: Pune, Maharashtra, India
Job responsibilities
- Design, develop and maintain scalable and large-scale data processing pipelines and infrastructure on the cloud following engineering standards, governance standards and technology best practices
- Develop and optimize data models for large-scale datasets, ensuring efficient storage, retrieval, and analytics while maintaining data integrity and quality.
- Collaborate with cross-functional teams to translate business requirements into scalable and effective data engineering solutions.
- Demonstrate a passion for innovation and continuous improvement in data engineering, proactively identifying opportunities to enhance data infrastructure, data processing and analytics capabilities.
Required qualifications, capabilities, and skills
- Strong analytical problem solving and critical thinking skills
- Proficiency in at least one programming language ( Python, if not Java or Scala)
- Proficiency in at least one distributed data processing framework ( Spark or similar)
- Proficiency in at least one cloud data Lakehouse platforms (AWS Data lake services or Databricks, alternatively Hadoop),
- Proficiency in at least one scheduling/orchestration tools ( Airflow, if not AWS Step Functions or similar)
- Proficiency with relational and NoSQL databases.
- Proficiency in data structures, data serialization formats (JSON, AVRO, Protobuf, or similar), and big-data storage formats (Parquet, Iceberg, or similar),
- Experience working in teams following Agile methodology
- Experience with test-driven development (TDD) or behavior-driven development (BDD) practices, as well as working with continuous integration and continuous deployment (CI/CD) tools.
- Proficiency in Python and Pyspark
- Proficiency in IaC (preferably Terraform, alternatively AWS cloud formation)
- Experience with AWS Glue, AWS S3, AWS Lakehouse, AWS Athena, Airflow, Kinesis and Apache Iceberg
- Experience working with Jenkins



