LOG IN
SIGN UP
Tech Job Finder - Find Software, Technology Sales and Product Manager Jobs.
Sign In
OR continue with e-mail and password
E-mail address
Password
Don't have an account?
Reset password
Join Tech Job Finder
OR continue with e-mail and password
E-mail address
First name
Last name
Username
Password
Confirm Password
How did you hear about us?
By signing up, you agree to our Terms & Conditions and Privacy Policy.

Pyspark Data Engineer

at Citi

Back to all Data Engineering jobs
Citi logo
Bulge Bracket Investment Banks

Pyspark Data Engineer

at Citi

Mid LevelNo visa sponsorshipData Engineering

Posted 3 days ago

No clicks

Compensation
Not specified

Currency: Not specified

City
Chennai
Country
India

Citi is seeking a Python/PySpark Data Engineer to design and implement data migration, profiling, and processing pipelines on large-scale data platforms. The role focuses on PySpark distributed processing, SQL queries (Oracle), JDBC integration, and real-time streaming. You'll collaborate with data architects, data engineers, and business stakeholders to translate requirements into robust data solutions while ensuring data quality and performance.

Pyspark Data Engineer

Job Req Id:
26936201
Location(s):
Chennai, Tamil Nadu, India
Job Type:
Hybrid
Posted:
Feb. 16, 2026

Discover your future at Citi

Working at Citi is far more than just a job. A career with us means joining a team of more than 230,000 dedicated people from around the globe. At Citi, you’ll have the opportunity to grow your career, give back to your community and make a real impact.

Job Overview

We are seeking a highly motivated and intuitive Python Developer to join our dynamic team, focusing on critical data migration and profiling initiatives. The ideal candidate will be a self-starter with strong engineering principles, capable of designing and implementing robust solutions for handling large datasets and complex data flows. This role offers an exciting opportunity to work on challenging projects that drive significant impact within our data ecosystem.



Responsibilities:

  • Develop, test, and deploy high-quality Python code for data migration, data profiling, and data processing.
  • Design and implement scalable solutions for working with large and complex datasets, ensuring data integrity and performance.
  • Utilize PySpark for distributed data processing and analytics on large-scale data platforms.
  • Develop and optimize SQL queries for various database systems, including Oracle, to extract, transform, and load data efficiently.
  • Integrate Python applications with JDBC-compliant databases (e.g., Oracle) for seamless data interaction.
  • Implement data streaming solutions to process real-time or near real-time data efficiently.
  • Perform in-depth data analysis using Python libraries, especially Pandas, to understand data characteristics, identify anomalies, and support profiling efforts.
  • Collaborate with data architects, data engineers, and business stakeholders to understand requirements and translate them into technical specifications.
  • Contribute to the design and architecture of data solutions, ensuring best practices in data management and engineering.
  • Troubleshoot and resolve technical issues related to data pipelines, performance, and data quality.


Qualifications:

  • 4-7 years of relevant experience in the Financial Service industry
  • Strong Proficiency in Python:
  • Excellent command of Python programming, including object-oriented principles, data structures, and algorithms.
  • PySpark Experience:
  • Demonstrated experience with PySpark for big data processing and analysis.
  • Database Expertise:
  • Proven experience working with relational databases, specifically Oracle, andconnecting applications using JDBC.
  • SQL Mastery:
  • Advanced SQL querying skills for complex data extraction, manipulation, andoptimization.
  • Big Data Handling:
  • Experience in working with and processing large datasets efficiently.
  • Data Streaming:
  • Familiarity with data streaming concepts and technologies (e.g., Kafka, SparkStreaming) for processing continuous data flows.
  • Data Analysis Libraries:
  • Proficient in using data analysis libraries such as Pandas for data manipulationand exploration.
  • Software Engineering Principles:
  • Solid understanding of software engineering best practices,including version control (Git), testing, and code review.
  • Problem-Solving:
  • Intuitive problem-solver with a self-starter mindset and the ability to work independently and as part of a team.


Education:

  • Bachelor’s degree/University degree or equivalent experience


This job description provides a high-level review of the types of work performed. Other job-related duties may be assigned as required.

  • Preferred Skills & Qualifications (Good to Have):

    • Experience in developing and maintaining reusable Python packages or libraries for data engineering tasks.

    • Familiarity with cloud platforms (e.g., AWS, Azure, GCP) and their data services.

    • Knowledge of data warehousing concepts and ETL/ELT processes.

    • Experience with CI/CD pipelines for automated deployment.

------------------------------------------------------

Job Family Group:

Technology

------------------------------------------------------

Job Family:

Applications Development

------------------------------------------------------

Time Type:

Full time

------------------------------------------------------

Most Relevant Skills

Please see the requirements listed above.

------------------------------------------------------

Other Relevant Skills

For complementary skills, please see above and/or contact the recruiter.

------------------------------------------------------

Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.

If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi.

View Citi’s EEO Policy Statement and the Know Your Rights poster.

Pyspark Data Engineer

at Citi

Back to all Data Engineering jobs
Citi logo
Bulge Bracket Investment Banks

Pyspark Data Engineer

at Citi

Mid LevelNo visa sponsorshipData Engineering

Posted 3 days ago

No clicks

Compensation
Not specified

Currency: Not specified

City
Chennai
Country
India

Citi is seeking a Python/PySpark Data Engineer to design and implement data migration, profiling, and processing pipelines on large-scale data platforms. The role focuses on PySpark distributed processing, SQL queries (Oracle), JDBC integration, and real-time streaming. You'll collaborate with data architects, data engineers, and business stakeholders to translate requirements into robust data solutions while ensuring data quality and performance.

Pyspark Data Engineer

Job Req Id:
26936201
Location(s):
Chennai, Tamil Nadu, India
Job Type:
Hybrid
Posted:
Feb. 16, 2026

Discover your future at Citi

Working at Citi is far more than just a job. A career with us means joining a team of more than 230,000 dedicated people from around the globe. At Citi, you’ll have the opportunity to grow your career, give back to your community and make a real impact.

Job Overview

We are seeking a highly motivated and intuitive Python Developer to join our dynamic team, focusing on critical data migration and profiling initiatives. The ideal candidate will be a self-starter with strong engineering principles, capable of designing and implementing robust solutions for handling large datasets and complex data flows. This role offers an exciting opportunity to work on challenging projects that drive significant impact within our data ecosystem.



Responsibilities:

  • Develop, test, and deploy high-quality Python code for data migration, data profiling, and data processing.
  • Design and implement scalable solutions for working with large and complex datasets, ensuring data integrity and performance.
  • Utilize PySpark for distributed data processing and analytics on large-scale data platforms.
  • Develop and optimize SQL queries for various database systems, including Oracle, to extract, transform, and load data efficiently.
  • Integrate Python applications with JDBC-compliant databases (e.g., Oracle) for seamless data interaction.
  • Implement data streaming solutions to process real-time or near real-time data efficiently.
  • Perform in-depth data analysis using Python libraries, especially Pandas, to understand data characteristics, identify anomalies, and support profiling efforts.
  • Collaborate with data architects, data engineers, and business stakeholders to understand requirements and translate them into technical specifications.
  • Contribute to the design and architecture of data solutions, ensuring best practices in data management and engineering.
  • Troubleshoot and resolve technical issues related to data pipelines, performance, and data quality.


Qualifications:

  • 4-7 years of relevant experience in the Financial Service industry
  • Strong Proficiency in Python:
  • Excellent command of Python programming, including object-oriented principles, data structures, and algorithms.
  • PySpark Experience:
  • Demonstrated experience with PySpark for big data processing and analysis.
  • Database Expertise:
  • Proven experience working with relational databases, specifically Oracle, andconnecting applications using JDBC.
  • SQL Mastery:
  • Advanced SQL querying skills for complex data extraction, manipulation, andoptimization.
  • Big Data Handling:
  • Experience in working with and processing large datasets efficiently.
  • Data Streaming:
  • Familiarity with data streaming concepts and technologies (e.g., Kafka, SparkStreaming) for processing continuous data flows.
  • Data Analysis Libraries:
  • Proficient in using data analysis libraries such as Pandas for data manipulationand exploration.
  • Software Engineering Principles:
  • Solid understanding of software engineering best practices,including version control (Git), testing, and code review.
  • Problem-Solving:
  • Intuitive problem-solver with a self-starter mindset and the ability to work independently and as part of a team.


Education:

  • Bachelor’s degree/University degree or equivalent experience


This job description provides a high-level review of the types of work performed. Other job-related duties may be assigned as required.

  • Preferred Skills & Qualifications (Good to Have):

    • Experience in developing and maintaining reusable Python packages or libraries for data engineering tasks.

    • Familiarity with cloud platforms (e.g., AWS, Azure, GCP) and their data services.

    • Knowledge of data warehousing concepts and ETL/ELT processes.

    • Experience with CI/CD pipelines for automated deployment.

------------------------------------------------------

Job Family Group:

Technology

------------------------------------------------------

Job Family:

Applications Development

------------------------------------------------------

Time Type:

Full time

------------------------------------------------------

Most Relevant Skills

Please see the requirements listed above.

------------------------------------------------------

Other Relevant Skills

For complementary skills, please see above and/or contact the recruiter.

------------------------------------------------------

Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.

If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi.

View Citi’s EEO Policy Statement and the Know Your Rights poster.