CrawlJobs Logo

Lead Data Engineer

https://www.citi.com/ Logo

Citi

Location Icon

Location:
India, Chennai

Category Icon
Category:
IT - Software Development

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

The Lead Data Engineer is responsible for building Data Engineering Solutions using next generation data techniques. The individual will be working directly with product owners, customers and technologists to deliver data products/solutions in a collaborative and agile environment.

Job Responsibility:

  • Design and development of big data solutions
  • Partner with domain experts, product managers, analyst, and data scientists to develop Big Data pipelines
  • Deliver data as a service framework
  • Move all legacy workloads to cloud platform
  • Work with data scientist to build Client pipelines using heterogeneous sources
  • Ensure automation through CI/CD across platforms
  • Research and assess open source technologies
  • Mentor other team members on Big Data and Cloud Tech stacks
  • Define needs around maintainability, testability, performance, security, quality and usability for data platform
  • Drive implementation, consistent patterns, reusable components, and coding standards for data engineering processes
  • Convert SAS based pipelines into languages like PySpark, Scala
  • Tune Big data applications on Hadoop and non-Hadoop platforms
  • Evaluate new IT developments and evolving business requirements
  • Supervise day-to-day staff management issues

Requirements:

  • 12+ years of total IT experience
  • 8+ years of experience with Hadoop (Cloudera)/big data technologies
  • Advanced knowledge of the Hadoop ecosystem and Big Data technologies
  • Hands-on experience with the Hadoop eco-system (HDFS, MapReduce, Hive, Pig, Impala, Spark, Kafka, Kudu, Solr)
  • Experience on designing and developing Data Pipelines for Data Ingestion or Transformation using Java or Scala or Python
  • Experience with Spark programming (pyspark or scala or java)
  • Expert level building pipelines using Apache Spark
  • Familiarity with core provider services from AWS, Azure or GCP
  • Hands-on experience with Python/Pyspark/Scala and basic libraries for machine learning
  • Experience with containerization and related technologies (e.g. Docker, Kubernetes)
  • Experience with all aspects of DevOps (source control, continuous integration, deployments, etc.)
  • 1 year Hadoop administration experience preferred
  • 1+ year of SAS experience preferred
  • Proficient in programming in Java or Python with prior Apache Beam/Spark experience a plus
  • System level understanding - Data structures, algorithms, distributed storage & compute
  • Can-do attitude on solving complex business problems, good interpersonal and teamwork skills
  • Possess team management experience and have led a team of data engineers and analysts
  • Experience in Snowflake or Delta lake is a plus

Nice to have:

  • Experience in Snowflake or Delta lake
  • 1 year Hadoop administration experience
  • 1+ year of SAS experience

Additional Information:

Job Posted:
March 22, 2025

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:
Welcome to CrawlJobs.com
Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.