This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
The Lead Data Engineer is responsible for building Data Engineering Solutions using next generation data techniques. The individual will be working directly with product owners, customers and technologists to deliver data products/solutions in a collaborative and agile environment.
Job Responsibility:
Design and development of big data solutions
Partner with domain experts, product managers, analyst, and data scientists to develop Big Data pipelines
Deliver data as a service framework
Move all legacy workloads to cloud platform
Work with data scientist to build Client pipelines using heterogeneous sources
Ensure automation through CI/CD across platforms
Research and assess open source technologies
Mentor other team members on Big Data and Cloud Tech stacks
Define needs around maintainability, testability, performance, security, quality and usability for data platform
Drive implementation, consistent patterns, reusable components, and coding standards for data engineering processes
Convert SAS based pipelines into languages like PySpark, Scala
Tune Big data applications on Hadoop and non-Hadoop platforms
Evaluate new IT developments and evolving business requirements
Supervise day-to-day staff management issues
Requirements:
12+ years of total IT experience
8+ years of experience with Hadoop (Cloudera)/big data technologies
Advanced knowledge of the Hadoop ecosystem and Big Data technologies
Hands-on experience with the Hadoop eco-system (HDFS, MapReduce, Hive, Pig, Impala, Spark, Kafka, Kudu, Solr)
Experience on designing and developing Data Pipelines for Data Ingestion or Transformation using Java or Scala or Python
Experience with Spark programming (pyspark or scala or java)
Expert level building pipelines using Apache Spark
Familiarity with core provider services from AWS, Azure or GCP
Hands-on experience with Python/Pyspark/Scala and basic libraries for machine learning
Experience with containerization and related technologies (e.g. Docker, Kubernetes)
Experience with all aspects of DevOps (source control, continuous integration, deployments, etc.)
1 year Hadoop administration experience preferred
1+ year of SAS experience preferred
Proficient in programming in Java or Python with prior Apache Beam/Spark experience a plus
System level understanding - Data structures, algorithms, distributed storage & compute
Can-do attitude on solving complex business problems, good interpersonal and teamwork skills
Possess team management experience and have led a team of data engineers and analysts
Welcome to
CrawlJobs.com
– Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.
We use cookies to enhance your experience, analyze traffic, and serve personalized content. By clicking “Accept”, you agree to the use of cookies.