Job Detail

Senior Data Engineer

Senior Data Engineer
job location

Patna

Opening: 13-Jan-2024

Skills Required: Python, Java, Scripting languages (Bash, Shell), Apache Spark, Scala, Airflow, Hadoop, Kafka, AWS, SQL and NoSQL databases

Experience Required: Minimum of 5 Years

Educational Qualification: Bachelor's degree in Computer Science, Information, Engineering or another quantitative field

Employment Type: Full Time

Workplace Options: Onsite

Opening Summary

We are looking for an experienced data engineer to join our team. Data Engineers uses various methods to transform raw data into useful data systems. They will create algorithms and conduct statistical analysis. Overall, Data Engineers are strive for efficiency by aligning data systems with business goals.

Requirements

  • Proven experience as a data engineer with a focus on AWS cloud technologies.
  • Strong knowledge of AWS data services, including S3, Redshift, Glue, EMR, EC2, Dynamo DB, RDS, Lambda, and Athena.
  • Proficiency in programming languages such as Python, Scala, or Java.
  • Experience with ETL tools and frameworks, such as Apache Spark.
  • Experience with data modeling and SQL.
  • Knowledge of data governance, security, and compliance practices.
  • Excellent problem-solving and communication skills.
  • AWS certifications (e.g., AWS Certified Data Engineer) are a plus

Responsibilities

  • Design, develop, and maintain ETL (Extract, Transform, Load) processes and data pipelines on AWS for Batch & Realtime data.
  • Implement data storage solutions, including data lakes and data warehouses, leveraging AWS services such as S3, Redshift, Dynamo DB, RDS, Glue, EC2 and EMR or 3rd party managed services like Snowflake, Databricks, Redis, Neo4j
  • Implement the best data format (Parquet, ORC, Avro, Json) for processing and accessing the data.
  • Collaborate with data scientists, analysts, and other stakeholders to understand data requirements and deliver solutions that meet business objectives
  • Optimize and tune data processing and transformation jobs for performance and scalability.
  • Ensure data security, compliance, and governance practices are followed.
  • Monitor data pipeline health, troubleshoot issues, and implement proactive solutions.
  • Implement Data Model in consumption layer, which is standardized, governed and optimized
  • Work and implement by selecting a CI/CD tool for cloud environment, consider factors such as preferred cloud provider, integration capabilities, ease of use, scalability, and pricing for e.g. Jenkins, GitLab CI/CD, AWS CodePipeline, Google Cloud Build, Bamboo, Drone, Travis, check, puppet, bash scripts.
  • Maintain documentation and best practices for data engineering processes and standards. Stay up to date with emerging AWS technologies and best practices in data engineering.

Notes

  • Joining time should not be longer than 30 days.