• Design, develop, and maintain scalable data pipelines and ETL processes to support data
integration and analytics.
• Collaborate with data scientists, analysts, and other stakeholders to understand data
requirements and deliver high-quality data solutions.
• Implement and optimize data storage solutions using data warehouse and data lake.
• Develop and maintain data models, schemas, and documentation to ensure data consistency
and integrity.
• Monitor and troubleshoot data pipelines to ensure data accuracy and reliability.
• Stay up-to-date with industry trends and best practices in data engineering.
Requirements
Bachelor's degree in Computer Science, Engineering, or a related field.
• 4-7 years of experience as a Data Engineer or in a similar role.
• Strong proficiency in Python and SQL.
• Experience with big data processing frameworks such as Apache PySpark, Apache Kafka, or
similar.
• Experience with scripting orchestration tool like Airflow.
• Experience with git version control like Github and Azure DevOps
• Knowledge of data warehousing concepts and technologies.
• Excellent problem-solving skills and attention to detail.
• Strong communication and collaboration skills.
Preferred Qualifications:
• Familiarity with cloud platforms (e.g., AWS, GCP, Azure) and their data services. (Not
Mandatory)
• Familiarity with containerization and orchestration tools (e.g., Docker, Kubernetes).
• Knowledge of data governance and security best practices