Pyspark developer

EXL • Gurugram, India • 21h ago

Please find the JD below:

The ideal candidate will be responsible for developing high-quality applications.

Design, develop, maintain efficient and scalable solutions using PySpark
Ensure data quality and integrity by implementing robust testing, validation and cleansing processes
Integrate data from various sources, including databases, APIs, external datasets etc.
Optimize and tune PySpark jobs for performance and reliability
Document data engineering processes, workflows and best practices
Strong understanding of databases, data modelling, and ETL tools and processes
String programming skills in python and proficiency with PySpark, SQL
Experience with relational databases, Hadoop, Spark, Hive, Impala
Excellent communication and collaboration skills