Location- Pune/Indore/Hyderabad
Primary Skills :
• Masters / Bachelors Degree in Computer Science, Information Technology or other relevant fields.
• Minimum 7+ Years of Mandatory experience in Advance Data & Analytics Architecture, ETL, Data Engineering solutions using the following Skills, tools and technologies:
• AWS Data & Analytics Services: Athena, Glue, EMR, DynamoDB, Redshift, Kinesis, Lambda
• Databricks Lakehouse Platform
• PySpark, Spark SQL, Spark Streaming
• Experience with Apache Kafka / Nifi for use with streaming data / event-based data
• Experience with any NoSQL Database
• 3+ year of coding experience with modern programming or scripting language (Python).
• Expert-level skills in writing and optimizing SQL.
• Experience operating very large data warehouses or data lakes/ data platform.
• Show efficiency in handling data - tracking data lineage, ensuring data quality, and improving discoverability of data.
• Experience of working in Agile delivery
• Sound knowledge of distributed systems and data architecture (lambda)- design and implement batch and stream data processing pipelines, knows how to optimize the distribution, partitioning, and MPP of high-level data structures
• Experience with full software development life cycle, including coding standards, code reviews, source control management, build processes, and testing.
• Excellent business and communication skills to work with business owners to understand data requirements.
Responsibilities :
• You will be responsible to understand the client requirement and architect robust data platform on AWS cloud technologies and Databricks.
• You will be responsible for identifying and creating reusable data solutions (accelerators, MVP)
• You will be responsible for creating reusable components for rapid development of data platform
• Work closely with the Product Owners and stake holders to design the Technical Architecture for data platform to meet the requirements of the proposed solution.
• Work with the leadership to set the standards for software engineering practices within the Data engineering team and support across other disciplines
• Play an active role in leading team meetings and workshops with clients.
• Delivering and presenting proofs of concept to of key technology components to project stakeholders.
• Educate clients on Cloud technologies and influence direction of the solution.
• Choose and use the right analytical libraries, programming languages, and frameworks for each task.
• Help us to shape the next generation of our products.
• Explore and learn the latest AWS Data & Analytics and Databricks Platform features /technologies to provide new capabilities and increase efficiency.
• Design, build and operationalize large scale enterprise data solutions and applications using one or more of AWS data and analytics services in combination with 3rd parties – Databricks, Spark, EMR, DynamoDB, RedShift, Kinesis, Lambda,Glue, Athena.
• Analyze, re-architect and re-platform, migrate on-premise data warehouses to data platforms on AWS cloud using AWS or 3rd party services.
• Design and build production data pipelines from ingestion to consumption within a big data architecture, using Python,PySpark, Databricks.
• Design and implement data engineering, ingestion and curation functions on AWS cloud using AWS native or custom programming.
• Perform detail assessments of current state data platforms and create an appropriate transition path to AWS cloud & Databricks.
• Build and deliver high quality data architecture to support business analyst, data scientists, and customer reporting needs.