Designed and developed an optimal metadata-driven framework that can reduce data pipeline development effort by 80%
- Designed and implemented data pipeline to process semi-structured data using PySpark and stored processed data in Redshift and AWS RDS
- Ingested data from disparate data sources such as Salesforce API, SFTP, and other APIs using Python for data pipelines.
- Automates works using python scripts
- Orchestrated workflows using Airflow and monitored them using AWS Cloudwatch
- Implemented alerts using AWS SNS
- Designing, developing and testing out Proof of concepts
06 Jan 2020 - 31 Dec 2021
We aspire to become a platform that inspire you to create and helps you become self reliant with your skills and proof of work.