Transform Settlement Process using AWS Data pipeline

The task involves processing settlement files from various sources using AWS data pipelines. These files may come as zip files, Excel sheets, or database tables. The pipeline executes business logic and transforms inputs (often Excel) into outputs (also in Excel). It receives multiple settlement files from different locations. We optimized this by connecting to AWS S3, storing input files there, and triggering a data ETL jobs pipeline for processing. Our inputs typically come from various sources. We use inputs from existing AWS tables and external inputs in Excel format. These diverse inputs are ultimately converted to Parquet. This documentation outlines the process, and I would like to share the AWS data pipeline ETL jobs architecture for replication purposes. *AWS Architecture: * An overview of the AWS architecture including components like S3, DynamoDB, event bridges, lambdas, ETL jobs, and AWS Glue. how files are processed through different layers of S3 and the role of Lambda functions in unzipping and converting files. Event Rules and Domains: The setup of event rules in AWS, which trigger ETL processes when files are dropped in specific S3 folders. They also explained the concepts of domains and tenants in AWS and how they are used to create architectural diagrams.

Apr 27, 2025 - 05:50
 0
Transform Settlement Process using AWS Data pipeline

The task involves processing settlement files from various sources using AWS data pipelines. These files may come as zip files, Excel sheets, or database tables. The pipeline executes business logic and transforms inputs (often Excel) into outputs (also in Excel).
It receives multiple settlement files from different locations. We optimized this by connecting to AWS S3, storing input files there, and triggering a data ETL jobs pipeline for processing.

Our inputs typically come from various sources. We use inputs from existing AWS tables and external inputs in Excel format. These diverse inputs are ultimately converted to Parquet. This documentation outlines the process, and I would like to share the AWS data pipeline ETL jobs architecture for replication purposes.

*AWS Architecture: * An overview of the AWS architecture including components like S3, DynamoDB, event bridges, lambdas, ETL jobs, and AWS Glue. how files are processed through different layers of S3 and the role of Lambda functions in unzipping and converting files.

Uploading image

Event Rules and Domains: The setup of event rules in AWS, which trigger ETL processes when files are dropped in specific S3 folders. They also explained the concepts of domains and tenants in AWS and how they are used to create architectural diagrams.