Amazon S3 Object Lambda

What is an Object Lambda ? S3 Object Lambda is a new feature that lets you customize how data from Amazon S3 is handled before it’s sent back to your application. This means you can process and change the data using your own code without having to modify your application. It works by using AWS Lambda functions to automatically process the data each time it’s fetched from S3. This way, you can easily create different versions or views of your data without changing your app, and you can update how the data is processed whenever you need to. How to setup s3 object lambda Step 1: First, we will create an access point for S3 bucket Go to your S3 bucket and open access points We will have to create a bucket access point Name you access point and select internet in network origin Step 2: Create Lambda import boto3 import requests import pandas as pd def lambda_handler(event, context): print(event) object_get_context = event["getObjectContext"] request_route = object_get_context["outputRoute"] request_token = object_get_context["outputToken"] s3_url = object_get_context["inputS3Url"] # Get object from S3 response = requests.get(s3_url) original_object = response.content.decode('utf-8') # Transform object columns = original_object.split('\r\n')[0].replace('"','').replace('\ufeff','').split(',') user_data = [] for data in original_object.split('\r\n')[1:-1] : user_data.append(data.replace('"','').split(',')) user_df = pd.DataFrame(user_data,columns=columns) user_df.drop(columns=['password'],inplace=True) # Write object back to S3 Object Lambda s3 = boto3.client('s3') s3.write_get_object_response( Body=bytes(user_df.to_csv(index=False),encoding = 'utf-8'), RequestRoute=request_route, RequestToken=request_token) return {'status_code': 200} Step 3: Create object lambda access points from left hand side menu of S3 Click on ‘Create Object Lambda Access Point’ Select your bucket access point Select all transformation Select Lambda that you have created above Leave other things as default. Step 4: Get object using boto3 Csv Data : Copy the below data and store it in csv file with name user_daily_data.csv id,email,username,password,name__firstname,name__lastname,phone,__v,address__geolocation__lat,address__geolocation__long,address__city,address__street,address__number,address__zipcode 1,sarahreyes@example.org,cameronrobert,&RnzRNczN6,Tracy,Doyle,9-087-607-2043,0,-81.123854,-158.066853,Thompsonland,166 Hammond Stravenue,8812,56265 2,amandawallace@example.net,lambertfranklin,$BQGXjt49y,Christopher,Hansen,6-507-543-5500,0,-56.4904755,-47.427614,Pageburgh,716 Leonard Haven Suite 277,8776,42149 import boto3 import pandas as pd s3 = boto3.client('s3') print('Original object from the S3 bucket:') original = s3.get_object( Bucket='your_bucket_name', Key='user_daily_data.csv') data_str = original['Body'].read().decode('utf-8') print(data_str) print('Object processed by S3 Object Lambda:') transformed = s3.get_object( Bucket='bucket_access_point', Key='user_daily_data.csv') data_str = transformed['Body'].read().decode('utf-8') print(data_str) Author Bio : Yashupadhyaya is a Certified AWS Data Engineer with over 2.5 years of experience in architecting and optimizing data pipelines using AWS and Azure services. Specializing in cloud-based solutions, Yashupadhyaya is proficient in technologies like Python, PySpark, Airflow, dBt, and Flask. He has a proven track record of leveraging AWS and Azure services to build scalable analytics platforms and streamline data ingestion processes. With a passion for driving data-driven decision-making, he continues to innovate in the realm of multi-cloud environments and open-source tools.

Apr 13, 2025 - 09:19
 0
Amazon S3 Object Lambda

What is an Object Lambda ?

S3 Object Lambda is a new feature that lets you customize how data from Amazon S3 is handled before it’s sent back to your application. This means you can process and change the data using your own code without having to modify your application. It works by using AWS Lambda functions to automatically process the data each time it’s fetched from S3. This way, you can easily create different versions or views of your data without changing your app, and you can update how the data is processed whenever you need to.

How to setup s3 object lambda

Step 1: First, we will create an access point for S3 bucket

  1. Go to your S3 bucket and open access points
  2. We will have to create a bucket access point
  3. Name you access point and select internet in network origin

Step 2: Create Lambda

import boto3
import requests
import pandas as pd
def lambda_handler(event, context):
print(event)
object_get_context = event["getObjectContext"]
request_route = object_get_context["outputRoute"]
request_token = object_get_context["outputToken"]
s3_url = object_get_context["inputS3Url"]
# Get object from S3
response = requests.get(s3_url)
original_object = response.content.decode('utf-8')
# Transform object
columns = original_object.split('\r\n')[0].replace('"','').replace('\ufeff','').split(',')
user_data = []
for data in original_object.split('\r\n')[1:-1] :
user_data.append(data.replace('"','').split(','))
user_df = pd.DataFrame(user_data,columns=columns)
user_df.drop(columns=['password'],inplace=True)
# Write object back to S3 Object Lambda
s3 = boto3.client('s3')
s3.write_get_object_response(
Body=bytes(user_df.to_csv(index=False),encoding = 'utf-8'),
RequestRoute=request_route,
RequestToken=request_token)
return {'status_code': 200}

Step 3: Create object lambda access points from left hand side menu of S3

  1. Click on ‘Create Object Lambda Access Point’
  2. Select your bucket access point
  3. Select all transformation
  4. Select Lambda that you have created above
  5. Leave other things as default.

Step 4: Get object using boto3

Csv Data : Copy the below data and store it in csv file with name user_daily_data.csv

id,email,username,password,name__firstname,name__lastname,phone,__v,address__geolocation__lat,address__geolocation__long,address__city,address__street,address__number,address__zipcode
1,sarahreyes@example.org,cameronrobert,&RnzRNczN6,Tracy,Doyle,9-087-607-2043,0,-81.123854,-158.066853,Thompsonland,166 Hammond Stravenue,8812,56265
2,amandawallace@example.net,lambertfranklin,$BQGXjt49y,Christopher,Hansen,6-507-543-5500,0,-56.4904755,-47.427614,Pageburgh,716 Leonard Haven Suite 277,8776,42149
import boto3
import pandas as pd

s3 = boto3.client('s3')

print('Original object from the S3 bucket:')
original = s3.get_object(
Bucket='your_bucket_name',
Key='user_daily_data.csv')
data_str = original['Body'].read().decode('utf-8')
print(data_str)

print('Object processed by S3 Object Lambda:')
transformed = s3.get_object(
Bucket='bucket_access_point',
Key='user_daily_data.csv')
data_str = transformed['Body'].read().decode('utf-8')
print(data_str)

Author Bio :

Yashupadhyaya is a Certified AWS Data Engineer with over 2.5 years of experience in architecting and optimizing data pipelines using AWS and Azure services. Specializing in cloud-based solutions, Yashupadhyaya is proficient in technologies like Python, PySpark, Airflow, dBt, and Flask. He has a proven track record of leveraging AWS and Azure services to build scalable analytics platforms and streamline data ingestion processes. With a passion for driving data-driven decision-making, he continues to innovate in the realm of multi-cloud environments and open-source tools.