Product
OverviewVideo​Graphic​Document​
Enterprise
Story
LETR / TECH noteNews / Notice​
Pricing
En
한국어English日本語日本語
User guide
Getting started
한국어English日本語
한국어English日本語
Zipping Large Size S3 Folders and Files Using Node.js Lambda And EFS
2024-07-04

‍

Zipping Large Size S3 Folders and Files Using Node.js Lambda And EFS

‍

By Hyuntaek Park

Senior full-stack engineer at Twigfarm

‍

AWS S3 is a very convenient cloud storage. Can you upload and download files stored in various ways with AWS CLI, SDK, API, etc. But can you download an entire folder and its sub-folders and files recursively? Notable, S3 does not provide such features. We need to develop our own way to recursively zip the folder and make the zip file available for download.

‍

Requirements

Our goal is to zip entire folders, sub-folders, and files under the folders in our S3 bucket while storing folder structure tree. The files can be large (> 512 MB, which is the size of Lambda temporary storage).

‍

How files are stored in S3

We have created folders and stored files as following in S3 bucket.

image

However, to be concise, they are not folders in S3. THERE ARE JUST FOUR FILES WITH THE FOLLOWING KEYS:

  • folder1/sub1/image.png
  • folder1/sub2/test.txt
  • folder2/large.mov
  • folder2/test2.pdf

‍

Solution

Memories S3 does not have a concept of folders, the key of each file has folder information as prefixes. Each folder level is delimited by '/' and separated by the file name. (i.e., folder1/sub1/image.png)

Using the key that has folder information prefix, we can create folders in EFS and then download the file from S3.

Then Lambda simply does the zipping and upload the zip file back to S3. The following diagram shows the sequence of our implementation and how files are intended to be in S3 and EFS.

image

One thing to keep in mind is that our Lambda and the EFS must be in the same VPC.

‍

Create EFS (Elastic File System) and access point

There are a couple of reasons why Amazon EFS comes in handy.

  • EFS is just like the Linux file system. Can you use file commands such as mkdir, ls, CP, rm, etc.
  • We could use Lambda's temporary storage has size limit: < 512 MB

Let's create an EFS. Go to Elastic file system in the AWS console and click Create file system.

image

Then click create.

Now it is time for creating an access point which to be used in Lambda function later. CHOOSE THE FILE SYSTEM WE JUST CREATED. Then click Access points —> Create an access point.

Here's the input values you should enter:

  • Root directory path: /efs
  • POSIX users
  • User ID: 1000
  • Group ID: 1000
  • Root directory creation permissions
  • Owner user ID: 1000
  • Owner Group ID: 1000
  • POSIX permissions to apply to the root directory path: 0777
image

‍

Create and configure EFS attached Lambda function

Let's create a Node.js Lambda function as following:

image
image

Once the Lambda function is created, click configuration —> File systems —> Add file system.

image

CHOOSE THE EFS ACCESS POINT THAT WE HAVE JUST CREATED. And put /mnt/efs For local mount path. This important because /mnt/efs Will be your EFS folder.

Click Save, now you have access to /mnt/efs From the Lambda function.

‍

Access to S3 from Lambda

VPC Endpoints

According to https://docs.aws.amazon.com/vpc/latest/privatelink/vpc-endpoints.html,

A VPC endpoint connecting between a virtual private cloud (VPC) and supported services, without necessarily that you use an internet gateway, NAT device, VPN connection, or AWS Direct Connect connection.

To access to S3 buckets from Lambdas inside a VPC, we need to set up a VPC endpoint for S3. Go to VPC and click Endpoints —> Create endpoint. Then select input as following:

image
image

Then click Create endpoint. Reaching the Lambdas within the VPC can access to S3 now but one more step is required to really access a specific S3 bucket.

‍

Lambda role

A Lambda role is created while creating our Lambda function. Can you use the existing role for the Lambda. Here we just created a new role. Go to our Lambda function then click configuration —> Permissions. Then choose the role under Execution role.

image

Then go to Permissions policies —> click Add permissions —> Create inline policy. On the next screen, choose JSON tab. Then copy and paste following. Replace YOUR_BUCKET_NAME with your own bucket name.

Click Review policy. Enter the policy name you like and then click Create policy.

‍

More Lambda configurations

Since taking time and file sizes can be measured of megabytes, Lambda's default memory size (128 MB) and timeout (3 seconds) are not enough. For this purpose, memory size and timeout are set to 4096 MB and 2 minutes, saved in configuration —> General configuration.

‍

Lambda code

Here's the final Lambda code. The Code Implements What We Have Done.

  1. Copies folders/file from S3 to EFS.
  2. Zips stored files in EFS
  3. Uploads the zip file back to S3
  4. Staying the Temporary EFS file

I hope the code decommissives is self-defeating. Just one thing to mention is that we used an open source Node.js package called archiver for zipping folders and files. There are many ways that you can zip files in Node.js. YOU CAN CHOOSE ANYTHING SUITS YOU THE BEST.

Saying there should be try/catch Blocks to deal with error cases. But here we just omit them for simplicity.

‍

results

Let's go check our S3 bucket.

image

As you can see there is a new zip file, called my-archive.zip. Let's click the file name and download and unzip the file.

image

Folder and file structure is exactly the same as the one at the top of this article.

We had many steps to follow to achieve this simple requirement, zipping folders and files in S3, but they are pretty standard when you have to deal with AWS.

  • Create and launch AWS service
  • Give appropriate permissions
  • Execute the logic

It took a while for me to get used to it! :)

‍

Thanks for reading.

‍

‍

✏️콘텐츠 번역&현지화, 한 곳에서 해결하세요.

• 영상번역 툴 무료 체험하기
• 월간 소식지로 더 많은 이야기 읽어보기 💌


View all blogs

View featured notes

LETR note
Comparing Google Gemini and LETR WORKS Persona chatbots
2024-12-19
WORKS note
All about persona chatbot: technology, usage, and LETR WORKS approach
2024-12-16
LETR note
Paradigm innovation in content creation - the present and future of AI dubbing technology
2024-12-12
User Guide
Partnership
Twigfarm Co.,Ltd.
Company registration number : 556-81-00254  |  Mail-order sales number : 2021- Seoul Jongno -1929
CEO : Sunho Baek  |  Personal information manager : Hyuntaek Park
Seoul head office : (03187) 6F, 6,Jong-ro, Jongno-gu,Seoul, Republic of Korea
Gwangju branch : (61472) 203,193-22, Geumnam-ro,Dong-gu,Gwangju, Republic of Korea
Singapore asia office : (048581) 16 RAFFLES QUAY #33-07 HONG LEONG BUILDING SINGAPORE
Family site
TwigfarmLETR LABSheybunny
Terms of use
|
Privacy policy
ⓒ 2024 LETR WORKS. All rights reserved.