Multi Stage Docker Build - smaller is better

40 days of K8s - CKA challenge (03/40) Multi Stage Docker Build - smaller is better @piyushsachdeva Day 3/40 - Multi Stage Docker Build - Docker Tutorial For Beginners - CKA Full Course 2024 in the preview lesson - I created a docker image that is 1.2 GB this lesson discussed how to make an image smaller. the chapter focus on main methodology. let me here count some methods considered best approch (BEST PRACTICES) to create as small image as can be created. but 1ST : lets count main reasons : why LARGE images are not recommended: there are several reasons why we want to make images smaller : (1) Faster Deployment * Smaller images pull and push faster * Reduced network bandwidth usage * Quicker container startup times (2) Security Benefits * Smaller attack surface * Fewer potentially vulnerable packages * Less code to scan and maintain (3) Resource Efficiency * Less disk space usage in registries * Reduced memory footprint * Better cache utilization (4) Cost Savings * Lower storage costs in container registries * Reduced data transfer costs * More efficient use of cloud resources (5) Better CI/CD Performance * Faster build pipelines * Quicker rollbacks if needed * Improved deployment reliability there are several methods to make the docker image smaller, and the most recommended is using MULTISTAGE BUILD lets count and explain the methods : Key techniques for slimming Docker images: (1) Start from a Minimal Base Image ('-alpine' or '-slim' ) when choosing the base image of a known application - usualy alpine is available. when you create an image from scratch - use alpine as base image. alpine is a slim linux operation system, s owhen an image is created out of alpine, it is usualy considering installing only what necesarry for the application to use. there is also scratch image, which is an image that uses the HOST operating system as base, but it is not recommended- since we can never know where our image will be executed. there is whole different approach when using scratch , and I will not write about it here **sometimes official versions dont use '-alpine' , but '-slim' in the tagname of the image** (2) Remove Unnecessary Packages/Files after executing image logic - if files are not nesesarry - we can remove the files that used in execution **it is importent to remove the files in the same layer *RUN*, since once a layer is executed - storage that is clearde will not be reduced** RUN apt-get update && apt-get install -y \ some-package && rm -rf /var/lib/apt/lists/* after installing package we can remove the files that are not neaded manually the && add a command after previous comman was executed all is done in the same RUN layer (3) remove cache on layers when adding a an argument to Dockerfile , we can then execute the build with dynamic value to this argument it prevent caching of layers , which makes build longer , but also clears cached data from image FROM .... ARG CACHEBUST=1 # default value RUN ... and then when executing build : (4) remove cache on build when building with --no-cache , we omit cache , just lije in the build ARGUMENTS (5) Multi-Stage Builds when we use multy stage - we actually use the files in one build , but not saving it to the next build. we use the result of the execution as source in a new clean build without all of the installations, needed to create this files. it is similar to removing files after executing (2) however - using this method - we are creating hirarchical flow of the build. in this example I use the first node image (installer) to install and all react packeges. then I add the files used in the build of the site. the command : ``` RUN yarn && yarn build ``` install all packages , and then execute build then, in the 2ND image (deployer) I install NGINX HTTP server - a slim server, much slimmer then node, since I dont need all of the code compilation capabilities of node. then I copy the build files from the node server - which are the only files I need to use in the site from : /app/dist ==> to ==> /usr/share/nginx/html in deployer image. and I config expose its port (80) for outer connection (docker access to the image on port 80 of this image) results : the original image (one stage) - 1.2 GB the multi-staged image - 280MB docker run -d --name 4inarow-multi -p 8080:80 multi-stage-4_in_a_row:1.1 docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 2396e23aa874 multi-stage-4_in_a_row:1.1 "/docker-entrypoint.…" 21 seconds ago Up 21 seconds 0.0.0.0:8080->80/tcp 4inarow-multi

Apr 21, 2025 - 01:06
 0
Multi Stage Docker Build - smaller is better

40 days of K8s - CKA challenge (03/40)

Multi Stage Docker Build - smaller is better

@piyushsachdeva

Day 3/40 - Multi Stage Docker Build - Docker Tutorial For Beginners - CKA Full Course 2024

in the preview lesson - I created a docker image that is 1.2 GB

this lesson discussed how to make an image smaller.
the chapter focus on main methodology.
let me here count some methods considered best approch
(BEST PRACTICES) to create as small image as can be created.

but 1ST : lets count main reasons : why LARGE images are
not recommended:

there are several reasons why we want to make images smaller :

(1) Faster Deployment

* Smaller images pull and push faster * Reduced network bandwidth usage * Quicker container startup times

(2) Security Benefits

* Smaller attack surface * Fewer potentially vulnerable packages * Less code to scan and maintain

(3) Resource Efficiency

* Less disk space usage in registries * Reduced memory footprint * Better cache utilization

(4) Cost Savings

* Lower storage costs in container registries * Reduced data transfer costs * More efficient use of cloud resources

(5) Better CI/CD Performance

* Faster build pipelines * Quicker rollbacks if needed * Improved deployment reliability

there are several methods to make the docker image smaller, and the most recommended is using MULTISTAGE BUILD

lets count and explain the methods :

Key techniques for slimming Docker images:

(1) Start from a Minimal Base Image ('-alpine' or '-slim' )

when choosing the base image of a known application - usualy alpine is available. when you create an image from scratch - use alpine as base image. alpine is a slim linux operation system, s owhen an image is created out of alpine, it is usualy considering installing only what necesarry for the application to use.

there is also scratch image, which is an image that uses the HOST operating system as base, but it is not recommended- since we can never know where our image will be executed. there is whole different approach when using scratch , and I will not write about it here

**sometimes official versions dont use '-alpine' , but '-slim' in the tagname of the image** (2) Remove Unnecessary Packages/Files

after executing image logic - if files are not nesesarry - we can remove the files that used in execution **it is importent to remove the files in the same layer *RUN*, since once a layer is executed - storage that is clearde will not be reduced**

RUN apt-get update && apt-get install -y \ some-package && rm -rf /var/lib/apt/lists/*

after installing package we can remove the files that are
not neaded manually
the && add a command after previous comman was executed
all is done in the same RUN layer

Image description

(3) remove cache on layers

when adding a an argument to Dockerfile , we can then execute the build with dynamic value to this argument it prevent caching of layers , which makes build longer , but also clears cached data from image FROM .... ARG CACHEBUST=1 # default value RUN ...

Image description

and then when executing build :

Image description

(4) remove cache on build

when building with --no-cache , we omit cache , just lije in the build ARGUMENTS

Image description

(5) Multi-Stage Builds

when we use multy stage - we actually use the files in one build , but not saving it to the next build. we use the result of the execution as source in a new clean build without all of the installations, needed to create this files. it is similar to removing files after executing (2) however - using this method - we are creating hirarchical flow of the build.

Image description

in this example I use the first node image (installer) to install and all react packeges. then I add the files used in the build of the site. the command :

``` RUN yarn && yarn build ```

install all packages , and then execute build

then, in the 2ND image (deployer) I install NGINX HTTP server - a slim server, much slimmer then node, since I dont need all of the code compilation capabilities of node. then I copy the build files from the node server - which are the only files I need to use in the site from : /app/dist ==> to ==> /usr/share/nginx/html in deployer image. and I config expose its port (80) for outer connection (docker access to the image on port 80 of this image)

results :

the original image (one stage) - 1.2 GB
the multi-staged image - 280MB

Image description

docker run -d --name 4inarow-multi -p 8080:80 multi-stage-4_in_a_row:1.1


docker ps 
CONTAINER ID   IMAGE                        COMMAND                  CREATED          STATUS          PORTS                  NAMES
2396e23aa874   multi-stage-4_in_a_row:1.1   "/docker-entrypoint.…"   21 seconds ago   Up 21 seconds   0.0.0.0:8080->80/tcp   4inarow-multi

Image description