Dockerfile Explained: How to Build Your First Container Image

Learn what a Dockerfile is, how each instruction works, and how to write a clean Dockerfile that builds a small, fast, and reproducible container image you can ship anywhere.

Dockerbeginner
12 min read

If you have read What is Docker?, you know that images are the blueprints and containers are the running instances. The thing that defines an image — the recipe Docker reads to assemble it — is a Dockerfile. Learning to read and write one well is the single most useful Docker skill, because every project you touch from here on out will have one.

This guide walks through every instruction you will use 95% of the time, shows you a real production-grade Node.js Dockerfile, and explains the patterns (multi-stage builds, layer caching, slim base images) that separate a 1.2 GB beginner image from a 90 MB professional one. By the end you will be able to write a Dockerfile for any small app from scratch.

What a Dockerfile Actually Is

A Dockerfile is a plain text file (no extension, just Dockerfile) sitting at the root of your project. Each line is an instructionFROM, COPY, RUN, CMD — that builds up the image one layer at a time.

When you run docker build -t myapp ., the Docker daemon reads the Dockerfile top to bottom, runs each instruction, caches the resulting filesystem snapshot, and stacks the layers into a final image. Each layer is content-addressed and immutable — change one instruction near the top and every layer below it has to rebuild. Change one near the bottom and Docker reuses the cached layers above. This is why instruction order matters so much for build speed.

The current best practice in 2026 is BuildKit (default in Docker 23+) which adds parallel layer builds, secret mounting, cache mounts, and multi-platform builds out of the box. You get all of this for free just by using a recent Docker.

The Instructions You Actually Need

Out of the dozen+ Dockerfile instructions, you will use these eight in almost every file:

  • FROM — the base image you start from. Always the first line. FROM node:22-alpine says "start from a tiny Alpine Linux image with Node 22 already installed."
  • WORKDIR — set the working directory inside the image. All subsequent commands run from here. WORKDIR /app.
  • COPY — copy files from your project into the image. COPY package*.json ./ brings just the package files in (we will see why ordering matters in a moment).
  • RUN — execute a shell command at build time. Used to install dependencies. RUN npm ci --omit=dev.
  • ENV — set environment variables. ENV NODE_ENV=production.
  • EXPOSE — document which port the container listens on. EXPOSE 3000. (Documentation only — -p at run time is what actually publishes the port.)
  • USER — switch to a non-root user. USER node.
  • CMD — the default command run when the container starts. CMD ["node", "server.js"].

There is also ENTRYPOINT (locks in the executable), ARG (build-time variables), VOLUME, LABEL, and HEALTHCHECK — useful but not always needed for a first Dockerfile.

A Minimal, Solid Dockerfile

Let us write a real Dockerfile for a small Node.js API:

dockerfiledockerfile
FROM node:22-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci --omit=dev
COPY . .
ENV NODE_ENV=production
EXPOSE 3000
USER node
CMD ["node", "server.js"]

Read it top to bottom: start from a 50 MB Node 22 base, set the working directory, copy just package.json and package-lock.json first, install production dependencies, then copy the rest of the source. Set the environment, document the port, drop privileges to the non-root node user that the base image provides, and define the start command.

That is enough for a first production-quality image. Build it with docker build -t myapi:1.0 . and run it with docker run -p 3000:3000 myapi:1.0.

Why Layer Order Matters (Caching)

Why copy package*.json before the rest of the source? Layer caching.

Every time Docker rebuilds, it walks each instruction and asks "did anything that affects this layer change?" If not, it reuses the cached layer. The moment something changes, every instruction below has to re-run.

npm ci is the slow part of the build. By copying only the package manifests first, the RUN npm ci layer is cached as long as your dependencies do not change. Edit server.js, rebuild, and Docker skips straight to the COPY . . layer. Without that ordering trick, every code change triggers a full reinstall — turning 5-second rebuilds into 90-second ones.

Apply the same principle to any language: copy the dependency files first, install, then copy the source.

Multi-Stage Builds: How Pros Get Tiny Images

A node image is huge because it includes a compiler, a package manager, system tools — none of which you need at runtime. Multi-stage builds let you use a heavy image to build, then copy just the artefacts into a lean final image:

dockerfiledockerfile
# Stage 1: build
FROM node:22-alpine AS build
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
 
# Stage 2: runtime
FROM node:22-alpine
WORKDIR /app
ENV NODE_ENV=production
COPY --from=build /app/dist ./dist
COPY --from=build /app/package*.json ./
RUN npm ci --omit=dev
USER node
CMD ["node", "dist/server.js"]

The final image only contains the compiled dist/ output and runtime dependencies — often 5-10x smaller than a single-stage build. The same pattern works beautifully for Go (FROM scratch runtime), Rust, Java, and Python.

.dockerignore: The File Everyone Forgets

Right next to your Dockerfile, create a .dockerignore:

CodeCode
node_modules
.git
.env
.env.*
*.log
dist
.DS_Store

Without it, COPY . . slurps in your local node_modules (slow, often broken), your .git history (huge), and your .env files (a real security risk). Add it from day one — it speeds builds dramatically and prevents leaking secrets into images.

Common Mistakes Beginners Make

  • Using latest as a tag. FROM node:latest means tomorrow's build can be different from today's. Always pin: FROM node:22-alpine or even FROM node:22.11.0-alpine.
  • Copying source before installing deps. Breaks layer caching, makes every rebuild slow.
  • Running as root. The base image probably defines a node or nobody user — switch to it with USER.
  • No .dockerignore. Your image ends up with node_modules, .git, and possibly .env. Slow and unsafe.
  • Putting secrets in ENV. ENV values are visible in docker history. Use BuildKit secret mounts (RUN --mount=type=secret,id=...) or runtime env injection.

Quick Reference

  • Build: docker build -t myapp:1.0 .
  • Build for arm64 from x86 (or vice versa): docker buildx build --platform linux/amd64,linux/arm64 -t myapp .
  • Show layers + sizes: docker history myapp:1.0
  • Find what is blowing up image size: dive myapp:1.0
  • Skinny base images: node:22-alpine, python:3.13-slim, golang:1.23-alpine, eclipse-temurin:21-jre-alpine.
  • Distroless / scratch: minimal runtime images for compiled binaries.
  • BuildKit cache mount (faster npm ci): RUN --mount=type=cache,target=/root/.npm npm ci.
Rune AI

Rune AI

Key Insights

  • A Dockerfile is a small text recipe; each instruction creates a cached image layer.
  • Order matters — copy dependency manifests first so the install step caches between code changes.
  • Multi-stage builds dramatically reduce final image size by leaving build tools behind.
  • Always pin base image versions, add a .dockerignore, and switch to a non-root USER.
  • Use BuildKit features (cache mounts, secret mounts) for faster, safer builds.
RunePowered by Rune AI

Frequently Asked Questions

Why is my image so big?

Usually one of: heavy base image (`ubuntu` instead of `alpine`/`slim`), no multi-stage build, copying `node_modules` from the host, or leaving build tools in the runtime stage.

Alpine vs Debian-slim?

lpine is smaller and faster but uses musl libc — some Node native modules and Python wheels do not work cleanly on it. Debian-slim is a safe default if you hit weird crashes.

ENTRYPOINT vs CMD?

`ENTRYPOINT` defines the executable; `CMD` defines its default arguments. Use `CMD` alone for simple apps. Use `ENTRYPOINT` when you want users to pass arguments to your container as if it were the binary.

Should I `RUN apt-get update && apt-get install` in one line?

Yes — separate `RUN` instructions create separate cache layers, and an old `apt-get update` cached layer means stale package lists. Always combine.

Can I have multiple Dockerfiles?

Yes — `docker build -f Dockerfile.prod .` builds with a specific file. Common pattern: one for dev (with hot reload), one for prod (multi-stage, optimised).

Conclusion

A clean Dockerfile is one of those tiny artefacts that pays back every single day of a project's life. Pin your base image, order instructions for cache friendliness, multi-stage your builds, drop root privileges, and add a .dockerignore. Do those five things and you are already ahead of most production Dockerfiles you will see in the wild.