Skip to content

Building Images

Complete guide to building DevOps Images locally, including benchmarks, optimisation strategies, and troubleshooting.


When to Build vs Pull

Pull if you need

  • Standard tooling: Official builds have everything most teams need
  • Fast setup: Pull in seconds vs build in minutes
  • Tested builds: CI/CD tested and scanned for vulnerabilities
  • Multi-arch support: Automatic architecture selection
  • Regular updates: Weekly rebuilds with latest security patches

Quick pull:

docker pull ghcr.io/jinalshah/devops/images/all-devops:latest

Build Locally

Build if you need

  • 🔧 Custom tools: Add proprietary or internal tools
  • 🔧 Specific versions: Pin tool versions for compliance
  • 🔧 Size optimisation: Remove unused tools
  • 🔧 Custom base: Different Linux distro or base image
  • 🔧 Air-gapped: No internet access for pulls

Prerequisites

System Requirements

Requirement Minimum Recommended Notes
Docker 20.10+ 24.0+ BuildKit required
Disk Space 10 GB free 20 GB free Build cache + layers
RAM 4 GB 8 GB Parallel builds benefit
CPU 2 cores 4+ cores Faster builds
Network 10 Mbps 100 Mbps Package downloads

Enable BuildKit

# Enable for single build
DOCKER_BUILDKIT=1 docker build .

# Enable permanently
echo 'export DOCKER_BUILDKIT=1' >> ~/.bashrc
source ~/.bashrc
  1. Open Docker Desktop settings
  2. Go to "Docker Engine"
  3. Add to configuration:
    {
      "features": {
        "buildkit": true
      }
    }
    
  4. Click "Apply & Restart"
docker version | grep BuildKit
# Should show: BuildKit: true

Quick Start

Build Single Image

docker build \
  --target all-devops \
  --tag all-devops:local \
  .

Build time: ~21 minutes (cold), ~3 minutes (warm)

docker build \
  --target aws-devops \
  --tag aws-devops:local \
  .

Build time: ~18 minutes (cold), ~2 minutes (warm)

docker build \
  --target gcp-devops \
  --tag gcp-devops:local \
  .

Build time: ~19 minutes (cold), ~2.5 minutes (warm)

Build All Images

#!/bin/bash
# build-all.sh

for target in all-devops aws-devops gcp-devops; do
  echo "Building $target..."
  docker build \
    --target "$target" \
    --tag "$target:local" \
    .
done

Total build time: ~60 minutes (cold), ~8 minutes (warm with cache)


Build Time Benchmarks

Cold Build (No Cache)

Image amd64 arm64 Notes
Base layer 15 min 17 min Rocky Linux + tools
aws-devops 18 min 20 min +AWS CLI installation
gcp-devops 19 min 21 min +gcloud SDK (larger)
all-devops 21 min 23 min +both cloud CLIs

Warm Build (Cached Layers)

Image Build Time Layers Rebuilt Layers Cached
aws-devops 2-3 min AWS layer only Base layer (15 min saved)
gcp-devops 2.5-3.5 min GCP layer only Base layer (15 min saved)
all-devops 3-4 min Cloud layers only Base layer (15 min saved)

CI/CD Build Times

Platform Cold Build Warm Build Cache Strategy
GitHub Actions 20-25 min 4-6 min Layer caching enabled
GitLab CI 22-27 min 5-7 min Registry cache
Jenkins 18-23 min 3-5 min Persistent volumes
Local (M2 Mac) 15-20 min 2-4 min BuildKit cache

Build Cache Strategies

Strategy 1: Layer Caching (Default)

Docker automatically caches unchanged layers:

# Layer 1: Base (rarely changes) - CACHED
FROM rockylinux:9

# Layer 2: System packages (monthly) - CACHED
RUN dnf install -y python3 nodejs

# Layer 3: IaC tools (weekly) - REBUILT
RUN install-terraform.sh

# Layer 4: Cloud CLIs (monthly) - REBUILT
RUN install-aws-cli.sh

Optimise Layer Order

Place frequently changing layers at the end to maximise cache hits.

Strategy 2: BuildKit Cache Mount

# Use cache mount for package managers
docker build \
  --target all-devops \
  --cache-from type=local,src=/tmp/buildkit-cache \
  --cache-to type=local,dest=/tmp/buildkit-cache \
  -t all-devops:local .

Benefits: - Persistent cache across builds - Shared cache between projects - Faster package manager operations

Strategy 3: Registry Cache

# Pull previous build as cache
docker pull ghcr.io/jinalshah/devops/images/all-devops:latest

# Build using registry cache
docker build \
  --target all-devops \
  --cache-from ghcr.io/jinalshah/devops/images/all-devops:latest \
  -t all-devops:local .

Benefits: - Works in CI/CD without local cache - Team shares cache via registry - Consistent across environments


Build Args Reference

Common Build Args

Arg Default Purpose Example
GCLOUD_VERSION 501.0.0 Google Cloud SDK version 501.0.0
PACKER_VERSION 1.11.2 Packer version 1.11.2
TERRAGRUNT_VERSION 0.68.14 Terragrunt version 0.68.14
TFLINT_VERSION 0.50.3 TFLint version 0.50.3
K9S_VERSION 0.32.7 k9s version 0.32.7
PYTHON_VERSION 3.12.4 Python source version to compile 3.11.10, 3.12.4
PYTHON_VERSION_TO_USE python3.12 Python binary selected via alternatives python3.11, python3.12
GHORG_VERSION 1.9.10 ghorg version 1.9.10
MONGODB_VERSION 6.0 MongoDB shell repository major version 6.0, 7.0

Using Build Args

docker build \
  --target all-devops \
  --build-arg GCLOUD_VERSION=500.0.0 \
  --build-arg PACKER_VERSION=1.11.2 \
  --build-arg PYTHON_VERSION=3.11 \
  -t all-devops:custom .

Pin All Versions

# versions.env
GCLOUD_VERSION=501.0.0
PACKER_VERSION=1.11.2
TERRAGRUNT_VERSION=0.68.14
TFLINT_VERSION=0.50.3
K9S_VERSION=0.32.7

# Build with pinned versions
docker build \
  --target all-devops \
  --build-arg GCLOUD_VERSION=501.0.0 \
  --build-arg PACKER_VERSION=1.10.0 \
  --build-arg TERRAGRUNT_VERSION=0.68.14 \
  --build-arg TFLINT_VERSION=0.50.0 \
  --build-arg K9S_VERSION=0.32.0 \
  -t all-devops:pinned .

Validate Builds

Quick Validation

# Verify all-devops
docker run --rm all-devops:local terraform version
docker run --rm all-devops:local aws --version
docker run --rm all-devops:local gcloud --version

# Verify aws-devops
docker run --rm aws-devops:local terraform version
docker run --rm aws-devops:local aws --version

# Verify gcp-devops
docker run --rm gcp-devops:local terraform version
docker run --rm gcp-devops:local gcloud --version

Comprehensive Validation

#!/bin/bash
# validate-build.sh

IMAGE=$1

echo "Validating $IMAGE..."

# Check base tools
docker run --rm $IMAGE terraform version || exit 1
docker run --rm $IMAGE kubectl version --client || exit 1
docker run --rm $IMAGE helm version || exit 1
docker run --rm $IMAGE ansible --version || exit 1
docker run --rm $IMAGE trivy --version || exit 1
docker run --rm $IMAGE python3 --version || exit 1
docker run --rm $IMAGE node --version || exit 1

# Check cloud tools (if present)
docker run --rm $IMAGE aws --version 2>/dev/null && echo "AWS CLI: OK"
docker run --rm $IMAGE gcloud --version 2>/dev/null && echo "gcloud: OK"

echo "✅ Validation complete!"

Usage:

./validate-build.sh all-devops:local

Repository Test Scripts

If building from the repository, additional test scripts are available:

# Test network tools
./test_network_tools.sh all-devops:local

# Test DNS resolution
./test_dns_tools.sh all-devops:local

# Test ncat (network cat)
./test_ncat_tool.sh all-devops:local

Optimisation Techniques

Reduce Build Time

# Build multiple images in parallel
docker build --target aws-devops -t aws-devops:local . &
docker build --target gcp-devops -t gcp-devops:local . &
wait
# Enable parallel layer builds
BUILDKIT_STEP_LOG_MAX_SIZE=10000000 \
BUILDKIT_STEP_LOG_MAX_SPEED=10000000 \
docker build --target all-devops -t all-devops:local .
# Use faster mirrors
RUN sed -i 's|^mirrorlist=|#mirrorlist=|g' /etc/yum.repos.d/Rocky-*.repo && \
    sed -i 's|^#baseurl=http://dl.rockylinux.org|baseurl=https://mirror.example.com|g' /etc/yum.repos.d/Rocky-*.repo

Reduce Image Size

See Optimisation Guide for detailed size reduction techniques.


Troubleshooting

Build fails with 'No space left on device'

Problem: Insufficient disk space for build layers

Solutions:

  1. Clean up Docker system:

    docker system prune -a --volumes
    

  2. Check available space:

    df -h /var/lib/docker
    

  3. Increase Docker Desktop disk size:

  4. Settings → Resources → Disk image size → 60 GB
Build takes extremely long (>1 hour)

Problem: Network issues or no layer caching

Solutions:

  1. Check network speed:

    curl -o /dev/null https://dl.k8s.io/release/v1.29.0/bin/linux/amd64/kubectl
    

  2. Enable BuildKit (faster):

    export DOCKER_BUILDKIT=1
    

  3. Use cache from registry:

    docker pull ghcr.io/jinalshah/devops/images/all-devops:latest
    docker build --cache-from ghcr.io/jinalshah/devops/images/all-devops:latest -t all-devops:local .
    

Cannot download packages (404 errors)

Problem: Package repositories unreachable or versions unavailable

Solutions:

  1. Check internet connectivity:

    ping -c 3 dl.k8s.io
    

  2. Use build arg to pin working version:

    docker build --build-arg TERRAGRUNT_VERSION=0.67.0 -t all-devops:local .
    

  3. Check Rocky Linux mirrors:

    docker run --rm rockylinux:9 dnf repolist
    

Build fails on M1/M2 Mac

Problem: Architecture mismatch or emulation issues

Solutions:

  1. Build for native architecture:

    docker build --platform linux/arm64 --target all-devops -t all-devops:local .
    

  2. Disable Rosetta emulation in Docker Desktop:

  3. Settings → Features in development → Uncheck "Use Rosetta"

  4. Use native arm64 builders:

    docker buildx create --name arm-builder --platform linux/arm64
    docker buildx use arm-builder
    

Python or Node.js version errors

Problem: Incompatible Python/Node.js version

Solutions:

  1. Specify version with build arg:

    docker build --build-arg PYTHON_VERSION=3.11 -t all-devops:local .
    

  2. Check available versions in Dockerfile:

    grep "PYTHON_VERSION" Dockerfile
    

gcloud SDK installation fails

Problem: Large download, network timeout

Solutions:

  1. Increase build timeout (CI/CD):

    # GitHub Actions
    timeout-minutes: 60
    

  2. Use cached layer:

    docker build --cache-from ghcr.io/jinalshah/devops/images/gcp-devops:latest .
    

  3. Download manually and add to build context:

    curl -O https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-sdk-501.0.0-linux-x86_64.tar.gz
    


Platform-Specific Builds

macOS (Apple Silicon)

# Build for native arm64
docker build \
  --platform linux/arm64 \
  --target all-devops \
  -t all-devops:local .

Notes: - Faster than emulated amd64 - All tools have native arm64 support - No compatibility issues

Windows (WSL2)

# From WSL2 terminal
export DOCKER_BUILDKIT=1
docker build --target all-devops -t all-devops:local .

Notes: - Use WSL2 for best performance - Avoid Docker Desktop with Hyper-V (slower) - Ensure WSL2 has enough memory (Settings → WSL)

Linux

# Standard build
docker build --target all-devops -t all-devops:local .

Notes: - Best performance (native) - No emulation overhead - Fastest build times


Advanced Build Topics

Multi-Platform Builds

Build for both amd64 and arm64:

docker buildx build \
  --platform linux/amd64,linux/arm64 \
  --target all-devops \
  -t all-devops:multiarch \
  --load .

See Multi-Platform Images for complete guide.

Custom Builds

Extend images with custom tools:

# Create custom Dockerfile
FROM ghcr.io/jinalshah/devops/images/all-devops:latest

# Add custom tools
RUN pip3 install custom-package
RUN curl -o /usr/local/bin/custom-tool https://example.com/tool

# Build
docker build -f Dockerfile.custom -t all-devops:custom .

See Customisation Guide for detailed examples.

Automated Builds

Set up CI/CD to build automatically:


Cloud-Specific Build Guides

Detailed build instructions for each variant:


Best Practices

Build Recommendations

  1. Use cache: Always enable --cache-from in CI/CD
  2. Pin versions: Use build args to lock versions for reproducibility
  3. Test locally first: Validate builds before CI/CD
  4. Monitor build times: Track and optimise slow stages
  5. Clean regularly: Run docker system prune weekly

Common Mistakes

  • ❌ Building without BuildKit (slower)
  • ❌ No layer caching (rebuilds everything)
  • ❌ Not pinning versions (non-reproducible)
  • ❌ Building on low-spec machines (slow)
  • ❌ Not validating after build (broken images)

Next Steps