Building Images¶
Complete guide to building DevOps Images locally, including benchmarks, optimisation strategies, and troubleshooting.
When to Build vs Pull¶
Pull from Registry (Recommended)¶
Pull if you need
- ✅ Standard tooling: Official builds have everything most teams need
- ✅ Fast setup: Pull in seconds vs build in minutes
- ✅ Tested builds: CI/CD tested and scanned for vulnerabilities
- ✅ Multi-arch support: Automatic architecture selection
- ✅ Regular updates: Weekly rebuilds with latest security patches
Quick pull:
Build Locally¶
Build if you need
- 🔧 Custom tools: Add proprietary or internal tools
- 🔧 Specific versions: Pin tool versions for compliance
- 🔧 Size optimisation: Remove unused tools
- 🔧 Custom base: Different Linux distro or base image
- 🔧 Air-gapped: No internet access for pulls
Prerequisites¶
System Requirements¶
| Requirement | Minimum | Recommended | Notes |
|---|---|---|---|
| Docker | 20.10+ | 24.0+ | BuildKit required |
| Disk Space | 10 GB free | 20 GB free | Build cache + layers |
| RAM | 4 GB | 8 GB | Parallel builds benefit |
| CPU | 2 cores | 4+ cores | Faster builds |
| Network | 10 Mbps | 100 Mbps | Package downloads |
Enable BuildKit¶
- Open Docker Desktop settings
- Go to "Docker Engine"
- Add to configuration:
- Click "Apply & Restart"
Quick Start¶
Build Single Image¶
Build time: ~21 minutes (cold), ~3 minutes (warm)
Build time: ~18 minutes (cold), ~2 minutes (warm)
Build All Images¶
#!/bin/bash
# build-all.sh
for target in all-devops aws-devops gcp-devops; do
echo "Building $target..."
docker build \
--target "$target" \
--tag "$target:local" \
.
done
Total build time: ~60 minutes (cold), ~8 minutes (warm with cache)
Build Time Benchmarks¶
Cold Build (No Cache)¶
| Image | amd64 | arm64 | Notes |
|---|---|---|---|
| Base layer | 15 min | 17 min | Rocky Linux + tools |
| aws-devops | 18 min | 20 min | +AWS CLI installation |
| gcp-devops | 19 min | 21 min | +gcloud SDK (larger) |
| all-devops | 21 min | 23 min | +both cloud CLIs |
Warm Build (Cached Layers)¶
| Image | Build Time | Layers Rebuilt | Layers Cached |
|---|---|---|---|
| aws-devops | 2-3 min | AWS layer only | Base layer (15 min saved) |
| gcp-devops | 2.5-3.5 min | GCP layer only | Base layer (15 min saved) |
| all-devops | 3-4 min | Cloud layers only | Base layer (15 min saved) |
CI/CD Build Times¶
| Platform | Cold Build | Warm Build | Cache Strategy |
|---|---|---|---|
| GitHub Actions | 20-25 min | 4-6 min | Layer caching enabled |
| GitLab CI | 22-27 min | 5-7 min | Registry cache |
| Jenkins | 18-23 min | 3-5 min | Persistent volumes |
| Local (M2 Mac) | 15-20 min | 2-4 min | BuildKit cache |
Build Cache Strategies¶
Strategy 1: Layer Caching (Default)¶
Docker automatically caches unchanged layers:
# Layer 1: Base (rarely changes) - CACHED
FROM rockylinux:9
# Layer 2: System packages (monthly) - CACHED
RUN dnf install -y python3 nodejs
# Layer 3: IaC tools (weekly) - REBUILT
RUN install-terraform.sh
# Layer 4: Cloud CLIs (monthly) - REBUILT
RUN install-aws-cli.sh
Optimise Layer Order
Place frequently changing layers at the end to maximise cache hits.
Strategy 2: BuildKit Cache Mount¶
# Use cache mount for package managers
docker build \
--target all-devops \
--cache-from type=local,src=/tmp/buildkit-cache \
--cache-to type=local,dest=/tmp/buildkit-cache \
-t all-devops:local .
Benefits: - Persistent cache across builds - Shared cache between projects - Faster package manager operations
Strategy 3: Registry Cache¶
# Pull previous build as cache
docker pull ghcr.io/jinalshah/devops/images/all-devops:latest
# Build using registry cache
docker build \
--target all-devops \
--cache-from ghcr.io/jinalshah/devops/images/all-devops:latest \
-t all-devops:local .
Benefits: - Works in CI/CD without local cache - Team shares cache via registry - Consistent across environments
Build Args Reference¶
Common Build Args¶
| Arg | Default | Purpose | Example |
|---|---|---|---|
GCLOUD_VERSION | 501.0.0 | Google Cloud SDK version | 501.0.0 |
PACKER_VERSION | 1.11.2 | Packer version | 1.11.2 |
TERRAGRUNT_VERSION | 0.68.14 | Terragrunt version | 0.68.14 |
TFLINT_VERSION | 0.50.3 | TFLint version | 0.50.3 |
K9S_VERSION | 0.32.7 | k9s version | 0.32.7 |
PYTHON_VERSION | 3.12.4 | Python source version to compile | 3.11.10, 3.12.4 |
PYTHON_VERSION_TO_USE | python3.12 | Python binary selected via alternatives | python3.11, python3.12 |
GHORG_VERSION | 1.9.10 | ghorg version | 1.9.10 |
MONGODB_VERSION | 6.0 | MongoDB shell repository major version | 6.0, 7.0 |
Using Build Args¶
docker build \
--target all-devops \
--build-arg GCLOUD_VERSION=500.0.0 \
--build-arg PACKER_VERSION=1.11.2 \
--build-arg PYTHON_VERSION=3.11 \
-t all-devops:custom .
Pin All Versions¶
# versions.env
GCLOUD_VERSION=501.0.0
PACKER_VERSION=1.11.2
TERRAGRUNT_VERSION=0.68.14
TFLINT_VERSION=0.50.3
K9S_VERSION=0.32.7
# Build with pinned versions
docker build \
--target all-devops \
--build-arg GCLOUD_VERSION=501.0.0 \
--build-arg PACKER_VERSION=1.10.0 \
--build-arg TERRAGRUNT_VERSION=0.68.14 \
--build-arg TFLINT_VERSION=0.50.0 \
--build-arg K9S_VERSION=0.32.0 \
-t all-devops:pinned .
Validate Builds¶
Quick Validation¶
# Verify all-devops
docker run --rm all-devops:local terraform version
docker run --rm all-devops:local aws --version
docker run --rm all-devops:local gcloud --version
# Verify aws-devops
docker run --rm aws-devops:local terraform version
docker run --rm aws-devops:local aws --version
# Verify gcp-devops
docker run --rm gcp-devops:local terraform version
docker run --rm gcp-devops:local gcloud --version
Comprehensive Validation¶
#!/bin/bash
# validate-build.sh
IMAGE=$1
echo "Validating $IMAGE..."
# Check base tools
docker run --rm $IMAGE terraform version || exit 1
docker run --rm $IMAGE kubectl version --client || exit 1
docker run --rm $IMAGE helm version || exit 1
docker run --rm $IMAGE ansible --version || exit 1
docker run --rm $IMAGE trivy --version || exit 1
docker run --rm $IMAGE python3 --version || exit 1
docker run --rm $IMAGE node --version || exit 1
# Check cloud tools (if present)
docker run --rm $IMAGE aws --version 2>/dev/null && echo "AWS CLI: OK"
docker run --rm $IMAGE gcloud --version 2>/dev/null && echo "gcloud: OK"
echo "✅ Validation complete!"
Usage:
Repository Test Scripts¶
If building from the repository, additional test scripts are available:
# Test network tools
./test_network_tools.sh all-devops:local
# Test DNS resolution
./test_dns_tools.sh all-devops:local
# Test ncat (network cat)
./test_ncat_tool.sh all-devops:local
Optimisation Techniques¶
Reduce Build Time¶
Reduce Image Size¶
See Optimisation Guide for detailed size reduction techniques.
Troubleshooting¶
Build fails with 'No space left on device'
Problem: Insufficient disk space for build layers
Solutions:
-
Clean up Docker system:
-
Check available space:
-
Increase Docker Desktop disk size:
- Settings → Resources → Disk image size → 60 GB
Build takes extremely long (>1 hour)
Problem: Network issues or no layer caching
Solutions:
-
Check network speed:
-
Enable BuildKit (faster):
-
Use cache from registry:
Cannot download packages (404 errors)
Problem: Package repositories unreachable or versions unavailable
Solutions:
-
Check internet connectivity:
-
Use build arg to pin working version:
-
Check Rocky Linux mirrors:
Build fails on M1/M2 Mac
Problem: Architecture mismatch or emulation issues
Solutions:
-
Build for native architecture:
-
Disable Rosetta emulation in Docker Desktop:
-
Settings → Features in development → Uncheck "Use Rosetta"
-
Use native arm64 builders:
Python or Node.js version errors
Problem: Incompatible Python/Node.js version
Solutions:
-
Specify version with build arg:
-
Check available versions in Dockerfile:
gcloud SDK installation fails
Problem: Large download, network timeout
Solutions:
-
Increase build timeout (CI/CD):
-
Use cached layer:
-
Download manually and add to build context:
Platform-Specific Builds¶
macOS (Apple Silicon)¶
# Build for native arm64
docker build \
--platform linux/arm64 \
--target all-devops \
-t all-devops:local .
Notes: - Faster than emulated amd64 - All tools have native arm64 support - No compatibility issues
Windows (WSL2)¶
# From WSL2 terminal
export DOCKER_BUILDKIT=1
docker build --target all-devops -t all-devops:local .
Notes: - Use WSL2 for best performance - Avoid Docker Desktop with Hyper-V (slower) - Ensure WSL2 has enough memory (Settings → WSL)
Linux¶
Notes: - Best performance (native) - No emulation overhead - Fastest build times
Advanced Build Topics¶
Multi-Platform Builds¶
Build for both amd64 and arm64:
docker buildx build \
--platform linux/amd64,linux/arm64 \
--target all-devops \
-t all-devops:multiarch \
--load .
See Multi-Platform Images for complete guide.
Custom Builds¶
Extend images with custom tools:
# Create custom Dockerfile
FROM ghcr.io/jinalshah/devops/images/all-devops:latest
# Add custom tools
RUN pip3 install custom-package
RUN curl -o /usr/local/bin/custom-tool https://example.com/tool
# Build
docker build -f Dockerfile.custom -t all-devops:custom .
See Customisation Guide for detailed examples.
Automated Builds¶
Set up CI/CD to build automatically:
- GitHub Actions build workflow - Example in repository
- GitLab CI - Build pipeline
- Jenkins - Declarative pipeline
Cloud-Specific Build Guides¶
Detailed build instructions for each variant:
- Building all-devops - Multi-cloud image
- Building aws-devops - AWS-optimised image
- Building gcp-devops - GCP-optimised image
Best Practices¶
Build Recommendations
- Use cache: Always enable
--cache-fromin CI/CD - Pin versions: Use build args to lock versions for reproducibility
- Test locally first: Validate builds before CI/CD
- Monitor build times: Track and optimise slow stages
- Clean regularly: Run
docker system pruneweekly
Common Mistakes
- ❌ Building without BuildKit (slower)
- ❌ No layer caching (rebuilds everything)
- ❌ Not pinning versions (non-reproducible)
- ❌ Building on low-spec machines (slow)
- ❌ Not validating after build (broken images)
Next Steps¶
- Optimisation Guide - Reduce size and build time
- Customisation Guide - Extend with custom tools
- Multi-Platform Guide - Build for amd64 and arm64
- Architecture Overview - Understand image layers