Modern CI antipattern
Have you ever used modern CI platforms like GitLab CI or GitHub Actions?
Then you know that usually pipeline configuration comes from YAML files. In most cases, this is not a single YAML but the whole jungle of files with references, inheritance, inlined shell scripts, and other disgusting things. I do see it as an antipattern people should avoid. Let’s see why.
Case study
This site you are currently reading is a static site generated with Hugo and hosted on GitHub Pages. One option to publish the site to GitHub Pages is to use GitHub Actions which I considered.
So let’s take a look at the proposed Deploy Hugo site to GitHub Pages pipeline:
# Sample workflow for building and deploying a Hugo site to GitHub Pages
name: Deploy Hugo site to Pages
on:
# Runs on pushes targeting the default branch
push:
branches:
- main
# Allows you to run this workflow manually from the Actions tab
workflow_dispatch:
# Sets permissions of the GITHUB_TOKEN to allow deployment to GitHub Pages
permissions:
contents: read
pages: write
id-token: write
# Allow only one concurrent deployment, skipping runs queued between the run in-progress and latest queued.
# However, do NOT cancel in-progress runs as we want to allow these production deployments to complete.
concurrency:
group: "pages"
cancel-in-progress: false
# Default to bash
defaults:
run:
shell: bash
jobs:
# Build job
build:
runs-on: ubuntu-latest
env:
HUGO_VERSION: 0.123.0
steps:
- name: Install Hugo CLI
run: |
wget -O ${{ runner.temp }}/hugo.deb \
https://github.com/gohugoio/hugo/releases/download/v${HUGO_VERSION}/hugo_extended_${HUGO_VERSION}_linux-amd64.deb \
&& sudo dpkg -i ${{ runner.temp }}/hugo.deb
- name: Install Dart Sass
run: sudo snap install dart-sass
- name: Checkout
uses: actions/checkout@v4
with:
submodules: recursive
fetch-depth: 0
- name: Setup Pages
id: pages
uses: actions/configure-pages@v4
- name: Install Node.js dependencies
run: "[[ -f package-lock.json || -f npm-shrinkwrap.json ]] && npm ci || true"
- name: Build with Hugo
env:
# For maximum backward compatibility with Hugo modules
HUGO_ENVIRONMENT: production
HUGO_ENV: production
run: |
hugo \
--gc \
--minify \
--baseURL "${{ steps.pages.outputs.base_url }}/"
- name: Upload artifact
uses: actions/upload-pages-artifact@v2
with:
path: ./public
# Deployment job
deploy:
environment:
name: github-pages
url: ${{ steps.deployment.outputs.page_url }}
runs-on: ubuntu-latest
needs: build
steps:
- name: Deploy to GitHub Pages
id: deployment
uses: actions/deploy-pages@v3
Just to generate a static site and make it available on GitHub Pages you would need:
80 lines of GitHub Actions YAML DSL
Inline shell scripts which should be the easiest part here
wget -O ${{ runner.temp }}/hugo.deb \ https://github.com/gohugoio/hugo/releases/download/v${HUGO_VERSION}/hugo_extended_${HUGO_VERSION}_linux-amd64.deb \ && sudo dpkg -i ${{ runner.temp }}/hugo.deb ... [[ -f package-lock.json || -f npm-shrinkwrap.json ]] && npm ci || true ... hugo \ --gc \ --minify \ --baseURL "${{ steps.pages.outputs.base_url }}/"
Magic values (the most problematic part in my opinion)
uses: actions/configure-pages@v4 ... uses: actions/upload-pages-artifact@v2 ... uses: actions/deploy-pages@v3 ```
These magic values are GitHub Actions carefully prepared by the GitHub Actions Team for your convenience.
For example actions/upload-pages-artifact@v2 is a 71 lines YAML file, while actions/configure-pages@v4 and actions/deploy-pages@v3 are 2 complex JavaScript based actions. If not getting deep into implementation details you at least would need to understand how to use these building blocks, their input and output parameters. And it is not as straightforward as with any programming language you used to work.
What’s wrong with it?
The example above is a very simple task - publishing a static site. In enterprise development, CI pipelines are way more complicated with testing, linting, versioning, and other activities. And if you fall into a trap of all these convenient actions and templates, you start including them in your project, configuring them with parameters for your needs, inheriting them, producing your own actions and templates, and spawning tons of interconnected YAML files. It is a project in itself that requires heavy maintenance. In big companies there even could be a dedicated team to support CI setup.
Here are the disadvantages of this technique:
- It is not possible to run CI scripts locally. After making a change to a CI script you can verify it only on a CI platform
- As a result, there are different approaches to local development and CI
- CI scripts require heavy maintenance, sometimes even consuming the capacity of a whole team
- Most likely pipelines will consist of many steps, running in different containers, consuming a lot of resources, and taking a lot of time
- If there is a dedicated team to maintain CI scripts then:
- they will produce generic scripts to meet the needs of all teams even more increasing the complexity
- developers will be detached from all these scripts caring only about the “green” pipeline without trying to understand the details
- I’ve seen a case when a pipeline was “green” and both parties thought that everything is fine, although it was not
- A complex CI setup results in a CI platform vendor lock-in
What to do?
Having already a list of problems we need to do the opposite to avoid them.
- CI has to be based on something that developers are not only able to run locally but do regularly. So it has to be part of building tooling
- CI scripts have to be a thin layer to run your regular build tools only passing needed configuration
- Don’t build generic solutions on CI. Like tests, CI has to be plain and straightforward
- Keep CI scripts as abstract as possible delegating all details to the project you’re building
- Don’t prepare an environment for CI in your CI scripts. Instead, prepare it in advance
- Developers have to be responsible for CI and not a dedicated team that knows how to do it better
An ideal CI pipeline definition would be:
jobs:
build:
runs-on: prebuilt-docker-image-for-ci
steps:
- name: Build
run: make build
First of all, you prepare an environment for your job in advance - prebuilt-docker-image-for-ci
. And the second you hide all details in a Makefile. That’s it! You easily can port this setup to any CI platform and any project can be adapted to this script.
Yes, in this case, you won’t have a fancy pipeline diagram and to understand what failed you need to look into logs, but it is the most efficient and natural approach. An error message from the log can be attached to the failed build notification, so you don’t even need to look at the diagram. But if the industry needs these diagrams then I’d say even more - CI platforms have to be able to understand the output of all popular build tools and visualize pipelines instead of forcing us to adapt to them.
Of course, the ideal cannot be achieved in every case, but at least you know another perspective and can find a balance.
Let’s not be fooled by the sham convenience of modern CI platforms and keep things simple.