Simple Database CI with Spawn and Github Actions

June 23, 2021 · 5 min read

Andrew Farries

Running tests against databases in CI pipelines is an essential part of testing your application.

Provisioning databases in CI pipelines can be hard work, however. Broadly speaking you have two options:

Have all your pipelines use shared databases
Use Docker to run containerised database instances

The first option has the advantage that you can test against real data, perhaps a recently restored copy of production, but it effectively serializes your pipelines as they contend for the shared database. You may be able to scale by adding multiple database servers, but ultimately the parallelism of your CI pipelines is limited by the number of database servers you have available to the pipelines.

The second option is a substantial improvement in terms of parallelism as each pipeline run now has a dedicated database spun up and torn down for exclusive use. However, the problem of testing against realistic data is now more acute. Typically where we see Docker being used to provision databases in CI pipelines, we see the use of seed data stored in the code repository used to populate the containerised database. This means you lose the confidence that you gain from testing against a realistic data set. If you don't want to go down the route of using seed data, you need to manage a docker volume inside your pipeline, or run a lengthy database restore operation in each pipeline run.

Database CI with Spawn#

Fortunately, there is a third way of testing against databases in CI that has all the advantages of both these approaches. Spawn allows you run an arbitrary number of pipelines in parallel, use realistic test data in all your tests, and not have to worry about Docker volume management or lengthy restore operations. The remainder of this article assumes that you are familiar with the basic concepts of Spawn; data images and data containers. If not, sign up for Spawn (its free!) and get started.

We'll create a simple Github Actions workflow that uses Spawn to provision some databases for us to run database migration tests against.

Create a data container#

The first step in our workflow is to create a data container, in other words the database server that our tests will run against. We do this using the create spawn data container action:

name: Create container demo

on: workflow_dispatch

jobs:
  create-container:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
        uses: actions/checkout@v2
      - name: Create account data container
        id: create-account-container
        uses: red-gate/create-spawn-data-container@v1
        with:
          dataImage: demo-account:latest
          lifetime: '10m'

The action takes the name of the data image from which to create the data container, and a lifetime after which the data container will be automatically destroyed.

Connect to the data container#

The create container action has a number of outputs that allow us to connect to the new database server in later steps. The example workflow below shows how to connect to the new data container and run some database migration tests on it:

name: Database migration test

on: workflow_dispatch

jobs:
  migration:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
        uses: actions/checkout@v2
      - name: Create account data container
        id: create-account-container
        uses: red-gate/create-spawn-data-container@v1
        with:
          dataImage: demo-account:latest
          lifetime: '10m'
      - name: Run database migrations
        run: |
          ./migrate-db.sh $ACCOUNT_HOST $ACCOUNT_PORT $ACCOUNT_USERNAME $ACCOUNT_PASSWORD

          echo "Successfully migrated 'Account' database"
        env:
          ACCOUNT_HOST: ${{ steps.create-account-container.outputs.dataContainerHost }}
          ACCOUNT_PORT: ${{ steps.create-account-container.outputs.dataContainerPort }}
          ACCOUNT_USERNAME: ${{ steps.create-account-container.outputs.dataContainerUsername }}
          ACCOUNT_PASSWORD: ${{ steps.create-account-container.outputs.dataContainerPassword }}
    env:
      SPAWNCTL_ACCESS_TOKEN: ${{ secrets.SPAWNCTL_ACCESS_TOKEN }}

Every run of this workflow runs against a freshly provisioned, cloud hosted, isolated database server. The server spins up in seconds, regardless of the size of the data image from which it is created. We are testing our migrations against a real database, with realistic data and we are able to do it in parallel with other pipeline runs.

Saving and restoring a data container#

During a test run, it is often desirable to save the state of the database so that it can be rolled back to later in the test run. There is no good way to do this either when working with shared database servers or with servers provisioned with Docker, but Spawn allows for easy save and reset of any data container. We can take advantage of this functionality using the save and reset actions:

name: Demo Spawn Actions

on: workflow_dispatch

jobs:
  demo-actions:
    name: Demo Spawn Github Actions
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - name: Create account data container
        id: create-account-container
        uses: red-gate/create-spawn-data-container@v1
        with:
          dataImage: demo-account:latest
          lifetime: '10m'
      - name: save data container
        uses: red-gate/save-spawn-data-container/@v1
        with:
          dataContainer: ${{ steps.create-container.outputs.dataContainerName }}

      ... Run tests against data container

      - name: reset data container
        uses: red-gate/reset-spawn-data-container/@v1
        with:
          dataContainer: ${{ steps.create-container.outputs.dataContainerName }}

      ... Run further tests against data container

One application of save and reset would be to ensure that any tests in a workflow don't mutate the state of the database for any subsequent tests.

Conclusion#

Working with databases in CI is often a choice between connecting to live servers vs using containers. Each of these approaches has benefits and drawbacks. Spawn gives us the best of both worlds; realistic live data that spins up in seconds, and the parallelism and isolation that we get from containerised instances. Coupled with Github Actions to lessen the scripting required to invoke the Spawn CLI, we get smooth, frictionless database CI in our workflows.

If you want to experience how Spawn can make working with databases in CI so much easier, sign up for free now.

Recent posts

Database CI with Spawn#

Create a data container#

Connect to the data container#

Saving and restoring a data container#

Conclusion#