6 posts tagged with "development"

View All Tags

Instant GitHub Codespaces are dangerous without realistic data

September 6, 2021 · 4 min read

Chris Heppell

Picture the scene:

You click a button on a GitHub repository
A new tab launches with VS Code containing your code
All of the prerequisites are preinstalled and you can start coding, compiling, and running your application instantly

Amazing! Forget about walking line-by-line through “SETUP.md” to configure your local machine.

But let's fast-forward a little bit more…

You start your web app and load it in a new tab
You log in to your web app
You’re greeted with an empty screen

There’s no data. There are no other users. There’s no realistic data in this application at all.

Codespaces with databases

August 1, 2021 · 6 min read

Santiago Arias

With GitHub Codespaces you can set up a cloud-hosted, containerized VS Code environment. You can then connect to a codespace through the browser or through VS Code.

The main question we are trying to answer now is:

Can we have virtualized environments with databases created from backups or scripts, not just empty databases, or fake data sets?

Pull request previews with WunderPreview and Spawn

July 21, 2021 · 4 min read

Andrew Farries

Level up your pull request workflow with live preview environments backed by dedicated databases.

Going all-in on cloud-based development with realistic databases

June 29, 2021 · 4 min read

Throughout 2020 and 2021 much of the world moved to remote-first working through unfortunate necessity. We got used to remote standups, meetings, reviews and collaboration. Working patterns and practices changed, but have development environments kept up?

Simple Database CI with Spawn and Github Actions

June 23, 2021 · 5 min read

Andrew Farries

Running tests against databases in CI pipelines is an essential part of testing your application.

Provisioning databases in CI pipelines can be hard work, however. Broadly speaking you have two options:

Have all your pipelines use shared databases
Use Docker to run containerised database instances

The first option has the advantage that you can test against real data, perhaps a recently restored copy of production, but it effectively serializes your pipelines as they contend for the shared database. You may be able to scale by adding multiple database servers, but ultimately the parallelism of your CI pipelines is limited by the number of database servers you have available to the pipelines.

The second option is a substantial improvement in terms of parallelism as each pipeline run now has a dedicated database spun up and torn down for exclusive use. However, the problem of testing against realistic data is now more acute. Typically where we see Docker being used to provision databases in CI pipelines, we see the use of seed data stored in the code repository used to populate the containerised database. This means you lose the confidence that you gain from testing against a realistic data set. If you don't want to go down the route of using seed data, you need to manage a docker volume inside your pipeline, or run a lengthy database restore operation in each pipeline run.

Development databases in Docker aren’t good enough

June 22, 2021 · 3 min read

Chris Heppell

Development databases in Docker aren’t good enough on their own. Why? Because they’re almost always so far from the production environment characteristics that you get a false sense of security in development.

Having isolated databases is far better than a shared environment where other developers trample over your changes. But because dev databases tend to either be empty, or have “happy path” data within them, they never truly demonstrate the behaviours you’ll end up seeing in production.

This leads to a variety of different problems:

Unexpected data loss during schema migrations
Unacceptable latency on specific queries because of vastly different data sizes
Poor UX due to unanticipated user-provided data
UI glitches or performance issues not caught in lower environments because of unrealistic data
Entire branches of code left unexercised due to conditions on the data not caught in lower environments

I think I lost more data due to database bugs in production than anything else.
— JBD ヤナドガン (@rakyll) June 19, 2021

Recent posts

6 posts tagged with "development"