Faster Circle CI Builds for Rails Apps

At Later, we use CircleCI for most of our testing (and some of our deployment) needs. There are a bunch of great articles out there on how to speed up your CI builds. Over the past year we have spent some time making sure that our development and deployment processes aren’t blocked by our test suites.

In this post I am going to go over step by step two of our CircleCI jobs that we run on all Ruby code that is pushed to Github. The first job we run caches and audits our dependencies and runs Rubocop, and the second job actually runs all of our specs. You can see the config.yml in full here. Parts of our setup require some code changes in our app; when we run into those, I’ll add the Ruby code to see how it works in conjunction with the CI build.

Some of the speed ups we talk about here won’t really make much of a difference for smaller projects, but the nice thing is their benefit will grow along with your app. Add them now and don’t worry about them anymore.

Here is a brief look at what we’ll be going over:

  • Speeding up your rubocop step
    • caching
    • parallel
  • Enabling and configuring bootsnap for an ephemeral CI environment
  • Speeding up database setup
    • Using *-ram images
    • Tracking changes and caching schema
  • parallel_test gem
    • PARALLEL_TEST_PROCESSORS
    • database setup

Setup Job

As mentioned above we use this job to do a few things to prep for the rest of the CircleCI workflow. These are:

  • Download and cache our dependencies
  • Audit our dependencies for vulnerabilities
  • Run rubocop

Docker

The first thing to mention is that we try to use the CircleCI pre-built Docker images. If we do need to use a custom Docker image, we will typically start from one of CircleCI’s images. Our reasoning is that it is more likely that the layers for these images are already cached on the instance our job is running on. In turn this reduces the overhead associated with the ‘Spin up Environment’ step each job has to execute.

For our Ruby based apps we typically use this image:

docker:
  - image: circleci/ruby:2.5.1-node-browsers

Code checkout

This step is probably part of every job. It checks out your code to the CI machine. It is worth noting that some people might benefit from source caching. We gave it a shot, but didn’t really see any benefit.

- checkout

Get the Gemfile.lock from the master branch

This command creates a file that is a duplicate of the Gemfile.lock from the master branch. I’ll explain why we do this in the next step.

- run:
    name: Get the Gemfile.lock from the master branch
    command: |
      git show master:Gemfile.lock > MasterGemfile.lock

Restore gem cache

One of the first things you can do to help speed up your builds is to cache your gem dependencies. You can read about all the details here, but I’ll give a quick synopsis here. The job will check each of the keys in order looking to see if there is a matching cache entry. An important note is that these aren’t exact string matches, but instead are prefix matches. If the key (prefix) matches multiple entries, the job will pull the most recently created cache object.

Here we first try to pull the cache associated with a particular instance of our Gemfile.lock file. If we have added or updated some gems then our Gemfile.lock will have changed and we will fallback to the most recent cache object created by a CI run on the master branch. I am making a few assumptions here. One is that master is your default git branch, and the second is that master is most likely to have a version of Gemfile.lock closest to this branch’s newly updated Gemfile.lock. There are situations when other options might be closer, but I believe this generalizes pretty well across most situations.

Note: We use a v*- prefix in case we need to bust the cache all at once.

- restore_cache:
    keys:
      - v1-dependencies-gem-{{ checksum "Gemfile.lock" }}
      - v1-dependencies-gem-{{ checksum "MasterGemfile.lock" }}
      - v1-dependencies-gem

Verify and/or install dependencies

This step checks to see if our cache has everything we need and if not, installs the missing gems. Doing a bundle check first speeds things up little in the case where you don’t need to download anything. We set the path to be vendor/bundle so we can easily cache the gems in a later step. The --jobs=4 flag add some parallelism to speed things up.

- run:
    name: bundle install
    command: |
      bundle check --path vendor/bundle || bundle install --jobs=4 --retry=3 --path vendor/bundle

Gems security audit

Security vulnerabilities are no good. The wonderful bundle-audit gem helps you keep on top of issues with your gems. Note: the --update flag was added in version 0.5.0.

- run:
    name: Dependencies security audit
    command: |
      bundle exec bundle-audit check --update

Remove unused gems

We don’t want to cache gems or versions we are no longer using, so let’s clean things up.

- run:
    name: Clean up gems before we save
    command: |
      bundle clean

Cache gems

Cache the gems associated with this version of the Gemfile.lock. Notice the vendor/bundle path. By putting all of our gems there, it allows us to cache them easily with this step.

A couple of things to note is that if the key already exists, this step will effectively noop. In cases when your dependencies did change, in mature Rails apps this step can take a bit of time. This is one reason we don’t add a {{ .Branch }} to the key. This would force a cache save on the first run of each branch which shouldn’t be required.

- save_cache:
    key: v1-dependencies-gem-{{ checksum "Gemfile.lock" }}
    paths:
      - vendor/bundle

Restore rubocop cache

Rubocop is amazing. Something people might not realize is that rubocop caches its results to be used on later runs. If you are using the same version of rubocop, the same .rubocop.yml, and the contents of a file haven’t changed, then rubocop can just display the results from a previous run. No parsing or analyzing needed!

Here we are restoring said cache. First we check for the latest data from our branch, then check for the latest from master and if that fails, we check for the latest from any branch using the same .rubocop.yml file.

- restore_cache:
    keys:
      - v1-rubocop-cache-{{ checksum ".rubocop.yml" }}-{{ .Branch }}
      - v1-rubocop-cache-{{ checksum ".rubocop.yml" }}-master
      - v1-rubocop-cache-{{ checksum ".rubocop.yml" }}

Run rubocop

Make sure to use the --parallel option! Didn’t know about this for a good while.

- run:
    name: Rubocop
    command: bundle exec rubocop --parallel

Save rubocop cache

Rubocop defaults to using $HOME/.cache/rubocop_cache to store all of its results. Here we add {{ epoch }} to the cache key to make sure that we are always using the most recent rubocop results for this branch. Remember that the restore_cache step is a prefix match that uses the most recently written version it finds that matches the prefix. The cache object is small, so we can feel free to write this every single run.

- save_cache:
    key: v1-rubocop-cache-{{ checksum ".rubocop.yml" }}-{{ .Branch }}-{{ epoch }}
    paths:
      - ../.cache/rubocop_cache

And that finishes up our setup job!

Specs Job

We are trying to make good use of CircleCI 2.0 and workflows at Later. Workflows let us parallelize different jobs and cut down on the end to end time of our CI runs. By running our setup job first then following it up with a separate job to run our tests we give ourselves the flexability to parallelize our tests in a lot of different ways. We could use CircleCI’s built in parallelism, break them up in rspec tags, or the actual spec directories.

In addition to parallelizing our tests, when we are running CI for our master branch we are actually processing all our docs while our tests are running. If the tests pass, then we publish the docs. If the tests fail, then the workflow exits and the docs deploy job never runs. Since we use rspec_api_documentation and yard for our docs, they benefit from the gem caching we do in the setup job.

Docker

You’ll notice that we are using the same base image Ruby image from CircleCI along with a redis and postgres image. Be sure to notice that we use -ram variant for the postgres image. This image has postgres setup to use a RAM volume instead of the disk. These 4 characters can really help speed up your tests. Outside of some things to setup our database, you can see we set the PARALLEL_TEST_PROCESSORS environment variable. This tells the parallel_test gem how many processes to use when running your specs.

Note: We have found that setting PARALLEL_TEST_PROCESSORS to 4 has been a good configuration for CircleCI’s medium instances. Unless otherwise specified, your jobs are running on a medium instance.

docker:
  - image: circleci/ruby:2.5.1-node-browsers
    environment:
      RAILS_ENV: test
      PGHOST: 127.0.0.1
      PGUSER: root
      PARALLEL_TEST_PROCESSORS: 4
  - image: redis:3.0
  - image: circleci/postgres:10.3-ram
    environment:
      POSTGRES_USER: root

Code checkout

Once again we check out our code.

- checkout

Restore gem dependencies

We only need to check one key for our dependencies in this job because we know that we cached something to this key either in the previous setup job on in a previous workflow.

- restore_cache:
    keys:
      - v1-dependencies-gem-{{ checksum "Gemfile.lock" }}

Tell bundler where our gems are located

Something I ran into while trying to break up our original CI build into multiple jobs was getting bundler to pick up the gems in the vendor/bundle directory. My solution was to just run a bundle check which seemed to set everything up correctly for the rest of the job.

- run:
    name: Setup bundler path
    command: |
      bundle check --path vendor/bundle

Restore bootsnap cache

bootsnap is a great gem. Start up times just get faster. Which is awesome. There are a few things we need to do in order to make sure bootsnap is configured correctly for CI and tests. We don’t use the default require 'bootsnap/setup', we configure it manually.

In our config/boot.rb:

...

require 'bundler/setup' # Set up gems listed in the Gemfile.
require 'bootsnap'

env = ENV['RAILS_ENV'] || ENV['RACK_ENV'] || ENV['ENV']
development_mode = ['', nil, 'development'].include?(env)

cache_dir = ENV['BOOTSNAP_CACHE_DIR'] || 'tmp/bootsnap_cache'

# If we explicity run coverage locally or are in CI where we always run it
compile_cache_iseq = !ENV['CIRCLECI'] && !ENV['COVERAGE']

Bootsnap.setup(
  cache_dir: cache_dir,
  development_mode: development_mode,
  load_path_cache: true,
  autoload_paths_cache: true,
  disable_trace: false,
  compile_cache_iseq: compile_cache_iseq,
  compile_cache_yaml: true
)

...

Things we are configuring:

  • cache_dir to tmp/bootsnap_cache. This is what we are actually restoring for the CircleCI cache in this step.
  • development_mode is turned off. This tells the cache that things aren’t going to be changing a bunch and it views the cache as stable.
  • disable_trace is set to false. It is set to false in the default setup (though not on the main README, which is why I mention it here). Just stuck with it.
  • compile_cache_iseq is turned off in a couple of scenarios. Running code coverage tools and using this flag are mutually exclusive. Since we always run coverage in our CI builds, we turn this flag off if we detect that we are running on CircleCI or we explicitly enable coverage locally via something like COVERAGE=true rspec spec/. You read more about it here.

We also make a small change to our spec_helper.rb

unless !ENV['CIRCLECI'] && !ENV['COVERAGE']
  require 'simplecov'
  require 'codecov'
end

This turns off coverage locally unless we explicity enable via the COVERAGE environment variable. We are typically more interested in output of the tests and would rather them run quicker than having coverage run everytime.

Our actual CirlceCI step:

- restore_cache:
    keys:
      - v1-bootsnap-cache-{{ checksum "Gemfile.lock" }}-{{ .Branch }}
      - v1-bootsnap-cache-{{ checksum "Gemfile.lock" }}-master

Rails smoke test

I have a confession. We have some classes that don’t have any test coverage. I know, I know. We are horrible people, and we apologize. This step gives us the tiniest of smoke tests for those classes. We are just triggering the eager load of all of our classes. It’ll mostly check to make sure some syntax and our class/file naming is ok.

Please don’t look at me like that. I already apologized…

- run:
    name: Eager load classes to check for issues
    command: |
        bundle exec rails runner 'Rails.application.eager_load!'

Clockwork smoke test

We use the clockwork gem to handle our cron like tasks. Unfortunately we have been bitten by a few misconfigured at values. When this happens, the tasks simply don’t run. This step invokes our clockwork.rb to warn us if something is amiss.

- run:
    name: Validate clockwork.rb
    command: |
      bundle exec ruby ./clockwork.rb

Wait for PG image to start

If your tests require a database to run, then it is probably important to make sure it is up and running. We put this farther down the list of steps to give the database image time to startup while we check other things. If by the time we get here, the image hasn’t started, this step will block for up to a minute waiting for the database to get going. After a minute it will timeout and fail the job.

- run:
    name: Wait for postgres
    command: dockerize -wait tcp://localhost:5432 -timeout 1m

Wait for Redis image to start

This step is similar to the one above except for our Redis image.

- run:
    name: Wait for redis
    command: dockerize -wait tcp://localhost:6379 -timeout 1m

Install psql

I found this great article when I got around to optimizing the database portion of our CI builds. The gist of the article is that you monitor your db/ directory for changes and if there are changes then you run your setup like normal, dump the resulting SQL, and cache it. If there aren’t any changes then you simply load the cached SQL into your database.

In either case you’ll need to install the client associated with your database. Since we are running Postgres 10.3 we need to jump through a few hoops to get the correct client downloaded.

It is worth noting that we could probably shave a few seconds off here by adding this to a custom Docker image. We figured that the 5 seconds we would save weren’t worth the hassle of maintaining the custom image.

- run:
    name: Install Postgres Client
    command: |
      sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt/ `lsb_release -cs`-pgdg main" >> /etc/apt/sources.list.d/pgdg.list'
      wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -
      sudo apt update
      sudo apt install postgresql-client-10

Gather database info

This step is where we actually monitor for changes in our database configuration. In the above article they do this by using the git sha associated with the db/ directory (git log --pretty=format:'%H' -n 1 -- db/). We’ve taken a bit of a different approach. We decided to checksum each file in the db/, sort them, and cat them to a file. We then add the checksum of the config/database.yml as well. Next, comes the number of processes we are going to use for our parallel_test command and the database version. If any of these change, the cached SQL dump will keep up with it.

- run:
    name: Database Checksum
    command: |
      find db -type f -exec md5sum {} \; | sort -k 2 > db_dir_checksums.txt
      md5sum config/database.yml >> db_dir_checksums.txt
      echo $PARALLEL_TEST_PROCESSORS >> db_dir_checksums.txt
      psql postgres -A -t -c 'SELECT version()' >> db_dir_checksums.txt

See what your database setup is

This is just a fun little step to see exactly what you are depending on for your database cache.

- run:
    name: Cat db_dir_checksums.txt
    command: |
      cat db_dir_checksums.txt

Restore your database dump

Once everything is written to db_dir_checksums.txt we can use it as a checksum in our cache key.

- restore_cache:
    keys:
      - v1-database-schema-{{ checksum "db_dir_checksums.txt" }}

Setup database(s)

If we had a cache hit in the previous step, we’ll have a file called postgres_dump.sql present. If it is, then we just load its data into our database. If it isn’t there, then we know that we have a new database configuration and we need to run our setup and dump the resulting schema.

- run:
    name: Database Setup
    command: |
      if [ -e postgres_dump.sql ]
      then
        echo "Restoring databases dump"
        psql -U postgres -f postgres_dump.sql
      else
        echo "Setting up databases"
        bundle exec rake parallel:setup
        echo "Dumping databases"
        pg_dumpall > postgres_dump.sql
      fi

Cache database dump

This will noop if this was a preexisting database configuration, but if not we’ll save the new schema.

- save_cache:
      key: v1-database-schema-{{ checksum "db_dir_checksums.txt" }}
      paths:
        - postgres_dump.sql

Create our test results directory

I found I needed to create the directory I was going to place our test results in ahead of time.

- run:
    name: Create test results directory
    command: |
      mkdir tmp/test-results

Restore our test timings for parallel_test

Fetch the most recent test timings for this branch. If this is the first time running tests on this branch, use the timings from master.

- restore_cache:
    keys:
      - v1-test-times-{{ .Branch }}
      - v1-test-times-master

Run our tests!!!

Finally we get to run our tests! There is a bunch going on here. There are three basic parts:

1) the parallel_test invocation 2) the Rspec flags 3) the CircleCI test splits

The parallel_test invocation is telling us we are using rspec (-t rspec) and where previous test timings can be found (--runtime-log tmp/test-results/parallel_runtime_rspec.log). The next five lines (following the -- ) are the flags that will be passed to rspec. These are three different formatters and their respective output files. The last line is some CircleCI magic for automatically splitting up specs when you use their built in parallelism feature.

- run:
    name: Run specs
    command: |
      bundle exec parallel_test  \
        -t rspec \
        --runtime-log tmp/test-results/parallel_runtime_rspec.log \
        -- --format progress \
        --format RspecJunitFormatter \
        --out tmp/test-results/rspec.xml \
        --format ParallelTests::RSpec::RuntimeLogger \
        --out tmp/test-results/parallel_runtime_rspec.log \
        -- $(circleci tests glob "spec/**/*_spec.rb" | circleci tests split --split-by=timings)

Cache our test timings

Congrats! If you are here, that means your tests passed! Here we are saving the timings for our tests so that the next run will be able to make use of this run’s data. We use -{{ epoch }} in the key so that our restore step will always pull the latest cached data.

- save_cache:
    key: v1-test-times-{{ .Branch }}-{{ epoch }}
    paths:
      - tmp/test-results/parallel_runtime_rspec.log

Cache our bootsnap cache

Similar to the previous step, we need to save our bootsnap cache so the next run can start up blazing fast!!!

- save_cache:
    key: v1-bootsnap-cache-{{ checksum "Gemfile.lock" }}-{{ .Branch }}-{{ epoch }}
    paths:
      - tmp/bootsnap_cache

Store test results

CircleCI stores test results metadata so you can make use of some Insights.

- store_test_results:
    path: tmp/test-results

Store test artifacts

Store the results as an artifact if you need them for anything else.

- store_artifacts:
    path: tmp/test-results
    destination: test-results

Store coverage artifacts

Don’t forget your coverage results!

- store_artifacts:
    path: coverage

Some general caching tips

You may have noticed a few things we do with our cache keys. I figured I’d just spell out two of the general approaches we take with these.

The first one is that if the object to be cached is small, we will always create a new cache entry by using -{{ epoch }} as part of our key. This way we will always have the most recent and up to date info for the next CI run.

The second is that if we use -{{ .Branch }} in our key, we will usually have the fallback key to be -master. The reason is that the cached object for master will most likely give you a good starting point for your new branch.

Conclusion

And there you go! There is a bunch in there, but I think this setup is one that can grow with your app and CI setup. If you spot any issues, which I am sure are there somewhere, or see more room for improvement, please let us know at development@later.com.