At Later, we use CircleCI for most of our testing (and some of our deployment) needs. There are a bunch of great articles out there on how to speed up your CI builds. Over the past year we have spent some time making sure that our development and deployment processes aren’t blocked by our test suites.
In this post I am going to go over step by step two of our CircleCI jobs that we run on all Ruby code that is pushed to Github. The first job we run caches and audits our dependencies and runs Rubocop, and the second job actually runs all of our specs. You can see the config.yml
in full here. Parts of our setup require some code changes in our app; when we run into those, I’ll add the Ruby code to see how it works in conjunction with the CI build.
Some of the speed ups we talk about here won’t really make much of a difference for smaller projects, but the nice thing is their benefit will grow along with your app. Add them now and don’t worry about them anymore.
Here is a brief look at what we’ll be going over:
- Speeding up your
rubocop
step- caching
- parallel
- Enabling and configuring
bootsnap
for an ephemeral CI environment - Speeding up database setup
- Using
*-ram
images - Tracking changes and caching schema
- Using
parallel_test
gemPARALLEL_TEST_PROCESSORS
- database setup
Setup Job
As mentioned above we use this job to do a few things to prep for the rest of the CircleCI workflow. These are:
- Download and cache our dependencies
- Audit our dependencies for vulnerabilities
- Run
rubocop
Docker
The first thing to mention is that we try to use the CircleCI pre-built Docker images. If we do need to use a custom Docker image, we will typically start from one of CircleCI’s images. Our reasoning is that it is more likely that the layers for these images are already cached on the instance our job is running on. In turn this reduces the overhead associated with the ‘Spin up Environment’ step each job has to execute.
For our Ruby based apps we typically use this image:
docker:
- image: circleci/ruby:2.5.1-node-browsers
Code checkout
This step is probably part of every job. It checks out your code to the CI machine. It is worth noting that some people might benefit from source caching. We gave it a shot, but didn’t really see any benefit.
- checkout
Get the Gemfile.lock from the master branch
This command creates a file that is a duplicate of the Gemfile.lock
from the master
branch. I’ll explain why we do this in the next step.
- run:
name: Get the Gemfile.lock from the master branch
command: |
git show master:Gemfile.lock > MasterGemfile.lock
Restore gem cache
One of the first things you can do to help speed up your builds is to cache your gem dependencies. You can read about all the details here, but I’ll give a quick synopsis here. The job will check each of the keys in order looking to see if there is a matching cache entry. An important note is that these aren’t exact string matches, but instead are prefix matches. If the key (prefix) matches multiple entries, the job will pull the most recently created cache object.
Here we first try to pull the cache associated with a particular instance of our Gemfile.lock
file. If we have added or updated some gems then our Gemfile.lock
will have changed and we will fallback to the most recent cache object created by a CI run on the master
branch. I am making a few assumptions here. One is that master
is your default git branch, and the second is that master
is most likely to have a version of Gemfile.lock
closest to this branch’s newly updated Gemfile.lock
. There are situations when other options might be closer, but I believe this generalizes pretty well across most situations.
Note: We use a v*-
prefix in case we need to bust the cache all at once.
- restore_cache:
keys:
- v1-dependencies-gem-{{ checksum "Gemfile.lock" }}
- v1-dependencies-gem-{{ checksum "MasterGemfile.lock" }}
- v1-dependencies-gem
Verify and/or install dependencies
This step checks to see if our cache has everything we need and if not, installs the missing gems. Doing a bundle check
first speeds things up little in the case where you don’t need to download anything. We set the path to be vendor/bundle
so we can easily cache the gems in a later step. The --jobs=4
flag add some parallelism to speed things up.
- run:
name: bundle install
command: |
bundle check --path vendor/bundle || bundle install --jobs=4 --retry=3 --path vendor/bundle
Gems security audit
Security vulnerabilities are no good. The wonderful bundle-audit
gem helps you keep on top of issues with your gems. Note: the --update
flag was added in version 0.5.0.
- run:
name: Dependencies security audit
command: |
bundle exec bundle-audit check --update
Remove unused gems
We don’t want to cache gems or versions we are no longer using, so let’s clean things up.
- run:
name: Clean up gems before we save
command: |
bundle clean
Cache gems
Cache the gems associated with this version of the Gemfile.lock
. Notice the vendor/bundle
path. By putting all of our gems there, it allows us to cache them easily with this step.
A couple of things to note is that if the key already exists, this step will effectively noop. In cases when your dependencies did change, in mature Rails apps this step can take a bit of time. This is one reason we don’t add a {{ .Branch }}
to the key. This would force a cache save on the first run of each branch which shouldn’t be required.
- save_cache:
key: v1-dependencies-gem-{{ checksum "Gemfile.lock" }}
paths:
- vendor/bundle
Restore rubocop
cache
Rubocop is amazing. Something people might not realize is that rubocop caches its results to be used on later runs. If you are using the same version of rubocop
, the same .rubocop.yml
, and the contents of a file haven’t changed, then rubocop
can just display the results from a previous run. No parsing or analyzing needed!
Here we are restoring said cache. First we check for the latest data from our branch, then check for the latest from master
and if that fails, we check for the latest from any branch using the same .rubocop.yml
file.
- restore_cache:
keys:
- v1-rubocop-cache-{{ checksum ".rubocop.yml" }}-{{ .Branch }}
- v1-rubocop-cache-{{ checksum ".rubocop.yml" }}-master
- v1-rubocop-cache-{{ checksum ".rubocop.yml" }}
Run rubocop
Make sure to use the --parallel
option! Didn’t know about this for a good while.
- run:
name: Rubocop
command: bundle exec rubocop --parallel
Save rubocop
cache
Rubocop defaults to using $HOME/.cache/rubocop_cache
to store all of its results. Here we add {{ epoch }}
to the cache key to make sure that we are always using the most recent rubocop results for this branch. Remember that the restore_cache
step is a prefix match that uses the most recently written version it finds that matches the prefix. The cache object is small, so we can feel free to write this every single run.
- save_cache:
key: v1-rubocop-cache-{{ checksum ".rubocop.yml" }}-{{ .Branch }}-{{ epoch }}
paths:
- ../.cache/rubocop_cache
And that finishes up our setup job!
Specs Job
We are trying to make good use of CircleCI 2.0 and workflows at Later. Workflows let us parallelize different jobs and cut down on the end to end time of our CI runs. By running our setup job first then following it up with a separate job to run our tests we give ourselves the flexability to parallelize our tests in a lot of different ways. We could use CircleCI’s built in parallelism, break them up in rspec tags, or the actual spec directories.
In addition to parallelizing our tests, when we are running CI for our master branch we are actually processing all our docs while our tests are running. If the tests pass, then we publish the docs. If the tests fail, then the workflow exits and the docs deploy job never runs. Since we use rspec_api_documentation
and yard
for our docs, they benefit from the gem caching we do in the setup job.
Docker
You’ll notice that we are using the same base image Ruby image from CircleCI along with a redis and postgres image. Be sure to notice that we use -ram
variant for the postgres image. This image has postgres setup to use a RAM volume instead of the disk. These 4 characters can really help speed up your tests. Outside of some things to setup our database, you can see we set the PARALLEL_TEST_PROCESSORS
environment variable. This tells the parallel_test
gem how many processes to use when running your specs.
Note: We have found that setting PARALLEL_TEST_PROCESSORS
to 4
has been a good configuration for CircleCI’s medium
instances. Unless otherwise specified, your jobs are running on a medium instance.
docker:
- image: circleci/ruby:2.5.1-node-browsers
environment:
RAILS_ENV: test
PGHOST: 127.0.0.1
PGUSER: root
PARALLEL_TEST_PROCESSORS: 4
- image: redis:3.0
- image: circleci/postgres:10.3-ram
environment:
POSTGRES_USER: root
Code checkout
Once again we check out our code.
- checkout
Restore gem dependencies
We only need to check one key for our dependencies in this job because we know that we cached something to this key either in the previous setup job on in a previous workflow.
- restore_cache:
keys:
- v1-dependencies-gem-{{ checksum "Gemfile.lock" }}
Tell bundler where our gems are located
Something I ran into while trying to break up our original CI build into multiple jobs was getting bundler to pick up the gems in the vendor/bundle
directory. My solution was to just run a bundle check
which seemed to set everything up correctly for the rest of the job.
- run:
name: Setup bundler path
command: |
bundle check --path vendor/bundle
Restore bootsnap cache
bootsnap
is a great gem. Start up times just get faster. Which is awesome. There are a few things we need to do in order to make sure bootsnap
is configured correctly for CI and tests. We don’t use the default require 'bootsnap/setup'
, we configure it manually.
In our config/boot.rb
:
...
require 'bundler/setup' # Set up gems listed in the Gemfile.
require 'bootsnap'
env = ENV['RAILS_ENV'] || ENV['RACK_ENV'] || ENV['ENV']
development_mode = ['', nil, 'development'].include?(env)
cache_dir = ENV['BOOTSNAP_CACHE_DIR'] || 'tmp/bootsnap_cache'
# If we explicity run coverage locally or are in CI where we always run it
compile_cache_iseq = !ENV['CIRCLECI'] && !ENV['COVERAGE']
Bootsnap.setup(
cache_dir: cache_dir,
development_mode: development_mode,
load_path_cache: true,
autoload_paths_cache: true,
disable_trace: false,
compile_cache_iseq: compile_cache_iseq,
compile_cache_yaml: true
)
...
Things we are configuring:
cache_dir
totmp/bootsnap_cache
. This is what we are actually restoring for the CircleCI cache in this step.development_mode
is turned off. This tells the cache that things aren’t going to be changing a bunch and it views the cache as stable.disable_trace
is set tofalse
. It is set tofalse
in the default setup (though not on the main README, which is why I mention it here). Just stuck with it.compile_cache_iseq
is turned off in a couple of scenarios. Running code coverage tools and using this flag are mutually exclusive. Since we always run coverage in our CI builds, we turn this flag off if we detect that we are running on CircleCI or we explicitly enable coverage locally via something likeCOVERAGE=true rspec spec/
. You read more about it here.
We also make a small change to our spec_helper.rb
unless !ENV['CIRCLECI'] && !ENV['COVERAGE']
require 'simplecov'
require 'codecov'
end
This turns off coverage locally unless we explicity enable via the COVERAGE
environment variable. We are typically more interested in output of the tests and would rather them run quicker than having coverage run everytime.
Our actual CirlceCI step:
- restore_cache:
keys:
- v1-bootsnap-cache-{{ checksum "Gemfile.lock" }}-{{ .Branch }}
- v1-bootsnap-cache-{{ checksum "Gemfile.lock" }}-master
Rails smoke test
I have a confession. We have some classes that don’t have any test coverage. I know, I know. We are horrible people, and we apologize. This step gives us the tiniest of smoke tests for those classes. We are just triggering the eager load of all of our classes. It’ll mostly check to make sure some syntax and our class/file naming is ok.
Please don’t look at me like that. I already apologized…
- run:
name: Eager load classes to check for issues
command: |
bundle exec rails runner 'Rails.application.eager_load!'
Clockwork smoke test
We use the clockwork
gem to handle our cron like tasks. Unfortunately we have been bitten by a few misconfigured at
values. When this happens, the tasks simply don’t run. This step invokes our clockwork.rb
to warn us if something is amiss.
- run:
name: Validate clockwork.rb
command: |
bundle exec ruby ./clockwork.rb
Wait for PG image to start
If your tests require a database to run, then it is probably important to make sure it is up and running. We put this farther down the list of steps to give the database image time to startup while we check other things. If by the time we get here, the image hasn’t started, this step will block for up to a minute waiting for the database to get going. After a minute it will timeout and fail the job.
- run:
name: Wait for postgres
command: dockerize -wait tcp://localhost:5432 -timeout 1m
Wait for Redis image to start
This step is similar to the one above except for our Redis image.
- run:
name: Wait for redis
command: dockerize -wait tcp://localhost:6379 -timeout 1m
Install psql
I found this great article when I got around to optimizing the database portion of our CI builds. The gist of the article is that you monitor your db/
directory for changes and if there are changes then you run your setup like normal, dump the resulting SQL, and cache it. If there aren’t any changes then you simply load the cached SQL into your database.
In either case you’ll need to install the client associated with your database. Since we are running Postgres 10.3 we need to jump through a few hoops to get the correct client downloaded.
It is worth noting that we could probably shave a few seconds off here by adding this to a custom Docker image. We figured that the 5 seconds we would save weren’t worth the hassle of maintaining the custom image.
- run:
name: Install Postgres Client
command: |
sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt/ `lsb_release -cs`-pgdg main" >> /etc/apt/sources.list.d/pgdg.list'
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -
sudo apt update
sudo apt install postgresql-client-10
Gather database info
This step is where we actually monitor for changes in our database configuration. In the above article they do this by using the git sha associated with the db/
directory (git log --pretty=format:'%H' -n 1 -- db/
). We’ve taken a bit of a different approach. We decided to checksum each file in the db/
, sort them, and cat them to a file. We then add the checksum of the config/database.yml
as well. Next, comes the number of processes we are going to use for our parallel_test
command and the database version. If any of these change, the cached SQL dump will keep up with it.
- run:
name: Database Checksum
command: |
find db -type f -exec md5sum {} \; | sort -k 2 > db_dir_checksums.txt
md5sum config/database.yml >> db_dir_checksums.txt
echo $PARALLEL_TEST_PROCESSORS >> db_dir_checksums.txt
psql postgres -A -t -c 'SELECT version()' >> db_dir_checksums.txt
See what your database setup is
This is just a fun little step to see exactly what you are depending on for your database cache.
- run:
name: Cat db_dir_checksums.txt
command: |
cat db_dir_checksums.txt
Restore your database dump
Once everything is written to db_dir_checksums.txt
we can use it as a checksum in our cache key.
- restore_cache:
keys:
- v1-database-schema-{{ checksum "db_dir_checksums.txt" }}
Setup database(s)
If we had a cache hit in the previous step, we’ll have a file called postgres_dump.sql
present. If it is, then we just load its data into our database. If it isn’t there, then we know that we have a new database configuration and we need to run our setup and dump the resulting schema.
- run:
name: Database Setup
command: |
if [ -e postgres_dump.sql ]
then
echo "Restoring databases dump"
psql -U postgres -f postgres_dump.sql
else
echo "Setting up databases"
bundle exec rake parallel:setup
echo "Dumping databases"
pg_dumpall > postgres_dump.sql
fi
Cache database dump
This will noop if this was a preexisting database configuration, but if not we’ll save the new schema.
- save_cache:
key: v1-database-schema-{{ checksum "db_dir_checksums.txt" }}
paths:
- postgres_dump.sql
Create our test results directory
I found I needed to create the directory I was going to place our test results in ahead of time.
- run:
name: Create test results directory
command: |
mkdir tmp/test-results
Restore our test timings for parallel_test
Fetch the most recent test timings for this branch. If this is the first time running tests on this branch, use the timings from master
.
- restore_cache:
keys:
- v1-test-times-{{ .Branch }}
- v1-test-times-master
Run our tests!!!
Finally we get to run our tests! There is a bunch going on here. There are three basic parts:
1) the parallel_test
invocation
2) the Rspec flags
3) the CircleCI test splits
The parallel_test
invocation is telling us we are using rspec
(-t rspec
) and where previous test timings can be found (--runtime-log tmp/test-results/parallel_runtime_rspec.log
). The next five lines (following the --
) are the flags that will be passed to rspec
. These are three different formatters and their respective output files. The last line is some CircleCI magic for automatically splitting up specs when you use their built in parallelism
feature.
- run:
name: Run specs
command: |
bundle exec parallel_test \
-t rspec \
--runtime-log tmp/test-results/parallel_runtime_rspec.log \
-- --format progress \
--format RspecJunitFormatter \
--out tmp/test-results/rspec.xml \
--format ParallelTests::RSpec::RuntimeLogger \
--out tmp/test-results/parallel_runtime_rspec.log \
-- $(circleci tests glob "spec/**/*_spec.rb" | circleci tests split --split-by=timings)
Cache our test timings
Congrats! If you are here, that means your tests passed! Here we are saving the timings for our tests so that the next run will be able to make use of this run’s data. We use -{{ epoch }}
in the key so that our restore step will always pull the latest cached data.
- save_cache:
key: v1-test-times-{{ .Branch }}-{{ epoch }}
paths:
- tmp/test-results/parallel_runtime_rspec.log
Cache our bootsnap
cache
Similar to the previous step, we need to save our bootsnap
cache so the next run can start up blazing fast!!!
- save_cache:
key: v1-bootsnap-cache-{{ checksum "Gemfile.lock" }}-{{ .Branch }}-{{ epoch }}
paths:
- tmp/bootsnap_cache
Store test results
CircleCI stores test results metadata so you can make use of some Insights.
- store_test_results:
path: tmp/test-results
Store test artifacts
Store the results as an artifact if you need them for anything else.
- store_artifacts:
path: tmp/test-results
destination: test-results
Store coverage artifacts
Don’t forget your coverage results!
- store_artifacts:
path: coverage
Some general caching tips
You may have noticed a few things we do with our cache keys. I figured I’d just spell out two of the general approaches we take with these.
The first one is that if the object to be cached is small, we will always create a new cache entry by using -{{ epoch }}
as part of our key. This way we will always have the most recent and up to date info for the next CI run.
The second is that if we use -{{ .Branch }}
in our key, we will usually have the fallback key to be -master
. The reason is that the cached object for master
will most likely give you a good starting point for your new branch.
Conclusion
And there you go! There is a bunch in there, but I think this setup is one that can grow with your app and CI setup. If you spot any issues, which I am sure are there somewhere, or see more room for improvement, please let us know at development@later.com.