Remote caching with sbt and S3 across multiple machines
sbt 1.4.x introduced remote caching, a feature for sharing your build output in order to save time during compilation. You can think of it similarly to how incremental compilation works in that if you change 1 file of a project containing hundreds of files, you'd rather not have to recompile everything. Remote caching takes this one step further. Rather than be limited to your machine, you can do this across multiple machines (or multiple branches on your local machine or what have you).
sbt's remote caching feature is technically still marked experimental, but we've been using it at work for over a year now and haven't run into issues so far. We originally only set up this feature in CI (in our case Jenkins) but recently I thought to myself, "Wouldn't it be nice to also re-use Jenkins' build output on my local machine too so that I never have to do a full recompile on this project ever again?" Considering the project in question is 300,000+ LOC and takes roughly 7 minutes to perform a clean build on my underpowered laptop, I figured this would be a huge win. And spoilers... it was! 😄 Now onto the setup:
Choosing the storage method
You have many options as to where to store the build cache. sbt can cache via Maven repository, so anywhere you can host a Maven repository is an option. That means a private Nexus repository would also work. In fact, here's a blog post from Muki Seiler which goes into that (as well as using minio as an alternative).
In our case, we didn't already have a private Nexus repository, nor were we too keen on setting one up as that meant needing another server running 24/7. We decided to try S3 instead, as it would mean less infrastructure to set up as well as being less costly.
The setup
Luckily there already exists an sbt plugin for resolving and publishing artifacts using S3 called fm-sbt-s3-resolver. Simply add the following to
your plugins.sbt
(or wherever you store it):
addSbtPlugin("com.frugalmechanic" % "fm-sbt-s3-resolver" % "0.20.0")
Then create an S3 bucket where you intend to store the build cache. For example, let's call the bucket
sbt_build_cache
.
Once that's done, you can set the pushRemoteCacheTo
sbt setting in your build file like so:
pushRemoteCacheTo := Some("S3 Remote Cache" at "s3://your-s3-bucket/sbt_build_cache/")
Note that it's called pushRemoteCacheTo
but this works in both directions (push and pull).
There is no
pullRemoteCacheFrom
.
Credentials for S3
fm-sbt-s3-resolver has a section
about
credentials but personally I didn't see the need for any of that. The default
AWSCredentialsProvider
should already do everything you need. For your local machine, just make sure you've already configured your
credentials file (the one usually located at ~/.aws/credentials
). In my case, I already had
done this a
long
time ago which meant I didn't need to do anything else, but if you've never done it before follow the steps
described
here.
Similarly, our Jenkins instance was basically already configured as it runs on
ec2 and has an AWS role attached to it that can access the S3 bucket. The default AWSCredentialsProvider
takes it from there and will perform the auth. If you haven't done this before, the
following article may be helpful.
And that's it. No need to set up a shared account or commit anything extra to the repository. Each developer should have their own AWS account that can be managed (and revoked) separately that accesses the S3 bucket.
While I've not tried this with GitHub Actions, I imagine the easiest solution would be to use the configure-aws-credentials action and follow the instructions described there.
Actual usage
In CI, when running your sbt command be sure to include the pullRemoteCache
and pushRemoteCache
tasks.
Something like pullRemoteCache; Test/compile; pushRemoteCache; test
or however you'd prefer to
order the
tasks
(such as pushing the remote cache only after your tests have passed).
As for usage on my local machine, I simply call pullRemoteCache
followed by a
Test/compile
in sbt. What was
once 7
minutes to compile everything now takes 15 seconds. Almost all that time is spent downloading the artifacts.
The compile step itself takes 2 seconds.
Note the download from S3 only happens once per content hash. The artifacts are stored in your local sbt
cache after
the first download. So if you do another clean
followed by a pullRemoteCache
,
it'll be much
faster the second time.
The tricky parts
Everything up to this point went smoothly for me. As I mentioned earlier, we've been using sbt remote
caching
for over a year now in CI. The difference this time is that our local machines use a different environment
from
Jenkins. So when I pulled the remote cache from S3 (that Jenkins had uploaded) onto my Windows dev machine, I
ran into
unforeseen
issues. While the pullRemoteCache
task worked fine, whenever I went to compile anything it
would do a full
recompile each time.
What I later realized is that all the cache files were getting invalidated by sbt/Zinc during the
incremental compilation step. The
reason was that what was being passed into scalacOptions
differed slightly between my machine
and Jenkins.
For
example, if my machine uses a -Ybackend-parallelism
value of 16 while Jenkins uses 4, then Zinc
will
invalidate all the files and recompile everything.
An even worse example is that on Windows the plugin paths are
sent
as -Xplugin:target\compiler_plugins\wartremover_2.13.8-3.0.5.jar
and not
-Xplugin:target/compiler_plugins/wartremover_2.13.8-3.0.5.jar
. The backslash difference is
enough to cause
everything to get invalidated.
As a workaround I did the following:
incOptions := incOptions.value
.withIgnoredScalacOptions(
incOptions.value.ignoredScalacOptions ++ Array(
"-Xplugin:.*",
"-Ybackend-parallelism [\\d]+"
)
)
Luckily this fixed it and everything works fine now.
It took me a while to find but if you're struggling with similar issues, use the following options to enable the logs for debugging purposes:
logLevel := Level.Debug
incOptions := incOptions.value.withApiDebug(true)
You'll be able to see what is causing invalidation to happen. Beware though, the logs are noisy. You may want to pipe it to a file.
Additional thoughts
Consider adding an expiration policy to your S3 bucket so that these cache files don't accumulate forever. We arbitrarily went with a 2 week lifetime, but use whatever value you think makes sense.
As for sbt's incremental compilation options, I feel like it should be more aware of which scalac options are "safe" (i.e. don't ruin repeatable builds) and which are unsafe. Right now it seems way too aggressive to me as any difference will cause an invalidation. At the very least it should normalize paths. But perhaps this was never a concern before the remote caching feature was introduced. Maybe it's something that makes sense to address now.
Conclusion
And that's it! I hope that was helpful and you're able to get remote caching working. It's definitely worth it for large projects. Surprisingly I rarely see this feature mentioned, and that's a shame because it's been tremendously helpful for cutting down our CI times. And now with local compilation too!