Multi-Stack Integration Tests with CircleCI

There's an old adage about defensive-security that goes like this:

A defender can find and fix a thousand vulnerabilities in their software, but if they miss even one, the attacker has already won.

We're not going to lie, defensive security is tough to get right. OWASP alone lists almost 200 classes of vulnerabilities, and between your standard XSS exploit, and more obscure attacks like NoSQL injection, there are more ways for an attacker to exploit your application than any single team of engineers can be expected to protect against - at least, if they want to have time left over to actually build a product. That's why we're firm believers in the idea of integrating vulnerability scanning into your DevOps process; if we can detect almost all of your vulnerabilities before your code even hits production, your engineers can spend more of their time solving problems instead of securing against them.

That's the goal at least, but not letting any vulnerabilities slip by in the first place is a task of its own. Most engineers agree that writing correct code is much easier with a solid test suite, and it's no different when dealing with vulnerability scanning - except when some vulnerabilities only manifest themselves on a misconfigured Tomcat server, running on a Windows box. Unit tests are great, but unless you actually stress the application in a production-setting, you risk letting some particularly nasty bugs slide through: in our case, false negatives for vulnerabilities with severe consequences.

In the course of building Tinfoil Security, we've written integration tests which pit our scanner against everything from Sinatra servers, to your standard LAMP setup, with even a few Windows stacks thrown in for good measure. We soon found that our dependence on so many virtual machines meant that running our tests entirely locally was out of the question - our development machines just weren’t powerful enough to run through the suite in any reasonable amount of time, while also letting our engineers be productive as the tests ran. We evaluated a few solutions, but anything viable required more resources than our small team was willing to throw at the problem. In the end, we found that we couldn’t reasonably justify including some of our more expensive integration tests as part of our development cycle.

Enter CircleCI.

CircleCI markets itself as continuous integration with easy setup, and little maintenance, and from our experience with it, that couldn't be closer to the reality. We've experimented with a few solutions now, and have found that they offer just the right balance between the usability of other cloud CI providers, and the extensibility of a self-hosted solution like Jenkins. For a while, we had just been using them to run our unit tests on every check-in, but in late May, they introduced the concept of parameterized builds. This was just what we needed to solve our problems with integration tests.

Every CircleCI integration is controlled using a circle.yml in your project's root. A standard configuration for a Ruby project might look something like this:

machine:
  ruby:
    version: 2.0.0-p451

test:
  override:
    bundle exec rake spec

If your project needs it, there's dozens of other options available for setting up dependencies like MySQL or Redis. In our case, we were interested in the ability to read environment variables from the configuration file. If we could break our build process into discrete enough steps, we'd be able to parameterize our builds in such a way that we could easily run our various integration tests in parallel, nightly, without any of the maintenance cost associated with hosting our own build servers. Or at least, that was the original goal: we eventually got to the point where we could run our tests on every build, even while spinning up our small fleet of VMs.

We decided that working towards 100% coverage on WAVSEP would be an excellent first pass to test the waters. WAVSEP is a set of compliance tests that together act as a honeypot of sorts, meant to measure the performance of vulnerability scanners. Historically, we've done very well on the tests, but a recent revamp of the project meant that we had never actually run through the new suite in its entirety.

A portion of WAVSEP tests against some Windows specific vulnerabilities, and so it requires a relatively beefy Windows installation to run. Setting up a Windows instance on EC2 is easy enough, but given the nature of WAVSEP, we didn't want to have to worry about keeping a vulnerable Windows server in pristine condition—obviously our builds would need to handle the lifecycle of any honeypots they test against.

If your project has unique requirements not handled natively by their built-in options, you're able to make use of setup and cleanup blocks within your project's config file to run arbitrary code; your builds are sandboxed in an LXC container, and they give you free reign to modify that container as you see fit. Our config file ended up looking something like this:

test:
  pre:
    - if [ -n "$RUN_WAVSEP"     ]; then bin/provision_wavsep;  fi
  override:
    - if [ -n "$RUN_UNIT_TESTS" ]; then bundle exec rake spec; fi
    - if [ -n "$RUN_WAVSEP"     ]; then bin/run_wavsep;        fi
  post:
    - if [ -n "$RUN_WAVSEP"     ]; then bin/teardown_wavsep;   fi

Within the provisioning step, we made use of the incredible fog gem to spin-up an EC2 instance from an AMI we created of a clean WAVSEP installation. We weren't too concerned about securing this instance, but CircleCI does provide an AWS security group, which we used to block all incoming connections not originating from a build server. From here we simply called out to the their API from a crontab, and before we knew it, we had WAVSEP running every night. Better yet, the constant feedback provided by the results has allowed us to literally cut the running time of the tests in half, making it feasible to run the tests on every build.

In the past month, we've progressed from never having run WAVSEP in its entirety, to running the entire suite every night, to, just recently, running through WAVSEP on every build. We've also uncovered numerous bugs missed by existing tests, prevented at least one particularly nasty regression, and achieved 100% coverage on the tests themselves (but that's a story for another blog post!). Based on the success of this experiment, we've since started to run all of our integration tests as part of every build, and the benefits are already showing.

Tests are awesome, but actually running them is even better.


Shane Wilton

Shane Wilton is the Grand Magistrate of Security at Tinfoil Security, and the company's resident programming language theorist. When he isn't coding in a functional language like Elixir, he's probably hacking on an interpreter for an esolang of his own, or playing around with dependent types in Idris. Security is always at the forefront of his thoughts, and he enjoys building tools which make it easy for other engineers to write secure code. His love for security is matched only by his love for bad movies - and does he ever love bad movies.

Tinfoil Security Blog

Tinfoil Security provides the simplest security solution. With Tinfoil Security, your site is routinely monitored and checked for vulnerabilities using a scanner that's constantly updated. Using the same techniques as malicious hackers, we systematically test all the access points, instantly notifying you when there's a threat and giving you step-by-step instructions, tailored to your software stack, to eliminate it. You have a lot to manage; let us manage your website's security.