Robust CDK snapshot testing with snapshot serializers

Robust CDK snapshot testing with snapshot serializers

One of the great things about CDK (the Cloud Development Kit) is that you can treat Infrastructure as Code as Code (IaCaC). That is, you can bring all the skills and experience you've built up over the years to your IaC code. One of these skills is unit testing. Unit testing has always been a pain point in IaC because it typically requires infrastructure to actually be created in order to test that it was created properly. Unfortunately, this is slow, can be expensive, and could even introduce security problems if there are security issues in the IaC. (Arguably this isn't unit testing anyway, but integration testing since it involves cloud APIs, but I'm not going to quibble about details.) Accordingly, most IaC that I've seen doesn't even try to do unit testing.

Enter CDK

Once we use CDK for our IaC, now we're writing with real programming languages. So, true to the IaCaC principle, we can write unit tests. In CDK you can easily create and test the stack for specific resources. The default CDK stack created by the cdk init app command even supplies you with a simple test as an example:

test('SQS Queue Created', () => {
  const app = new cdk.App();
    // WHEN
  const stack = new Testcdk.TestcdkStack(app, 'MyTestStack');
    // THEN
  const template = Template.fromStack(stack);

  template.hasResourceProperties('AWS::SQS::Queue', {
    VisibilityTimeout: 300
  });
});

You can easily expand this to do as much testing as you would like. However, like much example code you find, this may not be the best way of testing. You can absolutely do this, but generally people find it tedious. Besides, CDK allows you to create resources conditionally - again, IaCaC - and this kind of unit test is tedious to write and can be difficult to fix when the conditions change.

A better way - Snapshots

Because of this drawback, many people use the build-in jest snapshot testing. By default, this allows you to unit test your entire stack or construct at once.

  test('Snapshot Test', async () => {
    const app = new cdk.App();
    const applicationStack = await ApplicationStack.construct(app, 'TestApplicationStackSnapshot', {
      env: BOGUS_ACCOUNT_ENV_PROPS.env,
    });
    const template = Template.fromStack(applicationStack);

    // This will ensure that the configuration the stack does not have any unintended changes
    // Note: this automatically creates a snapshot if there isn't one; otherwise, it will compare
    //       the state of the template with the saved snapshot
    expect(template.toJSON()).toMatchSnapshot();
  });

The snapshot will be stored in a file with a more-or-less JSON (not quite) format, by default in a local __snapshots__ sub-directory with a file name like application-stack.test.ts.snap. You should always check these snapshot files into git. As long as you make no changes to the resources, tests pass normally.

npm run test:unit

If you make resource changes, the snapshot test will fail, but it's easy to have the stored snapshot be updated. NOTE: make sure you actually examine the output to see that no unintended changes are being made.

npm run test:unit -u

Even better, you can create multiple snapshots. This lets you easily set up individual jest unit test cases to test conditionally-created resources:

  test('Snapshot Test for Condition X', async () => {
    // Rest of code is same as above
    expect(template.toJSON()).toMatchSnapshot('condition-x-snapshot);
  });

So what's not to like?

With all this goodness, snapshots seem ideal. Unfortunately, there is a hidden drawback. It turns out that there are a number of things that may change that you really don't care about. For example:

  • S3 buckets may be used but not given names; these may change from run to run.

  • Asset names are long random sets of characters - these are the sub-directory names that appear in the cdk.out directory - and they may change when unrelated changes are made to the stack.

This causes snapshot verification to fail with false positives. And like most false positives, they are not only annoying, but actually harmful because they may make it harder to find actual problems. The last thing you want is a developer blindly clicking through ten pages of false positives every time they do a commit.

Making snapshots less fragile FTW!

Fortunately, jest has a built-in capability to modify what goes into the snapshots: the snapshot serializer.

  expect.addSnapshotSerializer({
    test: (val) => checkCondition(),
    print: (val) => substituteValue()

This lets you check the values and substitute ones that are less fragile. Two big ones we've found with CDK is the two asset-related ones above:

  // Expected patterns
  const bucketMatch = new RegExp(`cdk-[0-9a-z]{9}-assets-${account}-${region}`);
  const assetMatch = /[0-9a-f]{64}\.zip/;

  expect.addSnapshotSerializer({
    test: (val) => typeof val === 'string'
      && (val.match(bucketMatch) != null
        || val.match(assetMatch) != null),
    print: (val) => {
      // Substitute both the bucket part and the asset zip part
      let sval = `${val}`;
      sval = sval.replace(bucketMatch, '[ASSET BUCKET]');
      sval = sval.replace(assetMatch, '[ASSET ZIP]');
      return `"${sval}"`;
    }
  });

Now every string that looks like a bucket or asset name will be substituted with the string [ASSET BUCKET] or [ASSET ZIP].

Anything else fragile?

Once you have the basic serializer in place, you can substitute for other fragile things.

Can I check for specific keys?

You can, after a fashion. All strings, including key strings, are actually also passed into the serializer; but you don't usually need to modify the key values. However, the way the serializer seems to work is:

  • Descend through the object tree

  • Pass each node to the snapshot serializer's test function

  • If the test function returns true, call the print function, use its output, and don't descend any further

  • Otherwise, if the node is a javascript object or array, descend a level and call the test function on each sub-node

As a result, the test function is called for each object or array, and also each sub-object/array and string inside the object/array. This means you can modify values for only specific keys of an object, just not as simply as one would like. It would be a lot more straightforward if it were called for each key/value pair, but alas.

So, let's say you want to robustify a Lambda environment variable object that contains

{
  "SNOWFLAKE_CRED_SECRET":"sf/23488fa34c/profile", 
  "OTHER_LAMBDA_ENV_VAR":"foobar
}

by substituting the value for SNOWFLAKE_CRED_SECRET. The only way I know of to do this is stringify the entire environment variable object using something like:

  expect.addSnapshotSerializer({
    test: (val) => typeof val === 'object'
      && val.includes('SNOWFLAKE_CRED_SECRET'),
    print: (val) => {
      return JSON.stringify({
        ...(val as any),
        SNOWFLAKE_CRED_SECRET: '[SNOWFLAKE CRED SECRET]'
      });
    }
  });

Iterate

Beyond the known problem cases, you might want to keep an eye on when your snapshots change to see if there are changes that should pass through silently. For example, we found that we are having base64-encoded hashes being passed to Lambdas that often caused false positives. So, we added an additional serializer - you can add as many as you want - to substitute strings with that form with [B64 HASH]. I suggest to do this even you have expected changes to the stack. After a couple rounds of this, you'll probably have most of your false positives ironed out.

That's it! Let us know in the comments if you find any other useful strings to ignore.