Prologue
Once upon a time, when I was just a cub programmer, I walked into the office of a senior engineer, whom I’ll call Rob (because that was his name). Rob was looking at a configuration file for some product we were using. He looked up and said to me:
“They keep making this configuration language more complex. They should just use a regular programming language.”
I don’t remember what the product was or what the language looked like, but I never forgot Rob’s comment.
The Rise of the Domain Specific Languages (DSLs)
Years went by. We saw what we might call the rise of the DSLs, led by XML. Sure, DSLs had long been with us: SQL and regular expressions had been around for several years. But with the advent of XML, we now had an extensible language that let us describe things in more complexity than, say, a Java properties file. I was so excited about XML, I joined the XML working group. (I even had dinner with Sir Tim Berners-Lee.) Before long, the wordiness of XML made it passé and JSON took over; then JSON’s explosion of punctuation led to YAML. We also had languages whose syntax was even more domain specific; Terraform’s HCL as an example. DSLs were everywhere.
But they all had the same problem.
Tell me if you've ever seen this progression. You come up with a DSL in your product; you think, this will be simple, so you come up with a simple language.
{
"baseDir": "c:/src/myproject",
"verbose": false
}
People start using the product. Before long, they find shortcomings of the DSL. The following dance ensues. “You know, if I could just substitute variables, I’d be able to reuse the configuration.” So the DSL developer adds variables.
{
"baseDir": "${BASEDIR}",
"verbose": false
}
“We’d like to reuse this in different environments, and some of it is only applicable to certain ones.” OK, let’s add a simple conditional capability.
{
"cond": {
"expr": "${ENV}==prod",
"if": {
"baseDir": "${BASEDIR}"
},
"else": {
"baseDir": "."
},
},
"verbose": false
}
“Great, but in some environments we need to specify more than one value.” OK, let’s add arrays as a data type.
{
"cond": {
"expr": "${ENV}==prod",
"if": {
"baseDir": "${BASEDIR}"
},
"else": {
"baseDir": [".", "c:/temp"]
},
},
"verbose": false
}
“Now we want to repeat parts of it based on the arrays.” OK, let’s add a looping construct. “It would be great to modularize this configuration.” OK, let’s add server side includes. “Oh but we need to parameterize the includes.” OK, let’s make them modules with input parameters. “What if we could use inheritance…”
And so on. I’ve seen this dance between users and developers happen again and again over the years, and I've been on both sides of the dance. It seems to happen anytime a DSL gets popular. People start using it, and before long they are coming up with use cases we as the original developer didn’t think of. That’s good, right? It means the product is getting usage, and people are using it creatively. It’s what everybody wants.
Can we do this more efficiently?
I can't help but think of Rob's remark at this point. Note what we’re doing when we add all those features to a DSL: we are recapitulating the evolution of programming languages in general. From the Jacquard Loom and Difference Engine to Turing Machines to FORTRAN and COBOL to Structured to Object-Oriented to Aspect-Oriented, programming languages have evolved to make them more flexible, building up a body of software engineering concepts like modularity, abstraction, and separation of concerns. Many modifications of these DSLs expand the language in that direction.
Some of these modifications are more general than others; sometimes limited changes need to be enhanced with future modifications. If the modifications are not carefully designed, we end up with languages that are hard to use. (Note: A lot of PHP's problems have been fixed. But it took a long time, and a lot of us suffered long because of them.) Even if there are no poor design decisions, we all need to learn new languages that are usually less well-documented than other existing programming languages. With a DSL, we don't have years of Stack Overflow posts to draw on or a large community of people to ask general software design questions regarding the language.
What if we could use a more well-thought out, fully-featured means of expression from the beginning? General-purpose programming languages have this history built into them to lesser or greater degrees. Can we just use their collected wisdom? I think we can. One of the newer capabilities in this direction is the Cloud Development Kit or CDK. CDK is designed primarily (as the name implies) for cloud infrastructure - Infrastructure as Code - but I think it's worth considering as a general example.
With CDK, you have the full power of a real programming language - currently including TypeScript, Python, C#, Go, and Java. (Why you'd want to use Java for IaC I don't quite understand, but go for it.) Want variables? Conditionals? Loops? Modularity? Class- or prototype-based inheritance? You've got it, and you can use Stack Overflow just you do every day. And you can choose whatever language works best for you and your development team. Here's a simple example of CDK that provisions a Lambda function and associated Lambda Layer with the citysdk npm package:
export class MyAPIStack extends cdk.Stack {
constructor(scope: Construct, id: string, props?: cdk.StackProps) {
super(scope, id, props);
const citysdkLayer = new lambda.LayerVersion(this, 'CitySDK Layer', {
code: lambda.Code.fromAsset('src/layers/citysdk-utils'),
compatibleRuntimes: [ lambda.Runtime.NODEJS_12_X, lambda.Runtime.NODEJS_14_X ]
});
const helloWorld = new NodejsFunction(this, 'hello-world-function', {
memorySize: 128,
timeout: cdk.Duration.seconds(5),
runtime: lambda.Runtime.NODEJS_14_X,
handler: 'main',
entry: path.join(__dirname, `/../src/hello.ts`),
layers: [ citysdkLayer ]
});
}
}
You can examine each of these constructs in CDK's excellent documentation, but I think it is pretty clear what's going on: it's creating a stack (equivalent to, and implemented by, a CloudFormation stack) with a Lambda Layer and a Lambda that uses it. Deploying the code is pretty simple, too:
cdk deploy --profile my_aws_profile
Now, let’s say you want to customize the behavior for different environments. Fine, read the configuration from an external file. Make the layer optional or include different layers depending on the environment. No problem. Refactor the lambda creations out to a separate source file? Fine, just call the code remotely. Package up a set of resources and reuse it across projects or publish it to the world? Fine, create a subclass of the Construct
class to encapsulate the resources. In short, do anything a general-purpose programming language can do.
One more thing that will appeal to anybody who has spent far too long with Mr. Google trying to understand the ins and outs of some DSL: Since this is all just TypeScript, you can control-click in Visual Studio Code on any element and see its definition and possible values. This has saved me so much time, rather than trying to find blog postings illustrating the possible options for constructs.
CDK vs IaC Alternatives
Comparing CDK to other DSLs designed for cloud IaC is instructive. Let's compare it to several alternatives. I'm sure there are more, but these are the ones I have personal experience with.
CloudFormation, the granddaddy of AWS IaC tools. CloudFormation is written using JSON or YAML. There is some parameterization and modularization capability, but there are a number of limitations (e.g. stacks that publish outputs for other stacks to use are locked together with them, not allowing changes until the dependent stack is deleted). And in the end, it's extremely wordy and inflexible, lacking even simple looping constructions.
ARM Templates, Azure's answer to CloudFormation. ARM templates also use JSON (with the same kind of wordiness), but attempt to fix some of the reusability issues of CloudFormation, with more flexible conditional and looping constructs. However, in my experience, the overall developer experience ends up being much worse. You can't delete a deployed ARM template, for example, and have it automatically delete all the resources as you can with CloudFormation (at least you couldn't when I was using them extensively a couple of years ago). In addition, ARM templates just generally don't work very well. I spent literally days trying to get somebody from Microsoft support to help me get a rather complex ARM template working. In the end, Microsoft couldn't help, and we punted. There is now a more recent language built on top of ARM templates called BICEP (get it? ARM? Bicep? Those wacky Microsoft guys!) Unfortunately, they ended up just creating another DSL that's reminiscent of Terraform's HCL.
Terraform's HCL, a cross-platform IaC tool. Terraform is a standalone provisioning tool that directly uses the cloud APIs of providers such as AWS, Azure, and Google Cloud Platform. Normally you configure Terraform using HCL, a somewhat JSON-like DSL with some extensions for conditions, looping, and modularity. Over the years, HCL has definitely gotten better, no longer requiring Terragrunt, which added a "thin wrapper" to make HCL more modular, but in the bargain adding a layer of opaque complexity that adds very little. Note that Terraform also supports CDK (see below); I'm only speaking about HCL as a problematic DSL.
Serverless Application Model, and client-side toolset and clever set of macros that make it much easier to deploy serverless applications than straight CloudFormation. However, in the end, SAM is just a thin layer on top of CloudFormation, and though its client-side tools are pretty neat, I think SAM's DSL (YAML) disqualifies it from being used directly as an IaC tool. (I will be soon writing an article directly comparing the development experience in SAM vs CDK.)
Serverless Framework, a cross-platform tool for deploying serverless applications written typically using YAML. To get around YAML's lack of modularity and other modern programming constructs, Serverless Framework supports a plugin mechanism, with a large ecosystem of plugins. However, in my experience, these plugins are often buggy and poorly-documented and don't always work with other plugins. Besides, IMO the importance of its cross-platform nature is over-stated. Does anybody believe you can really write an application once and deploy it to AWS, Azure, and GCP without modifications? Besides, do you really want to? I've done cross-cloud deployments, and they have always sucked.
One thing I should mention is that Terraform has a new capability called CDK-TF. Unfortunately, as of this writing, it is still in beta, but I believe it's pretty stable and have talked to members of the CDK-TF team, who I think will be taking it to 1.0 soon. CDK-TF fixes the problems of HCL and is my go-to solution for IaC code anywhere other than AWS.
The Bottom Line
Though there have been a few successful DSLs (regex and SQL being two), in my opinion, there have been many more cases that would have benefited from just using a real programming language in the first place. So, why do we keep creating DSLs? In part, it's fun: I know how much fun it is to come up with your own configuration language. But if it becomes popular, the pressure to add more and more features will never go away. To quote my old co-worker Rob once more, we "should just use a regular programming language".