
Is Your Terragrunt Truly DRY? A Smarter Approach to Infrastructure Management
Infrastructure as code, most commonly implemented by Terraform, is the most common way of managing modern cloud environments. But let’s face it: as your infrastructure grows, so does its complexity. When you’re juggling multiple environments and modules, keeping your Terraform or OpenTofu code clean, consistent, and manageable often feels like an impossible balancing act. This is where Terragrunt comes in. It is a thin wrapper above Terraform meant to help make your infrastructure DRY (Don’t Repeat Yourself) and maintainable. But here’s the catch—does it really? While Terragrunt boldly promises to eliminate redundancy and streamline your codebase, many users find themselves falling back into the familiar pattern of copy-pasting, with folders multiplying as more and more environments are added and configurations quietly drifting apart. So, is Terragrunt as “DRY” as it claims to be?
In this article, we’ll dig into the common pitfalls of a typical Terragrunt project setup, challenge its weaknesses, and explore a smarter, more scalable architecture. Throughout, we’ll demonstrate key points using example code. You can access the full code examples here. Let’s start by looking at a common structure for a Terragrunt project:
.
├── envs
│ ├── _envcommon
│ │ ├── ec2.hcl
│ │ └── vpc.hcl
│ ├── dev
│ │ ├── ec2
│ │ │ └── terragrunt.hcl
│ │ ├── vpc
│ │ │ └── terragrunt.hcl
│ │ └── env.hcl
│ ├── staging
│ │ ├── ec2
│ │ │ └── terragrunt.hcl
│ │ ├── vpc
│ │ │ └── terragrunt.hcl
│ │ └── env.hcl
│ └── prod
│ ├── ec2
│ │ └── terragrunt.hcl
│ ├── vpc
│ │ └── terragrunt.hcl
│ └── env.hcl
├── modules
│ ├── ec2
│ │ └── main.tf
│ └── vpc
│ └── main.tf
└── root.hcl
What are the weak points of this architecture?
-
- Too Many Folders and Files
Each environment requires its own folder with sub-folders for every stack component. While manageable for small setups, this structure quickly becomes bulky and hard to navigate as the number of environments and components grows. - Duplicated
terragrunt.hcl
Files
Even with a sharedroot.hcl
, each component still needs its ownterragrunt.hcl
. This violates the DRY principle and increases maintenance, leading to inevitable configuration drift as changes are not made to all files. - Scattered Input Variables
Input variables are spread across multiple files likeenv.hcl
,_envcommon.hcl
, andterragrunt.hcl
. Inconsistent definitions using inputs or local blocks make tracking inputs or comparing configurations between environments difficult. - Limited Flexibility for Shared Variables
There’s no easy way to define variables for subsets of environments. You’re limited to global or environment-specific variables, making it hard to manage shared inputs for all non-prod or dev environments.
- Too Many Folders and Files
So, the solution we are looking for is a cleaner architecture that reduces clutter, eliminates duplication, and simplifies variable management across environments while staying true to the DRY principle. Let’s see how to achieve it.
Let’s begin.
As a first step, instead of a folder for each env, we will create a global template that will be used for all envs. A bit like a terraform module, we want one Terragrunt folder that can be referenced multiple times to create different environments. That way, adding a new env will be very simple – just use the existing template! No need to copy-past. And when you want to change or add something to the terragrunt.hcl
, You only need to do it in this one file. No need to go into each environment’s terragrunt.hcl
separately.
Let’s understand what I mean. Our new folder structure will look like this:
.
├── environment
│ ├── _envcommon
│ │ ├── dev.tfvars
│ │ ├── prod.tfvars
│ │ └── staging.tfvars
│ ├── ec2
│ │ ├── terragrunt.hcl
│ │ └── tfvars
│ │ ├── default.tfvars
│ │ ├── non-prod.tfvars
│ │ └── prod.tfvars
│ ├── vpc
│ │ ├── terragrunt.hcl
│ │ └── tfvars
│ │ └── default.tfvars
│ └── envs.yaml
├── modules
│ ├── ec2
│ │ └── main.tf
│ └── vpc
│ └── main.tf
└── root.hcl
The changes are:
- Instead of a folder for each env, one
environment
folder that will be used for all terragrunt runs. In it are the same sub-folders for the different components of our infra stack. - In each of the components folders there is a folder called
tfvars
, in which are stored all the configuration variables for that component. Wherever possible, Inputs are declared in .tfvars files, and not directly in input blocks or in local blocks. - In each
environment/*/tfvars/
folder there is adefault.tfvars
that stores global variables that are shared among all environments. - Other than the
default.tfvars
, there can be tfvars for specific envs (e.g.dev.tfavrs
,staging.tfvars
), or for some combination of envs (e.g.non-prod.tfvars
, qa.tfvars) - Variables that are specific to one environment, but are shared among all the env’s components are stored inside
_envcommon/
. - In the
inputs
block in each componentsterragrunt.hcl
we will keep only the inputs that are declared from the dependency. Any other inputs are declared in the relevant.tfvars
file.
OK, so we have our new structure, and we have put all the variables into tfvars files in an orderly fashion. But how do we tell terragrunt how to run?
For that, we have the two main files: root.hcl
and envs.yaml
.
Lets look at the envs.yaml
:
dev:
aws_region: "us-east-1"
vpc:
inputs:
vpc_name: "dev-vpc"
vpc_cidr: "10.1.0.0/16"
tfvar_files:
- default.tfvars
- dev.tfvars
ec2:
inputs:
instance_name: "dev-instance"
tfvar_files:
- default.tfvars
- non-prod.tfvars
prod:
aws_region: "us-east-1"
vpc:
inputs:
vpc_name: "prod-vpc"
vpc_cidr: "10.2.0.0/16"
tfvar_files:
- default.tfvars
- non-prod.tfvars
ec2:
inputs:
instance_name: "prod-instance"
tfvar_files:
- default.tfvars
- prod.tfvars
The structure is pretty simple – for each env we want Terragrunt to deploy, we create a new top-level key and give it the environment’s name. Nested inside are all the configurations Terragrunt needs to know about in order to deploy the resources: the AWS region to run in, the components to create and the inputs to pass. You can see that in addition to passing the tfvars files to use, I can also put direct inputs under the inputs
key. This is a personal decision, you can configure the yaml file to your needs and conventions.
What I like about this file is how easy it is to read. I can see in one glance what envs I have, what components I have in each, and where all the values are passed to the environment. Also, adding a new environment is as simple as adding a new top-level block, calling it qa,
and assigning it the right values files. No need to create a whole new folder and copy the terragrunt.hcl files into it.
We still don’t understand how this is run by Terragrunt, as we configured everything using yaml and not hcl. All the pieces fall into place when we look at the root.hcl
file. This is where the real power of Terragrunt shines.
The first thing that Terragrunt needs to know when we run any command like Terragrunt apply
, is in what context it is running in. The context is the env we want to update, or more specifically – the top-level key in the envs.yaml
that Terragrunt should use in this specific run. We will use an environment variable that we will export in advance and access in the locals
block of the roo.hcl
file.
locals {
env = get_env("TF_VAR_ENV")
}
We can now use local.env
and some basic Terragrunt functions to extract the configuration for this run from the envs.yaml
.
locals {
env = get_env("TF_VAR_ENV")
# Extract the top level block with the env name as the key
env_config = yamldecode(file("${get_terragrunt_dir()}/../envs.yaml"))[local.env]
# Extract the aws_region from the env_config
aws_region = local.env_config["aws_region"]
# Extract the config for the current folder (component) we are in
component = basename("${get_original_terragrunt_dir()}")
component_config = local.env_config[local.component]
# Extract the direct inputs from the component_config
inputs = try(local.component_config["inputs"], {})
# Extract the env tfvar file
env_tfvars_file = "${get_terragrunt_dir()}/../_envcommon/${local.env}.tfvars"
# Extract the additional tfvars files from the component_config
tfvars_files = try([for file in local.component_config["tfvar_files"]: "${get_terragrunt_dir()}/tfvars/${file}"], [])}
Example:
# Set the current context before running terragrunt:
cd environment/vpc
export TF_VAR_ENV=dev
# The local values for this terragrunt run will be:
local.env = “dev”
local.env_config = {
dev: {
aws_region: "us-east-1",
vpc: {
inputs: {
vpc_name: "dev-vpc",
vpc_cidr: "10.1.0.0/16"
},
tfvar_files: ["default.tfvars", "dev.tfvars"]
}
}
}
local.env_region = "us-east-1"
local.component = "vpc"
local.component_config = {
inputs: {
vpc_name: "dev-vpc",
vpc_cidr: "10.1.0.0/16"
}
}
local.inputs = {
vpc_name: "dev-vpc",
vpc_cidr: "10.1.0.0/16"
}
To tell Terragrunt to use the local.tfvars_files
, we will add them to the terraform block:
terraform {
extra_arguments "tfvars" {
commands = get_terraform_commands_that_need_vars()
optional_var_files = concat([local.env_tfvars_file], local.tfvars_files)
}
}
Making it optional will allow Terragrunt to run even if it doesn’t find the file.
I also added some terminal output that will echo the context Terragrunt is running in, which can be useful for making sure we are not making changes to the wrong environment:
terraform {
…
before_hook "before_hook" {
commands = ["apply", "plan", "init"]
execute = ["echo", "-e", "\\n\\033[1;33m====================================\\n🚀 STARTING TERRAGRUNT IN ENV: ${local.env}\\n====================================\\033[0m\\n"]
}
after_hook "after_hook" {
commands = ["apply", "plan", "init"]
execute = ["echo", "-e", "\\n\\033[1;32m====================================\\n✅ FINISHED TERRAGRUNT IN ENV: ${local.env}\\n====================================\\033[0m\\n"]
run_on_error = true
}
}
The output will look like this:
The remote state will use the local.aws_region
. Also, we must update the way Terragrunt stores the tfstate files. In common Terragrunt usage, we use the folders structure as the paths for storing the tfstate:
key = "${path_relative_to_include()}/terraform.tfstate"
The result bucket key, as an example, is:dev/ec2/terraform.tfstate
staging/ec2/terraform.tfstate
If we use this in the new structure, the key will always result to environment/ec2/terraform.tfstate
, which will cause conflicts and result in terraform destroying our resources.
To solve the problem of running different environments from the same folder path, we must change the way we structure the key:
remote_state {
backend = "s3"
generate = {
path = "backend.tf"
}
config = {
bucket = "terraform-state"
key = "${local.env}/${local.component}/terraform.tfstate"
region = local.aws_region
encrypt = true
dynamodb_table = "terraform-locks"
}
}
This will now result in the same keys as we had before by using the${path_relative_to_include()}
.
And to wrap things up, lets not forget to pass the local.inputs
inputs = merge(local.inputs, {})
And there you have it! A real “Dry and maintainable” Terragrunt project. To run this all you need to do is the cd to any folder in the environment
dir, export the TF_VAR_ENV
, and run terragrunt apply. Then you can change the value of TF_VAR_ENV
, and run again to deploy a different env.
This is a simple setup, and the options on top of this are endless. Once you understand the relationship between the terragrunt.hcl
, root.hcl
and envs.yaml
, you can template and automate almost anything.
I’ll give you a small last example, if you made it this far with me:
If you look at the envs.yaml file above, currently, we are passing the vpc_name
and instance_name
as inputs for each environment.
dev:
vpc:
inputs:
vpc_name: "dev-vpc"
ec2:
inputs:
instance_name: "dev-instance"
prod:
vpc:
inputs:
vpc_name: "prod-vpc"
ec2:
inputs:
instance_name: "prod-instance"
The names have a known pattern. They always start with the env name and continue the same. So, instead of hardcoding the names, we can template it inside the terragrunt.hcl
files, like this:
# environment/vpc/terragrunt.hcl
inputs = {
vpc_name = "${include.root.locals.env}-vpc"
}
# environment/ec2/terragrunt.hcl
inputs = {
vpc_id = dependency.vpc.outputs.vpc_id
subnet_id = dependency.vpc.outputs.private_subnets[0]
security_group_id = dependency.vpc.outputs.security_group_id
instance_name = "${include.root.locals.env}-instance"
}
The include.root.locals.env
references the local block of the root.hcl file, where we get the env name from the TF_VAR_ENV
variable.
locals {
env = get_env("TF_VAR_ENV")
}
With this small update, we reduced redundancy and automated conventions, which can help keep our cloud environment clean, well-organized, and free of human error.
By restructuring your Terragrunt project with this DRY and modular approach, you’ll simplify your environment management and reduce maintenance overhead and the risk of configuration drift over time. This setup ensures clarity, scalability, and ease of use. With Terragrunt, the possibilities for automation and optimization are virtually limitless—your infrastructure can finally work smarter, not harder.