Splitting one tfstate into multiple files

Splitting one tfstate into multiple files
Photo by Med Badr Chemmaoui / Unsplash

Or How I rebuilt my basic terraform project

Starting off I had a small terraform project that was simply a test bed for new ideas. I had created a small set of configurations for a demo Cloudflare site and was experimenting with some Cloudflare security features and using IaC techniques. Eventually that small project needed to be mobile for when I wanted to work on my laptop in various places and needed to be able to checkout the project from github. So, I added a remote backend on AWS which was a fun excursion which I might also write about. Then as I was planning on additional experiments using multiple cloud providers, I quickly realized that the simple design I had originally started with was not going to be ideal. And in fact, I wanted to emulate a real production design as best I could, so there was no room for half-assed designs that "just work". Nay! I need this project to be fully robust. And frankly its a good learning experience.

I wanted to follow a particular design pattern that I had read about at Marcin Kasprowicz's Medium article Ultimate Terraform Project Structure. However, In order to modify my test project into that design I would need to break out parts of my current tfstate into multiple files respective of the area of the project they tracked. Ah, but now the question, how to restructure my project code, without having to destroy any resources? Well, you can do this by using terraform state commands to manage which tfstate files your resources are tracked in. Being a real noob at this, I needed a good primer that was going to help me with my use case. I had read several articles and discussion topics on how to best split state files, and then stumbled upon an article by Brent Woodruff from Hashicorp's own support center, How to Split State Files.

How I restructured the project

Original Layout

The original layout for this project was a very simple single directory layout. Including my configuration for the cloudflare experiments, aws backend resources, and of course the necessary providers.

.
├── aws.tf
├── cloudflare.tf
├── providers.tf

New Design

While simple enough, this obviously would not work well at a larger scale and would quickly become difficult to deal with. So I decided to restructure it to a more professional design. Using the Ultimate Terraform Project Structure design layout (which I highly recommend you read, its very good stuff), I wanted to develop something that looked a little more like this:

├── common
│   ├── README.md
│   ├── config.tf
├── pre
│   ├── aws
│   └── cloudflare
├── prod
│   ├── aws
│   │   ├── db
│   │   │   ├── README.md
│   │   │   ├── config.tf
│   │   │   ├── main.tf
│   │   │   ├── outputs.tf
│   │   │   └── variables.tf
|   |   ├── README.md
│   │   ├── config.tf
│   │   ├── main.tf
│   │   ├── outputs.tf
│   │   └── variables.tf
│   └── cloudflare
│       ├── README.md
│       ├── config.tf
│       ├── main.tf
│       ├── outputs.tf
│       └── variables.tf
└── rev

Here the design would be much more robust. A few key items of note:

  1. All resources for the tfstate backend would be maintained under the "common" directory using a single config.tf file
  2. A production (prod) and pre-production (pre) environment would be laid out in separate directories.
  3. Additionally a rev directory exists for workspace level testing (Check the Ultimate Terraform Project for more details)
  4. In both prod and pre, individual cloud providers are broken out into their own respective directories, so changes to each can be de-coupled.
  5. Specialized resources can be broken out in their own directories, if desired, to keep their changes de-coupled from the rest of the architecture. In my case I didn't want my database to be modified when I changed other components in the AWS configuration, so the db directory is nested inside the aws directory.

Fun with the terraform state command

Now with my desired design laid out I needed to place my resources into new tfstate files. In order to do this we are going to use the terraform state command with its various subcommands.

First we need to pull down our current state to a local file (current.tfstate).

terraform state pull > current.tfstate

Next copy the current.tfstate file to a new, empty directory. The empty directory is important because the next few commands will not operate as expected if run in an init'd directory. If you do run it from a directory with existing terraform configuration, the commands will use those configurations and not our current.tfstate file. We don't want that.

Now we will begin moving resources from the current.tfstate file into separate tfstates for common, aws, and cloudflare. We can identify our current resource by running the command terraform state list --state=current.tfstate > current-resources.txt. This will output a list a resources similar to the following:

# current-resources.txt
aws_dynamodb_table.terraform_locks
aws_kms_key.tf_kms_key
aws_s3_bucket.terraform_state
aws_s3_bucket_server_side_encryption_configuration.tf_state_sse
cloudflare_pages_domain.yourdomain_com
cloudflare_record.dev

For each of the new tfstates we will need to move the resources (one by one unfortunately) to their respective state files like this:

#AWS Example (do this for every AWS resource)
terraform state mv --state=current.tfstate --state-out=aws.tfstate aws_dynamodb_table.terraform_locks aws_dynamodb_table.terraform_locks

#Cloudflare Example (do this for every Cloudflare resource)
terraform state mv --state=current.tfstate --state-out=cloudflare.tfstate cloudflare_pages_domain.yourdomain_com cloudflare_pages_domain.yourdomain_com

After moving all of the resources into their respective files, my original current.tfstate file was empty and I had three new state files: one for the common resources for my backend state, one for my aws resources, and one for my cloudflare resources.

Now each of these new state files can be moved to their respective new directories in either common or prod. Once in place in the desired directory, we need to add the state to the remote backends. You can use the same S3 bucket as before. All you need to do to support multiple tfstate files is to place them in different directories in your S3 bucket. For instance:

    AWS tfstate:  terraform/prod/aws/terraform.tfstate
    Cloudflare tfstate: terraform/prod/cloudflare/terraform.tfstate
    Common tfstate:  terraform/common/terraform.tfstate
    etc...

I designed an S3 directory schema in this form terraform/{env}/{service}/terraform.tfstate. But you could you also design however you feel makes sense for your project, so long as each tfstate gets its own unique directory.

And in each of the terraform project directories, there will be a respective config.tf that looks like this:

terraform {
  backend "s3" {
    bucket         = "my-terraform-state"
    key            = "terraform/prod/cloudflare/terraform.tfstate"
    region         = "us-east-1"
    dynamodb_table = "terraform-state-locking"
    encrypt        = true
  }

  required_providers {
    cloudflare = {
      source  = "cloudflare/cloudflare"
      version = "~> 4"
    }
  }
}

With the new terraform state file and remote config in place, we now run the terraform init command in each directory to initialize it. Then, we run the terraform push command: terraform state push cloudflare.tfstate (for example) which will push the state to our remote backend.

Having performed all these steps for all of my resources and tfstate files, the configuration now was laid out in its new, more robust layout and without having to destroy and recreate resources.

Hopefully this helps you in restructuring your own projects without having destroy and rebuild any infrastructure in the process.