Synchronizing Terraform State Files to External Storage from TFC/TFE
As organizations scale their infrastructure with Terraform, managing state becomes increasingly critical. While HCP Terraform( previously known as Terraform Cloud/TFC) and Terraform Enterprise(TFE) provide robust state management capabilities, some scenarios may require maintaining additional copies of state files in external storage systems. Today, I'll walk through couple of approaches for synchronizing your Terraform state from TFC/TFE to Amazon S3 or similar storage services in case your organization requires to do so. Why Maintain External Copies of Terraform State? Before diving into implementation, let's consider why you might need this capability: Disaster recovery: Maintaining additional backups beyond TFC/TFE's built-in mechanisms Compliance requirements: Meeting regulatory needs for data retention or storage location Resiliency management: Reviewing resiliency details with AWS Resilience Hub (We will look at this at a later post.) Integration with custom tools: Supporting internal systems that require access to state data Implementation Approaches Let's explore two primary approaches to accomplish this state synchronization. Approach 1: CLI-Based Workflow If you're utilizing a CLI-based workflow with TFC/TFE, the process is relatively straightforward: After executing your Terraform operations, run terraform state pull to retrieve the current state Save this output to a file Upload the file to your external storage (e.g., Amazon S3) Here's a simple bash script demonstrating this approach: #!/bin/bash # Pull current state terraform state pull > current_state.tfstate # Upload to S3 aws s3 cp current_state.tfstate s3://your-bucket-name/path/to/state/current_state.tfstate # Cleanup rm current_state.tfstate This approach is simple and can be integrated into your existing CLI based HCP Terraform workflow. Approach 2: API-Based with CI/CD Integration For VCS-driven workflows, accessing the Terraform API directly provides greater flexibility. This method can be implemented in any CI/CD system with access to your TFC/TFE environment. An example implementation using GitHub actions is laid out here. The implementation follows these key steps: Authenticate with the Terraform Cloud/Enterprise API Retrieve the workspace details to identify the current state version Download the state file Upload it to Amazon S3 Here's a simplified version of the bash implementation: #!/bin/bash # Set variables TF_WORKSPACE="your-workspace-name" TF_ORG="your-organization" S3_BUCKET="your-s3-bucket" TFE_TOKEN="your-tfe-token" # Get workspace ID WORKSPACE_ID=$(curl \ --header "Authorization: Bearer $TFE_TOKEN" \ --header "Content-Type: application/vnd.api+json" \ "https://app.terraform.io/api/v2/organizations/${TFE_ORGANIZATION}/workspaces/${TFE_WORKSPACE}" \ | jq -r '.data.id') # Get current state version and state url STATE_VERSION_ID=$(curl \ --header "Authorization: Bearer $TFE_TOKEN" \ --header "Content-Type: application/vnd.api+json" \ "https://app.terraform.io/api/v2/workspaces/${WORKSPACE_ID}/current-state-version" \ | jq -r '.data.id') STATE_URL=$(curl \ --header "Authorization: Bearer $TFE_TOKEN" \ --header "Content-Type: application/vnd.api+json" \ "https://app.terraform.io/api/v2/state-versions/${STATE_VERSION_ID}" \ | jq -r '.data.attributes."hosted-state-download-url"') # Download state file # -vL as the state url you have captured earlier gets redirected curl -vL \ --header "Authorization: Bearer $TFE_TOKEN" \ --header "Content-Type: application/vnd.api+json" \ "$STATE_URL" \ --output terraform.tfstate # Upload to S3 aws s3 cp terraform.tfstate s3://${S3_BUCKET}/terraform.tfstate Setting Up in GitHub Actions The repository provides a complete GitHub Actions workflow for this task. Here's how you might set it up: Store your Terraform Cloud/Enterprise API token and AWS credentials as GitHub secrets Create a workflow file (e.g., .github/workflows/sync-state.yml) Configure the workflow to run on your preferred schedule or trigger name: Sync Terraform State to S3 on: schedule: - cron: '0 0 * * *' # Daily at midnight workflow_dispatch: # Allow manual triggering jobs: sync-state: runs-on: ubuntu-latest steps: - uses: actions/checkout@v3 - name: Configure AWS credentials uses: aws-actions/configure-aws-credentials@v4 with: role-to-assume: ${{ secrets.IAM_ROLE }} # role allowed to be assumed by GitHub repo aws-region: us-east-1 - name: Sync state to S3 env: TFE_TOKEN: ${{ secrets.TF_TOKEN }} TF_ORG: "your-org-name" TF_WORKSPACE: "your-workspace" S3_BUCKET: "your-bucket-name" run: | # Script to sync state (as shown above) Conclusion While Terraform Cloud and Enterprise provide robust

As organizations scale their infrastructure with Terraform, managing state becomes increasingly critical. While HCP Terraform( previously known as Terraform Cloud/TFC) and Terraform Enterprise(TFE) provide robust state management capabilities, some scenarios may require maintaining additional copies of state files in external storage systems. Today, I'll walk through couple of approaches for synchronizing your Terraform state from TFC/TFE to Amazon S3 or similar storage services in case your organization requires to do so.
Why Maintain External Copies of Terraform State?
Before diving into implementation, let's consider why you might need this capability:
- Disaster recovery: Maintaining additional backups beyond TFC/TFE's built-in mechanisms
- Compliance requirements: Meeting regulatory needs for data retention or storage location
- Resiliency management: Reviewing resiliency details with AWS Resilience Hub (We will look at this at a later post.)
- Integration with custom tools: Supporting internal systems that require access to state data
Implementation Approaches
Let's explore two primary approaches to accomplish this state synchronization.
Approach 1: CLI-Based Workflow
If you're utilizing a CLI-based workflow with TFC/TFE, the process is relatively straightforward:
- After executing your Terraform operations, run
terraform state pull
to retrieve the current state - Save this output to a file
- Upload the file to your external storage (e.g., Amazon S3)
Here's a simple bash script demonstrating this approach:
#!/bin/bash
# Pull current state
terraform state pull > current_state.tfstate
# Upload to S3
aws s3 cp current_state.tfstate s3://your-bucket-name/path/to/state/current_state.tfstate
# Cleanup
rm current_state.tfstate
This approach is simple and can be integrated into your existing CLI based HCP Terraform workflow.
Approach 2: API-Based with CI/CD Integration
For VCS-driven workflows, accessing the Terraform API directly provides greater flexibility. This method can be implemented in any CI/CD system with access to your TFC/TFE environment.
An example implementation using GitHub actions is laid out here.
The implementation follows these key steps:
- Authenticate with the Terraform Cloud/Enterprise API
- Retrieve the workspace details to identify the current state version
- Download the state file
- Upload it to Amazon S3
Here's a simplified version of the bash implementation:
#!/bin/bash
# Set variables
TF_WORKSPACE="your-workspace-name"
TF_ORG="your-organization"
S3_BUCKET="your-s3-bucket"
TFE_TOKEN="your-tfe-token"
# Get workspace ID
WORKSPACE_ID=$(curl \
--header "Authorization: Bearer $TFE_TOKEN" \
--header "Content-Type: application/vnd.api+json" \
"https://app.terraform.io/api/v2/organizations/${TFE_ORGANIZATION}/workspaces/${TFE_WORKSPACE}" \
| jq -r '.data.id')
# Get current state version and state url
STATE_VERSION_ID=$(curl \
--header "Authorization: Bearer $TFE_TOKEN" \
--header "Content-Type: application/vnd.api+json" \
"https://app.terraform.io/api/v2/workspaces/${WORKSPACE_ID}/current-state-version" \
| jq -r '.data.id')
STATE_URL=$(curl \
--header "Authorization: Bearer $TFE_TOKEN" \
--header "Content-Type: application/vnd.api+json" \
"https://app.terraform.io/api/v2/state-versions/${STATE_VERSION_ID}" \
| jq -r '.data.attributes."hosted-state-download-url"')
# Download state file
# -vL as the state url you have captured earlier gets redirected
curl -vL \
--header "Authorization: Bearer $TFE_TOKEN" \
--header "Content-Type: application/vnd.api+json" \
"$STATE_URL" \
--output terraform.tfstate
# Upload to S3
aws s3 cp terraform.tfstate s3://${S3_BUCKET}/terraform.tfstate
Setting Up in GitHub Actions
The repository provides a complete GitHub Actions workflow for this task. Here's how you might set it up:
- Store your Terraform Cloud/Enterprise API token and AWS credentials as GitHub secrets
- Create a workflow file (e.g.,
.github/workflows/sync-state.yml
) - Configure the workflow to run on your preferred schedule or trigger
name: Sync Terraform State to S3
on:
schedule:
- cron: '0 0 * * *' # Daily at midnight
workflow_dispatch: # Allow manual triggering
jobs:
sync-state:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ secrets.IAM_ROLE }} # role allowed to be assumed by GitHub repo
aws-region: us-east-1
- name: Sync state to S3
env:
TFE_TOKEN: ${{ secrets.TF_TOKEN }}
TF_ORG: "your-org-name"
TF_WORKSPACE: "your-workspace"
S3_BUCKET: "your-bucket-name"
run: |
# Script to sync state (as shown above)
Conclusion
While Terraform Cloud and Enterprise provide robust state management capabilities, there are legitimate scenarios where maintaining external copies of your state files could be required. The approaches outlined in this post offer flexible solutions whether you're using CLI-based workflows or VCS-driven automation.
Note: Always ensure your state file handling complies with your organization's security policies, as state files may contain sensitive information.