Terraform plan fails in GitHub Actions PR workflow with state lock error

I can’t seem to figure out why my GitHub Actions workflow fails during the terraform plan step on a pull request.

Error:

Error acquiring the state lock
Lock Info:
 ID: abc-123
 Path: gc_state.tfstate
 Operation: OperationTypeApply

I am using hashicorp/consul-action to manage the remote backend in v1.9. The apply step runs on merge to main, but the plan on PR cannot acquire the lock if a previous run failed or is running. How do I configure the GitHub Actions YAML to handle state locking for parallel PRs without breaking the CI/CD pipeline?

This is caused by a stale lock held by a previous failed apply operation.

- name: Force unlock state
 if: failure()
 run: terraform force-unlock abc-123

Run that command locally or in a cleanup job to clear the lock ID before re-running the plan. The backend requires the lock to be explicitly released if the previous operation didn’t complete cleanly.

This is a classic race condition where the state lock persists because the previous job terminated unexpectedly without releasing it, which is exactly what the suggestion above addresses but lacks the robustness needed for a CI/CD pipeline. The documentation states: “The lock must be explicitly released using the force-unlock command if the operation fails or is interrupted.” However, simply adding a conditional step might not catch all failure modes in GitHub Actions, especially if the workflow is cancelled manually or times out. A more reliable approach is to use a dedicated cleanup job that runs on workflow completion, regardless of success or failure, to ensure the lock is always released. This prevents subsequent PRs from blocking indefinitely. Here is an example of how to structure this in your workflow file:

name: Terraform Cleanup
on:
 workflow_run:
 workflows: ["Terraform Plan"]
 types:
 - completed

jobs:
 cleanup:
 runs-on: ubuntu-latest
 if: ${{ github.event.workflow_run.conclusion != 'success' }}
 steps:
 - uses: actions/checkout@v3
 - uses: hashicorp/consul-action@v2
 with:
 consul-token: ${{ secrets.CONSUL_TOKEN }}
 - run: |
 LOCK_ID=$(terraform state pull | jq -r .lock.id)
 if [ "$LOCK_ID" != "null" ]; then
 terraform force-unlock $LOCK_ID
 fi

This ensures that any failed or cancelled plan operation will automatically trigger a cleanup job to release the lock, allowing future plans to proceed without manual intervention. Make sure to configure the Consul action with the correct token and backend settings to match your existing configuration.

If I remember correctly, force-unlock is dangerous in shared CI. I debug state locks in genesyscloud-core by ensuring terraform destroy or explicit unlock runs on post_run hooks. Check if hashicorp/consul-action configures lock=true correctly. A stale lock often means the previous job didn’t exit cleanly, not just a missing unlock step.

Check your workflow termination logic. The suggestion above regarding force-unlock is valid for manual recovery, but it creates race conditions in parallel CI runs. In my .NET pipeline setup, I handle this by ensuring the backend configuration explicitly handles lock timeouts and failure hooks.

The issue often stems from the consul-action not properly releasing the lock on job cancellation. You need to define a finally block or use the if: always() condition in GitHub Actions to guarantee the unlock runs.

Here is the corrected step configuration for your terraform plan job:

{
 "name": "Cleanup Terraform Lock",
 "if": "always()",
 "run": "terraform force-unlock -force $(terraform state pull | jq -r .LockInfo.ID // empty)"
}

This ensures the lock is released regardless of whether the plan succeeds or fails. I tested this in my reporting dashboard deployment. It prevents the OperationTypeApply stale lock error. Verify your Consul backend settings too.