The Ops Community ⚙️

Arseny Zinchenko
Arseny Zinchenko

Posted on • Originally published at rtfm.co.ua on

Terraform: data types, loops, indexes, and the "resource must be replaced" issue

We have an automation for AWS IAM that creates EKS Access Entries to give AWS IAM Users access to an EKS cluster.

I don’t remember if I wrote it myself or if some LLM generated it (although judging by the code, I did :-) ), but later I discovered an unpleasant feature of how this automation works: when a user is deleted from variables, Terraform starts “re-mapping” other users.

Actually, today we’ll look at how I did it, recall Terraform’s data types, and see how that should have been done it to avoid such problems.

Although the error is described about aws_eks_access_entry resource (here in the examples it will be local_file instead of aws_eks_access_entry), it actually concerns the general approach to using indexes and loops in Terraform.

Content

  • Current implementation
  • Variables and data types
  • variable “eks_clusters”
  • variable “eks_users”
  • local.eks_users_access_entries_backend
  • resource “local_file” “backend”
  • The Issue
  • The Fix

Current implementation

Simplified, it looks like this:

variable "eks_clusters" {
  description = "List of EKS clusters to create records"
  type = set(string)
  default = [
    "cluster-1",
    "cluster-2"
  ]
}

variable "eks_users" {
  description = "IAM Users to be added to EKS with aws_eks_access_entry, one item in the set() per each IAM User"
  type = map(list(string))
  default = {
    backend = [
      "user1",
      "user2",
      "user3",
    ]
  }
}

locals {
  eks_users_access_entries_backend = flatten([
    for cluster in var.eks_clusters : [
      for user_arn in var.eks_users.backend : {
        cluster_name = cluster
        principal_arn = user_arn
      }
    ]
  ])
}

resource "local_file" "backend" {
  for_each = { for cluster, user in local.eks_users_access_entries_backend : cluster => user }

   filename = "${each.value.cluster_name}@${each.value.principal_arn}.txt"
   content = <<EOF
    cluster_name=${each.value.cluster_name}
    principal_arn=${each.value.principal_arn}
  EOF
}
Enter fullscreen mode Exit fullscreen mode

Only in the original, instead of the resource "local_file", the resource "aws_eks_access_entry" is used.

Actually, in this code:

  • variable "eks_clusters": contains a list of our EKS clusters to which users should be attached
  • variable "eks_users": contains lists of groups (backend in this example) and users in this group - user1, user2, user3
  • locals.eks_users_access_entries_backend: creates a list for each unique combination of EKS cluster + IAM user
  • resource "local_file": for each cluster and each user, creates a file with a name like EKS-cluster@IAM-user.txt

Now let’s take a closer look at the variables and data types — since I did this a long time ago, it’s useful to recall.

Variables and data types

variable "eks_clusters"

The type is simply set(string) with two elements - "cluster-1" and_"cluster-2_":

variable "eks_clusters" {
  description = "List of EKS clusters to create records"
  type = set(string)
  default = [
    "cluster-1",
    "cluster-2"
  ]
}
Enter fullscreen mode Exit fullscreen mode

The set[] type has no indexes, and objects are accessed in any order.

variable "eks_users"

variable "eks_users" {
  description = "IAM Users to be added to EKS with aws_eks_access_entry, one item in the set() per each IAM User"
  type = map(list(string))
  default = {
    backend = [
      "user1",
      "user2",
      "user3",
    ]
  }
}
Enter fullscreen mode Exit fullscreen mode

Here we have a variable with type map(list(string)).

map{} is a set of key => value data, where key is the name of a user group (devops, backend, qa), and in the value we have a nested list[] with objects of type string, where each object is a username.

A list, unlike a set, has indices for each element, and therefore the order of access to objects in the list will be in turn.

That is, we can access them by indexes 0–1–2:

  • eks_users["backend"].0: will be user1
  • eks_users["backend"].1: will be user2
  • eks_users["backend"].2: will be user3

You can check this using outputs:

output "eks_users_all" {
  value = var.eks_users
}

output "eks_users_backend" {
  value = var.eks_users["backend"]
}

output "eks_users_backend_user_1" {
  value = var.eks_users["backend"].0
}

output "eks_users_backend_user_2" {
  value = var.eks_users["backend"].1
}
...
Enter fullscreen mode Exit fullscreen mode

And as a result of terraform apply we get:

$ terraform apply
...
eks_users_all = tomap({
  "backend" = tolist([
    "user1",
    "user2",
  ])
})
eks_users_backend = tolist([
  "user1",
  "user2",
])
eks_users_backend_user_1 = "user1"
eks_users_backend_user_2 = "user2"
Enter fullscreen mode Exit fullscreen mode

Or just look in the terraform console:

> var.eks_users["backend"].0
"user1"
> var.eks_users["backend"].1
"user2"
> var.eks_users["backend"].2
"user3"
Enter fullscreen mode Exit fullscreen mode

But the problem arises not because of the indexes themselves, but because of the way they change if an item in the list is deleted or moved, especially if these indexes are used as keys for for_each. Actually, we'll get to that soon.

local.eks_users_access_entries_backend

locals {
  eks_users_access_entries_backend = flatten([
    for cluster in var.eks_clusters : [
      for user_arn in var.eks_users.backend : {
        cluster_name = cluster
        principal_arn = user_arn
      }
    ]
  ])
}
Enter fullscreen mode Exit fullscreen mode

Here we’re using a double for that iterates over each cluster from set(string) in the var.eks_clusters, and then for each user from list(string) in the var.eks_users.backend.

Let’s remove flatten() for now:

...
  eks_users_access_entries_backend_unflatten = [
    for cluster in var.eks_clusters : [
      for user_arn in var.eks_users.backend : {
        cluster_name = cluster
        principal_arn = user_arn
      }
    ]
  ]
...
Enter fullscreen mode Exit fullscreen mode

Now in the eks_users_access_entries_backend_unflatten = [ ...] we get a nested list - list(list(object)):

  • external name = [ ...] - this is the first level of the list, where each element is a cluster from var.eks_clusters
  • then with the for cluster in var.eks_clusters : [ ...] we generate a separate nested list of objects for each cluster
  • and then with the for user_arn in var.eks_users.backend : { ... } objects with the fields cluster_name and principal_arn are created - one object for each pair of cluster_name and principal_arn

Let’s also look at it with the terraform console again:

> local.eks_users_access_entries_backend_unflatten
tolist([
  [
    {
      "cluster_name" = "cluster-1"
      "principal_arn" = "user1"
    },
    {
      "cluster_name" = "cluster-1"
      "principal_arn" = "user2"
    },
    {
      "cluster_name" = "cluster-1"
      "principal_arn" = "user3"
    },
  ],
  [
    {
      "cluster_name" = "cluster-2"
      "principal_arn" = "user1"
    },
    {
      "cluster_name" = "cluster-2"
      "principal_arn" = "user2"
    },
    {
      "cluster_name" = "cluster-2"
      "principal_arn" = "user3"
    },
  ],
])
Enter fullscreen mode Exit fullscreen mode

And flatten() simply removes this nesting of list(list(object)), and turns the result into a flat list(object), where each object is a unique cluster + user pair:

...
eks_users_access_entries_backend = [
  {
    "cluster_name" = "cluster-1"
    "principal_arn" = "user1"
  },
  {
    "cluster_name" = "cluster-1"
    "principal_arn" = "user2"
  },
  {
    "cluster_name" = "cluster-2"
    "principal_arn" = "user1"
  },
  {
    "cluster_name" = "cluster-2"
    "principal_arn" = "user2"
  },
]
...
Enter fullscreen mode Exit fullscreen mode

Okay — we’ve figured it out, let’s move on — let’s see how this list will be used in for_each, and what mistakes I made there.

resource "local_file" "backend"

Finally, the main thing is that using this list the code creates users for each cluster:

...
resource "local_file" "backend" {
  for_each = { for cluster, user in local.eks_users_access_entries_backend : cluster => user }

   filename = "${each.value.cluster_name}@${each.value.principal_arn}.txt"
   content = <<EOF
    cluster_name=${each.value.cluster_name}
    principal_arn=${each.value.principal_arn}
  EOF
}
Enter fullscreen mode Exit fullscreen mode

In the original, it looks like this:

resource "aws_eks_access_entry" "backend" {
  for_each = { for cluser, user in local.eks_users_access_entries_backend : cluser => user }

  cluster_name = each.value.cluster_name
  principal_arn = each.value.principal_arn

  kubernetes_groups = [
    "backend-team"
  ]
}
Enter fullscreen mode Exit fullscreen mode

And again, loops and lists :-)

What do we have here: from { for cluster, user in local.eks_users_access_entries_backend : cluster => user }, a map{} is formed, where the key (cluster) is the index of each element from the local.eks_users_access_entries_backend list, and the value (user) is an object from this list at this index, and this object contains the cluster_name and principal_arn fields.

That is, in the cluster we will have the values 0, 1, 2, and in the user - the values { cluster_name = "cluster-1", principal_arn = "user1" }, { cluster_name = "cluster-1", principal_arn = "user2" }, { cluster_name = "cluster-1", principal_arn = "user3" } respectively.

So, my first mistake is the names cluster and user in the for loop itself: it would be more correct to call them simply for index, entry in ..., or index (or idx) and user - because in each user we have a combination that identifies the user - cluster+user.

One more thing: since the key in this for_each is an index of type number, and the value is an object of type object, we get not the map{} type but object{}, because in map the key and value must be of the same type, and Terraform cannot create map(number => object):

> type({ for cluster, user in local.eks_users_access_entries_backend : cluster => user })
object({
    0: object({
        cluster_name: string,
        principal_arn: string,
    }),
    1: object({
        cluster_name: string,
        principal_arn: string,
    }),
...
Enter fullscreen mode Exit fullscreen mode

Although this is not essential now.

The Issue

Now let’s move on to the main problem: if we delete a user in the list of users in the variable "eks_users", i.e. instead of:

variable "eks_users" {
  description = "IAM Users to be added to EKS with aws_eks_access_entry, one item in the set() per each IAM User"
  type = map(list(string))
  default = {
    backend = [
      "user1",
      "user2",
      "user3",
    ]
  }
}
Enter fullscreen mode Exit fullscreen mode

Will do in that way:

variable "eks_users" {
  description = "IAM Users to be added to EKS with aws_eks_access_entry, one item in the set() per each IAM User"
  type = map(list(string))
  default = {
    backend = [
      "user1",
      "user3",
    ]
  }
}
Enter fullscreen mode Exit fullscreen mode

Then this will lead to the fact that, firstly, local.eks_users_access_entries_backend will change, because instead of six objects:

> local.eks_users_access_entries_backend
[
  {
    "cluster_name" = "cluster-1"
    "principal_arn" = "user1"
  },
  {
    "cluster_name" = "cluster-1"
    "principal_arn" = "user2"
  },
  {
    "cluster_name" = "cluster-1"
    "principal_arn" = "user3"
  },
  {
    "cluster_name" = "cluster-2"
    "principal_arn" = "user1"
  },
  {
    "cluster_name" = "cluster-2"
    "principal_arn" = "user2"
  },
  {
    "cluster_name" = "cluster-2"
    "principal_arn" = "user3"
  },
]
Enter fullscreen mode Exit fullscreen mode

We get a new list with four objects:

> local.eks_users_access_entries_backend
[
  {
    "cluster_name" = "cluster-1"
    "principal_arn" = "user1"
  },
  {
    "cluster_name" = "cluster-1"
    "principal_arn" = "user3"
  },
  {
    "cluster_name" = "cluster-2"
    "principal_arn" = "user1"
  },
  {
    "cluster_name" = "cluster-2"
    "principal_arn" = "user3"
  },
]
Enter fullscreen mode Exit fullscreen mode

And since for_each is formed based on the indexes of the local.eks_users_access_entries_backend list:

for_each = {
  for cluster, user in local.eks_users_access_entries_backend :
  cluster => user
}
Enter fullscreen mode Exit fullscreen mode

Then, when the number of elements in local.eks_users_access_entries_backend changes, the map{} (which is still the object) in the condition for the for_each will change too, because it is created based on the indices of the local.eks_users_access_entries_backend list.

That is, instead of 0, 1, … 5:

> { for cluster, user in local.eks_users_access_entries_backend : cluster => user }
{
  "0" = {
    "cluster_name" = "cluster-1"
    "principal_arn" = "user1"
  }
  "1" = {
    "cluster_name" = "cluster-1"
    "principal_arn" = "user2"
  }
  "2" = {
    "cluster_name" = "cluster-1"
    "principal_arn" = "user3"
  }
  "3" = {
    "cluster_name" = "cluster-2"
    "principal_arn" = "user1"
  }
  "4" = {
    "cluster_name" = "cluster-2"
    "principal_arn" = "user2"
  }
  "5" = {
    "cluster_name" = "cluster-2"
    "principal_arn" = "user3"
  }
}
Enter fullscreen mode Exit fullscreen mode

We now have 0, 1, … 3:

> { for cluster, user in local.eks_users_access_entries_backend : cluster => user }
{
  "0" = {
    "cluster_name" = "cluster-1"
    "principal_arn" = "user1"
  }
  "1" = {
    "cluster_name" = "cluster-1"
    "principal_arn" = "user3"
  }
  "2" = {
    "cluster_name" = "cluster-2"
    "principal_arn" = "user1"
  }
  "3" = {
    "cluster_name" = "cluster-2"
    "principal_arn" = "user3"
  }
}
Enter fullscreen mode Exit fullscreen mode

And if earlier for_each created a map with the key "3" and the value {"cluster_name" = "cluster-2" "principal_arn" = " user1 "}, then now the key " 3" will have the {"cluster_name" = "cluster-2" "principal_arn" = " user3 _"} _value.

And for Terraform, it looks like the value of the object with the same key has changed, and therefore it should delete the old resource local_file.backend["3"], and create a new one at the same index, but with the new content:

Terraform will perform the following actions:

  # local_file.backend["1"] must be replaced
-/+ resource "local_file" "backend" {
      ~ content = <<-EOT # forces replacement
            cluster_name=cluster-1
          - principal_arn=user2
          + principal_arn=user3
        EOT
...

  # local_file.backend["2"] must be replaced
-/+ resource "local_file" "backend" {
      ~ content = <<-EOT # forces replacement
          - cluster_name=cluster-1
          + cluster_name=cluster-2
          - principal_arn=user3
          + principal_arn=user1
        EOT
...

  # local_file.backend["3"] must be replaced
-/+ resource "local_file" "backend" {
      ~ content = <<-EOT # forces replacement
            cluster_name=cluster-2
          - principal_arn=user1
          + principal_arn=user3
        EOT
...
Enter fullscreen mode Exit fullscreen mode

And all this is because for_each is based on an unstable index that can change.

The Fix

So, how can we prevent this from happening?

Simply change the way keys are created for the for_each.

Instead of creating the key from an index as a key and a value as an object, as it is done now:

> { for cluster, user in local.eks_users_access_entries_backend : cluster => user }
{
  "0" = {
    "cluster_name" = "cluster-1"
    "principal_arn" = "user1"
  }
  ...
  }
  "5" = {
    "cluster_name" = "cluster-2"
    "principal_arn" = "user3"
  }
}
Enter fullscreen mode Exit fullscreen mode

We can create a unique key for each cluster+user pair, and iterate on that key:

> { for entry in local.eks_users_access_entries_backend : "${entry.cluster_name}-${entry.principal_arn}" => entry }
{
  "cluster-1-user1" = {
    "cluster_name" = "cluster-1"
    "principal_arn" = "user1"
  }
  "cluster-1-user2" = {
    "cluster_name" = "cluster-1"
    "principal_arn" = "user2"
  }
  "cluster-1-user3" = {
    "cluster_name" = "cluster-1"
    "principal_arn" = "user3"
  }
  "cluster-2-user1" = {
    "cluster_name" = "cluster-2"
    "principal_arn" = "user1"
  }
  "cluster-2-user2" = {
    "cluster_name" = "cluster-2"
    "principal_arn" = "user2"
  }
  "cluster-2-user3" = {
    "cluster_name" = "cluster-2"
    "principal_arn" = "user3"
  }
}
Enter fullscreen mode Exit fullscreen mode

And then when you delete “user2”, all the other keys in the condition for the for_each will not change, and Terraform will not change the files:

> { for entry in local.eks_users_access_entries_backend : "${entry.cluster_name}-${entry.principal_arn}" => entry }
{
  "cluster-1-user1" = {
    "cluster_name" = "cluster-1"
    "principal_arn" = "user1"
  }
  "cluster-1-user3" = {
    "cluster_name" = "cluster-1"
    "principal_arn" = "user3"
  }
  "cluster-2-user1" = {
    "cluster_name" = "cluster-2"
    "principal_arn" = "user1"
  }
  "cluster-2-user3" = {
    "cluster_name" = "cluster-2"
    "principal_arn" = "user3"
  }
}
Enter fullscreen mode Exit fullscreen mode

In the code, it will look like this:

resource "local_file" "backend" {
  for_each = { for entry in local.eks_users_access_entries_backend : "${entry.cluster_name}-${entry.principal_arn}" => entry }

   filename = "${each.value.cluster_name}@${each.value.principal_arn}.txt"
   content = <<EOF
    cluster_name=${each.value.cluster_name}
    principal_arn=${each.value.principal_arn}
  EOF
}
Enter fullscreen mode Exit fullscreen mode

At first, Terraform will still recreate all the resources because the keys have changed, but you can safely add/remove users later.

Originally published at RTFM: Linux, DevOps, and system administration.


Top comments (0)