GitOps and IaC at Scale – AWS, ArgoCD, Terragrunt, and OpenTofu – Part 2 – Creating Spoke environments
Publish Date: Apr 20
7 0
Level 400
Hi clouders, in the previous blog you can explore the considerations to deploy at scale gitops and IaC, in this post you can learn more about how to deploy spokes cluster using the GitOps bridge framework and specific use cases with AWS enterprise capabilities and security best practices.
Architecture Overview
Spoke clusters and Scenarios
According to the best practices there are many considerations and scenarios, for example:
You can have one stack or IaC repository with the definitions for each account, team or squat. It depends on your internal organizations and share responsibility model. For this scenario suppose that the model is a decentralized DevOps Operational Model, and each squat has their custom IaC for each project. Consider keeping clear separation and isolation between environments and make match your platform strategy, for example, some organizations have a single hub for managing dev, QA and prod cluster, others have a hub to manage each environment cluster and other, and others have one hub for previous environments and another for production.
Another key point to note is the capacity planning and applications by team, some organizations allow share the same cluster for an organizational unit and keep the applications grouped by namespaces in each environment, others prefer to have one environment and cluster by workload or application. Considering the networking and security considerations has main pain and challenges for both scenarios. In this series the main assumption is that each workload has a dedicated cluster and account by environment but there is a transversal platform team that manages the cluster configuration and control plane. The following table describe the relationship between scenario and AWS accounts:
Scenario
AWS Account
Clusters
Single Hub – N Spokes
1 for HUB – N account by environment
1 cluster Hub, N Spoke Clusters by environment
M Hub - N spokes
M accounts By Hub environments – N account by environments
M Hub Clusters, N Environment Clusters
The Figure 1 depicts the architecture for this scenario.
Figure 1. GitOps bridge Deployment architecture in AWS
The FinOps practices are key point independent of your resources distribution you must consider what is the best strategy for track cost and shared resources.
Hands On
First, modify the hub infrastructure stacks to add the stack to manage the credentials cross account and allow the CI infrastructure agents to take them to register the cluster in the control plane. Also, create the role for Argocd to enable the authentication between Argocd hub and spoke cluster.
Updating Control Plane Infrastructure
So, let’s set up the credentials according to the best practices and the scenario described in the previous section, the cluster credentials are stored in parameter store and share with the organizational units for each team or business unit.
You must enable RAM as trusted service in your organization from Organizations management account and RAM Console.
For this task, a local module terraform-aws-parameter-store was created:
The module creates the parameter and shares with organization id, organizational unit, or account id principals.
Now, using terragrunt the module is called to create a new stack or terragrunt unit.
#parameter_store-terragrunt.hclinclude"root"{path=find_in_parent_folders("root.hcl")expose=true}dependency"eks"{config_path="${get_parent_terragrunt_dir("root")}/infrastructure/containers/eks_control_plane"mock_outputs={cluster_name="dummy-cluster-name"cluster_endpoint="dummy_cluster_endpoint"cluster_certificate_authority_data="dummy_cluster_certificate_authority_data"cluster_version="1.31"cluster_platform_version="1.31"oidc_provider_arn="dummy_arn"cluster_arn="arn:aws:eks:us-east-2:105171185823:cluster/gitops-scale-dev-hub"}mock_outputs_merge_strategy_with_state="shallow"}locals{# Define parameters for each workspaceenv={default={parameter_name="/control_plane/${include.root.locals.environment.locals.workspace}/credentials"sharing_principals=["ou-w3ow-k24p2opx"]tags={Environment="control-plane"Layer="Operations"}}"dev"={create=true}"prod"={create=true}}# Merge parametersenvironment_vars=contains(keys(local.env),include.root.locals.environment.locals.workspace)?include.root.locals.environment.locals.workspace:"default"workspace=merge(local.env["default"],local.env[local.environment_vars])}terraform{source="../../../modules/terraform-aws-ssm-parameter-sotre"}inputs={parameter_name="${local.workspace["parameter_name"]}"parameter_description="Control plane credentials"parameter_type="SecureString"parameter_tier="Advanced"create_kms=trueenable_sharing=truesharing_principals=local.workspace["sharing_principals"]parameter_value=jsonencode({cluster_name=dependency.eks.outputs.cluster_name,cluster_endpoint=dependency.eks.outputs.cluster_endpoint,cluster_certificate_authority_data=dependency.eks.outputs.cluster_certificate_authority_data,cluster_version=dependency.eks.outputs.cluster_version,cluster_platform_version=dependency.eks.outputs.cluster_platform_version,oidc_provider_arn=dependency.eks.outputs.oidc_provider_arn,hub_account_id=split(":",dependency.eks.outputs.cluster_arn)[4]})tags=local.workspace["tags"]}
Now, another stack is necessary, the IAM role to enable service account for argocd use the IAM authentication with spoke clusters. The module terraform-aws-iam/iam-eks-role allows to create the IRSA role but also is necessary create a custom policy to allow assume role in the spoke accounts. The Figure 2 depicts in depth this setup.
You can create simple stack for managing the role, or a module that supports the EKS definition and IAM role.
Figure 2. GitOps authentication summary.
So, the module is in modules/terraform-aws-irsa-eks-hub
#eks_role-terragrunt.hclinclude"root"{path=find_in_parent_folders("root.hcl")expose=true}dependency"eks"{config_path="${get_parent_terragrunt_dir("root")}/infrastructure/containers/eks_control_plane"mock_outputs={cluster_name="dummy-cluster-name"cluster_endpoint="dummy_cluster_endpoint"cluster_certificate_authority_data="dummy_cluster_certificate_authority_data"cluster_version="1.31"cluster_platform_version="1.31"oidc_provider_arn="dummy_arn"}mock_outputs_merge_strategy_with_state="shallow"}locals{# Define parameters for each workspaceenv={default={environment="control-plane"role_name="eks-role-hub"tags={Environment="control-plane"Layer="Networking"}}"dev"={create=true}"prod"={create=true}}# Merge parametersenvironment_vars=contains(keys(local.env),include.root.locals.environment.locals.workspace)?include.root.locals.environment.locals.workspace:"default"workspace=merge(local.env["default"],local.env[local.environment_vars])}terraform{source="../../../modules/terraform-aws-irsa-eks-hub"}inputs={role_name="${local.workspace["role_name"]}-${local.workspace["environment"]}"cluster_service_accounts={"${dependency.eks.outputs.cluster_name}"=["argocd:argocd-application-controller","argocd:argo-cd-argocd-repo-server","argocd:argocd-server",]}tags=local.workspace["tags"]}
Finally, the gitops_bridge stack must look like:
#eks_control_plane-terragrunt.hclinclude"root"{path=find_in_parent_folders("root.hcl")expose=true}include"k8s_helm_provider"{path=find_in_parent_folders("/common/additional_providers/provider_k8s_helm.hcl")}dependency"eks"{config_path="${get_parent_terragrunt_dir("root")}/infrastructure/containers/eks_control_plane"mock_outputs={cluster_name="dummy-cluster-name"cluster_endpoint="dummy_cluster_endpoint"cluster_certificate_authority_data="dummy_cluster_certificate_authority_data"cluster_version="1.31"cluster_platform_version="1.31"oidc_provider_arn="dummy_arn"}mock_outputs_merge_strategy_with_state="shallow"}dependency"eks_role"{config_path="${get_parent_terragrunt_dir("root")}/infrastructure/iam/eks_role"mock_outputs={iam_role_arn="arn::..."}mock_outputs_merge_strategy_with_state="shallow"}locals{# Define parameters for each workspaceenv={default={environment="control-plane"oss_addons={enable_argo_workflows=true#enable_foo = true# you can add any addon here, make sure to update the gitops repo with the corresponding application set}addons_metadata=merge({addons_repo_url="https://github.com/gitops-bridge-dev/gitops-bridge-argocd-control-plane-template"addons_repo_basepath=""addons_repo_path="bootstrap/control-plane/addons"addons_repo_revision="HEAD"})argocd_apps={addons=file("./bootstrap/addons.yaml")#workloads = file("./bootstrap/workloads.yaml")}tags={Environment="control-plane"Layer="Networking"}}"dev"={create=true}"prod"={create=true}}# Merge parametersenvironment_vars=contains(keys(local.env),include.root.locals.environment.locals.workspace)?include.root.locals.environment.locals.workspace:"default"workspace=merge(local.env["default"],local.env[local.environment_vars])}terraform{source="tfr:///gitops-bridge-dev/gitops-bridge/helm?version=0.1.0"}inputs={cluster_name=dependency.eks.outputs.cluster_namecluster_endpoint=dependency.eks.outputs.cluster_endpointcluster_platform_version=dependency.eks.outputs.cluster_platform_versionoidc_provider_arn=dependency.eks.outputs.oidc_provider_arncluster_certificate_authority_data=dependency.eks.outputs.cluster_certificate_authority_datacluster={cluster_name=dependency.eks.outputs.cluster_nameenvironment=local.workspace["environment"]metadata=local.workspace["addons_metadata"]addons=merge(local.workspace["oss_addons"],{kubernetes_version=dependency.eks.outputs.cluster_version})}apps=local.workspace["argocd_apps"]argocd={namespace="argocd"#set = [# {# name = "server.service.type"# value = "LoadBalancer"# }#]values=[yamlencode({configs={params={"server.insecure"=true}}server={"serviceAccount"={annotations={"eks.amazonaws.com/role-arn"=dependency.eks_role.outputs.iam_role_arn}}service={type="NodePort"}ingress={enabled=falsecontroller="aws"ingressClassName:"alb"aws={serviceType:"NodePort"}annotations={#"alb.ingress.kubernetes.io/backend-protocol" = "HTTPS"#"alb.ingress.kubernetes.io/ssl-redirect" = "443"#"service.beta.kubernetes.io/aws-load-balancer-type" = "external"#"service.beta.kubernetes.io/aws-load-balancer-nlb-target-type" = "ip"#"alb.ingress.kubernetes.io/listen-ports" : "[{\"HTTPS\":443}]"}}}controller={"serviceAccount"={annotations={"eks.amazonaws.com/role-arn"=dependency.eks_role.outputs.iam_role_arn}}}repoServer={"serviceAccount"={annotations={"eks.amazonaws.com/role-arn"=dependency.eks_role.outputs.iam_role_arn}}}})]}tags=local.workspace["tags"]}
Basically the main changes was the introduction to argocd map to setup the values for helm chart deployment to enable to use the IRSA role.
When you are using cross account deployment the profile that creates the secrets in hub cluster must to have access and permissions, for example in the repository the eks_control_plane stack introduce a new access entry:
#eks_control_plane-terragrunt.hclinclude"root"{path=find_in_parent_folders("root.hcl")expose=true}dependency"vpc"{config_path="${get_parent_terragrunt_dir("root")}/infrastructure/network/vpc"mock_outputs={vpc_id="vpc-04e3e1e302f8c8f06"public_subnets=["subnet-0e4c5aedfc2101502","subnet-0d5061f70b69eda14",]private_subnets=["subnet-0e4c5aedfc2101502","subnet-0d5061f70b69eda14","subnet-0d5061f70b69eda15",]}mock_outputs_merge_strategy_with_state="shallow"}locals{# Define parameters for each workspaceenv={default={create=falsecluster_name="${include.root.locals.common_vars.locals.project}-${include.root.locals.environment.locals.workspace}-hub"cluster_version="1.32"# Optionalcluster_endpoint_public_access=true# Optional: Adds the current caller identity as an administrator via cluster access entryenable_cluster_creator_admin_permissions=trueaccess_entries={###################################################################################################################### Admin installation and setup for spoke accounts - Demo purpose- must be the ci Agent Role####################################################################################################################admins_sso={kubernetes_groups=[]principal_arn="arn:aws:sts::123456781234:role/aws-reserved/sso.amazonaws.com/us-east-2/AWSReservedSSO_AWSAdministratorAccess_877fe9e4127a368d"user_name="arn:aws:sts::123456781234:assumed-role/AWSReservedSSO_AWSAdministratorAccess_877fe9e4127a368d/{{SessionName}}"policy_associations={single={policy_arn="arn:aws:eks::aws:cluster-access-policy/AmazonEKSClusterAdminPolicy"access_scope={type="cluster"}}}}}cluster_compute_config={enabled=truenode_pools=["general-purpose"]}tags={Environment="control-plane"Layer="Networking"}}"dev"={create=true}"prod"={create=true}}# Merge parametersenvironment_vars=contains(keys(local.env),include.root.locals.environment.locals.workspace)?include.root.locals.environment.locals.workspace:"default"workspace=merge(local.env["default"],local.env[local.environment_vars])}terraform{source="tfr:///terraform-aws-modules/eks/aws?version=20.33.1"}inputs={create=local.workspace["create"]cluster_name=local.workspace["cluster_name"]cluster_version=local.workspace["cluster_version"]# Optionalcluster_endpoint_public_access=local.workspace["cluster_endpoint_public_access"]# Optional: Adds the current caller identity as an administrator via cluster access entryenable_cluster_creator_admin_permissions=local.workspace["enable_cluster_creator_admin_permissions"]cluster_compute_config=local.workspace["cluster_compute_config"]vpc_id=dependency.vpc.outputs.vpc_idsubnet_ids=dependency.vpc.outputs.private_subnetsaccess_entries=local.workspace["access_entries"]tags={Environment=include.root.locals.environment.locals.workspaceTerraform="true"}tags=local.workspace["tags"]}
Public demo for Gitops bridge using terragrunt, OpenTofu and EKS
AWS GitOps Blueprint with Terragrunt
This project provides a blueprint for implementing GitOps on AWS using Terragrunt and Argo CD. It offers a structured approach to managing infrastructure as code and deploying applications across multiple environments.
The blueprint is designed to streamline the process of setting up a GitOps workflow on AWS, leveraging Terragrunt for managing Terraform configurations and Argo CD for continuous deployment. It includes configurations for essential AWS services such as EKS (Elastic Kubernetes Service) and VPC (Virtual Private Cloud), as well as GitOps components for managing cluster addons and platform-level resources.
Key features of this blueprint include:
Modular infrastructure setup using Terragrunt
EKS cluster configuration for container orchestration
VPC network setup for secure and isolated environments
GitOps bridge for seamless integration between infrastructure and application deployments
Argo CD ApplicationSets for managing cluster addons and platform resources
Environment-specific configurations for multi-environment deployments
The second step is creating the spoke cluster infrastructure. 🧑🏾💻
To manage a spoke cluster a single terragrunt or tofu project is created independently for each team from a template, so, infrastructure has an individual pipeline to manage it, and each team could have custom CI/CD agents, also more flexibility to add features and components to the infrastructure stacks. In some cases, you can have a single pipeline to manage the infrastructure setup and work with environment or parameters managed by the orchestration CI/CD tool. This approach is utilized when necessary to have central governance and control, but consider the changes rates, common tasks and environment setup and CI/CD workers capacity assigned to central ops.
The code is like the hub repository; however, the main difference is in the stack GitOps bridge.
Let’s watch it in depth. 🕵️♀️
First, a new provider configuration is necessary, the cluster hub provider, in the terragrunt_aws_gitops_spoke_blueprint/common/additional_providers:
locals{workspace=get_env("TF_VAR_env","dev")pipeline="false"hub_account_id="105171185823"}generate"k8s_helm_provider"{path="k8s_helm_provider.tf"if_exists="overwrite"contents=<<EOF
################################################################################
# Kubernetes Access for Spoke Cluster
################################################################################
# First, define the parameter store data source
data "aws_ssm_parameter" "hub_cluster_config" {
count = 1
with_decryption = true
name = "arn:aws:ssm:us-east-2:${local.hub_account_id}:parameter/control_plane/${local.workspace}/credentials"
#"/control_plane/${local.workspace}/credentials" # Adjust the parameter path as needed
}
provider "kubernetes" {
host = try(jsondecode(data.aws_ssm_parameter.hub_cluster_config[0].value).cluster_endpoint, var.cluster_endpoint)
cluster_ca_certificate = try(base64decode(jsondecode(data.aws_ssm_parameter.hub_cluster_config[0].value).cluster_certificate_authority_data), var.cluster_certificate_authority_data)
exec {
api_version = "client.authentication.k8s.io/v1beta1"
command = "aws"
args = [
"eks",
"get-token",
"--cluster-name",
try(jsondecode(data.aws_ssm_parameter.hub_cluster_config[0].value).cluster_name, var.cluster_name),
"--region",
try(jsondecode(data.aws_ssm_parameter.hub_cluster_config[0].value).cluster_region, data.aws_region.current.name),
"--profile",
var.profile["${local.workspace}"]["profile"]
]
}
alias = "hub"
}
EOF
}
The gitops_bridge stack is:
#eks_control_plane-terragrunt.hclinclude"root"{path=find_in_parent_folders("root.hcl")expose=true}include"k8s_helm_provider"{path=find_in_parent_folders("/common/additional_providers/provider_k8s_hub.hcl")}dependency"eks"{config_path="${get_parent_terragrunt_dir("root")}/infrastructure/containers/eks_spoke"mock_outputs={cluster_name="dummy-cluster-name"cluster_endpoint="dummy_cluster_endpoint"cluster_certificate_authority_data="dummy_cluster_certificate_authority_data"cluster_version="1.31"cluster_platform_version="1.31"oidc_provider_arn="dummy_arn"}mock_outputs_merge_strategy_with_state="shallow"}locals{# Define parameters for each workspaceenv={default={environment="control-plane"oss_addons={enable_argo_workflows=true#enable_foo = true# you can add any addon here, make sure to update the gitops repo with the corresponding application set}addons_metadata=merge({addons_repo_url="https://github.com/gitops-bridge-dev/gitops-bridge-argocd-control-plane-template"addons_repo_basepath=""addons_repo_path="bootstrap/control-plane/addons"addons_repo_revision="HEAD"})tags={Environment="control-plane"Layer="Networking"}}"dev"={create=true}"prod"={create=true}}# Merge parametersenvironment_vars=contains(keys(local.env),include.root.locals.environment.locals.workspace)?include.root.locals.environment.locals.workspace:"default"workspace=merge(local.env["default"],local.env[local.environment_vars])}terraform{source="../../../modules/terraform-aws-gitops-bridge-spoke"}inputs={cluster_name=dependency.eks.outputs.cluster_namecluster_endpoint=dependency.eks.outputs.cluster_endpointcluster_platform_version=dependency.eks.outputs.cluster_platform_versionoidc_provider_arn=dependency.eks.outputs.oidc_provider_arncluster_certificate_authority_data=dependency.eks.outputs.cluster_certificate_authority_datacreate_kubernetes_resources=falsecluster={cluster_name=dependency.eks.outputs.cluster_nameenvironment=local.workspace["environment"]metadata=local.workspace["addons_metadata"]addons=merge(local.workspace["oss_addons"],{kubernetes_version=dependency.eks.outputs.cluster_version})}hub_account_id=include.root.locals.common_vars.locals.hub_account_idtags=local.workspace["tags"]}
This stacks defines the data for cluster secret in hub cluster and metadata information for addons, the local module terraform-aws-gitops-bridge-spoke creates the access entry and for enable hub access using spoke role according to Figure 2 and reuse gitops-bridge-dev/gitops-bridge/helm with parameters to deploy the secret but not the argocd installation in spoke clusters. So the infrastructure composition for spokes IaC is:
Figure 4. Infrastructure composition.
An alternative approach is to manage the external secrets Operator and secrets manager, let a comment if you want to know how do it.
Finally, after runs the spoke stacks in Argocd Hub Server you can watch the clusters and metadata information:
Figure 5. Cluster and metadata Information.
For example some addons was deployed in spoke cluster using applications Set.
Figure 6. Applications Set in spoke clusters.
In the next post, you can learn how to custom addons and add advance setups. 🦸🦸
Infrastructure for Spoke clusters blueprint for GitOps Bridge
AWS GitOps Scale Infrastructure with EKS and VPC
A comprehensive Infrastructure as Code (IaC) solution that enables scalable GitOps deployments on AWS using EKS clusters in a hub-spoke architecture with automated infrastructure provisioning and configuration management.
This project provides a complete infrastructure setup using Terragrunt and Terraform to create and manage AWS resources including VPC networking, EKS clusters, and GitOps tooling. It implements a hub-spoke architecture where a central hub cluster manages multiple spoke clusters through GitOps practices, enabling consistent and automated application deployments at scale.
The solution includes automated VPC creation with proper network segmentation, EKS cluster provisioning with secure configurations, and integration with GitOps tools through a bridge component that enables declarative infrastructure and application management.
Repository Structure
├── common/ # Common configuration and variable definitions
│ ├── additional_providers/ # Provider configurations for Kubernetes, Helm, etc.
│ ├── common.hcl # Common Terragrunt configuration
│ ├── common.tfvars #