Introduction
Azure Kubernetes Service (AKS) is Microsoft's managed Kubernetes offering that simplifies the deployment, management, and operations of Kubernetes clusters in Azure. When building enterprise-grade applications, one of the most critical aspects is network security and isolation. This is where Virtual Network (VNET) integration becomes essential.
VNET integration allows your AKS cluster to communicate securely with other Azure resources while providing network-level isolation and control. In this article, we'll explore the various aspects of AKS and VNET integration, including different networking models, configuration options, and best practices.
Understanding how to properly integrate AKS with Azure Virtual Networks is crucial for:
- Security: Implementing network segmentation and access controls
- Compliance: Meeting organizational and regulatory requirements
- Performance: Optimizing network traffic and reducing latency
- Scalability: Planning for future growth and resource expansion
AKS Networking Models
1. Kubenet Networking
Kubenet is the basic networking plugin that provides simple network connectivity for AKS clusters. In this model:
- Node IP addresses: Assigned from the Azure VNET subnet
- Pod IP addresses: Assigned from a logically different address space
- Network Address Translation (NAT): Used for pod-to-internet communication
- Route tables: Azure manages routing between nodes and pods
Advantages of Kubenet:
- Simple configuration and management
- Lower IP address consumption in the VNET
- Suitable for development and testing environments
Limitations of Kubenet:
- Limited integration with Azure networking features
- Complex routing for advanced scenarios
- Potential performance impact due to NAT
2. Azure Container Networking Interface (CNI)
Azure CNI provides advanced networking capabilities by assigning IP addresses from the VNET to both nodes and pods:
- Direct IP assignment: Pods receive IP addresses directly from the VNET subnet
- Native VNET integration: Pods can communicate directly with VNET resources
- Network policies: Support for Kubernetes network policies
- Service integration: Direct integration with Azure Load Balancer and Application Gateway
Advantages of Azure CNI:
- Better performance with direct networking
- Enhanced security with network policies
- Seamless integration with Azure services
- Support for advanced networking features
Considerations for Azure CNI:
- Higher IP address consumption
- More complex IP address planning required
- Potential for IP address exhaustion in large clusters
VNET Integration Configurations
Basic VNET Integration
For basic VNET integration, you need to:
- Create a VNET with appropriate subnets:
# Create resource group
az group create --name myResourceGroup --location eastus
Explanation: Creates a new Azure resource group named myResourceGroup
in the East US region. Resource groups are logical containers that hold related Azure resources.
Terraform equivalent:
resource "azurerm_resource_group" "main" {
name = "myResourceGroup"
location = "East US"
}
# Create VNET
az network vnet create \
--resource-group myResourceGroup \
--name myVnet \
--address-prefixes 10.0.0.0/8 \
--subnet-name myAKSSubnet \
--subnet-prefix 10.240.0.0/16
Explanation: Creates a virtual network with a large address space (10.0.0.0/8) and an initial subnet (10.240.0.0/16) specifically for AKS nodes. The large address space allows for future expansion and multiple subnets.
Terraform equivalent:
resource "azurerm_virtual_network" "main" {
name = "myVnet"
address_space = ["10.0.0.0/8"]
location = azurerm_resource_group.main.location
resource_group_name = azurerm_resource_group.main.name
}
resource "azurerm_subnet" "aks" {
name = "myAKSSubnet"
resource_group_name = azurerm_resource_group.main.name
virtual_network_name = azurerm_virtual_network.main.name
address_prefixes = ["10.240.0.0/16"]
}
- Deploy AKS cluster with VNET integration:
# Get subnet ID
SUBNET_ID=$(az network vnet subnet show \
--resource-group myResourceGroup \
--vnet-name myVnet \
--name myAKSSubnet \
--query id -o tsv)
Explanation: Retrieves the unique resource ID of the AKS subnet. This ID is required when creating the AKS cluster to specify which subnet the nodes should be placed in.
Terraform equivalent:
# In Terraform, you can reference the subnet ID directly
# using: azurerm_subnet.aks.id
# Create AKS cluster
az aks create \
--resource-group myResourceGroup \
--name myAKSCluster \
--network-plugin azure \
--vnet-subnet-id $SUBNET_ID \
--docker-bridge-address 172.17.0.1/16 \
--dns-service-ip 10.2.0.10 \
--service-cidr 10.2.0.0/24
Explanation: Creates an AKS cluster with Azure CNI networking. Key parameters:
-
--network-plugin azure
: Enables Azure CNI for advanced networking features -
--vnet-subnet-id
: Specifies the subnet for node placement -
--docker-bridge-address
: Internal Docker bridge network (must not overlap with VNET) -
--dns-service-ip
: IP address for the cluster DNS service -
--service-cidr
: CIDR range for Kubernetes services (must not overlap with VNET)
Terraform equivalent:
resource "azurerm_kubernetes_cluster" "main" {
name = "myAKSCluster"
location = azurerm_resource_group.main.location
resource_group_name = azurerm_resource_group.main.name
dns_prefix = "myakscluster"
default_node_pool {
name = "default"
node_count = 1
vm_size = "Standard_D2_v2"
vnet_subnet_id = azurerm_subnet.aks.id
}
network_profile {
network_plugin = "azure"
dns_service_ip = "10.2.0.10"
service_cidr = "10.2.0.0/24"
docker_bridge_cidr = "172.17.0.1/16"
}
identity {
type = "SystemAssigned"
}
}
Advanced VNET Integration with Multiple Subnets
For production environments, consider using multiple subnets:
# Create additional subnets
az network vnet subnet create \
--resource-group myResourceGroup \
--vnet-name myVnet \
--name myInternalSubnet \
--address-prefixes 10.241.0.0/16
Explanation: Creates an additional subnet for internal services and resources that need network isolation from the AKS nodes but still require VNET connectivity.
Terraform equivalent:
resource "azurerm_subnet" "internal" {
name = "myInternalSubnet"
resource_group_name = azurerm_resource_group.main.name
virtual_network_name = azurerm_virtual_network.main.name
address_prefixes = ["10.241.0.0/16"]
}
az network vnet subnet create \
--resource-group myResourceGroup \
--vnet-name myVnet \
--name myApplicationGatewaySubnet \
--address-prefixes 10.242.0.0/24
Explanation: Creates a dedicated subnet for Azure Application Gateway. Application Gateway requires its own subnet and cannot share it with other resources.
Terraform equivalent:
resource "azurerm_subnet" "appgw" {
name = "myApplicationGatewaySubnet"
resource_group_name = azurerm_resource_group.main.name
virtual_network_name = azurerm_virtual_network.main.name
address_prefixes = ["10.242.0.0/24"]
}
Private AKS Clusters
Private AKS clusters provide enhanced security by making the API server accessible only from within the VNET:
az aks create \
--resource-group myResourceGroup \
--name myPrivateAKSCluster \
--network-plugin azure \
--vnet-subnet-id $SUBNET_ID \
--enable-private-cluster \
--private-dns-zone "privatelink.eastus.azmk8s.io"
Explanation: Creates a private AKS cluster where the Kubernetes API server is only accessible from within the VNET or connected networks. Key parameters:
-
--enable-private-cluster
: Makes the API server private -
--private-dns-zone
: Specifies a custom private DNS zone for the API server endpoint
Terraform equivalent:
resource "azurerm_kubernetes_cluster" "private" {
name = "myPrivateAKSCluster"
location = azurerm_resource_group.main.location
resource_group_name = azurerm_resource_group.main.name
dns_prefix = "myprivateaks"
private_cluster_enabled = true
private_dns_zone_id = "privatelink.eastus.azmk8s.io"
default_node_pool {
name = "default"
node_count = 1
vm_size = "Standard_D2_v2"
vnet_subnet_id = azurerm_subnet.aks.id
}
network_profile {
network_plugin = "azure"
}
identity {
type = "SystemAssigned"
}
}
Network Security and Policies
Network Security Groups (NSGs)
Configure NSGs to control traffic flow:
# Create NSG
az network nsg create \
--resource-group myResourceGroup \
--name myAKSSecurityGroup
Explanation: Creates a Network Security Group (NSG) which acts as a basic firewall containing access control rules. NSGs can be associated with subnets or individual network interfaces to filter network traffic.
Terraform equivalent:
resource "azurerm_network_security_group" "aks" {
name = "myAKSSecurityGroup"
location = azurerm_resource_group.main.location
resource_group_name = azurerm_resource_group.main.name
}
# Add rule for HTTPS traffic
az network nsg rule create \
--resource-group myResourceGroup \
--nsg-name myAKSSecurityGroup \
--name AllowHTTPS \
--direction inbound \
--priority 1000 \
--source-address-prefixes '*' \
--destination-port-ranges 443 \
--access allow \
--protocol tcp
Explanation: Creates an inbound security rule that allows HTTPS traffic (port 443) from any source. The priority determines rule evaluation order (lower numbers = higher priority).
Terraform equivalent:
resource "azurerm_network_security_rule" "allow_https" {
name = "AllowHTTPS"
priority = 1000
direction = "Inbound"
access = "Allow"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "443"
source_address_prefix = "*"
destination_address_prefix = "*"
resource_group_name = azurerm_resource_group.main.name
network_security_group_name = azurerm_network_security_group.aks.name
}
Kubernetes Network Policies
Implement micro-segmentation within your cluster:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: deny-all-ingress
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
ingress: []
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-frontend-to-backend
namespace: production
spec:
podSelector:
matchLabels:
app: backend
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
app: frontend
ports:
- protocol: TCP
port: 8080
Integration with Azure Services
Azure Application Gateway Integration
Azure Application Gateway provides Layer 7 load balancing and web application firewall capabilities:
# Create Application Gateway subnet
az network vnet subnet create \
--resource-group myResourceGroup \
--vnet-name myVnet \
--name myAppGatewaySubnet \
--address-prefixes 10.242.0.0/24
Explanation: Creates a dedicated subnet for Azure Application Gateway. This subnet must be used exclusively for the Application Gateway and cannot contain other resources.
Terraform equivalent:
resource "azurerm_subnet" "appgw" {
name = "myAppGatewaySubnet"
resource_group_name = azurerm_resource_group.main.name
virtual_network_name = azurerm_virtual_network.main.name
address_prefixes = ["10.242.0.0/24"]
}
# Enable Application Gateway Ingress Controller
az aks enable-addons \
--resource-group myResourceGroup \
--name myAKSCluster \
--addons ingress-appgw \
--appgw-subnet-id $APPGW_SUBNET_ID
Explanation: Enables the Application Gateway Ingress Controller (AGIC) add-on on the AKS cluster. This creates an Application Gateway in the specified subnet and configures it as an ingress controller for the cluster.
Terraform equivalent:
resource "azurerm_kubernetes_cluster" "main" {
# ... other configuration ...
ingress_application_gateway {
subnet_id = azurerm_subnet.appgw.id
}
}
# Or as a separate resource for existing clusters
resource "azurerm_kubernetes_cluster_extension" "appgw_ingress" {
name = "appgw-ingress"
cluster_id = azurerm_kubernetes_cluster.main.id
extension_type = "Microsoft.Web/sites"
configuration_settings = {
"appgw.subnetId" = azurerm_subnet.appgw.id
}
}
Azure Load Balancer Integration
Configure Azure Load Balancer for external access:
apiVersion: v1
kind: Service
metadata:
name: my-service
annotations:
service.beta.kubernetes.io/azure-load-balancer-internal: "true"
service.beta.kubernetes.io/azure-load-balancer-internal-subnet: "myInternalSubnet"
spec:
type: LoadBalancer
ports:
- port: 80
targetPort: 8080
selector:
app: my-app
Azure Private Link Integration
Enable private connectivity to Azure PaaS services:
# Create private endpoint for Azure SQL Database
az network private-endpoint create \
--resource-group myResourceGroup \
--name myPrivateEndpoint \
--vnet-name myVnet \
--subnet myInternalSubnet \
--private-connection-resource-id $SQL_SERVER_RESOURCE_ID \
--group-ids sqlServer \
--connection-name myPrivateConnection
Explanation: Creates a private endpoint that allows secure access to Azure SQL Database over a private IP address within your VNET. This eliminates the need to access the database over the public internet.
Terraform equivalent:
resource "azurerm_private_endpoint" "sql" {
name = "myPrivateEndpoint"
location = azurerm_resource_group.main.location
resource_group_name = azurerm_resource_group.main.name
subnet_id = azurerm_subnet.internal.id
private_service_connection {
name = "myPrivateConnection"
private_connection_resource_id = var.sql_server_resource_id
subresource_names = ["sqlServer"]
is_manual_connection = false
}
}
Testing VNET Integration
Once your AKS cluster is deployed with VNET integration, it's essential to validate that the networking configuration is working correctly. This section provides practical examples for testing various aspects of VNET integration.
Testing Basic Connectivity
1. Validate Pod-to-Pod Communication
Create test pods to verify internal cluster communication:
# test-pod-1.yaml
apiVersion: v1
kind: Pod
metadata:
name: test-pod-1
labels:
app: test-connectivity
spec:
containers:
- name: network-test
image: busybox
command: ["sleep", "3600"]
---
# test-pod-2.yaml
apiVersion: v1
kind: Pod
metadata:
name: test-pod-2
labels:
app: test-connectivity
spec:
containers:
- name: network-test
image: busybox
command: ["sleep", "3600"]
Deploy and test connectivity:
# Deploy test pods
kubectl apply -f test-pod-1.yaml
kubectl apply -f test-pod-2.yaml
Explanation: Deploys the test pod manifests to the Kubernetes cluster. kubectl apply creates or updates resources based on the YAML definitions.
Terraform equivalent:
resource "kubernetes_pod" "test_pod_1" {
metadata {
name = "test-pod-1"
labels = {
app = "test-connectivity"
}
}
spec {
container {
image = "busybox"
name = "network-test"
command = ["sleep", "3600"]
}
}
}
resource "kubernetes_pod" "test_pod_2" {
metadata {
name = "test-pod-2"
labels = {
app = "test-connectivity"
}
}
spec {
container {
image = "busybox"
name = "network-test"
command = ["sleep", "3600"]
}
}
}
# Get pod IPs
kubectl get pods -o wide
Explanation: Lists all pods with additional details including their assigned IP addresses, node placement, and status.
# Test ping between pods
kubectl exec test-pod-1 -- ping -c 3 <test-pod-2-ip>
Explanation: Executes a ping command inside test-pod-1 to test network connectivity to test-pod-2. The -c 3
parameter limits the ping to 3 packets.
# Test DNS resolution
kubectl exec test-pod-1 -- nslookup test-pod-2
Explanation: Tests DNS resolution within the cluster by looking up the hostname of test-pod-2 from test-pod-1.
2. Test External Connectivity
Verify internet access and external DNS resolution:
# Test external connectivity
kubectl exec test-pod-1 -- ping -c 3 8.8.8.8
Explanation: Tests internet connectivity by pinging Google's public DNS server (8.8.8.8) from within the pod.
# Test DNS resolution
kubectl exec test-pod-1 -- nslookup google.com
Explanation: Tests external DNS resolution by resolving the google.com domain name.
# Test HTTPS connectivity
kubectl exec test-pod-1 -- wget -O- https://www.microsoft.com
Explanation: Tests HTTPS connectivity by downloading the Microsoft homepage. This verifies both DNS resolution and HTTPS traffic flow.
Testing Azure Service Integration
1. Test Azure SQL Database Connectivity
Create a test pod to verify database connectivity:
# sql-test-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: sql-test-pod
spec:
containers:
- name: sql-test
image: mcr.microsoft.com/mssql-tools
command: ["sleep", "3600"]
env:
- name: SQL_SERVER
value: "your-server.database.windows.net"
- name: SQL_DATABASE
value: "your-database"
- name: SQL_USER
value: "your-username"
- name: SQL_PASSWORD
valueFrom:
secretKeyRef:
name: sql-secret
key: password
Test the connection:
# Deploy the test pod
kubectl apply -f sql-test-pod.yaml
Explanation: Deploys a test pod with SQL Server tools to test database connectivity.
Terraform equivalent:
resource "kubernetes_pod" "sql_test" {
metadata {
name = "sql-test-pod"
}
spec {
container {
image = "mcr.microsoft.com/mssql-tools"
name = "sql-test"
command = ["sleep", "3600"]
env {
name = "SQL_SERVER"
value = "your-server.database.windows.net"
}
env {
name = "SQL_DATABASE"
value = "your-database"
}
env {
name = "SQL_USER"
value = "your-username"
}
env {
name = "SQL_PASSWORD"
value_from {
secret_key_ref {
name = "sql-secret"
key = "password"
}
}
}
}
}
}
# Test SQL connection
kubectl exec sql-test-pod -- /opt/mssql-tools/bin/sqlcmd \
-S $SQL_SERVER \
-d $SQL_DATABASE \
-U $SQL_USER \
-P $SQL_PASSWORD \
-Q "SELECT 1"
Explanation: Executes sqlcmd inside the test pod to connect to Azure SQL Database and run a simple query. This validates that the pod can reach the database through the private endpoint.
2. Test Azure Storage Integration
Verify connectivity to Azure Storage accounts:
# Test storage account connectivity
kubectl run storage-test --image=mcr.microsoft.com/azure-cli \
--command -- sleep 3600
Explanation: Creates a pod with Azure CLI tools to test connectivity to Azure Storage services.
Terraform equivalent:
resource "kubernetes_pod" "storage_test" {
metadata {
name = "storage-test"
}
spec {
container {
image = "mcr.microsoft.com/azure-cli"
name = "azure-cli"
command = ["sleep", "3600"]
}
}
}
# Test blob storage access
kubectl exec storage-test -- az storage blob list \
--account-name yourstorageaccount \
--container-name yourcontainer \
--account-key youraccountkey
Explanation: Lists blobs in an Azure Storage container to verify that pods can access Azure Storage services through the network configuration.
Testing Network Policies
1. Deploy Test Applications
Create applications to test network policy enforcement:
# frontend-app.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: frontend
spec:
replicas: 1
selector:
matchLabels:
app: frontend
template:
metadata:
labels:
app: frontend
tier: frontend
spec:
containers:
- name: frontend
image: nginx
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: frontend-service
spec:
selector:
app: frontend
ports:
- port: 80
targetPort: 80
---
# backend-app.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: backend
spec:
replicas: 1
selector:
matchLabels:
app: backend
template:
metadata:
labels:
app: backend
tier: backend
spec:
containers:
- name: backend
image: nginx
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: backend-service
spec:
selector:
app: backend
ports:
- port: 80
targetPort: 80
2. Test Without Network Policies
# Deploy applications
kubectl apply -f frontend-app.yaml
kubectl apply -f backend-app.yaml
# Test connectivity before applying policies
kubectl exec deployment/frontend -- curl -s backend-service
3. Apply Network Policy and Test
# network-policy-test.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: backend-netpol
spec:
podSelector:
matchLabels:
tier: backend
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
tier: frontend
ports:
- protocol: TCP
port: 80
Test policy enforcement:
# Apply network policy
kubectl apply -f network-policy-test.yaml
# Test allowed connectivity (should work)
kubectl exec deployment/frontend -- curl -s backend-service
# Test denied connectivity (should fail)
kubectl run test-denied --image=busybox --command -- sleep 3600
kubectl exec test-denied -- wget -qO- backend-service
Testing Private Cluster Access
1. Verify API Server Access
Test access to the private API server:
# From a VM in the same VNET
kubectl get nodes
# Test API server endpoint resolution
nslookup your-aks-cluster-api-fqdn
# Verify private endpoint connectivity
curl -k https://your-aks-cluster-api-fqdn/api/v1
2. Test Jump Box Connectivity
Create a jump box to test private cluster access:
# Create jump box VM in the same VNET
az vm create \
--resource-group myResourceGroup \
--name jumpbox \
--image UbuntuLTS \
--vnet-name myVnet \
--subnet jumpbox-subnet \
--admin-username azureuser \
--generate-ssh-keys
Explanation: Creates a Linux virtual machine in the same VNET as the private AKS cluster. This VM acts as a jump box to access the private cluster since the API server is not accessible from the internet.
Terraform equivalent:
resource "azurerm_subnet" "jumpbox" {
name = "jumpbox-subnet"
resource_group_name = azurerm_resource_group.main.name
virtual_network_name = azurerm_virtual_network.main.name
address_prefixes = ["10.243.0.0/24"]
}
resource "azurerm_public_ip" "jumpbox" {
name = "jumpbox-pip"
resource_group_name = azurerm_resource_group.main.name
location = azurerm_resource_group.main.location
allocation_method = "Static"
}
resource "azurerm_network_interface" "jumpbox" {
name = "jumpbox-nic"
location = azurerm_resource_group.main.location
resource_group_name = azurerm_resource_group.main.name
ip_configuration {
name = "testconfiguration1"
subnet_id = azurerm_subnet.jumpbox.id
private_ip_address_allocation = "Dynamic"
public_ip_address_id = azurerm_public_ip.jumpbox.id
}
}
resource "azurerm_linux_virtual_machine" "jumpbox" {
name = "jumpbox"
resource_group_name = azurerm_resource_group.main.name
location = azurerm_resource_group.main.location
size = "Standard_B1s"
admin_username = "azureuser"
disable_password_authentication = true
network_interface_ids = [
azurerm_network_interface.jumpbox.id,
]
admin_ssh_key {
username = "azureuser"
public_key = file("~/.ssh/id_rsa.pub")
}
os_disk {
caching = "ReadWrite"
storage_account_type = "Standard_LRS"
}
source_image_reference {
publisher = "Canonical"
offer = "0001-com-ubuntu-server-focal"
sku = "20_04-lts"
version = "latest"
}
}
# Install kubectl on jump box
ssh azureuser@jumpbox-ip
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
Explanation: SSH into the jump box and install kubectl, the Kubernetes command-line tool needed to interact with the AKS cluster.
# Configure kubectl with AKS credentials
az aks get-credentials --resource-group myResourceGroup --name myAKSCluster
Explanation: Downloads the cluster configuration and credentials, configuring kubectl to connect to the private AKS cluster.
# Test cluster access
kubectl get nodes
kubectl get pods --all-namespaces
Explanation: Verifies that kubectl can successfully connect to the private cluster by listing nodes and pods.
Testing Load Balancer and Ingress
1. Test LoadBalancer Service
Deploy a test application with LoadBalancer service:
# loadbalancer-test.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: lb-test-app
spec:
replicas: 2
selector:
matchLabels:
app: lb-test
template:
metadata:
labels:
app: lb-test
spec:
containers:
- name: web
image: nginx
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: lb-test-service
spec:
type: LoadBalancer
selector:
app: lb-test
ports:
- port: 80
targetPort: 80
Test the LoadBalancer:
# Deploy the test application
kubectl apply -f loadbalancer-test.yaml
# Get external IP
kubectl get service lb-test-service
# Test external access
curl http://<external-ip>
# Test load balancing
for i in {1..10}; do curl http://<external-ip>; done
2. Test Application Gateway Ingress
Create an ingress resource for testing:
# ingress-test.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: test-ingress
annotations:
kubernetes.io/ingress.class: azure/application-gateway
spec:
rules:
- host: test.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: lb-test-service
port:
number: 80
Test ingress functionality:
# Apply ingress resource
kubectl apply -f ingress-test.yaml
# Get ingress status
kubectl get ingress test-ingress
# Test with host header
curl -H "Host: test.example.com" http://<ingress-ip>
Network Diagnostics and Troubleshooting
1. Network Connectivity Troubleshooting
Use diagnostic pods for network troubleshooting:
# Deploy network troubleshooting pod
kubectl run netshoot --image=nicolaka/netshoot --command -- sleep 3600
# Network diagnostics commands
kubectl exec netshoot -- ping <target-ip>
kubectl exec netshoot -- traceroute <target-ip>
kubectl exec netshoot -- nslookup <hostname>
kubectl exec netshoot -- netstat -tulpn
kubectl exec netshoot -- ss -tulpn
2. DNS Resolution Testing
# Test cluster DNS
kubectl exec netshoot -- nslookup kubernetes.default.svc.cluster.local
# Test external DNS
kubectl exec netshoot -- nslookup google.com
# Check DNS configuration
kubectl exec netshoot -- cat /etc/resolv.conf
3. Performance Testing
Test network performance between pods:
# Deploy iperf3 server
kubectl run iperf3-server --image=networkstatic/iperf3 -- iperf3 -s
# Deploy iperf3 client and test
kubectl run iperf3-client --image=networkstatic/iperf3 -- sleep 3600
# Get server IP
SERVER_IP=$(kubectl get pod iperf3-server -o jsonpath='{.status.podIP}')
# Run performance test
kubectl exec iperf3-client -- iperf3 -c $SERVER_IP -t 30
Validation Checklist
Use this checklist to validate your VNET integration:
- [ ] Pods can communicate with each other within the cluster
- [ ] Pods can access external internet resources
- [ ] DNS resolution works correctly for internal and external services
- [ ] Azure services are accessible from pods (if configured)
- [ ] Network policies are enforced correctly
- [ ] LoadBalancer services are accessible externally
- [ ] Ingress controllers route traffic properly
- [ ] Private cluster API server is accessible only from authorized networks
- [ ] NSG rules are working as expected
- [ ] No unnecessary network ports are exposed
Best Practices and Recommendations
IP Address Planning
- Reserve sufficient IP addresses: Plan for cluster scaling and pod density
- Use non-overlapping CIDR blocks: Avoid conflicts with on-premises networks
- Consider future growth: Allocate larger subnets than immediately needed
Example IP planning:
VNET: 10.0.0.0/8
├── AKS Subnet: 10.240.0.0/16 (65,536 IPs)
├── Internal Services: 10.241.0.0/16 (65,536 IPs)
├── Application Gateway: 10.242.0.0/24 (256 IPs)
└── Management: 10.243.0.0/24 (256 IPs)
Security Considerations
-
Implement defense in depth:
- Use NSGs for subnet-level filtering
- Apply Kubernetes Network Policies for pod-level segmentation
- Enable Azure Policy for governance
Use private clusters for production:
az aks create \
--enable-private-cluster \
--enable-managed-identity \
--enable-rbac
Explanation: Creates a production-ready AKS cluster with enhanced security features:
-
--enable-private-cluster
: API server only accessible from private networks -
--enable-managed-identity
: Uses Azure managed identity for secure authentication -
--enable-rbac
: Enables Kubernetes role-based access control for fine-grained permissions
Terraform equivalent:
resource "azurerm_kubernetes_cluster" "production" {
name = "production-aks"
location = azurerm_resource_group.main.location
resource_group_name = azurerm_resource_group.main.name
dns_prefix = "production-aks"
private_cluster_enabled = true
default_node_pool {
name = "default"
node_count = 3
vm_size = "Standard_D2_v2"
}
identity {
type = "SystemAssigned"
}
role_based_access_control_enabled = true
}
- Secure container registry access:
# Create private endpoint for ACR
az acr create \
--resource-group myResourceGroup \
--name myRegistry \
--sku Premium \
--public-network-enabled false
Explanation: Creates an Azure Container Registry with private endpoint capability:
-
--sku Premium
: Required for private endpoint support and advanced features -
--public-network-enabled false
: Disables public access, forcing all access through private endpoints
Terraform equivalent:
resource "azurerm_container_registry" "main" {
name = "myRegistry"
resource_group_name = azurerm_resource_group.main.name
location = azurerm_resource_group.main.location
sku = "Premium"
public_network_access_enabled = false
}
resource "azurerm_private_endpoint" "acr" {
name = "acr-private-endpoint"
location = azurerm_resource_group.main.location
resource_group_name = azurerm_resource_group.main.name
subnet_id = azurerm_subnet.internal.id
private_service_connection {
name = "acr-privateserviceconnection"
private_connection_resource_id = azurerm_container_registry.main.id
subresource_names = ["registry"]
is_manual_connection = false
}
}
Monitoring and Troubleshooting
- Enable Container Insights:
az aks enable-addons \
--resource-group myResourceGroup \
--name myAKSCluster \
--addons monitoring
Explanation: Enables Azure Monitor Container Insights for the AKS cluster, providing comprehensive monitoring of cluster performance, resource utilization, and application logs.
Terraform equivalent:
resource "azurerm_log_analytics_workspace" "main" {
name = "aks-logs"
location = azurerm_resource_group.main.location
resource_group_name = azurerm_resource_group.main.name
sku = "PerGB2018"
}
resource "azurerm_kubernetes_cluster" "main" {
# ... other configuration ...
oms_agent {
log_analytics_workspace_id = azurerm_log_analytics_workspace.main.id
}
}
- Use Network Watcher for troubleshooting:
# Enable Network Watcher
az network watcher configure \
--resource-group myResourceGroup \
--locations eastus \
--enabled
Explanation: Enables Azure Network Watcher in the specified region. Network Watcher provides network monitoring, diagnostic, and analytics tools to help troubleshoot network connectivity issues.
Terraform equivalent:
resource "azurerm_network_watcher" "main" {
name = "NetworkWatcher_eastus"
location = "East US"
resource_group_name = azurerm_resource_group.main.name
}
-
Monitor network performance:
- Track pod-to-pod communication latency
- Monitor ingress/egress bandwidth
- Set up alerts for network anomalies
Performance Optimization
- Choose appropriate VM sizes: Select VM SKUs with adequate network performance
- Enable accelerated networking: Improve network performance for supported VM sizes
- Optimize pod placement: Use node affinity and anti-affinity rules
- Implement horizontal pod autoscaling: Scale based on network metrics
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: my-app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
minReplicas: 3
maxReplicas: 100
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
Conclusion
Azure AKS and VNET integration is a fundamental component of enterprise Kubernetes deployments on Azure. By properly implementing VNET integration, organizations can achieve:
- Enhanced Security: Network-level isolation and access controls provide multiple layers of protection
- Improved Performance: Direct networking reduces latency and improves application responsiveness
- Seamless Integration: Native connectivity with Azure services simplifies architecture design
- Scalability: Proper IP planning and network design support future growth requirements
Key takeaways for successful AKS VNET integration:
- Plan your network architecture carefully: Consider current requirements and future growth
- Choose the right networking model: Azure CNI for advanced features, Kubenet for simplicity
- Implement security best practices: Use NSGs, network policies, and private clusters
- Monitor and optimize: Continuously monitor network performance and security
The choice between Kubenet and Azure CNI depends on your specific requirements, but for production workloads requiring advanced networking features and better integration with Azure services, Azure CNI is typically the preferred option.
As Kubernetes and Azure continue to evolve, staying informed about new networking features and best practices will help you maintain a secure, performant, and scalable container infrastructure.
References
- Azure Kubernetes Service (AKS) networking concepts
- Configure Azure CNI networking in Azure Kubernetes Service (AKS)
- Create a private Azure Kubernetes Service cluster
- Use network policies to secure traffic between pods in Azure Kubernetes Service (AKS)
- Application Gateway Ingress Controller for Azure Kubernetes Service
- Use a Load Balancer with Azure Kubernetes Service (AKS)
- Azure Private Link for Azure Kubernetes Service
- Plan IP addressing for your cluster
- Monitor Azure Kubernetes Service (AKS) with Azure Monitor
- Troubleshoot Azure Kubernetes Service cluster or node issues