SECRET OF CSS

Deploying AKS Kubenet With Calico | by Burak Tahtacıoğlu | May, 2022


A CNCF roadmap

0*hWAz9oblIg slJ7b
Photo by Mark König on Unsplash

There are multiple ways to deploy an AKS cluster, and our focus is on the Azure-CLI, but that doesn’t mean I will leave you without knowing how the other methods work.

Azure portal is the official GUI that you can use to create, manage and deploy your cloud resources. The good thing about the Azure portal is that it is always up to date, and it works on any device that has a capable modern web browser. To create a resource here, first, click on the create button. Then in the popup window, narrow down the Kubernetes Services option either by searching in the search bar or going through the thumbnails and clicking on its create link. OK, I guess if I continue commenting on everything, it will be a bit tedious.

So let me go over the important things that require your attention. First of all, to navigate the creation steps together, I will share my current tab with you. Now let’s continue with Region. The region is a mandatory option to create your resources; in a production environment, Region should be close to your customers so they can have the best experience. If you are using a free account, you might not be able to create your resources in some regions if those regions are under a heavy load at that time. In that case, please change the location to a less crowded region. If your Kubernetes workloads require a specific version of Kubernetes, you can adjust it here.

The default node count value is 3, which can cause a problem if you are on a free account and try to upgrade your cluster. If you don’t need that many nodes make sure you always choose one. I didn’t include node size in this slide because it depends on your scenario, but I always pick the cheapest option that can support my labs. In Azure, your nodes are members of a node pool. If you click on your current node pool, it will pop up a window that allows you to modify them. In the OS type section, you will notice that windows are greyed out; this is because the first node must be a Linux machine to run the Kubernetes system workloads. If you scroll further down to the optional settings, you should be able to spot the Maxpods per node dropdown. This means how many Pods are allowed to run on a single node. You can’t choose a number less than 10 since it is the hardcoded value by Azure, and depending on your CNI of choice, your maximum option is either 250 pods if you are using Azure-CNIor 110 if you are using kubenet.

This value is enforced by the AKS services and if you need to increase it after deployment, you have to create another node pool with the new setting and migrate all your workloads to the new node pool. If you like to add a windows node to your cluster as a secondary node, you need to make sure that the plugin under the networking tab is set to Azure-CNI, or you will be greeted with this error. Let’s skip the Access tab and go to Networking. Here you can choose your Networking CNI, your options are between Azure CNI or Kubenet.

Next, choose your policy engine, and your options are between Calico, Azure and None. Keep in mind that when you choose Calico to be deployed by the AKS it will install a version of the Calico from the Microsoft image repository. The None option will result in a Kubernetes cluster without a policy enforcer. In this mode, you can install the latest version of Calico and manually configure it. Ok, I have to explain why I’m skipping some of these tabs; it is because modifying these options requires configuring all Azure-CNI latest vendor-based features, which is out of the scope of this course. Now let’s get back to our deployment and click on the “Review + Create” tab. Here, You can push the create button to deploy your cluster or click on “Download a template for automation” link and allow me to transition into my upcoming slides.

Azure Resource Manager(ARM) is a declarative way to deploy Azure resources using code. An ARM template is a regular JSON file that contains detail by detail what resources must be deployed. After exporting the ARM template, you could use it with Azure-CLI or Azure portal to create as many resources as you desire in a single move. Azure-CLI is the official command-line utility to interact with the Azure cloud platform. You can use the Azure-CLI to create every resource that the Azure platform offers from the convenience of your command line, but I’m getting ahead of myself now.

I would like to mention that I’ve chosen CanadaCentral as my region. Feel free to change this to any other region that is closer to you.

Keep in mind that the Azure free account might not allow you to create resources in certain regions if there is a huge demand in that location.

Use the following command to get a comprehensive list of available regions.

az account list-locations -o table

AKS usually supports the mainstream versions of the Kubernetes in all regions. AKS also provides support for three GA minor versions of Kubernetes. You can get the currently supported list of Kubernetes by issuing the following command.

az aks get-versions --location canadacentral -o table

At the time of writing, the default version of AKS Kubernetes is v1.21.9

To run a Kubernetes cluster in Azure, you must create multiple resources that share the same lifespan and assign them to a resource group.

A resource group is a way to group related resources in Azure for easier management and accessibility. You are permitted to create multiple resource groups with unique names in one location.

Create a resource group;

az group create --name ccol2azure-week2-lab --location canadacentral

Deploying an AKS cluster with Azure-CLI is a simple command execution.

Network plugin refers to the CNI plugin that will establish networking.

Network policy refers to the CNI plugin that will enforce the network security policies.

Keep in mind that you can not change these options after deployment. In such a case, your only option is to destroy and redeploy your current cluster from scratch because the Azure control plane will periodically check and re-configure these values.

Use the following command to deploy an AKS cluster.

az aks create --resource-group ccol2azure-lab --name CalicoAKSCluster --node-count 1 --node-vm-size Standard_B2s --network-plugin kubenet --network-policy calico --pod-cidr 192.168.0.0/16 --generate-ssh-keys

Azure-CLI can be used to export the cluster config file for an AKS deployment. This file can be used with kubectl in order to communicate with the cluster API server in order to manage and maintain the cluster.

Use the following command to export the cluster config file.

az aks get-credentials --resource-group ccol2azure-lab --name CalicoAKSCluster --admin

After you have exported the config file, you should use one of the kubectl commands to verify that you can access your cluster control plane.

For example, you can use the node commands.

kubectl get nodes

You should see a result similar to the following.

NAME                                STATUS   ROLES   AGE   VERSION
aks-nodepool1-15457940-vmss000000 Ready agent 86s v1.21.9

In an AKS cluster with the Kubenet CNI plugin, Nodes get their IP address from the underlying Azure Network.

Use the following command to get the available IP range inside your VNET.

az network vnet list -o table | egrep ccol2azure-lab

Use the following command to check the IP address of the participating node in this AKS cluster.

kubectl  get node -o wide

An AKS cluster deployed with kubenet CNI uses the host-local plugin to allocate the IP addresses to your workload Pods.

The host-local IPAM plugin is a commonly used IP address management plugin, which allocates a fixed size IP address range (CIDR) to each node, and then assigns IP addresses to each Pod within that range.

kubectl get configmap -n calico-system cni-config -o yaml

Look for the following output that verifies host-local is in charge of IP allocation.

"ipam": { "type": "host-local", "subnet": "usePodCidr"},

The default address range size is /24 (256 IP addresses), though two of those IP addresses are reserved for special purposes and not assigned to pods.

kubectl cluster-info dump | egrep -i 'PodCIDR"'

Kubenet uses User Defined Routing to establish connections between cluster resources that are hosted in different nodes.

First, let’s verify that our cluster is not using any encapsulation.

kubectl get installations default -o jsonpath="{.spec.calicoNetwork}" | jq

If you take a closer look at the output, you will see that both backend (bgp) and ipPool encapsulations are disabled.

Next, look at the list of routes in the VNET routing table.

az network route-table list -o table

Use the Name and ResourceGroup from the previous output and run the following command.

az network route-table show -g <NAME-OF-YOUR-ROUTE-TABLE-RESOURCE-GROUP> --name <NAME-OF-YOUR-AKS-ROUTETABLE> --query "{addressPrefix:routes[].addressPrefix,nexhop:routes[].nextHopIpAddress}"

You should see a result similar to:

{
"addressPrefix": [
"192.168.0.0/24"
],
"nexhop": [
"10.240.0.4"
]
}

As the output suggests, our Azure network knows that the subnet range 192.168.0.0/16 is reachable via 10.240.0.4.

Routing is a Layer 3 concept and requires an additional hop to move the packet to its destination. This might add a minor latency to Pod communication.

Only the first Pool routing is generated automatically when you create a cluster that uses Kubenet; additional ipPools will require User Defined Route entries which adds a layer of complexity for the Admin.

Windows pools and Virtual Nodes are not supported when using Kubenet.

YAOBank is a demo application with three different tiers named Customer, Summary, and Database. The Customer pod connects to the Summary pod, which connects to the Database pod. It has a deployment manifest in YAML and can be helpful in testing and learning about Kubernetes networking.

Use the following command to deploy the YaoBank manifest.

kubectl apply -f https://raw.githubusercontent.com/tigera/ccol2azure/main/week2/yaobank.yaml

We can verify our deployment by looking at the deployment status and filtering the output with yao keyword.

kubectl get deployments -A | egrep yao

Now that we have successfully deployed our application. Let’s create a load balancer service and connect to the customer Pod using a browser.

To do this, we can use the following manifest.

kubectl apply -f - <<EOF
apiVersion: v1
kind: Service
metadata:
name: yaobank-customer
namespace: yaobank-customer
spec:
selector:
app: customer
ports:
- port: 80
targetPort: 80
type: LoadBalancer
EOF

Use the following command to verify the service deployment.

kubectl get svc -n yaobank-customer yaobank-customer

Note: It might take 1–2 minutes for the loadbalancer’s service to acquire an external IP address

Use the loadbalancer’s EXTERNAL-IP IP address in a browser, and you should be able to see the YaoBank web UI.

You can use the following command to check the connectivity log.

kubectl logs -n yaobank-customer deployments/customer

Note: The IP address reported in the log belongs to the kube-proxy Pod(s).

First, let’s verify if the customer pod can access the database directly.

Use the following command to access the database from one of the customer pods.

kubectl exec -it -n yaobank-customer deployments/customer -- curl --connect-timeout 5 http://database.yaobank-database:2379/v2/keys?recursive=true | python -m json.tool

Since there are no policies in place, the database answers to anyone who attempts to connect to it.

As its name suggests, A default deny policy establishes isolation by denying all types of communication in an environment. To create a Kubernetes NetworkPolicy that can achieve such a goal, we need to use selectors in a way that can affect traffic in both directions.

Use the following policy to establish isolation in yaobank-database namespace.

kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny
namespace: yaobank-database
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
EOF

Let’s do the connectivity check again.

kubectl exec -it -n yaobank-customer deployments/customer -- curl --connect-timeout 5 http://database.yaobank-database:2379/v2/keys?recursive=true

Perfect! We have blocked direct access to our database Pod and as a side effect, by putting the database in isolation, we have rendered our services unusable to our customers.

Note: If you refresh your browser, you will no longer be able to visit the YAOBank site.

Next, we will explore how to fix this situation and allow other services to talk to the database.

This time instead of denying, let’s use a policy that will permit traffic. Our goal is to describe what type of traffic is allowed in and out of the database namespace.

kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: p-summary2database
namespace: yaobank-database
spec:
podSelector:
matchLabels:
app: database
ingress:
- from:
- namespaceSelector:
matchLabels:
ns: summary
- podSelector:
matchLabels:
app: summary
egress:
- to:
- namespaceSelector:
matchLabels:
ns: summary
- podSelector:
matchLabels:
app: summary
EOF

Use the browser to connect the load balancer service and verify that our service is working correctly.

Kubernetes policies are great for targeting namespaced traffic. However, a cluster has many parts that these policies can not control.

For example, if you wish to establish true isolation, you must apply a default deny to every namespace within your cluster.

It is also worth noting that in the absence of policies, the default behavior of Calico is to permit all traffic. However, this behavior changes to block all traffic except those that are explicitly allowed by policies when a policy is present.

Before starting the next lab, please delete the default-deny policy.

kubectl delete networkpolicy -n yaobank-database default-deny

Hello! We’ve done a lot in the previous section and I like to go over some of the stuff that just happened. I’ll try my best to explain my thought process behind these kubernetes network policy resources that we just implemented. We implemented a Kubernetes network policy resource this part was obvious because the header of our policy indicated it

Then we needed to make sure this policy only affects the database namespace and that is why we used the namespace in the metadata section. A quick note here a Kubernetes network policy resource requires a namespace and when you don’t express one, it will automatically get assigned to your default namespace After that we added a pod selector with a loose condition which means all pod traffics that are happening inside the database namespace then by declaring the direction of traffic

are happening inside the database namespace Then By declaring the direction of traffic we concluded our policy and created isolation in the database namespace and successfully blocked everyone from accessing our services in this namespace. After that we created a permit policy to allow our customers in. This time our podSelector criteria were more restrictive though, by attaching match labels we basically declared that this policy should affect any pod that has the “app database”

we basically declared that this policy should affect any pod that has the “app database” label. For inbound traffic or ingress, we expressed that every traffic that comes from a namespace with the “ns summary” label and was initiated by a pod with the “app summary” label should only be affected. In the same policy for the outbound traffic or Egress We expressed that the database namespace is only permitted to talk to Pods that have an “app summary” label and are located in a namespace with “ns summary” label. A Quick note here because a Kubernetes NetworkPolicy resource is bound to namespaces, when you want to establish isolation in multiple namespaces you need to write a deny policy for each one individually.

Before we begin, please make sure that your cluster has only these two Kubernetes network policies.

Use the following command to list the policies.

kubectl get networkpolicy -A

If you just created a new cluster, use the following command to create the p-summary2database network policy.

kubectl apply -f https://raw.githubusercontent.com/tigera/ccol2azure/main/week2/01_allow_summary.yaml

The API server provides a REST API interface to interact with the projectcalico.org/v3 API group. The API server allows kubectl to manage Calico API group resources without installing calicoctl.

Use the following command to verify that calico-apiserver is not yet installed in the cluster.

kubectl get deployments -n calico-apiserver

Use the following command to install the calico-apiserver.

kubectl apply -f - <<EOF
apiVersion: operator.tigera.io/v1
kind: APIServer
metadata:
name: default
spec: {}
EOF

Use the following command to get information about the API server rollout status.

kubectl get tigerastatus

Calico GlobalNetworkPolicy resource is a security policy rule that can affect your cluster as a whole. This type of resource can affect both namespace (traffic inside a cluster) and non-namespace (external and NIC) traffic.

Note: In this policy, traffic from non-namespaced and kube-system, calico-system, calico-apiserver namespaces are excluded deliberately to simplify the flow of content.

Use the following command to establish isolation.

kubectl apply -f - <<EOF
apiVersion: projectcalico.org/v3
kind: GlobalNetworkPolicy
metadata:
name: default-app-policy
spec:
namespaceSelector: has(projectcalico.org/name) && projectcalico.org/name not in {"kube-system", "calico-system", "calico-apiserver"}
types:
- Ingress
- Egress
egress:
- action: Allow
protocol: UDP
destination:
selector: k8s-app == "kube-dns"
ports:
- 53
EOF

Let’s do the connectivity test again, but this time we are trying to access an external URL from a different namespace that is not explicitly denied.

kubectl exec -it -n yaobank-customer deployments/customer --  curl --connect-timeout 10 -LIs https://projectcalico.docs.tigera.io/ | egrep HTTP

A global policy applies to the whole of your cluster, which means the isolation has been established without having to add a policy for every namespace. This is true even for the namespaces that you will create in the future.

In addition to the GlobalNetworkPolicy resource, Calico also offers a NetworkPolicy resource that can be applied to namespaces individually.

Use the following command to add the required rules.

kubectl apply -f - <<EOF
apiVersion: projectcalico.org/v3
kind: NetworkPolicy
metadata:
name: p-external2customer
namespace: yaobank-customer
spec:
selector: app == "customer"
ingress:
- action: Allow
protocol: TCP
destination:
selector: app == "customer"
egress:
- action: Allow
protocol: TCP
destination:
serviceAccounts:
names:
- summary
namespaceSelector: ns == "summary"
---
apiVersion: projectcalico.org/v3
kind: NetworkPolicy
metadata:
name: p-customer2summary
namespace: yaobank-summary
spec:
selector: app == "summary"
ingress:
- action: Allow
protocol: TCP
destination:
selector: app == "summary"
ports:
- 80
egress:
- action: Allow
protocol: TCP
destination:
selector: app == "database"
namespaceSelector: projectcalico.org/name == "yaobank-database"
EOF

At this point, your cluster should be secured, and you will be able to revisit the WebUI.

Use the following command to see the list of Calico GlobalNetworkPolicy rules.

kubectl get globalnetworkpolicy

Use the following command to see the list of Calico NetworkPolicy rules.

kubectl get caliconetworkpolicy -A

Use the following command to see the list of Kubernetes Policy rules.

kubectl get networkpolicy -n yaobank-database

We just needed to add a Global Network Policy to lock down the cluster and a couple of network policies to permit communication between customers and summary namespace.

Now, I think I need to explain why we needed to implement these exceptions. A true isolation policy looks like this. Implementing such a policy requires fine-tuned exceptions to allow the essential Kubernetes services to work. This is because a Global Network Policy affects both namespace and non-namespace environments and without proper exceptions implementing such a policy will cause disruption of service. For example, Calico needs to communicate with the kube-system namespace for many reasons and without proper exceptions implementing total isolation can disrupt this crucial flow of traffic. Another example is the Kubernetes DNS service. DNS Pods are located in the kube-system namespace and total isolation blocks the DNS queries that are launched by cluster resources. To solve these issues, we used simple logic to create a general exemption for the essential Kubernetes services that can work in any environment.

Our logic uses a unique Calico definition to tell the Calico Policy engine that we are interested in network traffic from the namespace environment that is not part of this exception list. Now you might be wondering if the DNS pods are located in the Kube-system namespace, why did we need the egress part of the policy? While the kube-system namespace can now send and receive traffic, workload namespaces remain in isolation, which means their inbound and outbound communication is blocked. To solve this issue, we implemented a cluster-wide Egress or outbound policy that targets UDP traffic on port 53 and lets all namespaced resources send traffic to pods that have a kube-dns label.

Azure-CLI is the one-stop-shop application for managing Azure cloud platform resources. In addition to creating an AKS cluster, The “az aks delete subcommand” offers a quick and clean way to remove all participating resources in an AKS cluster.

Use the following command to delete the AKS cluster.

az aks delete --name CalicoAKSCluster --resource-group ccol2azure-lab -yaz group delete --resource-group ccol2azure-lab -y

Use the following commands to delete the AKS cluster entries from your kubectl config file.

kubectl config delete-cluster CalicoAKSCluster
kubectl config delete-context CalicoAKSCluster-admin
kubectl config delete-user clusterAdmin_ccol2azure-lab_CalicoAKSCluster

Thanks for reading



News Credit

%d bloggers like this: