〈 Back to Blog

A Hands-On Tour of Kubernetes: Part 1 - Introduction

By Source Allies Team | February 20, 2024 | Cloud

Introduction

Kubernetes is a divisive topic in the world of software development. There seems to be an ardent following of both promoters and detractors.

For some, Kubernetes is a herald to the upcoming golden age of cloud native software. We are on the cusp of reveling in workloads and infrastructure that are self-healing and always-available. We’ll no longer think in terms of “iteration cycles” since we’ll be delivering a constant stream of value into the world.

For others, Kubernetes is the tangled manifestation of an industry driven by hype and complexity. We now toil away on the inconsequential mess we’ve created for ourselves, rather than solving problems faced by businesses in the real world. We’ve prioritized job security over simplicity.

Perhaps it’s best to step away from the emotion and look at the numbers. According to the 2022 CNCF Annual Survey, nearly half of all organizations using containers run Kubernetes to deploy and manage at least some of those containers. Worldwide, there are 5.6 million developers using Kubernetes today.

And yet, I know there are many still wondering: what is Kubernetes?

The official kubernetes.io website describes Kubernetes as follows:

Kubernetes, also known as K8s, is an open-source system for automating deployment, scaling, and management of containerized applications.

I remember reading this sentence several years ago before I knew anything about Kubernetes. On the surface, this description did nothing to help me understand what is was. I didn’t wake up like Neo from a training program, with all the kubectl commands suddenly at my disposal.

Rather, I continued on to the documentation page where I was met with a buffet of information. I remember feeling lost in the sea of concepts and cross-referenced materials. I was on a boat without a skipper, and the further I dove into the material, the more I felt like Kubernetes was beyond my capability of understanding.

Eventually, I realized I felt lost because my learning style just didn’t align with how the official docs are organized. The official docs are an amazing reference for Kubernetes, but the longer I go before I convert reading into doing, the more likely I am to forget the material.

This blog post, and the ones that follow, aim to provide that hands-on, “hello world” introduction to Kubernetes for others like me. Rather than serving as an exhaustive reference, these posts focus on the basics of Kubernetes by running commands against a real cluster. By the end, you should feel more familiar with some of the Kubernetes terminology and underlying mechanics. Consider these posts the “context onramp” into the official docs.

In the interest of keeping these posts concise, I’ve made a few assumptions about the things you already know:

You are familiar with Docker/containers.
You are comfortable with running commands from the command line.

However, you do not need to be an expert in either of these things. We won’t be building our own container images, and we won’t be piping together long sequences of sed or awk. Mostly, we’ll be running a single command, inspecting the output, then showing what that information means.

And since we’re going to learn by doing, it seems appropriate to start by running Kubernetes locally.

Running Kubernetes Locally

Kubernetes is not a single piece of software; it is a set of software components working together.

It is possible to run Kubernetes by manually installing and configuring each of these components. There are guides that explain how to install Kubernetes from scratch, but this is a rather tedious endeavor. There are tools (such as kubeadm) that automate many of necessary tasks, but even these tools expect a level of administration knowledge that are beyond the scope of these blog posts.

In a production setting, you will likely use the hosted Kubernetes service provided by your favorite cloud provider. Examples include:

If you are managing your own data center, you may choose to use one of the numerous, commercially-backed Kubernetes distributions, such as:

There are a lot of different Kubernetes offerings out there. These offerings simplify many of the administration tasks while offering their own unique functionality and integration opportunities, but most of these options are too heavyweight for our current needs. We need a local cluster that we can start and stop with ease.

Luckily for us, there are two freely* available pieces of software that provide a single-click installation of Kubernetes for the desktop:

The sections below describe the setup process for these two pieces of software. Note that both options support Windows, macOS, and Linux.

* Docker Desktop is free for personal use, as explained on the pricing page.

Docker Desktop

For many, “Docker” is synonymous with “container”. You may already have it available on you machine. After installing Docker Desktop, you can enable Kubernetes by following the official instructions.

Rancher Desktop

Rancher Desktop is an open-source project (GitHub repo) that aims to bring Kubernetes to the desktop. When installing Rancher Desktop, you have a choice of using containerd or dockerd as the container runtime. We will be using dockerd.

Verify Your Setup

kubectl is THE command line tool for interacting with Kubernetes. Before continuing, let’s make sure kubectl is installed and available within our PATH. From your system’s command prompt, run:

$ kubectl config current-context

You should see a single line of output whose value depends on how you are running Kubernetes.

If you are using Docker Desktop, you should see something like docker-desktop
If you are using Rancher Desktop, you should see something like rancher-desktop

This output tells us that kubectl is working and pointing to our local Kubernetes instance. If the output of the above command shows a different value, please review the installation directions for Docker Desktop / Rancher Desktop and make sure the chosen software is running.

Note: If you are using Rancher Desktop, when you first use kubectl you may see some extra messages like this:

$ kubectl config current-context
kubectl config current-context
I0509 15:57:19.646691   13564 versioner.go:58] invalid configuration: no configuration has been provided
I0509 15:57:19.724839   13564 versioner.go:64] No local kubectl binary found, fetching latest stable release version
I0509 15:57:19.993116   13564 versioner.go:84] Right kubectl missing, downloading version 1.24.0
Downloading https://storage.googleapis.com/kubernetes-release/release/v1.24.0/bin/darwin/amd64/kubectl
...

This is expected, as Rancher Desktop may delay the installation of kubectl until it is first used.

Assuming the previous command matches the suggested output, we can now verify that our local Kubernetes instance is running.

$ kubectl get nodes

The output of this command will also differ depending on how you’re running Kubernetes.

Docker Desktop:

$ kubectl get nodes
NAME             STATUS   ROLES           AGE     VERSION
docker-desktop   Ready    control-plane   2m19s   v1.28.2

Rancher Desktop:

$ kubectl get nodes
NAME                   STATUS   ROLES                  AGE   VERSION
lima-rancher-desktop   Ready    control-plane,master   34d   v1.28.5+k3s1

Note: the numbers in the VERSION column may be different for you.

If your output looks similar, then congrats! You are successfully running Kubernetes locally.

Basic Concepts

Alright, I lied a little. Before we dive further into kubectl commands, I’d like to introduce some initial Kubernetes concepts. These blog posts will absolutely be hands-on, but having the context for our actions will help with our understanding.

Kubernetes can do a lot of things, but I think understanding its purpose can best be summarized as a dialog.

Me: Hey Kubernetes, here are some machines for running applications. Let’s call them “nodes”.

Kubernetes: Sounds good. Terminology noted.

Me: Perfect. Also, here are my applications. They are packaged as containers.

Kubernetes: Looks good to me. What do you want me to do with this information?

Me: I’d like to run the containers on these nodes, and I want you to figure out how to make it work.

Kubernetes: I’m on it!

Ultimately, Kubernetes exists to help us run our containerized applications across a set of machines. These machines carry compute (CPU and memory) alongside storage and networking, and we’re not interested in specifying every detail to allow our applications to run and communicate. Instead, we’d rather tell Kubernetes our desired outcome and let Kubernetes figure out where things belong.

Nodes

As suggested in the above dialog, the machines that are available to run our containers are called nodes, or sometimes “worker nodes”. A node can be either a bare-metal server or a virtual machine, but either way, the nodes run software that allows them to reach each other and run containerized workloads.

The nodes are registered to the control plane, which tracks the state of the cluster. The control plane itself is some software running on one or more machines. The software running on the control plane determines which containers run on which nodes.

How you set up all these machines can vary quite a bit:

We can run our control plane on a single machine and have multiple worker nodes connected.
We can roll with a highly available (HA) cluster which replicates the control plane components across multiple machines. Most managed Kubernetes offerings (such as EKS, AKS, and GKE) follow this topology.
We can use a single machine to simultaneously host the control plane and function as a worker node. This is the setup we’re using for our local cluster.

Regardless of how many machines we’re using, the interface to Kubernetes remains the same. From the perspective of the application developer, there is little difference between a single-node cluster and a 15,000-node cluster apart from the redundancy and compute resources available. No functionality becomes “unlocked” after you’ve added your eighth node to the cluster, for example.

Kubernetes API

The control plane hosts an API server and a database. Similar to other APIs, we send HTTP requests to the API server to manipulate the resources in the database.

However, these API resources don’t represent things like “customer”, “cart”, or “invoice”. Rather, these API resources represent things like “a running container”, “a network connection”, or “application storage”. While we could interact with Kubernetes entirely through its API endpoints, the kubectl tool provides a friendly wrapper around all these HTTP requests.

Only the API server connects to the database. All other components read and update the state of the cluster through the API server. This design leads to a key aspect of Kubernetes: the desired state of the cluster is entirely contained in the database. In fact, we can take a snapshot of the database and use it to restore the cluster if there was a critical failure.

Other software components (called controllers) compare the desired state of the cluster with the actual state. If the actual state doesn’t match the desired state, these controllers perform the necessary actions to bring the actual state closer to the desired state.

Unless you are a cluster administrator, you don’t need to worry about the detailed workings of these various components. However, I think some awareness of what’s happening behind the scenes helps demystify Kubernetes. With the initial concepts out of the way, the remainder of these blog post will focus on these different API resources and how we use them to run our applications.

Pods

We’re going to start with the most fundamental building block in Kubernetes, the pod. Put simply, a pod is collection of one or more containers along with an execution environment. For simplicity, the pods we’ll create in this series will only have a single container.

Let’s get our hands on the keyboard and run our first pod.

$ kubectl run my-first-pod --image=nginx:1.24
pod/my-first-pod created

If you’re familiar with running containers with docker, this command will look similar to you.

After kubectl run we specify the name of our pod, which is “my-first-pod”.
We use the --image option to specify which image to use for the single container in our pod. In this case, we are using an nginx image from Docker Hub.

Let’s now verify the pod is running.

$ kubectl get pods
NAME           READY   STATUS    RESTARTS   AGE
my-first-pod   1/1     Running   0          38s

The output of the previous command indicates that our pod, which contains a single container, is ready (1/1) and currently running. The ready (1/1) status signifies that the pod has one container, and this container is prepared to accept traffic. Additionally, the restart counter displays zero restarts, and the age of the pod is also provided.

Assuming your pod is also running with zero restarts, then congrats! You have successfully run nginx in your local Kubernetes cluster!

The nginx image we’re using binds to port 80 by default. Let’s try to send an HTTP request to our nginx instance:

$ curl http://localhost:80                   
curl: (7) Failed to connect to localhost port 80 after 0 ms: Couldn't connect to server

Unless you have another service listening on port 80 on your machine, you should see a “Couldn’t connect to server” error. Similar to Docker, containers running in Kubernetes have their own (virtual) network interfaces. We’ll look at cluster networking some more later on, but for now, let’s reassure the skeptic that nginx is running by first setting up port forwarding between our localhost and the pod.

$ kubectl port-forward pod/my-first-pod 8000:80
Forwarding from 127.0.0.1:8000 -> 80
Forwarding from [::1]:8000 -> 80

Requests to localhost:8000 will now be forwarded to port 80 of our pod. In a separate terminal, let’s try sending our HTTP request again.

$ curl http://localhost:8000
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>

Ta-da! We received the default nginx response. We can actually inspect the nginx logs and verify it wasn’t some other instance of nginx that someone sneakily ran on our machine.

$ kubectl logs pod/my-first-pod
/docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration
/docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/
/docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh
10-listen-on-ipv6-by-default.sh: info: Getting the checksum of /etc/nginx/conf.d/default.conf
10-listen-on-ipv6-by-default.sh: info: Enabled listen on IPv6 in /etc/nginx/conf.d/default.conf
/docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh
/docker-entrypoint.sh: Launching /docker-entrypoint.d/30-tune-worker-processes.sh
/docker-entrypoint.sh: Configuration complete; ready for start up
2024/01/24 22:24:49 [notice] 1#1: using the "epoll" event method
2024/01/24 22:24:49 [notice] 1#1: nginx/1.24.0
2024/01/24 22:24:49 [notice] 1#1: built by gcc 10.2.1 20210110 (Debian 10.2.1-6) 
2024/01/24 22:24:49 [notice] 1#1: OS: Linux 6.1.64-0-virt
2024/01/24 22:24:49 [notice] 1#1: getrlimit(RLIMIT_NOFILE): 1048576:1048576
2024/01/24 22:24:49 [notice] 1#1: start worker processes
2024/01/24 22:24:49 [notice] 1#1: start worker process 29
2024/01/24 22:24:49 [notice] 1#1: start worker process 30
127.0.0.1 - - [24/Jan/2024:22:25:46 +0000] "GET / HTTP/1.1" 200 615 "-" "curl/8.4.0" "-"

The final log message shows our HTTP request. Satisfied with these results, we can now hit “ctrl-c” in the terminal running our port forward command to stop the port forwarding.

In fact, we no longer need this pod. Let’s delete it.

$ kubectl delete pod my-first-pod
pod "my-first-pod" deleted

We can get the pods again to verify it’s gone:

$ kubectl get pods
No resources found in default namespace.

The keen reader may notice some similarities with these commands. For the most part, kubectl commands use the following pattern:

kubectl <verb> <type> <name>

verb indicates the desired action.
type refers to the resource type. So far we’ve only talked about pods, but we’ll see more resource types soon.
name is the name of the resource.

The kubectl documentation further explains the command syntax and all the possible actions.

I don’t think I’d be too much of a reductionist to say that Kubernetes is all about pod manipulation. Given one or more containers, Kubernetes will help me configure the storage, networking, lifecycle, organization, and security of those containers.

With this in mind, everything we learn from here is about giving ourselves more tools to manipulate our pods. In the next post, we’ll look at how we can use namespaces and labels to organize our objects.

〈 Back to Blog

A Hands-On Tour of Kubernetes: Part 1 - Introduction

Introduction

Running Kubernetes Locally

Docker Desktop

Rancher Desktop

Verify Your Setup

Basic Concepts

Nodes

Kubernetes API

Pods

Recent Posts

Database Migrations at Scale with Fargate and Step Functions

A Hands-On Tour of Kubernetes: Part 4 - Deployments and Replication

A Hands-On Tour of Kubernetes: Part 3 - Communication and Services

A Hands-On Tour of Kubernetes: Part 2 - Namespaces and Labels

〈 Back to Blog