The concept of sticking our compiled code into containers is becoming quite popular at the moment, and for very good reason. If you develop for the cloud, and don't use containers or similar technology now, then you should really take a look at whats involved. The huge benefits of using containerized infrastructure/deployment are being pushed by the cloud vendors, and also increasingly being recognized by mainstream enterprise as an all round good egg and something worth investing time and resources into embracing. Containers and 'server-less computing' are two technologies that most developers are going to have to embrace in the next short few years.
This article progresses the others in my DevOps/Infrastructure focused series, where we are looking at building a cloud agnostic, clustered virtual machine solution based on container based services. In the first article in this series, we looked at 'Terraform', what it is, and how to use it. The second article dug a bit deeper into that very useful piece of Infrastructure technology. The third one went through approaches for provisioning virtual machines with Terraform once they have been deployed. Since we have our Virtual Machines set up, the next thing we are going to look at is using this thing called Kubernetes to manage our code containers. If you are a pure Windows person, I would urge you to dip your toes into the Linux world - its fascinating and has the full backing of Microsoft. For those folk, I have included deeper step by step instructions in the article, others can just skip those parts.
Containers are incredibly useful, but you can only fit so many on a machine before you have to worry about how to manage them all - this management of containers is known as orchestration. Arguably the most popular container system is the one provided by the Docker engine. While Docker has its own answer to orchestration, in 'Docker Swarm', the one that seems to be pulling ahead in the race is called 'Kubernetes'.
Kubernetes is a system that will manage the lifecycle of your containers no matter if they are distributed on clusters of five or five thousand virtual machines. The technology itself is based on Googles 'Borg system', which has been Googles internal system for managing clusters for many years. Just as Terraform aims to keep your 'desired state' of nodes up and running, in the same way Kubernetes will ensure that the containers you give it are safely distributed around the machine cluster and operating in a healthy manner.
I will delve into Kubernetes and Docker in further articles, for this article we are going to focus on setting up a cluster. If you are interested in the background to Kubernetes these two Google papers are well worth a read:
Large-scale cluster management at Google with Borg
Borg, Omega, and Kubernetes - Lessons learned from three container management systems over a decade
Microsoft are making a big bet and very large investment in the Kubernetes world. As developers, we now have the ability to package directly to a container from within Visual Studio, and they have recently brought out some amazing tooling for it on Azure with Azure container service, as well as employing some of the worlds best Kubernetes engineers. In my experience, when Redmond go full force behind something like this, its time to sit up and take notice!
I think most IT folk will agree that life is short enough and there's simply too much to learn at times! ... while I am completely in favor of going deep dive into technologies *when its needed*, most of the time I am quite happy to stand on the shoulders of others and use the fruits of the community (and that's one of the reasons I write articles, to give back ... please try it out yourself, every article helps someone, somewhere!).
Kubernetes from scratch is not the easiest of things to set up. There can be complications and dependencies that can trip you up easily, and until you get used to it, it can feel like the proverbial three steps forward, two steps back. For the purposes of this article I am going to show you one of the best, most configurable ways I have come across of installing up a Kubernetes environment, using a community supplied Ansible based solution called 'Kube Spray'. If you prefer to do things from the ground up, I can highly recommend you go step by step through 'Kubernetes the hard way', an excellent in-depth tutorial covering more than you will ever need to know :)
If you are not aware, Ansible is software that helps automate software provisioning, config management, and app deployment. Its extremely popular and very widely used in the IT operations world. I will cover it in an upcoming article in this series.
Kube-spray (originally known as 'Kargo'), is at its core, a set of Ansible playbooks that automate the installation of Kubernetes master and client nodes. It is open-source, developed and maintained by the community, and now part of the Kubernetes incubation program. Critically, KubeSpray is *production ready* so takes quite a lot of pain out of the entire setup operation for us.
What we are going to do in this article, is a step by step install of Kubernetes using Kube-Spray along with a very swish dashboard or three to let you easily manage your container cluster.
Building a virtual network on Azure
If you are following along with the series (see links at top of article), you will be using Terraform to spin up your cluster. If you haven't got there yet or want to play around with different configurations, here's a quick run-through of doing it from the Azure Portal:
(1) After login, click NEW and from the list presented, select a resource (for this example I have chosen Ubuntu server).
(2) Next step is to give the VM a name and password (don't use ssh key in this instance for development purposes). We also make the VM a member of a 'Resource group'. For the first VM, give a new unique resource group name, and for any other VMs/resources you are adding to this group, you can subsequently select the resource group name from the 'existing resource group' drop-box.
(3) Having entered all required information, click NEXT to select the virtual machine SIZE you want for the machine.
You have a final chance tor review before you commit to setting the machine up.
After a short time you will be notified that the machine is ready for use. To connect to it, open the master virtual machine, click 'connect' at the top and this will give you an SSH IP address you can use to log-in.
Using Kube-Spray on Azure
The following instructions assume you have 4 x virtual machines provisioned and are able to access one via a Public-IP (this will be your master node) and the others by using SSH via the master node.
In Kubernetes we have two types of machines. The main machine is known as the master, and typically is called 'KubeMaster'. Other virtual machines that are controlled by the master used to be called 'minions' but are now called 'nodes'.
For development, we set up five (one master, four nodes) of the following machines:
(Azure D4S_V3) Standard
4 x vCPU
16 GB ram
8000 max iops
32gb local SSD
Premium disk support
It is critical for setup that you preform all commands given as the root user - once you have logged into a VM take care to "sudo -s" and become the root user before carrying out any commands. For the purposes of ANY doubt, I'll say it again - you MUST do everything from here in as root, if you don't, you will have issues :)
The following is how I set up my test cluster for this article .. note the auto-shutdown option I have turned on - critical when testing if you don't want your credit card smacked with a big bill at the end of the month!
1 - Ensure SSH service is installed and running, then SSH into the master machine. I used to use the Putty app for this in the past, but now that BASH FOR WINDOWS is available I use this exclusively ... its far easier and a fantastic Windows integrated tool if you are dipping your feet into the DevOps world. If you don't have it installed and want to give it a spin, please check out my step by step instructions on how to install BASH for Windows.
Connecting to the remote machine using Windows Bash
To find out how to login, on the dashboard, select the master machine and click the 'connect' button on the top. This will give you the main account and IP to log into.
To connect, open your BASH prompt and enter the command given in the connect information. I generally copy this to the clipboard and paste it into the BASH window.
To cut and paste into the bash or any text based window, click on the top left icon and then select edit -> paste from the popup menu shown
After pasting in the ssh command, click enter!
You will notice from the screenshot below, that once we have connected, as we have not SSH'd into this machine before, it will give us a security warning about the machine and ask us to accept. Normally when responding to a yes/no you can enter Y or N, but in this case you MUST enter the full word 'yes' ... assuming you wish to continue! :)
After accepting the connect query, you are then prompted to enter the password for the user on the machine you are connecting to (in this case, you are requesting to connect as user 'kubeadmin' on machine 18.104.22.168).
Once the password is entered correctly, you are now live on the remote machine. It shows your username and machine name in green, and the dollar ($) prompt. Next steps are to start preparing the machine for the main install.
Once we are connected, we then need to elevate our user to ROOT, and create security keys that will be used to communicate securely between hosts.
To change to the root user, enter:
sudo -s <enter>
You will be asked for the main password.
After entering the password, we need to move into the ROOT folder,
cd /root and generate the RSA security keys
ssh-keygen -t rsa
NB: Follow these instructions carefully! .... it is important to be logged in as root and important to generate the key without a password.
The next thing we are going to do is print the key we just generated to screen,. and copy that in preparation for pasting it on the remote nodes. Running the command 'cat' (standing for 'catenate') will read the file you send into it as param, and redirect its data to the screen....
To select and copy this data at the bash command line, highlight with your mouse and hit <enter>
Sync security between nodes
Now that we have a key ready, we need to copy this over to the other nodes so that they see our master machine as authorized. This next step gets repeated for *each node* and in addition, gets done for the master itself as well. I am showing you a manual method here - there are multiple ways to do this. In production I normally create a 'base' machine as a node thats setup and ready to go, and then 'clone' this for each new node I need - it saves a lot of the manual work.
Step 1 - connect to remote node.
We have for nodes to connect and make part of our security group. In my test cluster I have named these node1, node2, node3, node4. Each have the same admin username (kubeadmin) and password. The initial login is the same experience as connecting our local windows machine to the remote server. Initially you will get a security warning that you need to agree to, then you can login with the password.
Once logged in, as before, we need to elevate ourselves to the root user with
sudo -s. Having done that, we are then going to use a simple Linux text editor called 'nano' to edit the file that handles authentication of other machines trying to connect to it 'authorized_keys' which is located in the '/root/.ssh' folder.
The full command calls the text editor with a parameter of the file you wish to open in edit mode.
For those with sufficient grey hair, nano is not unlike the old WordStar text editor. In fact it seems that George RR Martin, the creator of the television series 'Game of Thrones' writes his novels on Wordstar verion 4!
We are going to keep this extremely simple. We will use the paste method mentioned above (click top left icon, remember?) to paste the contents of the public key from our master machine now into the editor...
(the entire key pasted does not show above but it does go in!)
So to save this file, and then exit, we press the 'CTRL + O' key combination at once - this will prompt a save - just accept by hitting the enter key.
We then use the key combination 'CTRL + X' then to exit nano. Ok, so now we are ready to test that it worked as expected. To do this, we will logout, and then attempt to log back in again but as root this time, not kubeadmin. To back-out of the remote user bash shell you are in, type 'exit' and hit enter until you are back at the 'root@kubemaster' prompt.
Now from this starting point, we attempt to login *using no password* to the remote node.
Success! ... note how no password was asked for and we are landed straight into the 'root' account prompt.
The above needs to be repeated for every virtual machine/node in your cluster.
Step 2 - finalizing access for the master node itself.
For the master node, we finally need to copy its own public key into its own authorized key file, we can do this byu copying (cp) the pub file into the auth file:
cp /root/.ssh/id_rsa.pub /root/.ssh/authorized_keys
We can confirm it worked by running a CAT on the auth file after the copy.
There are a number of installs and updates we need to carry out ont he master machine before we can install KubeSpray - these are in effect its dependencies.
We run each of these commands individually (answering y/yes where appropriate/when asked), or, you can run them from the attached script (just to make your life a small bit easier .. you're welcome! :D)
apt-get install software-properties-common
apt-get install ansible
apt-get -y upgrade
apt-get install python-pip
pip install jinja2
pip install netaddr
The official GIT reporitory for KubeSpray is the usual starting point. For the purpose of this article I am going to point you to a community branch customised by the very talented crew in Xenonstack ... their particular branch is already configured for a number of extremly useful dashboards and services so we may as well stand on their shoulders (as an aside, I have used Xenon for training and support on a number of occasions and are really excellent - my go-to experts for immediate professional advice and assistance!)
git clone https://github.com/xenonstack/kubespray.git
Having downloaded KubeSpray (now residing in '/root/kubespray'), we next need to tell the configuration the names and/or IPs of our nodes.
Navigate to /root/kubespray/inventory and edit the inventory.ini file (using our new found friend 'nano'), adding in references to 'KubeMaster' and 'nodeX' (1..4) as shown below.
(as you might guess, navigation uses basic commands quite like dos/powershell ... cd for change directory would be useful at this point!)
(ensure that the section headers are uncommented '#' so that '#[etcd]' becomes '[etcd]')
The sections are as follows:
all - this lists all nodes on the cluster. If a node multiple network addresses you can specify the one you need to link to by using the 'ip=x' setting that is commented out int he example below.
kube-master - list the Kubernetes master node, in our case 'KubeMaster'
etcd - this is a simple distributed key/value store and used by Kubernetes to keep track of nodes and containers. In this example we are storing it on the master, but in production it may be on a separate machine replicated.
kube-node - a listing of the child/minion nodes being controlled by the master (node 1..4)
k8s-cluster:children - this refers to the nodes to be part of the cluster. Here we list all nodes. In production this may be different.
NB: If you wish to use a different Linux distro instead of Ubuntu, you will need to edit the /kubespray/inventory/group_fars/all.yml file and change 'bootstrap_os: ubuntu' to 'bootstrap_os: XX' where XX is the name of the OS (listed in top of the all.yml file)
We are now ready to test connectivity between the master and nodes. From the kubespray/inventory folder run the following command:
ansible --inventory-file=inventory.ini -m ping all
Assuming that the instructions have been followed carefully, you should see a 'green light' response as follows:
So, good to go, the final thing to do is run the actual install itself and then check everything has worked!
Right, so heres the reason we came ot this party! ... after all our setup, the installation itself coundn't be simpler ... we just execute one command line from the kubespray folder (one up from inventory where we have been, try 'cd ..')...
ansible-playbook -i inventory/inventory.ini cluster.yml
Depending on the number of nodes you have in the cluster, and what spec they are, the script will run for anything form 5 to 30 minutes (thats a lot of work, thank goodness KubeSpray is there to do it all for us!).
While the script is running, you can expect the text output flowing to the screen to stop/start ... it will look similar to this:
When the script has completed, you will be shown a 'play recap', and the overall timings for the process. In my case testing, it took 11.33 minutes, which is pretty respectable.
Confirm KubeSpray installation
Right, everything seems to be complete, so we can now carry out checks to ensure everything is as it should be. One of the main ways we interact with Kubernetes is using the KUBECTL (kube control) command line utility. KubeCtl allows us to get status information from the cluster, and also to send commands into the cluster.
First up, lets confirm our cluster is up and running and our nodes have registered and are available. We do this by using the 'get' command, with a parameter of 'nodes'
kubectl get nodes
As shown below, our cluster consisting of our single master and four child nodes are now all connected and ready to go.
The next thing we will do is look and see exactly what containers and pods have been created by the installation script for us. To do this, we call 'kubectl get' again, passing a main param of 'pods' and also passing the optional parameter 'all-namespaces', which give us back system containers as well as our own specific deployments. We we dont have any of our own pods or containers deployed, we simply get to see a listing of what the script has put together for us.
kubectl get pods --all-namespaces
At the start of the article I talked about dashboard - lets confirm that they are up and running by calling their container IP/port combinations. To find out what is where, we need to examine the running 'services'.
kubectl get svc
When we call the 'service' list by default, it only gives us the top level exposed services - in this case its simply the cluster service:
However, we know there is more, and you can see now how the addition of the 'all-namespaces' parameter extends the request to give us detail on everything, not only the top level.
The pods we are interested in are the dashboard ones (kubernetes dashboard, grafana and weave-scope), and we can see they are present and operational.
Finally, to confirm they are working, we can use the CURL command (which acts as a command-line httpClient of sorts), to connect to the dashboard using its IP and download the response.
We have now successfully installed a Kubernetes cluster on Azure in a very pain-free way, using a method that will work for all cloud providers and also on bare metal. The next step is to expose some ports in our security area to see the dashboards in the browser, and examine the benefits the dashboards give us, and to expose 'persistent data volumes' that we can interact with for centralized storage. We will discuss this in the next article in this series
I attach a shortened set of instructions to assist you in setting up your own Kubernetes cluster for downloading. As usual, if this article has been helpful, please give it a vote above!