What is Volume in Kubernetes & why we need it? – part 1
Why we need Volume?
What is Volume?
Types of Volumes?
– What is EmptyDir?
– What is hostPath?
– What is Static Volume?
* PersistentVolume
* PersistentVolumeClaim
– What is Dynamic Volume?
We know that, we have one cluster, contains two workernodes. In cluster, when, we create a Deployment, inside this, replicaSet will be created and this replicaSet will create and manage the lifecycle of Pods.
Further, we expose this deployment with service.
Now, what happen, if we write anything on this container of this pod by creating directory and files. We can access these files and directory at container level. Click. Let’s suppose, if this pod crashes, then our Kubelet will create a new pod for us. But, newly created pod does not have old content what we had written on it.
It means, the loss of data when a container crashes. This is our first problem.
Same thing, let’s suppose we are sharing these files of pod1 with other pods and that pod1 crashed. It means that all the data vanished.
The Kubernetes volume abstraction solves both of these problems.
First, when we create a volume on cluster, and then attach this volume with this container.
now, when we write anything on the container, it actually, writes on the volume. If this pod or container goes down, then replicaSet will create a new pods with the old configuration and this pods is linked with the volume.
So, it retrieve all the previous data.
A good thing is that, you
can use this Volume either in the cluster or at cloud level.
Or we can use on premises SAN, VSAN. Netapp or NFS server.
What is Volume ?
Although we already knows what is volume,
but I would like to share the official definition of volume in Kubernetes. As
per Kubernetes.io web site.
Under Ephemeral, , it is three types.
EmtyDir, CongfigMap and Secret.
Under Durable, it is also three types.
hostPath, Dynamic and Static.
Inside Dynamic, Click, we create PV and PVC. PV stands for Persistent Volume and PVC stands for Persistent Volume claim.
Beneath Static, Click, it is also 2 Types, first we create StorageClass and then manually create PV and PVC.
Let’s elaborate EmptyDIR first.
An emptyDir volume is first created when a Pod is assigned to a node, and exists as long as that Pod is running on that node. Volume will be available until pod dies.
When a Pod is removed from a node for any reason, the data in the emptyDir is deleted permanently. It means that if pod dies it will also delete the volume it created. But if container dies then volume will remain available. Because this volume is bind with POD level.
As the name says, the emptyDir volume is initially empty. This statement is clear no need further explanation.
All containers in the Pod can read and write the same files in the emptyDir volume, though that volume can be mounted at the same or different paths in each container. In simple words, a single volume will be shared with multiple containers, which are created inside the POD. In addition to this, we can mount on different directory in the container with this volume.
Kubernetes will create a random directory on node for the pod.
This is our Node and kubernetes will
create the emptyDir in
the node itself.
This directory can be created in RAM ,
Click or in the harddisk.
This is our simple POD yaml file. Click, Here, the kind is POD,
Name of this pod is emptydir-pod1
Namespace is core, it means this pod will be created inside the core namespace.
Name of the container would be emptydir-container1 and please bear in mind that this value comes inside the container section.
We use nginx image here.
Kindly observe that both the name and image parameters has same indentation. Click, this is our container section.
When we add the emptydir
volume, I have changed the container name and pod name from 1 to 2.
Under spec section,
We created a new section for volume. Inside this, first we add the name of this volume.
and then emtpyDir and click the limit size is 500 Mega bits
Further, we have to add some more parameters inside the container section. Because, at the end, container will access this volume.
Inside this, we have mountpath, this option inform us that inside the container our volume will be mounted on this directory. Here, I have defined /caches. If you wish you can modify the directory name.
In the last, we have added the name. Here, you may notice that both the names are same. This should be same. If name mismatch, your volume will be not mounted.
Here, you may notice that both the names
are same. This should be same. If name mismatch, your volume will be not
mounted.
In the Lab section, I will explain, how
we may create / delete the emptydir volume.
In simple words, we can say when we use the emptydir volume, Kubernetes will create a directory inside the Hard disk or RAM. As per the values we have defined.
When the container restarted, we will not loss the data because this volume linked with POD.
It’s a part ephemeral volume type.
let’s move to our next topic.
ConfigMap and Secret, I will cover in my
next video. Because we use these mount points for different purpose.
Let’s talk about the hostPath
under Durable volume type.
This is our hostPath and it is not empty.
It must be created only on the harddisk. However, EmptyDir volume can be created on either RAM or Hardisk.
Another difference between both are, in EmptyDir, Kubernetes will create a directory on the node first and then create the pod. On the other hand, in hostPath, we have to manually create the directory on the node first, then we create the pod.
For that reason, we have
to add
some parameters of volume in the container section. Same as it is what we did
in the emptydir
volume.
volumeMounts,
inside this, we mentioned mountPath, in the container, we will observe /vol1
directory which is bound with this volume. Container will get this volume name
from the name parameter.
Let’s talk about the Static Volume under
Durable volume type.
We have different storage servers.
SAN 400 TB SSD
SAN 400 TB HHD
NFS 100 GB
we have one Kubernetes cluster, and this is our POD.
We want to mount these storage servers on this POD. So that our application which is running inside this POD can save some files. In simple words, we want a Persistent Volume that will be remain available after this container or POD dies.
But these storages are outside of the cluster.
In order to resolve this issue, Kubernetes came up with one solution and offered us a new binary that binary called Persistent volume, short form is PV. PV implemented as plugins in the Kubernetes.
we created 3 different PVs and point to the particular Storage.
Now, we have the link between the external storages and PV. But still there is no link between POD and PV. Therefor, Kubernetes again offered us a new binary or Object, that is PVC, called Persistent volume claim.
Now, PVC has the capability to communicate with PV and POD. It means, that in the PVC yaml file we have to mentioned which PV I need to be connected.
In short, in Static, if we want to use volume in POD, then we have to create two more objects called PV and PVC. Please note that PV is dedicated to one PVC.
So, what we understand that Persistent
volume is not the namespaced. It
means that Persistent Volume will be available to entire cluster. However, PVC
is namespaced. It
can be created under the namespace. Thus, pod which request PVC, both must be
on the same namespace.
Retain
When the PersistentVolumeClaim is deleted, the PersistentVolume still exists, and the volume is considered “released”. But it is not yet available for another claim because the previous claimant’s data remains on the volume. An administrator can manually reclaim the volume.
Delete : deletion removes both the PersistentVolume object from Kubernetes, as well as the associated storage asset in the external infrastructure.
Recycle: The Recycle reclaim policy is deprecated. Instead, the recommended approach is to use dynamic provisioning.
Next is capacity, here we define the how much storage capacity this volume can be consumed.
Let’s talk about the Dynamic Volume.
Generally, in the offices, there are 2 different teams. One is managing the infra part, means Physical servers, VMs, and Kubernetes cluster, backup and so on.
And 2nd is managing the application part, which means all the applications running inside the Kubernetes cluster, such as deployment, services, monitoring and so on.
Why I am telling you this because, the 2nd team i.e our deployment team, complaining that every time when they need volume, they have created 2 objects, that is PV and PVC and then bind it to POD.
How we can minimize it? So, Kubernetes come up with one solution that is called Dynamic Volume. In this, the first team that is managing the Kubernetes cluster, will create a Storage Object in advance. Now, deployment team need to create only PVC and mentioned the StorageClassName in the PVC configuration file i.e Yaml file.
In the backend, PV will be created automatically when we create PVC.