Docker process virtualization

by | Apr 8, 2019 | Big Data, Docker

Docker is a lightweight framework for virtualizing application processes. Instead of emulating a computer hardware that still needs an operating system to run applications, Docker takes a different approach. Docker is able to pretend an operating system environment to a process. So it is possible to run multiple processes on one machine and let each process believe it is running exclusively in that environment. Unix distributions like centos, ubuntu or suse differ in file system conventions, environment variables or the naming of tools. From the point of view of a process, unix distributions differ in the details visible to a process. Docker is able to pretend this environment to a process without having to emulate an entire operating system. Thus it is possible to run a web server on a centos operating system that believes to it is run in an ubuntu distribution. This makes it easy to run processes that believe to be running in other environments. The clou here is that Docker only emulates the parts relevant to the process and, unlike a virtual machine, does not have to emulate an entire operating system. While a virtual machine requires an operating system of up to one GB Ram and up to 20 GB disk space for emulation, Docker requires considerably less hardware resources.

An operating system is a prerequisite for using Docker. Docker is installed as a program package and provides its services. Processes are operated on the basis of images in containers. An image is the description of an environment and is instantiated as a container. The image describes the environment of a process while the container is a running instance. With Docker, it is possible to run multiple instances of a web server on one machine and make each instance think it is running on port 8080.

An executable program is packaged in a Docker Image. All runtime parts, such as a runtime environment, resource files, or the like, are packaged in the image. The image is the binary distribution and remains unchanged no matter in which target environment Docker is later used. The image itself is unchangeable and becomes a container when executed in docker jargon. Any changes made by the process are logged in the container. So it is possible to start an image multiple times on one environment. Docker creates an image of the image and references it after starting as a container with a unique ID. Another start creates a new container with a unique ID.

When a Docker Container is started, a defined environment is created for the process running there, so that the process always finds an identical environment after starting. Files, environment settings or network socket assignments are always identical. For example, if a Web server is started that wants to publish content on port 8080, the port is always available. Even if the port on the surrounding host system is already occupied or other instances of the container are running.