March 04, 2017

Resource Isolation

Docker Resource Isolation

Cgroups and Namespaces are features of the Linux kernel. They form the basis for lightweight process virtualization. Docker uses them to allow individual "containers" to run in an isolated environment within a single Linux kernel, avoiding the overhead of starting and maintaining virtual machines.
  • CGroups - resource allocation - limits how much resources can be used
  • cgroups (control groups) limits an application to a specific set of resources (CPU, memory, disk I/O, network, etc.). Facilitating the sharing of available hardware resources to containers. 
    • control group
      • A Resource Management and Resource Accounting/Tracking solution
      • Provides a generic process-grouping framework
    • A cgroup limits an application to a specific set of system resources
      • allows Docker to share available system resources to containers and enforce limits and constraints
      • e.g. you can limit the memory available to a specific container.
    • The kernel provides access to multiple controllers (also called subsystems) through the cgroup interface; for example, the "memory" controller limits memory use, "cpuacct" accounts CPU usage, etc.
    • originally developed by Google (in 2006 as 'process containers' and renamed 'Control Groups' in 2007 -- merged into 2.6.24 kernel in 2008)
    • governs the isolation and usage of system resources, such as CPU and memory, for a group of processes
    • can be manipulated by modifying files and directories in the /sys/fs/cgroup directory
  • Namespaces - resource isolation - limits what you can see
  • Namespaces allow an application to have its own view and control of shared system resources such as network stack, process space, mount point, etc.
    • Provides processes with their own view of the system
    • allows groups to be separated so they cannot “see” each other
    • originally developed by IBM
    • process’ namespaces are represented under the /proc/<PID>/ns directory:

    • Six Linux Namespaces:
      Namespace from Kernel Description
      UTS 2.6.19 Unix Timesharing System - Isolating kernel and version identifiers; domain and hostname - provides a way to get information about the system with commands like uname or hostname.
      The UTS namespace isolates two specific identifiers of the system: nodename and domainname. UTS namespace for example, allows changing the hostname.
      IPC 2.6.19 InterProcess Communication - Manage access to IPC resources; queues, semaphores, and shared memory - process/groups can have own IPC resources
      Isolating a process by the IPC namespace gives it its own interprocess communication resources, for example, System V IPC and POSIX messages.
      PID 2.6.19 Process ID - Process isolation
      Historically, the Linux kernel maintained a single contiguous process tree where every process exhibited a parent-child relationship. With the introduction of Linux namespaces, it's possible to have multiple distinct process trees where each process tree can have an entirely isolated set of processes.
      E.g. before Namespaces, init, the 1st process started on a Linux OS. It was the root of the process tree and had a process ID of 1. Every other process running in the OS fell somewhere in the hierarchy under init. With Namespaces, there can be multiple process hierarchies, each with its own process ID 1 process, and each isolated from one another.
      With process ID Namespace a new tree, with its own PID 1 process can be spun off the parent tree. The process that does this remains in the parent namespace, in the original tree, with the child being the root of its own process tree. Processes in the child namespace have no visibility into the parent process’s namespace. However, processes in the parent namespace have a complete view of processes in the child namespace.
      MNT 2.4.19 Mount - Manage filesystem mount points - processes can have their own root FS, conceptually close to chroot.
      Linux maintains a data structure for all the mount points on the system. With the Mount namespace, this data structure can be cloned, and processes in different Namespaces can change the mount points without affecting each other. Mount Namespace for example, allows creating a different file system layout, or making certain mount points read-only.
      NET 2.4.24 Networking - Manage network interfaces; IP, routes, devices, etc. - provides a logical copy of the network stack, with its own routing tables, firewall rules and network devices
      Network Namespace isolates processes into their own Namespaces, creating isolated network interface controllers (physical or virtual), iptables firewall rules, routing tables etc. Processes see an entirely different set of network interfaces depending on the process hierarchy.
      Network namespaces can be connected with each other using the "veth" virtual Ethernet device.
      USER 3.8 UID, GUID,…
      User namespace isolates the user IDs between namespaces.
      The user namespace allows a process to have root privileges within the namespace, without giving it that access to processes outside of the namespace.

      Other Isolation Technologies:
      • Hardware Virtualization technologies such as Intel-VT and AMD-V provide the processor’s hardware the ability to divide and isolate its computing capacity for multiple host virtual machines and their operating systems.
      • RunC

      References

No comments:

Post a Comment