From 41beacd9259babc58c37c206122f755b3ae66c55 Mon Sep 17 00:00:00 2001 From: "Daniel P. Berrange" Date: Tue, 14 May 2013 14:36:09 +0100 Subject: [PATCH] Expand documentation for LXC driver Update the LXC driver documentation to describe the way containers are setup by default. Also describe the common virsh commands for managing containers and a little about the security. Placeholders for docs about configuring containers still to be filled in. Signed-off-by: Daniel P. Berrange --- docs/drvlxc.html.in | 417 +++++++++++++++++++++++++++++++++++++++----- 1 file changed, 374 insertions(+), 43 deletions(-) diff --git a/docs/drvlxc.html.in b/docs/drvlxc.html.in index beff214e92..d5b003e5b8 100644 --- a/docs/drvlxc.html.in +++ b/docs/drvlxc.html.in @@ -3,49 +3,100 @@

LXC container driver

+ + +

-The libvirt LXC driver manages "Linux Containers". Containers are sets of processes -with private namespaces which can (but don't always) look like separate machines, but -do not have their own OS. Here are two example configurations. The first is a very -light-weight "application container" which does not have its own root image. +The libvirt LXC driver manages "Linux Containers". At their simplest, containers +can just be thought of as a collection of processes, separated from the main +host processes via a set of resource namespaces and constrained via control +groups resource tunables. The libvirt LXC driver has no dependency on the LXC +userspace tools hosted on sourceforge.net. It directly utilizes the relevant +kernel features to build the container environment. This allows for sharing +of many libvirt technologies across both the QEMU/KVM and LXC drivers. In +particular sVirt for mandatory access control, auditing of operations, +integration with control groups and many other features.

-

Project Links

- - - -

Cgroups Requirements

+

Control groups Requirements

-The libvirt LXC driver requires that certain cgroups controllers are -mounted on the host OS. The minimum required controllers are 'cpuacct', -'memory' and 'devices', while recommended extra controllers are -'cpu', 'freezer' and 'blkio'. The /etc/cgconfig.conf & cgconfig -init service used to mount cgroups at host boot time. To manually -mount them use: +In order to control the resource usage of processes inside containers, the +libvirt LXC driver requires that certain cgroups controllers are mounted on +the host OS. The minimum required controllers are 'cpuacct', 'memory' and +'devices', while recommended extra controllers are 'cpu', 'freezer' and +'blkio'. Libvirt will not mount the cgroups filesystem itself, leaving +this up to the init system to take care of. Systemd will do the right thing +in this respect, while for other init systems the cgconfig +init service will be required. For further information, consult the general +libvirt cgroups documentation. +

+ +

Namespace requirements

+ +

+In order to separate processes inside a container from those in the +primary "host" OS environment, the libvirt LXC driver requires that +certain kernel namespaces are compiled in. Libvirt currently requires +the 'mount', 'ipc', 'pid', and 'uts' namespaces to be available. If +separate network interfaces are desired, then the 'net' namespace is +required. In the near future, the 'user' namespace will optionally be +supported. +

+ +

+NOTE: In the absence of support for the 'user' namespace, +processes inside containers cannot be securely isolated from host +process without the use of a mandatory access control technology +such as SELinux or AppArmor. +

+ +

Default container setup

+ +

Command line arguments

+ +

+When the container "init" process is started, it will typically +not be given any command line arguments (eg the equivalent of +the bootloader args visible in /proc/cmdline). If +any arguments are desired, then must be explicitly set in the +container XML configuration via one or more initarg +elements. For example, to run systemd --unit emergency.service +would use the following XML

- # mount -t cgroup cgroup /dev/cgroup -o cpuacct,memory,devices,cpu,freezer,blkio
+  <os>
+    <type arch='x86_64'>exe</type>
+    <init>/bin/systemd</init>
+    <initarg>--unit</initarg>
+    <initarg>emergency.service</initarg>
+  </os>
 
-

-NB, the blkio controller in some kernels will not allow creation of nested -sub-directories which will prevent correct operation of the libvirt LXC -driver. On such kernels, it may be necessary to unmount the blkio controller. -

- - -

Environment setup for the container init

+

Environment variables

When the container "init" process is started, it will be given several useful -environment variables. +environment variables. The following standard environment variables are mandated +by systemd container interface +to be provided by all container technologies on Linux. +

+ +
+
container
+
The fixed string libvirt-lxc to identify libvirt as the creator
+
container_uuid
+
The UUID assigned to the container by libvirt
+
PATH
+
The fixed string /bin:/usr/bin
+
TERM
+
The fixed string linux
+
+ +

+In addition to the standard variables, the following libvirt specific +environment variables are also provided

@@ -54,9 +105,152 @@ environment variables.
LIBVIRT_LXC_UUID
The UUID assigned to the container by libvirt
LIBVIRT_LXC_CMDLINE
-
The unparsed command line arguments specified in the container configuration
+
The unparsed command line arguments specified in the container configuration. +Use of this is discouraged, in favour of passing arguments directly to the +container init process via the initarg config element.
+

Filesystem mounts

+ +

+In the absence of any explicit configuration, the container will +inherit the host OS filesystem mounts. A number of mount points will +be made read only, or re-mounted with new instances to provide +container specific data. The following special mounts are setup +by libvirt +

+ + + + +

Device nodes

+ +

+The container init process will be started with CAP_MKNOD +capability removed and blocked from re-acquiring it. As such it will +not be able to create any device nodes in /dev or anywhere +else in its filesystems. Libvirt itself will take care of pre-populating +the /dev filesystem with any devices that the container +is authorized to use. The current devices that will be made available +to all containers are +

+ + + +

+In addition, for every console defined in the guest configuration, +a symlink will be created from /dev/ttyN symlinked to +the corresponding /dev/pts/M pseudo TTY device. The +first console will be /dev/tty1, with further consoles +numbered incrementally from there. +

+ +

+Further block or character devices will be made available to containers +depending on their configuration. +

+ + + +

Container security

+ +

sVirt SELinux

+ +

+In the absence of the "user" namespace being used, containers cannot +be considered secure against exploits of the host OS. The sVirt SELinux +driver provides a way to secure containers even when the "user" namespace +is not used. The cost is that writing a policy to allow execution of +arbitrary OS is not practical. The SELinux sVirt policy is typically +tailored to work with an simpler application confinement use case, +as provided by the "libvirt-sandbox" project. +

+ +

Auditing

+ +

+The LXC driver is integrated with libvirt's auditing subsystem, which +causes audit messages to be logged whenever there is an operation +performed against a container which has impact on host resources. +So for example, start/stop, device hotplug will all log audit messages +providing details about what action occurred and any resources +associated with it. There are the following 3 types of audit messages +

+ + + +

Device access

+ +

+All containers are launched with the CAP_MKNOD capability cleared +and removed from the bounding set. Libvirt will ensure that the +/dev filesystem is pre-populated with all devices that a container +is allowed to use. In addition, the cgroup "device" controller is +configured to block read/write/mknod from all devices except those +that a container is authorized to use. +

+ +

Example configurations

Example config version 1

@@ -121,21 +315,158 @@ debootstrap, whatever) under /opt/vm-1-root: </domain> + +

Container usage / management

+

-In both cases, you can define and start a container using:

-
-virsh --connect lxc:/// define v1.xml
-virsh --connect lxc:/// start vm1
-
-and then get a console using: -
-virsh --connect lxc:/// console vm1
-
-

Now doing 'ps -ef' will only show processes in the container, for -instance. You can undefine it using +As with any libvirt virtualization driver, LXC containers can be +managed via a wide variety of libvirt based tools. At the lowest +level the virsh command can be used to perform many +tasks, by passing the -c lxc:/// argument. As an +alternative to repeating the URI with every command, the LIBVIRT_DEFAULT_URI +environment variable can be set to lxc:///. The +examples that follow outline some common operations with virsh +and LXC. For further details about usage of virsh consult its +manual page.

+ +

Defining (saving) container configuration>

+ +

+The virsh define command takes an XML configuration +document and loads it into libvirt, saving the configuration on disk +

+
-virsh --connect lxc:/// undefine vm1
+# virsh -c lxc:/// define myguest.xml
 
+ +

Viewing container configuration

+ +

+The virsh dumpxml command can be used to view the +current XML configuration of a container. By default the XML +output reflects the current state of the container. If the +container is running, it is possible to explicitly request the +persistent configuration, instead of the current live configuration +using the --inactive flag +

+ +
+# virsh -c lxc:/// dumpxml myguest
+
+ +

Starting containers

+ +

+The virsh start command can be used to start a +container from a previously defined persistent configuration +

+ +
+# virsh -c lxc:/// start myguest
+
+ +

+It is also possible to start so called "transient" containers, +which do not require a persistent configuration to be saved +by libvirt, using the virsh create command. +

+ +
+# virsh -c lxc:/// create myguest.xml
+
+ + +

Stopping containers

+ +

+The virsh shutdown command can be used +to request a graceful shutdown of the container. By default +this command will first attempt to send a message to the +init process via the /dev/initctl device node. +If no such device node exists, then it will send SIGTERM +to PID 1 inside the container. +

+ +
+# virsh -c lxc:/// shutdown myguest
+
+ +

+If the container does not respond to the graceful shutdown +request, it can be forceably stopped using the virsh destroy +

+ +
+# virsh -c lxc:/// destroy myguest
+
+ + +

Rebooting a container

+ +

+The virsh reboot command can be used +to request a graceful shutdown of the container. By default +this command will first attempt to send a message to the +init process via the /dev/initctl device node. +If no such device node exists, then it will send SIGHUP +to PID 1 inside the container. +

+ +
+# virsh -c lxc:/// reboot myguest
+
+ +

Undefining (deleting) a container configuration

+ +

+The virsh undefine command can be used to delete the +persistent configuration of a container. If the guest is currently +running, this will turn it into a "transient" guest. +

+ +
+# virsh -c lxc:/// undefine myguest
+
+ +

Connecting to a container console

+ +

+The virsh console command can be used to connect +to the text console associated with a container. If the container +has been configured with multiple console devices, then the +--devname argument can be used to choose the +console to connect to +

+ +
+# virsh -c lxc:/// console myguest
+
+ +

Running commands in a container

+ +

+The virsh lxc-enter-namespace command can be used +to enter the namespaces and security context of a container +and then execute an arbitrary command. +

+ +
+# virsh -c lxc:/// lxc-enter-namespace myguest -- /bin/ls -al /dev
+
+ +

Monitoring container utilization

+ +

+The virt-top command can be used to monitor the +activity and resource utilization of all containers on a +host +

+ +
+# virt-top -c lxc:///
+
+