The Transactional Update Guide

Thorsten Kukuk

<kukuk@thkukuk.de>

Ignaz Forster

<iforster@suse.com>

Version 0.4, 28. September 2021

Abstract

This is the documentation for transactional-update and is intended for users,
administrators and packagers.

It describes how transactional-update with Btrfs works by giving an overview of
the design, what an administrator needs to know about setting up and operating
such a system and what a packager needs to know for creating compatible
packages.

For specific usage see the transactional-update man page or the list of Kubic
related commands.

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

1. Introduction
    1.1. Description
    1.2. Definition
    1.3. Use Cases
2. Components
    2.1. libtukit.so
    2.2. tukit
    2.3. D-Bus Bindings
    2.4. transactional-update
3. High Level Concept
    3.1. The root file system
    3.2. Updating the correct snapshot
    3.3. Workflow
    3.4. Simplified workflow
4. System setup
    4.1. Read-only file system
    4.2. /var
    4.3. /etc
5. Files
6. Porting to other systems
7. Author/acknowledgments
8. Copyright information for this document

Chapter 1. Introduction

1.1. Description

transactional-update is an application that allows to update a Linux system and
its applications in an atomic way: The update will be performed in the
background, not influencing the currently running system. The update will be
activated by a reboot. This concept is similar to rpm-ostree or CoreOS'
previous Container OS. However transactional-update is not another package
manager, but is reusing the existing system tools such as RPM as the packaging
format and zypper as the package manager. It depends on Btrfs due to its
snapshotting and copy-on-write features.

The idea and reason to build up on existing tools is the ability to continue
using existing packages and tool chains for delivery and application of
updates. While currently only implemented for (open)SUSE environments the
concept is vendor independent and may also be implemented for other package
managers, package formats and file systems. It consists of the (open)SUSE
specific transactional-update script and the generic tukit library.

Conceptually transactional-update creates a new snapshot with btrfs before
performing any update and uses that snapshot for modifications. Since btrfs
snapshots contain only the difference between two versions and thus are usually
very small, updates done with transactional-update are very space efficient.
This also means several snapshots can be installed at the same time without a
problem.

1.2. Definition

A transactional update (also known as atomic upgrade) is an update that

  • is atomic:

      □ The update does not influence the running system.

      □ The machine can be powered off at any time. When powered on again
        either the unmodified old state or the new state is active. It is not
        possible to have a running system in an intermediate state.

  • can be rolled back:

      □ If the upgrade fails or if a newer software version turns out to not be
        compatible with your infrastructure, the system can quickly be restored
        to a previous state.

1.3. Use Cases

As Linux distributions are evolving, new concepts are emerging, such rolling
releases, containers, embedded systems or long time support releases. While the
classical update mechanisms are probably perfectly fine for a regular desktop
users or a conventional server system, the following example use cases may give
an indication why an even more error-proof system may be desirable:

Distributions with rolling updates face the problem: how should intrusive
updates be applied to a running system - without breaking the update mechanism
itself? Examples like the migration from SysV init to systemd, a major version
update of a desktop environment while the desktop is still running or even only
a small update to D-Bus may give a good idea of the problem. The desktop
environment may simply terminate, killing the update process and leaving the
system in a broken, undefined state. If any update breaks such a system there
needs to be a quick way to roll back the system to the last working state.

On mission critical systems or embedded systems one will usually want to make
sure that no service or user behaviour interferes with the update of the
system. Moreover the update should not modify the system, e.g. by uncontrolled
restarts of services or unexpected modifications to the system in post scripts.
Potential interruptions are deferred to a defined maintenance window instead.
For really critical systems the update can be verified (e.g. using snapper diff
) or discarded before actually booting into the new system. If an update
encounters an error the new snapshot will be discarded automatically.

For cluster nodes it is important that the system is always in a consistent
state, requires no manual interaction and is able to recover itself from error
conditions. For these systems transactional-updates provides automatic updates;
snapshots with failed updates will be automatically removed. Automatic reboots
can be triggered using a variety of different reboot methods (e.g. rebootmgr,
notify, kured or systemd), making the application of the updates cluster aware.

To summarize: The update should only be applied if there were no errors during
the update. If it turns out that the update is causing errors (e.g. because of
a new kernel version incompatible with the hardware) there should be a quick
and easy way to roll back to the state before the update was applied.

Chapter 2. Components

transactional-update is split into two parts: the (open)SUSE specific 
transactional-update shell script, and the generic tukit (including libtukit,
the tukit command line application and the D-Bus bindings).

2.1. libtukit.so

libtukit is a C++ library implementing the core functionality of 
transactional-update. It is responsible for snapshot management (including /etc
handling, see Section 4.3, “/etc”), preparing the environment and executing the
command to run within the update environment.

The library is designed to be a general purpose library for handling
transactional systems. It provides methods to create, modify and close
transactions as well as execute commands within a transaction. Currently 
snapper is the only implemented snapshot management option.

Applications such as package managers are expected to use this library for
easily supporting transactional systems. DNF for example is supporting
transactional systems via the libdnf-plugin-txnupd plugin.

The library also provides C bindings with the same functionality as the C++
library.

2.2. tukit

tukit is a utility application to call libtukit functionality from the command
line. Applications which do not support libtukit directly may use this
application as a wrapper. This command is not yet intended to be called by the
user directly, as it does not perform maintenance tasks such as marking a
snapshot for deletion for now.

2.3. D-Bus Bindings

The libtukit functionality is also available via D-Bus interface 
org.opensuse.tukit. Commands are executed asynchronously, returning a signal
when the command execution is finished.

2.4. transactional-update

This shell script is an (open)SUSE specific wrapper for handling the tasks
typical on a transactional system, e.g. installing packages, updating the
system or updating the bootloader. To do so it is using the tukit wrapper to
call applications such as zypper for package management.

Chapter 3. High Level Concept

3.1. The root file system

This chapter describes the handling of the root file system. In general 
transactional-update will not modify any other subvolumes or file systems:
Information stored on these mounts (such as /var or /home) is usually not
supposed to be rolled back. See Chapter 4, System setup for a real world setup.
There are exceptions for a few dedicated subvolumes such as /boot/grub2/
x86_64-efi which also have to be available to be able to update the bootloader;
these directories will be bind mounted into the update environment.

transactional-update is based around several concepts of the Btrfs file system,
a general purpose Copy-on-Write (Cow) filesystem with snapshot and subvolume
support, though support for other file systems is possible (see Chapter 6, 
Porting to other systems for requirements and porting information).

3.2. Updating the correct snapshot

transactional-update is wrapping all binaries which will modify the file system
with tukit, which will in turn use chroot to execute the command in the new
snapshot. That way e.g. zypper will install packages into the new snapshot
only.

3.3. Workflow

List of snapshots

In the beginning there is a list of old snapshots, each one based on the
previous one, and the newest one is the current root file system.

List of snapshots with a read-write clone of the current root file system.

In the first step, a new snapshot of the current root file system will be
created. This snapshot is set read write.

List of snapshots with a read-write clone of current root file system, which
will be updated with zypper.

In the second step the snapshot will be updated. This can be zypper up or 
zypper dup, the installation or removal of a package or any other modification
to the root file system.

List of snapshots with the clone set read-only.

In the third step the snapshot will be changed back to read-only, so that the
data cannot be modified anymore.

List of snapshots with the read-only clone set as the new default.

The last step is to mark the updated snapshot as new root file system. This is
the atomic step: If the power would have been pulled before, the unchanged old
system would have been booted. Now the new, updated system will boot.

List of snapshots with the current root file system as newest at the end.

After reboot, the newly prepared snapshot is the new root file system. In case
anything goes wrong a rollback to any of the older snapshots can be performed.

List of snapshots with a read-write clone of current root filesystem, which
will be updated with zypper.

If the system is not rebooted and transactional-update is called again a new
snapshot will be created and updated. This new snapshot is based on the current
running root filesystem again, not on the new default snapshot! For stacking
changes (i.e. if several commands are supposed to be combined in one single
snapshot) the shell command can be used to perform any number of operations, or
the --continue can be used to continue the latest snapshot while preserving a
separate snapshot for each step.

3.4. Simplified workflow

In essence the logic of transactional-update can be summarized as follows:

  • SNAPSHOT_ID=`snapper create --read-write -p -d "Snapshot Update"`


  • zypper -R ${SNAPSHOT_DIR} up|patch|dup|...


  • btrfs property set ${SNAPSHOT_DIR} ro true


  • btrfs subvol set-default ${SNAPSHOT_DIR}


  • systemctl reboot


Chapter 4. System setup

4.1. Read-only file system

transactional-update is typically used on a read-only root file system, even
though it also supports regular read-write systems.

4.2. /var

On a system with snapshot support /var should not be part of the root file
system, otherwise doing a rollback to a previous state would also roll back the
/var contents. On a read-only system this directory has to be writable in any
case, variable data is stored inside.

Due to the volatile nature of /var the directory will not be mounted into the
new snapshot during the transactional-update run, as this would break
atomicity: The currently running system depends on the old state of the data
(imagine a database migration was triggered by a package). Any modifications to
/var therefore have to be in the new system, i.e. modifying the contents of /
var as part of the packaging scripts is not allowed.

The only exception to this rule are directories: Those will be recreated during
the first boot into the updated system by the create-dirs-from-rpmdb.service
helper service. For all other cases please use one of the options described in
Packaging for transactional-updates and Migration / Upgrade in the Packaging
guidelines. If a package is breaking this rule by installing files into a
directory which is not part of the root file system, then a warning message
indicating the affected file is printed at the end of the transactional-update
run.

4.3. /etc

transactional-update also supports write operations to /etc on an otherwise
read-only file system. To do so /etc is created as a nested BTRFS subvolume of
the root file system.

Note

In versions prior to 5.0.0 /etc was handled via an overlay file system).
Existing overlayfs based systems are converted automatically.

Sometimes files in /etc are modified after the snapshot has been taken, but
before the system is rebooted, e.g. if a configuration management software is
still changing files. These files will be copied to the new system on boot, but
only if the file hasn't been modified in the new system, too.

To track the changes of both the old and the new snapshot a temporary third
nested reference snapshot of /etc called etc.syncpoint is created in the new
snapshot's /etc directory. This snapshot will be deleted again after
synchronization. Due to the copy-on-write nature of BTRFS' snapshots this
consumes only minimal space and and time.

Example 4.1. Snapshot layout

        > btrfs subvolume list / -uq
        ID 268 gen 39 top level 257 parent_uuid 256c1975-3b58-9540-b43e-788e32cb6d55 uuid 06388703-67ae-804f-92b0-0bf486df0a7e path @/.snapshots/2/snapshot
        ID 269 gen 55 top level 257 parent_uuid 06388703-67ae-804f-92b0-0bf486df0a7e uuid d3d13005-95b6-1849-a28e-ba250b465c19 path @/.snapshots/3/snapshot
        ID 270 gen 68 top level 269 parent_uuid -                                   uuid f77f0c38-1b32-7744-b3fe-bc515099556f path @/.snapshots/3/snapshot/etc
        ID 275 gen 61 top level 257 parent_uuid d3d13005-95b6-1849-a28e-ba250b465c19 uuid a279eb7b-11b0-4749-a9fd-1f0ad7d65f30 path @/.snapshots/4/snapshot
        ID 276 gen 62 top level 275 parent_uuid f77f0c38-1b32-7744-b3fe-bc515099556f uuid 3e4d7e76-2207-6540-a54d-51f92ce9299a path @/.snapshots/4/snapshot/etc
        ID 277 gen 62 top level 276 parent_uuid 3e4d7e76-2207-6540-a54d-51f92ce9299a uuid c8d8ec58-eccd-7a4d-8818-cde7043ba029 path @/.snapshots/4/snapshot/etc/etc.syncpoint


Snapshot 2 is an old snapshot not using /etc subvolumes yet.

Snapshot 3 is a snapshot that has been used already. This is how it will
usually look like.

Snapshot 4 is a snapshot that has been created, but hasn't been booted yet -
the synchronization snapshot is still present.


When the new snapshot is booted for the first time the systemd service 
transactional-update-sync-etc-state will copy changed files from the old
snapshot if required.

Warning

If a file has been modified both in the new snapshot and in the currently
running system after the snapshot was created, then the changes done in the
currently running system will be lost in the new snapshot.

/etc snapshot management is handled via a snapper plugin (/usr/lib/snapper/
plugins/50-etc). This makes it possible to use snapper directly, e.g. to create
a manual backup snapshot of the system.

Chapter 5. Files

/usr/etc/transactional-update.conf

    This is the reference configuration file for transactional-update,
    containing distribution default values. This file should not be changed by
    the administrator.

/etc/transactional-update.conf

    To change the default configuration for transactional-update copy or create
    this file and change the options accordingly. See transactional-update.conf
    (5) for a description of the configuration options. Values from this file
    will overwride the distribution default values.

Chapter 6. Porting to other systems

Currently snapper is the only supported snapshot implementation, however the
code base is prepared to support other (file) systems as long as they provide
snapshot functionality and the ability to boot from specific snapshots.

Chapter 7. Author/acknowledgments

This document was written by Thorsten Kukuk <kukuk@suse.com> with many
contributions from Ignaz Forster <iforster@suse.com>.

Chapter 8. Copyright information for this document

