Skip to content

Latest commit

 

History

History
668 lines (501 loc) · 27.2 KB

WritingTests.asciidoc

File metadata and controls

668 lines (501 loc) · 27.2 KB

openQA tests developer guide

Introduction

openQA is an automated test tool that makes it possible to test the whole installation process of an operating system. It’s free software released under the GPLv2 license. The source code and documentation are hosted in the os-autoinst organization on GitHub.

This document provides the information needed to start developing new tests for openQA or to improve the existing ones. It’s assumed that the reader is already familiar with openQA and has already read the Starter Guide, available at the official repository.

Basic

This section explains the basic layout of openQA tests and the API available in tests. openQA tests are written in the Perl programming language. Some basic but no in-depth knowledge of Perl is needed. This document assumes that the reader is already familiar with Perl.

API

os-autoinst provides the API for the tests. The most commonly used functions are:

  • get_var NAME Returns the content of a variable or undef

  • check_var NAME, CONTENT Check content of a variable, treating undef as empty.

  • send_key KEY[, WAITIDLE] Send key to openQA instance. Call wait_idle at the end if WAITIDLE is set. (e.g. send_key "alt-o", 1)

  • type_string STRING Typing the given string. (e.g. type_string "yes\n")

  • save_screenshot Capture and save a screenshot in the test results.

  • check_screen NEEDLE[, TIMEOUT] Wait until NEEDLE is seen on the screen. Return undef if not found within TIMEOUT (default 30s).

  • assert_screen NEEDLE[, TIMEOUT] Wait until NEEDLE is seen on the screen. Abort test and mark as failed if not found within TIMEOUT (default 30s).

  • record_soft_failure Record a workaround applied in the test.

  • wait_idle TIMEOUT Wait for max TIMEOUT seconds until system is idle.

  • wait_serial REGEX[, TIMEOUT] Wait until REGEX is seen on serial port. Return 0 if not found after TIMEOUT (default 90s)

  • upload_logs FILE Type in necessary shell commands to save FILE in the test results

  • assert_and_click NEEDLE[, BUTTON] Wait until NEEDLE is seen on the screen and click it in the middle of the smallest defined area

  • wait_still_screen [TIME] Wait for the screen to stop changing for TIME seconds

  • mouse_hide Moves the mouse cursor into the edge of the screen to hide it within needles

  • script_output SCRIPT[, TIMEOUT] Returns the text output of the given bash script. The script is called with bash -xe and STDOUT is returned

  • validate_script_output SCRIPT, CODE Checks the text output of the bash script using the given callback (e.g. validate_script_output "ls -la $HOME", sub { m/.xsession-errors/ })

  • assert_shutdown [TIMEOUT] Wait for max TIMEOUT seconds for system to shut down (default 30s)

The following API is distribution specific and subject to change

  • ensure_installed PACKAGES…​ Check that the specific application was installed in the system, if not then it will try to install it from download repositories.

  • x11_start_program NAME Execute the graphical application NAME. How exactly depends on the desktop environment.

  • become_root Become root (in a shell)

  • script_run COMMAND[, TIMEOUT] Wrapper for type_string meant to type in shell commands. Waits TIMEOUT until idle (default 9s)

  • script_sudo COMMAND Use sudo to execute COMMAND, handling the optional password prompt.

  • type_password Type default password (used for root and user)

How to write tests

openQA tests need to implement at least the run subroutine to contain the actual test code and the test needs to be loaded in the distribution’s main.pm.

The test_flags subroutine specifies what happens when the test fails.

There are several callbacks defined:

  • post_fail_hook is called to upload log files or determine the state of the machine

  • pre_run_hook is called before the run function - mainly useful for a whole group of tests

  • post_run_hook is run after successful run function - mainly useful for a whole group of tests

The following example is a basic test that assumes some live image that boots into the desktop when pressing enter at the boot loader:

use base "basetest";
use strict;
use testapi;

sub run {
    # wait for bootloader to appear
    assert_screen "bootloader", 30; # timeout 30 seconds

    # press enter to boot right away
    send_key "ret";

    # wait for the desktop to appear
    assert_screen "desktop", 300;
}

sub test_flags {
    # without anything - rollback to 'lastgood' snapshot if failed
    # 'fatal' - abort whole test suite if this fails (and set overall state)
    # 'milestone' - after this test succeeds, update 'lastgood'
    # 'important' - if this fails, set the overall state to 'failed'
    return { important => 1 };
}

1;

Test Case Examples

  • Console test that installs software from remote repository via zypper command

sub run() {

    # change to root
    become_root;

    # output zypper repos to the serial
    script_run "zypper lr -d > /dev/$serialdev";

    # install xdelta and insert a string 'xdelta_installed' to the serial
    script_run "zypper --gpg-auto-import-keys -n in xdelta && echo 'xdelta_installed' > /dev/$serialdev";

    # detecting whether 'xdelta_installed' appears in the serial within 200 seconds
    die "zypper install failed" unless wait_serial"xdelta_installed", 200;

    # capture a screenshot and compare with needle 'test-zypper_in-1'
    assert_screen 'test-zypper_in-1', 3;
}
  • Typical X11 test testing kate

sub run() {

    # make sure kate was installed
    # if not ensure_installed will try to install it
    ensure_installed "kate";

    # start kate
    x11_start_program "kate";

    # check that kate execution succeeded
    assert_screen 'test-kate-1', 10;

    # close kate's welcome window and wait for system becoming idle
    send_key 'alt-c', 1;

    # typing the string on kate
    type_string "If you can see this text kate is working.\n";

    # check the result
    assert_screen 'test-kate-2', 5;

    # quit kate
    send_key "ctrl-q";

    # make sure kate was closed
    assert_screen 'test-kate-3', 5;
}

Variables

Test case behavior can be controlled via variables. Some basic variables like DISTRI, VERSION, ARCH are always set. Others like DESKTOP are defined by the Test suites in the openQA web UI. Check the existing tests at os-autoinst-distri-opensuse on GitHub for examples.

Variables are accessible via the get_var and check_var functions.

Modifying setting of an existing test

There is no interface to modify existing tests but the clone script can be used to create a new job that adds, removes or changes settings:

/usr/share/openqa/script/clone_job.pl --from localhost --host localhost 42 FOO=bar BAZ=

Using snapshots to speed up development of tests

Sometimes it’s annoying to run the full installation to adjust some test. It would be nice to have the VM jump to a certain point. QEMU backend provides feature that allows a job to start from a snapshot that might help in this situation.

Depending on the use case, there are two options to help:

  • create and preserve snapshots for every test module run (MAKETESTSNAPSHOTS)

    • offers more flexibility as test can be resumed almost at any point, however disk space requirements are high (expect more than 30GB for one job)

    • this mode is useful for fixing non fatal issues in tests and debugging SUT

  • create snapshot after every successful test module, always overwrite to preserve only latest (TESTDEBUG)

    • allows to skip just before the start of first failed test module, which can be limiting, but there are no additional hardware requirements.

    • this mode is useful for iterative test development

In both modes there is no need to modify tests (i.e. adding milestone test flag, it is implied). In later mode, every test module is also considered fatal. This means the job is aborted after first failed test module.

Steps to enable snapshots for each test module (former mode):
1. run the worker with --no-cleanup parameter. This will preserve the hard
disks after test runs.

2. set +MAKETESTSNAPSHOTS=1+ on a job. This will make openQA save a
snapshot for every test module run. One way to do that is by cloning an
existing job and adding the setting:

$ /usr/share/openqa/script/clone_job.pl --from https://openqa.opensuse.org  --host localhost 24 MAKETESTSNAPSHOTS=1

3. create a job again, this time setting the +SKIPTO+ variable to the snapshot
you need. Again, clone_job.pl comes handy here:

$ /usr/share/openqa/script/clone_job.pl --from https://openqa.opensuse.org  --host localhost 24 SKIPTO=consoletest-yast2_i

Use qemu-img snapshot -l something.img to find out what snapshots
are in the image. Snapshots are named `"test module category"-"test module name"` (e.g. `installation-start_install`)
Steps to enable storing only last successful snapshot (latter mode):
1. run the worker with --no-cleanup parameter. This will preserve the hard
disks after test runs.

2. set +TESTDEBUG=1+ on a job. This will make openQA save a
snapshot after each successful test module run. Snapshots are overwritten.

$ /usr/share/openqa/script/clone_job.pl --from https://openqa.opensuse.org  --host localhost 24 TESTDEBUG=1

3. create a job again, this time setting the +SKIPTO+ variable to the snapshot
which failed on previous run. If clone_job script is not used, +TESTDEBUG=1+
variable must be also included:

$ /usr/share/openqa/script/clone_job.pl --from https://openqa.opensuse.org  --host localhost 24 TESTDEBUG=1 SKIPTO=consoletest-yast2_i

Assigning jobs to workers

By default, any worker can get any job with the matching architecture.

This behavior can be changed by setting job variable WORKER_CLASS. Jobs with this variable set (typically via machines or test suites configuration) are assigned only to workers, which have the same variable in the configuration file.

For example, the following configuration ensures, that jobs with WORKER_CLASS=desktop can be assigned only to worker instances 1 and 2.

[1]
WORKER_CLASS = desktop

[2]
WORKER_CLASS = desktop

[3]
# WORKER_CLASS is not set

Writing multi-machine tests

Scenarios requiring more than one system under test (SUT), like High Availability testing, are covered as multi-machine tests (MM tests) in this section.

OpenQA approaches multi-machine testing by assigning dependencies between individual jobs. This means the following:

  • everything needed for MM tests must be running as a test job (or you are on your own), even support infrastructure (custom DHCP, NFS, etc. if required), which in principle is not part of the actual testing, must have a defined test suite so a test job can be created

  • OpenQA scheduler makes sure tests are started as a group and in right order, cancelled as a group if some dependencies are violated and cloned as a group if requested.

  • OpenQA does not synchronize individual steps of the tests.

  • OpenQA provides locking server for basic synchronization of tests (e.g. wait until services are ready for failover), but the correct usage of locks is test designer job (beware deadlocks).

In short, writing multi-machine tests adds a few more layers of complexity:

  1. documenting the dependencies and order between individual tests

  2. synchronization between individual tests

  3. actual technical realization (i.e. custom networking)

Job dependencies

There are 2 types of dependencies: CHAINED and PARALLEL:

  • CHAINED describes when one test case depends on another and both are run sequentially, i.e. KDE test suite is run after and only after Installation test suite is successfully finished and cancelled if fail.

To define CHAINED dependency add variable START_AFTER_TEST with the name(s) of test suite(s) after which selected test suite is supposed to run. Use comma separated list for multiple test suite dependency. E.g. START_AFTER_TEST="kde,dhcp-server"

  • PARALLEL describes MM test, test suites are scheduled to run at the same time and managed as a group. On top of that, PARALLEL also describes test suites dependencies, where some test suites (children) run parallel with other test suites (parents) only when parents are running.

To define PARALLEL dependency, use PARALLEL_WITH variable with the name(s) of test suite(s) which acts as a parent suite(s) to selected test suite. In other words, PARALLEL_WITH describes "I need this test suite to be running during my run". Use comma separated list for multiple test suite dependency. E.g. PARALLEL_WITH="web-server,dhcp-server" Keep in mind that parent job must be running until all children finish, else scheduler will cancel child jobs once parent is done.

Job dependencies are only resolved when using the iso controller to create new jobs from job templates. Posting individual jobs manually won’t work.

Job dependencies are currently only possible between tests that are scheduled for the same machine.

OpenQA worker requirements

CHAINED dependency requires only one worker, since dependent jobs will run only after the first one finish. On the other hand PARALLEL dependency requires at least 2 workers for simple scenarios.

Examples:

CHAINED - i.e. test basic functionality before going advanced - requires 1 worker
A <- B <- C

Define test suite A,
then define B with variable START_AFTER_TEST=A and then define C with START_AFTER_TEST=B

-or-

Define test suite A, B
and then define C with START_AFTER_TEST=A,B
In this case however the start order of A and B is not specified.
But C will start only after A, B are successfully done.
PARALLEL basic High-Availability
A
^
B

Define test suite A
and then define B with variable PARALLEL_WITH=A.
A in this case is parent test suite to B and must be running throughout B run.
PARALLEL with multiple parents - i.e. complex support requirements for one test - requires 4 workers
A B C
\ | /
  ^
  D

Define test suites A,B,C
and then define D with PARALLEL_WITH=A,B,C.
A,B,C run in parallel and are parent test suites for D and all must run until D finish.
PARALLEL with one parent - i.e. running independent tests against one server - requires at least 2 workers
   A
   ^
  /|\
 B C D

Define test suite A
and then define B,C,D with PARALLEL_WITH=A
A is parent test suite for B, C, D (all can run in parallel).
Children B, C, D can run and finish anytime, but A must run until all B, C, D finishes.

Test synchronization and locking API

OpenQA provides locking server through lock API. To use lock API import lockapi package (use lockapi;) in your test file. Lock API provides three functions: mutex_create, mutex_lock, mutex_unlock. Each of these functions take one parameter: name of the lock. Locks are associated with caller`s job - locks can’t be unlocked by different job then the one who locked the lock.

mutex_lock tries to lock the mutex lock for caller`s job. If lock is unavailable or locked by someone else, mutex_lock call blocks.

mutex_unlock tries to unlock the mutex lock. If lock is locked by different job, mutex_unlock call blocks. When lock become available or if lock does not exist, call returns without doing anything.

mutex_create create new mutex lock. When lock is created by mutex_create, lock is automatically unlocked. When mutex lock already exists call returns without doing anything.

Locks are addressed by their name. This name is valid in test group defined by their dependencies. If there are more groups running at the same time and the same lock name is used, these locks are independent of each other.

The mmapi package provides wait_for_children, which the parent can use to wait for the children to complete.

Example:

parent job - wait until login prompt appear, assume services are started

use base "basetest";
use strict;
use testapi;
use lockapi;
use mmapi;

sub run {
    # wait for bootloader to appear
    assert_screen "bootloader", 30; # timeout 30 seconds

    # wait for the login to appear
    assert_screen "login", 300;

    # services start automatically
    # unlock by creating the lock
    mutex_create('services_ready');

    # wait until all children finish
    wait_for_children;
}

child job - check until parent is ready, then start testing services

use base "basetest";
use strict;
use testapi;
use lockapi;

sub run {
    # wait for bootloader to appear
    assert_screen "bootloader", 30; # timeout 30 seconds

    # wait for the login to appear
    assert_screen "login", 300;

    # this blocks until lock is created then locks and immediately unlocks
    mutex_lock('services_ready');
    mutex_unlock('services_ready');

    # login to continue
    type_string("root\n");
    sleep 1;
    type_string("secret\n");
}

Sometimes it is useful to let a parent wait for certain action on a child, for example to verify server state after completed request. In this scenario the child creates a mutex and the parent unlocks it.

The child can however die at any time. To prevent parent deadlock in this situation, parent has to pass child ID as a second parameter to mutex_lock(). If a child job with given ID already finished, mutex_lock() calls die.

parent job - wait until the child reaches given point

use base "basetest";
use strict;
use testapi;
use lockapi;
use mmapi;

sub run {
    my $children = get_children();

    # let's suppose there is only one child
    my $child_id = (keys %$children)[0];

    # this blocks until lock is available and then does nothing
    mutex_unlock('child_reached_given_point', $child_id);

    # continue with the test
}

Getting info about parents / children

use base "basetest";
use strict;
use testapi;
use mmapi;

sub run {

    # returns a hash ref containing (id => state) for all children
    my $children = get_children();

    for my $job_id (keys %$children) {
      print "$job_id is cancelled\n" if $children->{$job_id} eq 'cancelled';
    }

    # returns an array with parent ids, all parents are in running state (see Job dependencies above)
    my $parents = get_parents();

    # let's suppose there is only one parent
    my $parent_id = $parents->[0];

    # any job id can be queried for details with get_job_info()
    # it returns a hash ref containing these keys:
    #   name priority state result worker_id
    #   retry_avbl t_started t_finished test
    #   group_id group settings
    my $parent_info = get_job_info($parent_id);

    # it is possible to query variables set by openqa frontend,
    # this does not work for variables set by backend or by the job at runtime
    my $parent_name = $parent_info->{settings}->{NAME}
    my $parent_desktop = $parent_info->{settings}->{DESKTOP}
    # !!! this does not work, VNC is set by backend !!!
    # my $parent_vnc = $parent_info->{settings}->{VNC}
}

Support Server based tests

The idea is to have a dedicated "helper server" to allow advanced network based testing.

Support server takes advantage of the basic parallel setup as described in the previous section, with the support server being the parent test A and the test needing it being the child test B. This ensures that the test B always have the support server available.

Preparing the supportserver:

The support server image is created by calling a special test, based on the autoyast test:

/usr/share/openqa/script/client jobs post DISTRI=opensuse VERSION=13.2 \
    ISO=openSUSE-13.2-DVD-x86_64.iso  ARCH=x86_64 FLAVOR=Server-DVD \
    TEST=supportserver_generator MACHINE=64bit DESKTOP=textmode  INSTALLONLY=1 \
    AUTOYAST=supportserver/autoyast_supportserver.xml SUPPORT_SERVER_GENERATOR=1 \
    PUBLISH_HDD_1=supportserver.qcow2

This produces qemu image supportserver.qcow2 that contains the supportserver. The autoyast_supportserver.xml should define correct user and password, as well as packages and the common configuration.

More specific role the supportserver should take is then selected when the server is run in the actual test scenario.

Using the supportserver:

In the Test suites, the supportserver is defined by setting:

HDD_1=supportserver.qcow2
SUPPORT_SERVER=1
SUPPORT_SERVER_ROLES=pxe,qemuproxy
WORKER_CLASS=server,qemu_autoyast_tap_64

where the SUPPORT_SERVER_ROLES defines the specific role (see code in tests/support_server/setup.pm for available roles and their definition), and HDD_1 variable must be the name of the supportserver image as defined via PUBLISH_HDD_1 variable during supportserver generation. If the support server is based on older SUSE versions (opensuse 11.x, SLE11SP4..) it may also be needed to add HDDMODEL=virtio-blk. In case of qemu backend, one can also use BOOTFROM=c, for faster boot directly from the HDD_1 image.

Then for the child test using this supportserver, the following additional variable must be set: PARALLEL_WITH=supportserver-pxe-tftp where supportserver-pxe-tftp is the name given to the supportserver in the test suites screen. Once the tests are defined, they can be added to openQA in the usual way:

/usr/share/openqa/script/client isos post DISTRI=opensuse VERSION=13.2 \
        ISO=openSUSE-13.2-DVD-x86_64.iso ARCH=x86_64 FLAVOR=Server-DVD

where the DISTRI, VERSION, FLAVOR and ARCH correspond to the job group containing the tests. Note that the networking is provided by tap devices, so both jobs should run on machines defined by (apart from others) having NICTYPE=tap, WORKER_CLASS=qemu_autoyast_tap_64.

Example - a simple tftp test:

Let’s assume that we want to test tftp client operation. For this, we setup the supportserver as a tftp server:

HDD_1=supportserver.qcow2
SUPPORT_SERVER=1
SUPPORT_SERVER_ROLES=dhcp,tftp
WORKER_CLASS=server,qemu_autoyast_tap_64

with a test-suites name supportserver-opensuse-tftp.

The actual test child job, will then have to set PARALLEL_WITH=supportserver-opensuse-tftp, and also other variables according to the test requirements. For convenience, we have also started a dhcp server on the supportserver, but even without it, network could be set up manually by assigning a free ip address (e.g. 10.0.2.15) on the system of the test job.

The code in the *.pm module doing the actual tftp test could then look something like this:

use strict;
use base 'basetest';
use testapi;

sub run {

  my $script="set -e -x\n";
  $script.="echo test >test.txt\n";
  $script.="time tftp ".$server_ip." -c put test.txt test2.txt\n";
  $script.="time tftp ".$server_ip." -c get test2.txt\n";
  $script.="diff -u test.txt test2.txt\n";
  script_output($script);

}

assuming of course, that the tested machine was already set up with necessary infrastructure for tftp, e.g. network was set up, tftp rpm installed and tftp service started, etc. All of this could be conveniently achieved using the autoyast installation, as shown in the next section.

Example - autoyast based tftp test:

Here we will use autoyast to setup the system of the test job and the os-autoinst autoyast testing infrastructure. For supportserver, this means using proxy to access qemu provided data, for dowloading autoyast profile and tftp verify script:

HDD_1=supportserver.qcow2
SUPPORT_SERVER=1
SUPPORT_SERVER_ROLES=pxe,qemuproxy
WORKER_CLASS=server,qemu_autoyast_tap_64

The actual test child job, will then be defined as :

AUTOYAST=autoyast_opensuse/opensuse_autoyast_tftp.xml
AUTOYAST_VERIFY=autoyast_opensuse/opensuse_autoyast_tftp.sh
DESKTOP=textmode
INSTALLONLY=1
PARALLEL_WITH=supportserver-opensuse-tftp

again assuming the support server’s name being supportserver-opensuse-tftp. Note that the pxe role already contains tftp and dhcp server role, since they are needed for the pxe boot to work.

The tftp test defined in the autoyast_opensuse/opensuse_autoyast_tftp.sh file could be something like:

set -e -x
echo test >test.txt
time tftp #SERVER_URL# -c put test.txt test2.txt
time tftp #SERVER_URL# -c get test2.txt
diff -u test.txt test2.txt && echo "AUTOYAST OK"

and the rest is done automatically, using already prepared test modules in tests/autoyast subdirectory.