The configuration file

We provide an example configuration file under config/fenicsx-tests-gsoc2024.toml. In order to launch its experiments, we would use:

testbuddy-g5k --configuration config/fenicsx-tests-gsoc2024.toml launch

Generally, it is envisioned that you would write your own configuration files to declare the tests you would like to run on Grid’5000. The configuration files are using the TOML format. A configuration file has two sections, [sync] and [launch]. There are also three keys, login, project, name that do not belong in those sections, and you would declare them at the top of the file:

login = "g5k-username"
project = "my-fenicsx-tests"
name = "my-experiment"

This will log in to Grid’5000 with the username set to g5k-username. The project and name keys will have the effect of testbuddy-g5k creating a directory ~/testbuddy-g5k/my-fenics-tests/poisson, and putting inside it all declared assets. The experiment is expected to store its results under and ~/testbuddy-g5k/my-fenics-tests/my-experiments/results. Note that this is per-server, meaning that under the Grid’5000 access point server you’d find corresponding assets under ~/site.

In a single configuration file, only one login, project, and name can be set. However, each experiment can declare its own assets, entrypoint, parameters, site, and grd options (i.e. how many hosts to run on, what type of CPU/GPU/hardware resources to request from Grid’5000, and so on.)

The [sync] section

These are the configuration options particular to the sync subcommand of testbuddy-g5k. Currently, there is only:

results = "/local/path/to/results"

which will store all obtained tarballs from testbuddy sync locally into /local/path/to/results. This directory lies in the filesystem of the computer that runs testbuddy-g5k.

Warning

In order for this feature to work, your experiments must store their results in tarballs under the directory SRC_DIR/results, where SRC_DIR is passed to the script by testbuddy-g5k via --src-dir= (you cannot configure this; it is the directory /home/login/testbuddy-g5k/project/name.)

The [launch] section

In this section we declare the experiments array which contains all the experiments that will be executed. Each experiment is a different job on its own resources in Grid’5000. Some options can be the same for all experiments, and are defined in their own keys under [launch], while others are defined within the elements of experiments. For example:

[launch]
assets = [
  "/path/to/experiment_sources/",
  "/some/additional/directory/",
  "/some/additional/file/db.sqlite3"
]
entrypoint = "entrypoint.py"
grd_environment = "debian11-nfs"
experiments = [
  { site = "rennes", cluster = "paravance", grd_options = "hosts=8" },
  { site = "grenoble", cluster = "dahu", grd_options = "hosts=4" }
]

What we call assets here are the source code files that will run an experiment.

This is a configuration of two experiments. In this configuration, both experiments have common assets and a common entrypoint, entrypoint.py, as well as common grd_environment, but they have different key values for site, cluster, grd_options. The assets are placed under site/testbuddy-g5k/project/name, and then entrypoint is a filepath relative to that directory, i.e. site/testbuddy-g5k/project/name/entrypoint. The entrypoint script will be executed on one host of those acquired, and typically coordinates all hosts in a cluster to work together for the experiment.

Note that the entrypoint script is passed the option --src-dir=SRC_DIR, the directory in which it resides (this is because the underlying tool, grd on Grid’5000, launches the entrypoint from a different directory) so that the entrypoint can find all experiment assets. Other arguments/options specified in the script_args array are passed verbatim to the entrypoint:

script_args = [
  "--weak-dof=250000",
  "--strong-dof=500000,1000000"
]

These are particular fenicsx-tests-gsoc2024 arguments that instruct it to use 250000 degrees of freedom for weak scaling and run two trials with 500000 and 1000000 degrees of freedom for strong scaling.

Keep it simple

The configuration files allow for some cleverness, but it is best to avoid it and instead maintain simple configuration files, perhaps at the cost of some redundancy.

Overriding fields

There’s an interesting interplay between command-line options and declarative configuration files: We can use one to override the other.

Suppose we have a configuration example-config.toml that defines the following experiments:

experiments = [
  { site = "rennes", cluster = "paravance" },
  { site = "lyon", cluster = "nova" }
  { site = "grenoble", cluster = "dahu", grd_options = "hosts=8" }
]

For instance, perhaps we’d like to use example-config.toml but request 2 hosts instead. We can of course edit the file or create a new configuration file, but for a one-off, we can do:

testbuddy-g5k --configuration example-config.toml launch --grd-options hosts=2

This is equivalent to defining the grd_options key in the launch section, and it will have the effect of being added to every experiment that does not already define its grd_options key, i.e. the rennes and lyon clusters but not grenoble. If we would like to override all the grd_options keys, including for experiments that define it (e.g. including grenoble), we must use --override-options.

This is generally the interplay between the [launch] section and experiments array: you can define the fields for an experiment wholly within it, or you can define some common, between experiments, options in the [launch] section. The fields in experiments array take precedence unless --override-options is used in which case precedence is reversed and the fields in [launch] take precedence. The site: and cluster fields are the only fields that must be specified in an experiment.