Configuring federation

This page describes how to enable and configure federation capabilities between clusters.

An overview on how this feature works is discussed in the architecture section

Configuration

To enable a cluster to communicate with other clusters, some settings need to be added to the config.yml file. Federated clusters are identified by listing the cluster-to-hostname mapping in the RemoteClusters section.

Here is an example of the settings that should be added to the /etc/arvados/config.yml file:

Clusters:
  clsr1:
    RemoteClusters:
      clsr2:
        Host: api.cluster2.example
        Proxy: true
	ActivateUsers: true
      clsr3:
        Host: api.cluster3.example
        Proxy: true
	ActivateUsers: false

Similar settings should be added to clsr2 & clsr3 hosts, so that all clusters in the federation can talk to each other.

The ActivateUsers setting indicates whether users from a given cluster are automatically activated or they require manual activation. User activation is covered in more detail in the user activation section. In the current example, users from clsr2 would be automatically activated but users from clsr3 would require an admin to activate the account.

Note: The Proxy: variable is intended for future use, and should always be set to true.

User management

A federation of clusters can be configured to use a separate user database per cluster, or delegate a central cluster to manage the database.

Peer federation

If clusters belong to separate organizations, each cluster will have its own user database for the members of that organization. Through federation, a user from one organization can be granted access to the cluster of another organization. The admin of the second cluster can control access on a individual basis by choosing to activate or deactivate accounts from other organizations.

Centralized (LoginCluster) federation

If all clusters belong to the same organization, and users in that organization should have access to all the clusters, user management can be simplified by setting the LoginCluster which manages the user database used by all other clusters in the federation. To do this, choose one cluster in the federation which will be the ‘login cluster’. Set the Login.LoginCluster configuration value on all clusters in the federation to the cluster id of the login cluster. After setting LoginCluster, restart arvados-api-server and arvados-controller.

Clusters:
  clsr2:
    Login:
      LoginCluster: clsr1

The LoginCluster configuration redirects all user logins to the LoginCluster, and the LoginCluster will issue API tokens which will be accepted by the federation. Users are activated or deactivated across the entire federation based on their status on the login cluster.

Note: tokens issued by the login cluster need to be periodically re-validated when used on other clusters in the federation. The period between revalidation attempts is configured with Login.RemoteTokenRefresh. The default is 5 minutes. A longer period reduces overhead from validating tokens, but means it may take longer for other clusters to notice when a token has been revoked or a user has changed status (being activated/deactivated, admin flag changed).

To migrate users of existing clusters with separate user databases to use a single LoginCluster, use arv-federation-migrate .

Groups

In order for a user to see (and be able to share with) other users, the admin needs to create a “can_read” permission link from the user to either the “All users” group, or another group that grants visibility to a subset of users.

In a peer federation, this means that for a user that has joined a second cluster, that user needs to be added to the “All users” group on the second cluster as well, to be able to share with other users.

In a LoginCluster federation, all visibility of users to share with other users is set by the LoginCluster. It is not necessary to add users to “All users” on the other clusters.

Trusted clients

When a cluster is configured to use a LoginCluster, the login flow goes to the LoginCluster to log in and issue a token, then returns the user to the starting workbench. In this case, you want to configure the LoginCluster to “trust” the workbench instances associated with the other clusters.

Clusters:
  clsr1:
    Login:
      TrustedClients:
        "https://workbench.cluster2.example": {}
        "https://workbench2.cluster2.example": {}
        "https://workbench.cluster3.example": {}
        "https://workbench2.cluster3.example": {}

Testing

Following the above example, let’s suppose clsr1 is our “home cluster”, that is to say, we use our clsr1 user account as our federated identity and both clsr2 and clsr3 remote clusters are set up to allow users from clsr1 and to auto-activate them. The first thing to do would be to log into a remote workbench using the local user token. This can be done following these steps:

1. Log into the local workbench and get the user token
2. Visit the remote workbench specifying the local user token by URL: https://workbench.cluster2.example?api_token=token_from_clsr1
3. You should now be logged into clsr2 with your account from clsr1

To further test the federation setup, you can create a collection on clsr2, uploading some files and copying its UUID. Next, logged into a shell node on your home cluster you should be able to get that collection by running:

user@clsr1:~$ arv collection get --uuid clsr2-xvhdp-xxxxxxxxxxxxxxx

The returned collection metadata should show the local user’s uuid on the owner_uuid field. This tests that the arvados-controller service is proxying requests correctly.

One last test may be performed, to confirm that the keepstore services also recognize remote cluster prefixes and proxy the requests. You can ask for the previously created collection using any of the usual tools, for example:

user@clsr1:~$ arv-get clsr2-xvhdp-xxxxxxxxxxxxxxx/uploaded_file .

Previous: Link user accounts Next: Migrating users to federated accounts

The content of this documentation is licensed under the Creative Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the Apache License, Version 2.0.