To support running analysis on geographically dispersed data (avoiding expensive data transfers by sending the computation to the data), and “hybrid cloud” configurations where an on-premise cluster can expand its capabilities by delegating work to a cloud-hosted cluster, Arvados supports federated workflows. In a federated workflow, different steps of a workflow may execute on different clusters. Arvados manages data transfer and delegation of credentials, so that all that is required is adding arv:ClusterTarget hints to your existing workflow.
The tutorial files are located in the documentation section of the Arvados source repository: or see below
~$ git clone https://github.com/arvados/arvados ~$ cd arvados/doc/user/cwl/federated
At this time, remote steps of a workflow on Workbench are not displayed. As a workaround, you can find the UUIDs of the remote steps in the live logs of the workflow runner (the “Logs” tab). You may visit the remote cluster’s workbench and enter the UUID into the search box to view the details of the remote step. This will be fixed in a future version of workbench.
Run it like any other workflow:
~$ arvados-cwl-runner feddemo.cwl shards.cwl
You can also run a workflow on a remote federated cluster .
In this following example, an analysis task is executed on three different clusters with different data, then the results are combined to produce the final output.
Example input document:
select_column: color select_values: class: File location: colors_to_select.txt datasets: - cluster: clsr1 file: class: File location: keep:0dcf9310e5bf0c07270416d3a0cd6a43+56/items1.csv - cluster: clsr2 file: class: File location: keep:12707d325a3f4687674b858bd32beae9+56/items2.csv - cluster: clsr3 file: class: File location: keep:dbff6bb7fc43176527af5eb9dec28871+56/items3.csv intermediate_projects: - clsr1-j7d0g-qxc4jcji7n4lafx - clsr2-j7d0g-e7r20egb8hlgn53 - clsr3-j7d0g-vrl00zoku9spnen
The content of this documentation is licensed under the
Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the Apache License, Version 2.0.