Install the Git server

  1. Introduction
  2. Install dependencies
  3. Create “git” user and storage directory
  4. Install gitolite
  5. Configure gitolite
  6. Configure git synchronization
  7. Update config.yml
  8. Update nginx configuration
  9. Install arvados-git-httpd package
  10. Restart the API server and controller
  11. Confirm working installation

Introduction

Arvados support for git repository management enables using Arvados permissions to control access to git repositories. Users can create their own private and public git repositories and share them with others.

The git hosting setup involves three components.

  • The “arvados-git-sync.rb” script polls the API server for the current list of repositories, creates bare repositories, and updates the local permission cache used by gitolite.
  • Gitolite provides SSH access. Users authenticate by SSH keys.
  • arvados-git-http provides HTTPS access. Users authenticate by Arvados tokens.

Git services must be installed on the same host as the Arvados Rails API server.

Install dependencies

Alma/Red Hat/Rocky

# dnf install git perl-Data-Dumper openssh-server

Debian and Ubuntu

# apt-get --no-install-recommends install git openssh-server

Create “git” user and storage directory

Gitolite and some additional scripts will be installed in /var/lib/arvados/git, which means hosted repository data will be stored in /var/lib/arvados/git/repositories. If you choose to install gitolite in a different location, make sure to update the git_repositories_dir entry in your API server’s application.yml file accordingly: for example, if you install gitolite at /data/gitolite then your git_repositories_dir will be /data/gitolite/repositories.

A new UNIX account called “git” will own the files. This makes git URLs look familiar to users (git@[...]:username/reponame.git).

On Debian- or Red Hat-based systems:

gitserver:~$ sudo mkdir -p /var/lib/arvados/git
gitserver:~$ sudo useradd --comment git --home-dir /var/lib/arvados/git git
gitserver:~$ sudo chown -R git:git ~git

The git user needs its own SSH key. (It must be able to run ssh git@localhost from scripts.)

gitserver:~$ sudo -u git -i bash
git@gitserver:~$ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
git@gitserver:~$ cp .ssh/id_rsa.pub .ssh/authorized_keys
git@gitserver:~$ ssh -o stricthostkeychecking=no localhost cat .ssh/id_rsa.pub
Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC7aBIDAAgMQN16Pg6eHmvc+D+6TljwCGr4YGUBphSdVb25UyBCeAEgzqRiqy0IjQR2BLtSirXr+1SJAcQfBgI/jwR7FG+YIzJ4ND9JFEfcpq20FvWnMMQ6XD3y3xrZ1/h/RdBNwy4QCqjiXuxDpDB7VNP9/oeAzoATPZGhqjPfNS+RRVEQpC6BzZdsR+S838E53URguBOf9yrPwdHvosZn7VC0akeWQerHqaBIpSfDMtaM4+9s1Gdsz0iP85rtj/6U/K/XOuv2CZsuVZZ52nu3soHnEX2nx2IaXMS3L8Z+lfOXB2T6EaJgXF7Z9ME5K1tx9TSNTRcYCiKztXLNLSbp git@gitserver
git@gitserver:~$ rm .ssh/authorized_keys

Install gitolite

Check https://github.com/sitaramc/gitolite/tags for the latest stable version. This guide was tested with v3.6.11. Versions below 3.0 are missing some features needed by Arvados, and should not be used.

Download and install the version you selected.

$ sudo -u git -i bash
git@gitserver:~$ echo 'PATH=$HOME/bin:$PATH' >.profile
git@gitserver:~$ . .profile
git@gitserver:~$ git clone --branch v3.6.11 https://github.com/sitaramc/gitolite
...
Note: checking out '5d24ae666bfd2fa9093d67c840eb8d686992083f'.
...
git@gitserver:~$ mkdir bin
git@gitserver:~$ gitolite/install -ln ~git/bin
git@gitserver:~$ bin/gitolite setup -pk .ssh/id_rsa.pub
Initialized empty Git repository in /var/lib/arvados/git/repositories/gitolite-admin.git/
Initialized empty Git repository in /var/lib/arvados/git/repositories/testing.git/
WARNING: /var/lib/arvados/git/.ssh/authorized_keys missing; creating a new one
    (this is normal on a brand new install)

If this didn’t go well, more detail about installing gitolite, and information about how it works, can be found on the gitolite home page.

Clone the gitolite-admin repository. The arvados-git-sync.rb script works by editing the files in this working directory and pushing them to gitolite. Here we make sure “git push” won’t produce any errors or warnings.

git@gitserver:~$ git clone git@localhost:gitolite-admin
Cloning into 'gitolite-admin'...
remote: Counting objects: 6, done.
remote: Compressing objects: 100% (4/4), done.
remote: Total 6 (delta 0), reused 0 (delta 0)
Receiving objects: 100% (6/6), done.
Checking connectivity... done.
git@gitserver:~$ cd gitolite-admin
git@gitserver:~/gitolite-admin$ git config user.email arvados
git@gitserver:~/gitolite-admin$ git config user.name arvados
git@gitserver:~/gitolite-admin$ git config push.default simple
git@gitserver:~/gitolite-admin$ git push
Everything up-to-date

Configure gitolite

Configure gitolite to look up a repository name like username/reponame.git and find the appropriate bare repository storage directory.

Add the following lines to the top of ~git/.gitolite.rc:

my $repo_aliases;
my $aliases_src = "$ENV{HOME}/.gitolite/arvadosaliases.pl";
if ($ENV{HOME} && (-e $aliases_src)) {
    $repo_aliases = do $aliases_src;
}
$repo_aliases ||= {};

Add the following lines inside the section that begins %RC = (:

    REPO_ALIASES => $repo_aliases,

Inside that section, adjust the ‘UMASK’ setting to 022, to ensure the API server has permission to read repositories:

    UMASK => 022,

Uncomment the ‘Alias’ line in the section that begins ENABLE => [:

            # access a repo by another (possibly legacy) name
            'Alias',

Configure git synchronization

Create a configuration file /var/www/arvados-api/current/config/arvados-clients.yml using the following template, filling in the appropriate values for your system.

  • For arvados_api_token, use SystemRootToken
  • For gitolite_arvados_git_user_key, provide the public key you generated above, i.e., the contents of ~git/.ssh/id_rsa.pub.
production:
  gitolite_url: /var/lib/arvados/git/repositories/gitolite-admin.git
  gitolite_tmp: /var/lib/arvados/git
  arvados_api_host: ClusterID.example.com
  arvados_api_token: "zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz"
  arvados_api_host_insecure: false
  gitolite_arvados_git_user_key: "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC7aBIDAAgMQN16Pg6eHmvc+D+6TljwCGr4YGUBphSdVb25UyBCeAEgzqRiqy0IjQR2BLtSirXr+1SJAcQfBgI/jwR7FG+YIzJ4ND9JFEfcpq20FvWnMMQ6XD3y3xrZ1/h/RdBNwy4QCqjiXuxDpDB7VNP9/oeAzoATPZGhqjPfNS+RRVEQpC6BzZdsR+S838E53URguBOf9yrPwdHvosZn7VC0akeWQerHqaBIpSfDMtaM4+9s1Gdsz0iP85rtj/6U/K/XOuv2CZsuVZZ52nu3soHnEX2nx2IaXMS3L8Z+lfOXB2T6EaJgXF7Z9ME5K1tx9TSNTRcYCiKztXLNLSbp git@gitserver"
$ sudo chown git:git /var/www/arvados-api/current/config/arvados-clients.yml
$ sudo chmod og-rwx /var/www/arvados-api/current/config/arvados-clients.yml

Test configuration

$ sudo -u git -i bash -c 'cd /var/www/arvados-api/current && bin/bundle exec script/arvados-git-sync.rb production'

Enable the synchronization script

The API server package includes a script that retrieves the current set of repository names and permissions from the API, writes them to arvadosaliases.pl in a format usable by gitolite, and triggers gitolite hooks which create new empty repositories if needed. This script should run every 2 to 5 minutes.

Create /etc/cron.d/arvados-git-sync with the following content:

*/5 * * * * git cd /var/www/arvados-api/current && bin/bundle exec script/arvados-git-sync.rb production

Update config.yml

Edit the cluster config at config.yml .

    Services:
      GitSSH:
        ExternalURL: "ssh://git@git.ClusterID.example.com"
      GitHTTP:
        ExternalURL: https://git.ClusterID.example.com/
        InternalURLs:
	  "http://localhost:9001": {}
    Git:
      GitCommand: /var/lib/arvados/git/gitolite/src/gitolite-shell
      GitoliteHome: /var/lib/arvados/git
      Repositories: /var/lib/arvados/git/repositories

Update nginx configuration

Use a text editor to create a new file /etc/nginx/conf.d/arvados-git.conf with the following configuration. Options that need attention are marked in red.

upstream arvados-git-httpd {
  server                  127.0.0.1:9001;
}
server {
  listen                  443 ssl;
  server_name             git.ClusterID.example.com;
  proxy_connect_timeout   90s;
  proxy_read_timeout      300s;

  ssl_certificate         /YOUR/PATH/TO/cert.pem;
  ssl_certificate_key     /YOUR/PATH/TO/cert.key;

  # The server needs to accept potentially large refpacks from push clients.
  client_max_body_size 128m;

  location  / {
    proxy_pass            http://arvados-git-httpd;
  }
}

Install the arvados-git-httpd package

The arvados-git-httpd package provides HTTP access, using Arvados authentication tokens instead of passwords. It must be installed on the system where your git repositories are stored.

Alma/Red Hat/Rocky

# dnf install arvados-git-httpd

Debian and Ubuntu

# apt-get --no-install-recommends install arvados-git-httpd

Restart the API server and controller

After adding Workbench to the Services section, make sure the cluster config file is up to date on the API server host, and restart the API server and controller processes to ensure the changes are applied.

# systemctl restart nginx arvados-controller

Confirm working installation

Create ‘testrepo’ in the Arvados database.

~$ arv --format=uuid repository create --repository '{"name":"myusername/testrepo"}'

The arvados-git-sync cron job will notice the new repository record and create a repository on disk. Because it is on a timer (default 5 minutes) you may have to wait a minute or two for it to show up.

SSH

Before you do this, go to Workbench and choose SSH Keys from the menu, and upload your public key. Arvados uses the public key to identify you when you access the git repo.

~$ git clone git@git.ClusterID.example.com:username/testrepo.git

HTTP

Set up git credential helpers as described in install shell server for the git command to use your API token instead of prompting you for a username and password.

~$ git clone https://git.ClusterID.example.com/username/testrepo.git

Previous: Configure webshell Next: Build a cloud compute node image

The content of this documentation is licensed under the Creative Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the Apache License, Version 2.0.