Health check endpoints are found at /_health/ping on many Arvados services. The purpose of the health check is to offer a simple method of determining if a service can be reached and allow the service to self-report any problems, suitable for integrating into operational alert systems.
To access health check endpoints, services must be configured with a management token .
Health check endpoints return a JSON object with the field health. This has a value of either OK or ERROR. On error, it may also include a field error with additional information. Examples:
{
"health": "OK"
}
{
"health": "ERROR"
"error": "Inverted polarity in the warp core"
}
The service arvados-health performs health checks on all configured services and returns a single value of OK or ERROR for the entire cluster. It exposes the endpoint /_health/all .
The healthcheck aggregator uses the NodeProfile section of the cluster-wide arvados.yml configuration file. Here is an example.
Cluster:
# The cluster uuid prefix
zzzzz:
ManagementToken: xyzzy
NodeProfile:
# For each node, the profile name corresponds to a
# locally-resolvable hostname, and describes which Arvados
# services are available on that machine.
api:
arvados-controller:
Listen: :8000
arvados-api-server:
Listen: :8001
manage:
arvados-node-manager:
Listen: :8002
workbench:
arvados-workbench:
Listen: :8003
arvados-ws:
Listen: :8004
keep:
keep-web:
Listen: :8005
keepproxy:
Listen: :8006
keep-balance:
Listen: :9005
keep0:
keepstore:
Listen: :25107
keep1:
keepstore:
Listen: :25107
The content of this documentation is licensed under the
Creative
Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the
Apache License, Version 2.0.