Health check endpoints are found at /_health/ping
on many Arvados services. The purpose of the health check is to offer a simple method of determining if a service can be reached and allow the service to self-report any problems, suitable for integrating into operational alert systems.
To access health check endpoints, services must be configured with a management token .
Health check endpoints return a JSON object with the field health
. This has a value of either OK
or ERROR
. On error, it may also include a field error
with additional information. Examples:
{ "health": "OK" }
{ "health": "ERROR" "error": "Inverted polarity in the warp core" }
The service arvados-health
performs health checks on all configured services and returns a single value of OK
or ERROR
for the entire cluster. It exposes the endpoint /_health/all
.
The healthcheck aggregator uses the NodeProfile
section of the cluster-wide arvados.yml
configuration file. Here is an example.
Cluster: # The cluster uuid prefix zzzzz: ManagementToken: xyzzy NodeProfile: # For each node, the profile name corresponds to a # locally-resolvable hostname, and describes which Arvados # services are available on that machine. api: arvados-controller: Listen: :8000 arvados-api-server: Listen: :8001 manage: arvados-node-manager: Listen: :8002 workbench: arvados-workbench: Listen: :8003 arvados-ws: Listen: :8004 keep: keep-web: Listen: :8005 keepproxy: Listen: :8006 keep-balance: Listen: :9005 keep0: keepstore: Listen: :25107 keep1: keepstore: Listen: :25107
The content of this documentation is licensed under the
Creative
Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the
Apache License, Version 2.0.