Logs table management

This page aims to provide insight about managing the ever growing API Server’s logs table.

Logs table purpose & behavior

This database table currently serves three purposes:

  • It’s an audit log, permitting admins and users to look up the time and details of past changes to Arvados objects via arvados.v1.logs.* endpoints.
  • It’s a mechanism for passing cache-invalidation events, used by websocket servers, the Python SDK “events” library, and arvados-cwl-runner to detect when an object has changed.
  • It’s a staging area for stdout/stderr text coming from users’ containers, permitting users to see what their containers are doing while they are still running (i.e., before those text files are written to Keep).

As a result, this table grows indefinitely, even on sites where policy does not require an audit log; making backups, migrations, and upgrades unnecessarily slow and painful.

Configuration

To solve the problem mentioned above, the AuditLogs section of config.yml offers several options to limit the amount of log information stored on the table:

    AuditLogs:
      # Time to keep audit logs. (An audit log is a row added
      # to the "logs" table in the PostgreSQL database each time an
      # Arvados object is created, modified, or deleted.)
      #
      # Currently, websocket event notifications rely on audit logs, so
      # this should not be set lower than 5 minutes.
      MaxAge: 336h

      # Maximum number of log rows to delete in a single SQL transaction,
      # to prevent surprises and avoid bad database behavior
      # (especially the first time the cleanup job runs on an existing
      # cluster with a huge backlog) a maximum number of rows to
      # delete in a single transaction.
      #
      # If MaxDeleteBatch is 0, log entries will never be
      # deleted by Arvados. Cleanup can be done by an external process
      # without affecting any Arvados system processes, as long as very
      # recent (<5 minutes old) logs are not deleted.
      #
      # 100000 is a reasonable batch size for most sites.
      MaxDeleteBatch: 0

      # Attributes to suppress in events and audit logs.  Notably,
      # specifying {"manifest_text": {}} here typically makes the database
      # smaller and faster.
      #
      # Warning: Using any non-empty value here can have undesirable side
      # effects for any client or component that relies on event logs.
      # Use at your own risk.
      UnloggedAttributes: {}

Additional consideration

Depending on the local installation’s audit requirements, the cluster admins should plan for an external backup procedure before enabling this feature, as this information is not replicated anywhere else.


Previous: Preventing container reuse Next: Metadata vocabulary

The content of this documentation is licensed under the Creative Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the Apache License, Version 2.0.