Error handling

Kopf tracks the status of the handlers (except for the low-level event handlers), catches exceptions, and processes them for each handler.

The last (or the final) exception is stored in the object’s status, and reported via the object’s events.

Note

Keep in mind, the Kubernetes events are often garbage-collected fast, e.g. less than 1 hour, so they are visible only soon after they are added. For persistence, the errors are also stored on the object’s status.

Temporary errors

If a raised exception inherits from kopf.TemporaryError, it will postpone the current handler for the next iteration, which can happen either immediately, or after some delay:

import kopf
from typing import Any

@kopf.on.create('kopfexamples')
def create_fn(spec: kopf.Spec, **_: Any) -> None:
    if not is_data_ready():
        raise kopf.TemporaryError("The data is not yet ready.", delay=60)

In that case, there is no need to sleep in the handler explicitly, thus blocking any other events, causes, and generally any other handlers on the same object from being handled (such as deletion or parallel handlers/sub-handlers).

Note

The multiple handlers and the sub-handlers are implemented via this kind of errors: if there are handlers left after the current cycle, a special retriable error is raised, which marks the current cycle as to be retried immediately, where it continues with the remaining handlers.

The only difference is that this special case produces fewer logs.

Permanent errors

If a raised exception inherits from kopf.PermanentError, the handler is considered non-retriable, non-recoverable, and permanently failed.

Use this when the domain logic of the application means that there is no need to retry over time, as it will not become better:

import kopf
from typing import Any

@kopf.on.create('kopfexamples')
def create_fn(spec: kopf.Spec, **_: Any) -> None:
    valid_until = datetime.datetime.fromisoformat(spec['validUntil'])
    if valid_until <= datetime.datetime.now(datetime.timezone.utc):
        raise kopf.PermanentError("The object is not valid anymore.")

See also: Excluding handlers forever to prevent handlers from being invoked for the future change-sets even after the operator restarts.

Regular errors

Kopf assumes that any arbitrary errors (i.e. not kopf.TemporaryError and not kopf.PermanentError) are the environment’s issues and can self-resolve after some time.

As such, as the default behavior, Kopf retries the handlers with arbitrary errors infinitely until the handlers either succeed or fail permanently.

The reaction to the arbitrary errors can be configured:

import kopf
from typing import Any

@kopf.on.create('kopfexamples', errors=kopf.ErrorsMode.PERMANENT)
def create_fn(spec: kopf.Spec, **_: Any) -> None:
    raise Exception()

Possible values of errors are:

  • kopf.ErrorsMode.TEMPORARY (the default).

  • kopf.ErrorsMode.PERMANENT (prevent retries).

  • kopf.ErrorsMode.IGNORED (same as in the resource watching handlers).

Timeouts

The overall runtime of the handler can be limited:

import kopf
from typing import Any

@kopf.on.create('kopfexamples', timeout=60*60)
def create_fn(spec: kopf.Spec, **_: Any) -> None:
    raise kopf.TemporaryError(delay=60)

If the handler has not succeeded within this time, it is considered to have fatally failed.

If the handler is an async coroutine and it is still running at the moment, an asyncio.TimeoutError is raised; there is no equivalent way of terminating the synchronous functions by force.

By default, there is no timeout, so the retries continue forever.

Retries

The number of retries can be limited too:

import kopf
from typing import Any

@kopf.on.create('kopfexamples', retries=3)
def create_fn(spec: kopf.Spec, **_: Any) -> None:
    raise Exception()

Once the number of retries is reached, the handler fails permanently.

By default, there is no limit, so the retries continue forever.

Backoff

The interval between retries on arbitrary errors, when an external environment is supposed to recover and allow the handler execution to succeed, can be configured:

import kopf
from typing import Any

@kopf.on.create('kopfexamples', backoff=30)
def create_fn(spec: kopf.Spec, **_: Any) -> None:
    raise Exception()

The default is 60 seconds.

Note

This only affects the arbitrary errors. When kopf.TemporaryError is explicitly used, the delay should be configured with delay=....