States can encounter runtime errors for various reasons:

  • Transient errors, such as network outages and congestion
  • JSON path resolution errors

By default, when a state encounters a runtime error, the entire ZIS flow fails.

Action and Map states support error handling using a Catch block. You can use a Catch block to avoid failing the entire ZIS flow when a state encounters an error. See Using a Catch block.

Specific runtime errors in specific states can also trigger a retry of a failed ZIS flow. See ZIS flow retry logic.

Using a Catch block

A Catch block specifies a fallback state to run if an Action or Map state encounters a runtime error. Example:

"Zendesk.GetTickets": {  "Type": "Action",  "ActionName": "zis:INTEGRATION:action:zendesk.get_tickets",  "Catch": [    {      "ErrorEquals": ["States.ALL"],      "Next": "AnotherState",      "ResultPath": "$.error_response"    }  ],  "Next": "NextState"}

Supported properties for a Catch block

Objects in the Catch array support the following properties.

NameTypeMandatoryDescription
ErrorEqualsarray of stringstrueArray of error names. Only valid value is "States.ALL", which matches all error names
NextstringtrueFallback state to run if the Action or Map state encounters a runtime error and can't retry
ResultPathstringfalseReference path used to the store the state's output. Later states of the ZIS flow can access the output at this path. Defaults to "$", which replaces the state's input with its output

Catching error response codes

If a custom action's HTTP request receives a non-2xx HTTP response status code, its Action state returns a runtime error. The state also passes the following error message to the $.Cause reference path:

external action failed due to status code: {http_status_code}

For example, if a custom action's request receives a 404 HTTP status code, the $.Cause path contains the following message:

external action failed due to status code: 404

You can use a Catch block and Choice state to conditionally run different flow states based on the $.Cause path's message. For an example, see Manually retrying a ZIS flow.

ZIS flow retry logic

The following table contains the states and runtime errors that trigger a retry of a failed ZIS flow. The interval between retries varies based on the state and error.

StateRuntime errorInterval between retries
Action429 rate limit errorSee Retrying a ZIS flow after rate limiting
Action5xx server errorSee Retrying a ZIS flow after a server error
ActionAny error other than the following:
  • 429 rate limit error
  • 5xx server error
  • Action parse error
  • Connection missing error
  • Action invalid domain error
  • Action invalid request
Retry runs immediately
MapA child state encounters a runtime error that triggers a flow retryUses the retry interval for the child state and error

A Succeed or Fail state won't trigger a retry of a ZIS flow, even if it encounters a runtime error.

ZIS only attempts to run a flow up to four times: the initial attempt and up to three retry attempts. During a retry, the entire flow runs again from the beginning. Ensure your use case and flow logic account for this.

Using a Catch block to capture an error that ZIS normally retries overrides ZIS's automatic retry behavior. In this case, you must manually retry the request. For example, see Manually retrying a ZIS flow.

Retrying a ZIS flow after rate limiting

A 429 HTTP status code means "too many requests." When a web server returns an error with this code, it means rate limiting has kicked in because the client is sending too many requests too quickly.

Some servers include a Retry-After header in responses with a 429 error. This header specifies how long you should wait before retrying the request. The header may provide this interval as a number of seconds to wait or a date and time after which you can retry the call.

A custom action in an Action state can send an HTTP request. If a custom action's HTTP request receives a 429 HTTP response status code and the flow fails, the flow will retry based on the Retry-After interval.

Retry-After intervalFlow retry behavior
Less than 120 secondsRetry after the Retry-After interval
120 seconds or greaterRetry after 120 seconds
No Retry-After headerRetry after 30–35 seconds. If the retry fails, attempt a second retry after a further 60–65 seconds. If the second retry fails, attempt a third and final retry after a further 120-125 seconds.

Retrying a ZIS flow after a server error

A 5xx HTTP status code means something has gone wrong with the responding web server.

A custom action in an Action state can send an HTTP request. If a custom action's HTTP request receives a 5xx HTTP response status code and the flow fails, the flow will retry after 30–35 seconds. If the retry fails, ZIS attempts a second retry after a further 60–65 seconds. If the second retry fails, ZIS attempts a third and final retry after a further 120–125 seconds.

Manually retrying a ZIS flow

To manually retry a flow for other types of errors, use a Catch block in an Action state to catch the error. Then use a Choice state to check the $.Cause reference path for a specific error code.

For example, your workflow might look like this:

  • GET an object
  • Catch and check for a 404 error, indicating the object doesn't exist
  • Create the object
  • Resume the rest of your workflow

Example:

{  "StartAt": "Zendesk.DoSomething",  "States": {    "Zendesk.DoSomething": {      "Type": "Action",      "ActionName": "zis:YOUR_INTEGRATION_NAME:action:zendesk.YOUR_ACTION_NAME",      "Parameters": {        "ticketId.$": "{{$.input.ticket.id}}"      },      "Catch": [        {          "Comment": "ZIS only supports catching all error types, i.e. States.ALL",          "ErrorEquals": ["States.ALL"],          "Next": "CheckErrorType"        }      ],      "ResultPath": "$.do_something_result",      "End": true    },    "CheckErrorType": {      "Comment": "Checks whether the error caught is a 404",      "Type": "Choice",      "Choices": [        {          "Variable": "$.Cause",          "StringEquals": "external action failed due to status code: 404",          "Next": "log.errorCaught.404"        }      ],      "Default": "log.errorCaught.other"    },    "log.errorCaught.404": {      "Comment": "Use this branch to handle 404 error",      "Type": "Succeed",      "Message": "I caught a 404 error"    },    "log.errorCaught.other": {      "Comment": "Use this branch to handle other errors",      "Type": "Succeed",      "Message": "I caught a non-404 error"    }  }}

Flow circuit breaker

If a flow has more than 3,000 events occur within a 10-minute period, a circuit breaker will be triggered if either of the following limits are reached:

  • More than 50% of those events lead to flows failing with a retryable error
  • More than 10,000 retryable errors occur within that period

If triggered, the circuit breaker will drop subsequent events for 30 seconds.

After 30 seconds, ZIS will process the next event for the flow. If successful, the error thresholds will be reset. If not, ZIS will wait another 30 seconds before repeating the cycle until such time that the flow succeeds.