Error Handling

Various factors can cause errors in the lifecycle of a request. It's common to deal with errors in a networked environment by implementing retries on the client side. Having a retry mechanism makes your implementation robust against unexpected/expected failures and reduces your operation tasks/costs of the developers.

Best practices for 503 API errors

This section demonstrates how to use truncated exponential backoff as a retry mechanism with the insurance of not generating excessive load on the system.

When you receive a 503 from our APIs, it means your request was not accepted, and you need to retry again. As part of the RFC 7231 The 503 (Service Unavailable) status code indicates that the server is currently unable to handle the request, which will likely be alleviated after some delay. In case of facing this scenario, you can use the Retry-After header in the response for the suggested minimum amount of time the client has to wait before retrying again. We highly recommend implementing an truncated exponential backoff mechanism so your client periodically retries a failed request with increasing delays between requests. When clients retry without waiting, they could produce a heavy load on zDirect API servers. With this method, you provide space to the system and not add excessive load on it.

Example Algorithm

In short, an exponential backoff algorithm retries requests exponentially, increasing the waiting time between retries up to a maximum number of retries.

Request the API you want.
If the request fails, and you get the 503 status code, parse the Retry-After header in the response. This will tell you the minimum amount of time in seconds you need to wait.
Start counting the number of retries from 1.
Wait for the waiting time and retry to make the request. Calculate the waiting like: waiting-time = max(<value-from-retry-after>,(2^retries) seconds + random_number_milliseconds).
If the request failed again: increase the counter of retries: retries = retries + 1 and wait again - similar to step 4.
Continue this process until the waiting time is smaller than the maximum waiting time = 64 seconds.
When the client reached to maximum waiting time, at this moment, it does not need to increase the backoff time anymore, and only needs to wait for maximum waiting time seconds and retry for a finite amount of retries.

Note

The client should wait for the minimum amount of seconds defined in the "Retry-After" header.
The client can continue retrying after it has reached the maximum waiting time. Retries after this point do not need to continue increasing backoff time.
random_number_milliseconds is a positive random number below 1000 milliseconds. Adding this jitter helps to avoid all clients retry at once in a synchronized manner. This random number should be recalculated in each retry.
The maximum number of retries should be set to a certain threshold, as clients should not keep retrying indefinitely.

In case the failure persists after the retries. Feel free to reach out to our support.

Contact Support