Operational Experience

Heartbeat, Queueing, and Circuit Breaking

Webhook listeners — like any other piece of software — will eventually become unstable or unavailable due to network or system issues. Some webhook providers — such as Bolt, Aftership, BitBucket, and Square — recognize availability as an inevitable problem and provide resiliency features to reduce the burden on webhook consumers. Examples of interesting features include:

  • Heartbeat checks: Webhook providers regularly reach out to webhook listeners to check their status. If a listener is unavailable, the provider takes action — such as queueing notifications or sending alerts to system admins — until the webhook listener gets fixed. This feature is helpful, especially in webhooks that don't send messages often.

    app.post('/webhook', (req, res) => {
      if (!req.rawBody) {
        // empty value = heartbeat check. Return 200 or 204 - No Content
        res.status(204).send()
      }else{
        ...
    
  • Queueing: webhook providers can keep track of historical information of webhook events sent and allow consumers to send messages again after failures or unavailability. Queues can be implemented on listener errors or in conjunction with heartbeat checks.

    Webhook Events and retry mechanism

  • Circuit Breaking: Some webhook providers implement circuit breaking logic to throttle and alleviate requests to webhook listeners when webhook requests cross an error or response time threshold.

    Webhook Circuit breaking

These features help consumers keep their service resilient and recover from issues faster and without message loss.

Previous
Introduction