Engineering

API Integrations That Actually Hold Up in Production

Rate limits, retries, webhooks, and error handling — the unglamorous work that separates a demo-ready integration from a production-ready one.

Infonza Innovations·January 22, 2026·8 min read

Every API integration works in the demo. The test data is clean, the endpoints return 200s, and everything looks smooth. Then you go to production and discover that the third-party API times out under load, returns malformed JSON for certain edge cases, and has rate limits that nobody mentioned in the documentation.

Rate Limits Are Not Suggestions

We've inherited integrations that worked fine until the client ran a batch job that triggered 2,000 API calls in a minute. The third-party API throttled them, the integration had no retry logic, and silent failures corrupted data for hours before anyone noticed.

Every integration we build now has explicit rate limit handling. We track call counts with Redis, implement token bucket algorithms for sustained high-volume calls, and always build retry logic with exponential backoff.

Design for Failure, Not Success

The happy path is easy to build. What happens when the API returns a 503? What happens when it returns a 200 but the response body is empty? What happens when a required field is null? We've seen all of these from APIs that were supposedly well-documented.

Our standard practice: every API call is wrapped in error handling that distinguishes between retryable and non-retryable failures. Network errors and 5xx responses get retried. 4xx responses get logged with the full request payload and surfaced to the user with a meaningful message.

Webhooks Are More Reliable Than Polling

If the API supports webhooks, use them. Polling is expensive, laggy, and breaks rate limits faster than anything else. Webhooks are event-driven, immediate, and kind to API quotas.

The catch: webhook endpoints need to be idempotent. Third-party services will deliver the same webhook multiple times — intentionally, for reliability. Your handler needs to detect and ignore duplicates. We use a processed event ID table to handle this.

Document What the Docs Don't Tell You

Every integration has quirks that aren't in the official documentation. Specific error codes that only appear in edge cases. Undocumented rate limits on specific endpoints. Response fields that are sometimes null and sometimes missing entirely.

We maintain an internal integration notes document for every API we work with — a running log of everything we've discovered that the official docs don't cover. It saves hours on every subsequent project that touches the same API.

Monitor Everything

Integration health isn't something you check when something breaks. Set up monitoring for API response times, error rates, and rate limit consumption. Alert when error rates spike or when you're approaching 80% of your rate limit. Silent failures are the most expensive kind.

Working on something like this?

We help US-based startups and businesses build software that actually works.

Book a Free Strategy Call →