Pitchbar applies per-endpoint rate limits to protect against abuse and to keep cost-bearing endpoints (LLM, vector search) predictable. Every response from a rate-limited surface carries a quad of headers so integrators can pace themselves.

Headers on every response

Header	Meaning
`X-RateLimit-Limit`	Maximum requests allowed in the current window.
`X-RateLimit-Remaining`	Requests still available in the current window.
`X-RateLimit-Reset`	Unix timestamp when the window resets and budget is restored.
`Retry-After`	Only sent on 429 — seconds to wait before retrying.

Pitchbar's API-surface middleware ensures X-RateLimit-Reset ships on every response, not just on 429. That mirrors GitHub / Stripe behaviour.

Named limiters in use

Limiter	Applies to	Limit	Keyed by
`widget-init`	`POST /api/v1/widget/init`	1000/min/IP+agent · 30000/hr/IP	Soft per-IP — absorbs NAT bursts.
`widget-session`	`POST /api/v1/widget/messages*`, events, conversation operations	300/min	Per JWT (visitor session).
`widget-leads`	`POST /api/v1/widget/leads`	30/min	Per JWT.
`wp-plugin`	All `/v1/wp/*` bulk-sync endpoints	60/min	Per workspace API token id.
Inline (`throttle:600,1`)	`POST /api/v1/widget/typing`	600/min/IP	Per IP (visitor typing pings).
Inline (`throttle:60,1`)	`POST /api/v1/widget/satisfaction`	60/min/IP	Per IP.
Inline (`throttle:120,1`)	`POST /api/v1/widget/coupon/apply`	120/min/IP	Per IP.

What 429 looks like

HTTP/1.1 429 Too Many Requests
Content-Type: application/json
X-RateLimit-Limit: 300
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1747438932
Retry-After: 28

{"error":{"code":"rate_limited","message":"Too many widget requests for this conversation. Please slow down."}}

Best practices for consumers

Watch X-RateLimit-Remaining on every response. When it hits a small threshold (e.g. < 5), pause and wait for X-RateLimit-Reset.
On 429, sleep for the value in Retry-After, then retry. Don't back off exponentially — the window is deterministic.
If you operate a high-fanout integration, segment your callers so they don't all share an IP — the widget-init per-IP cap can squeeze hard from a single egress.