Crawler Authentication Protocol (CAP)

The Crawler Authentication Protocol (CAP) is the enforcement layer of Supertab Connect. It gives your CDN edge the ability to verify that a crawler holds a valid license before allowing it to reach your content. Compliant crawlers present a license token with every request; the edge validates it in real time and either forwards the request or returns a structured rejection. CAP operates entirely at the network layer. It has no footprint on your origin, no impact on human browser traffic, and no dependency on your application code.

How CAP works

The protocol follows a straightforward request-response cycle. A crawler that holds a valid license attaches it to every request using the Authorization header:

Authorization: License <token>

Your CDN edge — running the Supertab Connect SDK — intercepts the request. If the request looks like bot traffic, the SDK extracts the token and verifies its JWT signature against Supertab Connect’s JWKS endpoint. A valid token allows the request through to your origin. An invalid or missing token receives a rejection with structured headers pointing the crawler to where it can obtain a license:

HTTP/1.1 401 Unauthorized
WWW-Authenticate: License error="invalid_request"
Link: <https://yourdomain.com/license.xml>; rel="license"; type="application/rsl+xml"

The status code depends on the failure: 401 for missing, expired, or invalid tokens, 403 for tokens that are valid but don’t cover the requested resource. The WWW-Authenticate header tells the crawler what went wrong. The Link header tells it where your RSL license lives, which is where it can discover how to acquire access. Non-bot traffic — browsers, APIs, anything that doesn’t present itself as a crawler — passes through unchanged. CAP only enforces against traffic your bot detection logic identifies as automated.

What the token contains

Tokens are signed JWTs issued by Supertab Connect as part of the license acquisition flow. They carry the identity of the licensed crawler, the merchant system it is licensed against, and an expiry. The SDK validates the JWT signature against Supertab Connect’s JWKS (JSON Web Key Set) endpoint and checks the claims on every request — no session state or per-request call to the license server is required at your edge. Tokens are short-lived by design. A crawler must obtain and refresh tokens as part of normal operation, which means the enforcement signal is current: a valid token reflects an active license, not a historical one.

What CAP does not do

CAP is specifically scoped. Understanding its limits helps you integrate it cleanly rather than expecting it to cover more than it does. CAP does not identify bots. The SDK validates tokens for traffic your bot detection logic flags as automated. You are responsible for defining what counts as a bot in your infrastructure. The SDK accepts a botDetector function for this purpose; without one it applies its own built-in heuristics based on user-agent patterns, but custom logic is almost always more accurate. CAP does not enforce on human traffic. Browsers do not present Authorization: License headers. Any request without that header that your bot detection logic also passes through is treated as human traffic and forwarded without token validation. CAP does not issue tokens. Token acquisition is handled by the Open Licensing Protocol (OLP). CAP only validates tokens that already exist. A crawler that has never gone through the licensing flow will receive a 401 and the information it needs to start that process. CAP does not rate-limit or throttle. It validates credentials. Crawl rate control is a separate concern handled at your infrastructure level.

How CAP fits with your bot detection

The two most important operational decisions when deploying CAP are where to run it and how to identify bots. CAP runs at the CDN edge — before your origin sees the request. This matters for two reasons: it keeps validation latency low, and it means rejected requests never reach your servers at all. For high-volume crawlers, this has meaningful infrastructure implications. Bot detection is the input to CAP enforcement. If your bot detection is too narrow, licensed crawlers will slip through as apparent human traffic without token validation. If it is too broad, legitimate human users may be subjected to token checks they cannot pass. The SDK’s built-in detection covers known crawler user-agent patterns, but any serious deployment should be reviewed with your own traffic patterns in mind.

Enforcement modes

The SDK supports two enforcement modes: SOFT is the default. Requests pass through, but the SDK records validation events and attaches licensing headers. It is useful for initial rollout — you can observe which requests would be blocked before enabling hard enforcement. STRICT rejects requests from identified bots without a valid license token with a 401. Switch to this once you have validated that your bot detection logic is not catching human traffic. DISABLED turns off enforcement entirely. Requests are allowed without licensing intervention. This is useful during initial SDK integration when you want to confirm the deployment works without any enforcement side effects.

Deploy in Your CDN

Deploy CAP enforcement at your CDN edge, with links to platform-specific guides.

Content Licensing with RSL

How your RSL license expresses what licensed crawlers are permitted to do.

Introduction

Guides

Licensing

Reference

Crawler Authentication Protocol (CAP)

How CAP works

What the token contains

What CAP does not do

How CAP fits with your bot detection

Enforcement modes

Deploy in Your CDN

Content Licensing with RSL

Introduction

Guides

Licensing

Reference

​How CAP works

​What the token contains

​What CAP does not do

​How CAP fits with your bot detection

​Enforcement modes

​Related Docs

Deploy in Your CDN

Content Licensing with RSL

How CAP works

What the token contains

What CAP does not do

How CAP fits with your bot detection

Enforcement modes

Related Docs