Your license.xml is how you tell crawlers, AI agents, and other automated clients what they can and cannot do with your content. It is served at your domain and expressed in the RSL standard format — a machine-readable structure that covers scope, permissions, restrictions, and commercial terms.At a practical level, an RSL file answers four questions:
Which content does this rule apply to?
Which automated uses are allowed?
Which uses, users, or geographies are restricted?
What legal or commercial conditions sit behind those rights?
That is why license.xml matters. It is not just documentation.
It is the structured expression of your licensing position
for search engines, AI systems, crawlers, and other automated clients.The example below shows the same content section expressed through two separate <content> declarations.
The document root uses the RSL namespace, and RSL documents are served as application/rsl+xml.
one content declaration allows classic search broadly
another adds commercial AI input and AI indexing through a license-server flow
AI training remains prohibited
This split matters. Under the RSL specification, if a content declaration includes a license server, clients are expected to obtain licensed access for that content even when the applicable payment condition is free. That is why a broad search-allowed case should not carry a server value if you want unlicensed search access to remain possible.Supertab Connect takes care of this pattern in the RSL Editor engine, so users don’t have to worry about
accidentally losing on SEO optimization and can simply focus on creating their terms for licensed access.
Clients do not evaluate an RSL file by reading it top to bottom, but by scope and specificity. The practical rules that matter are:
More specific content scopes take precedence over broader scopes
Prohibitions override permissions when both apply
Overlapping, unclear licenses are interpreted conservatively
That has direct commercial consequences. If you publish a broad rule for / and a narrower rule for /premium/*, the narrower rule should govern the premium section. If you publish overlapping license offers that apply to the same content and same audience without a clear distinction, a well-behaved client is likely to default to the stricter interpretation.
If two overlapping licenses both appear to apply and one permits a use while the other prohibits it,
complying clients would apply the prohibition. Ambiguous drafting does not expand rights. It usually reduces licensed access.
<content> defines which asset, path, section, file, or resource the rules apply to. For most websites, this is the key commercial control point because it lets you distinguish public, premium, archive, research, or other sections.
Part
Meaning
url
Canonical scope identifier, expressed as a path pattern
server
License server clients use the server to obtain and validate licensed access
lastmod
Freshness metadata for the scoped asset or rule set
<schema>
Structured metadata associated with the content
<alternate>
Alternative machine-friendly version of the same content
<copyright>
Rights holder identity and contact
<terms>
Human-readable legal or commercial terms
For normal web licensing, path patterns are usually the simplest model:
/
/news/*
/articles/$
/premium/*
The main discipline is simple: use broad scopes only when the same policy genuinely applies across that whole area,
and use narrower scopes when the rules differ in a meaningful way.Short practical notes:
server matters when licensed access requires token acquisition. It connects the public license to the operational license acquisition flow.
if you want to license content without requiring token acquisition, keep that content declaration separate from token-gated licensed uses and do not include a server value in the allowing content declaration.
lastmod helps clients re-check freshness, but it does not change the rights themselves.
A <content> element can contain one or more <license> elements. Each one represents a coherent set of rights and conditions for that scope.In practice, multiple licenses make sense when the offers are genuinely different, for example:
search allowed for everyone, but AI use reserved for commercial licenses
non-commercial use under one offer, commercial use under another
one geography under one set of terms, another geography under another
What does not work well is publishing multiple licenses that apply to the same scope and same audience with only minor wording differences. That creates ambiguity without adding commercial flexibility.
This is the most commercially important vocabulary because it defines what automated systems may actually do with the content.
Token
Meaning
search
Traditional search indexing and search results, not AI-generated summaries
ai-input
Use as input to generate AI answers or summaries
ai-index
Storage in an AI retrieval or indexing layer
ai-train
Training or fine-tuning models on the content
ai-all
Any AI-system use across input, indexing, training, and related AI operations
all
Any automated processing, including AI and non-AI use
The most important distinction is that search does not mean AI summaries. If your goal is “allow classic search, but do not allow AI grounding or training,” search is the right starting point.
This says: AI use is allowed in general, but training is excluded. That is much clearer than trying to imply the same thing through silence or overlapping offers.
RSL can also express payment-related terms. At a protocol level, licenses may include payment conditions, references to standard or custom commercial terms, and pricing metadata.For this overview, the important point is strategic rather than structural: RSL is designed not only to say “yes” or “no,” but also to support licensed access under commercial conditions.
The following elements do not usually change the core permission logic, but they make the license more credible, easier to interpret, and easier to operate.
Element
What it adds
<schema>
Structured metadata associated with the content
<alternate>
Alternate machine-friendly representation of the same asset
<copyright>
Rights holder identity plus contact information
<terms>
Human-readable legal or commercial terms page
For most publishers, this is straightforward: publish real ownership metadata, a real rights contact, and a real terms page. Placeholder metadata weakens confidence in the license even if the XML is technically valid.