Skip to main content
Your license.xml is how you tell crawlers, AI agents, and other automated clients what they can and cannot do with your content. It is served at your domain and expressed in the RSL standard format — a machine-readable structure that covers scope, permissions, restrictions, and commercial terms. At a practical level, an RSL file answers four questions:
  1. Which content does this rule apply to?
  2. Which automated uses are allowed?
  3. Which uses, users, or geographies are restricted?
  4. What legal or commercial conditions sit behind those rights?
That is why license.xml matters. It is not just documentation. It is the structured expression of your licensing position for search engines, AI systems, crawlers, and other automated clients. The example below shows the same content section expressed through two separate <content> declarations. The document root uses the RSL namespace, and RSL documents are served as application/rsl+xml.
<?xml version="1.0" encoding="utf-8"?>
<rsl xmlns="https://rslstandard.org/rsl">
  <content
    url="/news/*"
    lastmod="2026-03-27T15:00:00Z"
  >
    <schema>https://publisher.example.com/schema/news.jsonld</schema>
    <terms>https://publisher.example.com/licensing-terms</terms>
    <copyright
      type="organization"
      contactEmail="licensing@publisher.example.com"
      contactUrl="https://publisher.example.com/contact"
    >
      Example Publishing Group
    </copyright>

    <license>
      <permits type="usage">search</permits>
      <permits type="user">commercial non-commercial education government personal</permits>
      <legal type="warranty">ownership authority</legal>
      <legal type="disclaimer">as-is no-liability</legal>
    </license>
  </content>

  <content
    url="/news/*"
    server="https://api-connect.supertab.co/urn:stc:merchant:system:publisher-123"
    lastmod="2026-03-27T15:00:00Z"
  >
    <license>
      <permits type="usage">ai-input ai-index</permits>
      <permits type="user">commercial</permits>
      <prohibits type="usage">ai-train</prohibits>
      <legal type="attestation">true</legal>
      <legal type="contact">mailto:licensing@publisher.example.com</legal>
    </license>
  </content>
</rsl>
Read it this way:
  • /news/* is the content scope
  • one content declaration allows classic search broadly
  • another adds commercial AI input and AI indexing through a license-server flow
  • AI training remains prohibited
This split matters. Under the RSL specification, if a content declaration includes a license server, clients are expected to obtain licensed access for that content even when the applicable payment condition is free. That is why a broad search-allowed case should not carry a server value if you want unlicensed search access to remain possible. Supertab Connect takes care of this pattern in the RSL Editor engine, so users don’t have to worry about accidentally losing on SEO optimization and can simply focus on creating their terms for licensed access.

How clients evaluate scope and conflicts

Clients do not evaluate an RSL file by reading it top to bottom, but by scope and specificity. The practical rules that matter are:
  • More specific content scopes take precedence over broader scopes
  • Prohibitions override permissions when both apply
  • Overlapping, unclear licenses are interpreted conservatively
That has direct commercial consequences. If you publish a broad rule for / and a narrower rule for /premium/*, the narrower rule should govern the premium section. If you publish overlapping license offers that apply to the same content and same audience without a clear distinction, a well-behaved client is likely to default to the stricter interpretation.
If two overlapping licenses both appear to apply and one permits a use while the other prohibits it, complying clients would apply the prohibition. Ambiguous drafting does not expand rights. It usually reduces licensed access.

Core license elements

<content>: the scope of the policy

<content> defines which asset, path, section, file, or resource the rules apply to. For most websites, this is the key commercial control point because it lets you distinguish public, premium, archive, research, or other sections.
PartMeaning
urlCanonical scope identifier, expressed as a path pattern
serverLicense server clients use the server to obtain and validate licensed access
lastmodFreshness metadata for the scoped asset or rule set
<schema>Structured metadata associated with the content
<alternate>Alternative machine-friendly version of the same content
<copyright>Rights holder identity and contact
<terms>Human-readable legal or commercial terms
For normal web licensing, path patterns are usually the simplest model:
  • /
  • /news/*
  • /articles/$
  • /premium/*
The main discipline is simple: use broad scopes only when the same policy genuinely applies across that whole area, and use narrower scopes when the rules differ in a meaningful way. Short practical notes:
  • server matters when licensed access requires token acquisition. It connects the public license to the operational license acquisition flow.
  • if you want to license content without requiring token acquisition, keep that content declaration separate from token-gated licensed uses and do not include a server value in the allowing content declaration.
  • lastmod helps clients re-check freshness, but it does not change the rights themselves.

<license>: one coherent rule bundle

A <content> element can contain one or more <license> elements. Each one represents a coherent set of rights and conditions for that scope. In practice, multiple licenses make sense when the offers are genuinely different, for example:
  • search allowed for everyone, but AI use reserved for commercial licenses
  • non-commercial use under one offer, commercial use under another
  • one geography under one set of terms, another geography under another
What does not work well is publishing multiple licenses that apply to the same scope and same audience with only minor wording differences. That creates ambiguity without adding commercial flexibility.

<permits>: what is allowed

<permits> is the positive grant of rights. It declares which uses, user classes, or geographies are allowed. The three main permit types are:
  • usage
  • user
  • geo

type="usage"

This is the most commercially important vocabulary because it defines what automated systems may actually do with the content.
TokenMeaning
searchTraditional search indexing and search results, not AI-generated summaries
ai-inputUse as input to generate AI answers or summaries
ai-indexStorage in an AI retrieval or indexing layer
ai-trainTraining or fine-tuning models on the content
ai-allAny AI-system use across input, indexing, training, and related AI operations
allAny automated processing, including AI and non-AI use
The most important distinction is that search does not mean AI summaries. If your goal is “allow classic search, but do not allow AI grounding or training,” search is the right starting point.

type="user"

This lets you separate rights by operator class rather than by technical use.
TokenMeaning
commercialFor-profit or commercial use
non-commercialNon-commercial use
educationEducational use
governmentPublic-sector or government use
personalIndividual personal use

type="geo"

This lets you scope rights geographically, usually with ISO 3166-1 alpha-2 country or region codes. Example:
<permits type="geo">US EU</permits>

<prohibits>: what is forbidden

<prohibits> is the hard stop. When something appears in both permits and prohibits, the prohibition wins. That makes it useful for carve-outs.
<license>
  <permits type="usage">ai-all</permits>
  <prohibits type="usage">ai-train</prohibits>
</license>
This says: AI use is allowed in general, but training is excluded. That is much clearer than trying to imply the same thing through silence or overlapping offers. The legal block expresses what you affirm, what you disclaim, and where counterparties can go for rights clarification.
typePurpose
warrantyPositive statements about rights, authority, or asset quality
disclaimerLiability and warranty disclaimers
attestationBoolean affirmation that you are authorized to make the rights statement
contactLegal or rights contact point
proofURI to supporting evidence of authority or rights
The values most management teams usually care about are:
  • warranties such as ownership, authority, and no-infringement
  • disclaimers such as as-is and no-liability
  • whether attestation is appropriate for your internal governance standard
Practical guidance:
  • publish warranties only when you are comfortable making them consistently and publicly
  • keep the legal contact monitored by someone who can actually answer rights questions
  • use proof links only when they point to meaningful evidence, not filler

<payment>: commercial terms

RSL can also express payment-related terms. At a protocol level, licenses may include payment conditions, references to standard or custom commercial terms, and pricing metadata. For this overview, the important point is strategic rather than structural: RSL is designed not only to say “yes” or “no,” but also to support licensed access under commercial conditions.

Supporting metadata

The following elements do not usually change the core permission logic, but they make the license more credible, easier to interpret, and easier to operate.
ElementWhat it adds
<schema>Structured metadata associated with the content
<alternate>Alternate machine-friendly representation of the same asset
<copyright>Rights holder identity plus contact information
<terms>Human-readable legal or commercial terms page
For most publishers, this is straightforward: publish real ownership metadata, a real rights contact, and a real terms page. Placeholder metadata weakens confidence in the license even if the XML is technically valid.

Common policy patterns

These are two of the most common patterns publishers use when deciding how open or restrictive to be.

Allow classic search, but not AI use

<license>
  <permits type="usage">search</permits>
  <prohibits type="usage">ai-all</prohibits>
</license>
This is clear and commercially easy to explain: search discovery is acceptable, but AI reuse is not.

Allow AI answers or retrieval, but not training

<license>
  <permits type="usage">ai-input ai-index</permits>
  <prohibits type="usage">ai-train</prohibits>
</license>
This is useful when you are open to answer-generation or retrieval use, but you do not want the content absorbed into model training.

Discovery

Publishing license.xml is not enough on its own. Clients also need a reliable way to discover it.

Primary recommendation: robots.txt

For websites, this is the simplest and most important discovery path. Add this line to your robots.txt:
License: https://yourdomain.com/license.xml
The URL must be fully qualified. If you only implement one discovery mechanism, this should usually be it.

Secondary discovery options

MethodBest used for
HTML <link rel="license">HTML pages where page-level association matters
HTTP Link headerAPIs, JSON responses, media files, and non-HTML assets
Examples:
<link rel="license" type="application/rsl+xml" href="https://yourdomain.com/license.xml">
Link: <https://yourdomain.com/license.xml>; rel="license"; type="application/rsl+xml"

Best practices

  • Keep the number of scopes small and commercially meaningful. A few clear boundaries are better than many overlapping ones.
  • Separate search, AI input, AI indexing, and AI training deliberately. They are different rights with different implications.
  • Avoid ambiguous overlapping license offers. If a counterparty cannot tell which license applies, you have weakened the commercial outcome.
  • Make discovery and contact simple: publish at /license.xml, advertise it in robots.txt, and use a real rights contact.

Deploy in Your CDN

Publish your license.xml at your own domain and enforce licensing at the edge.