[an error occurred while processing this directive]

canonical URI: About the canonical URI of a page. [] (canonical URI), article, page 722092
https://www.purl.org/stefan_ram/pub/siteinfo_canonical_en (permalink) is the canonical URI of this page.
Stefan Ram

The Canonical URI

A web page also is known as a resource. The usual form of address of a web resource is called a URI. A resource sometimes can be reached via several  URIs.

URIs And Ressource
.------------------. 
| URI 0 |-------. 
'------------------' | .----------------. 
.------------------. '---->| resource | 
| URI 1 |------------>| | 
'------------------' .---->| | 
.------------------. | '----------------' 
| URI 2 |-------' 
'------------------'

On of the URIs of a resource might be distinguished as the canonical URI of the resource. Since there might be at most one  canonical URI for a resource, the mapping between a canonical URI and its resource is 1-to-1 (given that the resource has a canonical URI at all).

The Canonical URI
                                 .----------------. 
| canonical URI | 
'----------------' 

.------------------. | 
| URI 0 |-------. V 
'------------------' | .----------------. 
.------------------. '---->| resource | 
| URI 1 |------------>| | 
'------------------' .---->| | 
.------------------. | '----------------' 
| URI 2 |-------' 
'------------------'

The maintainer of the resource will usually be at pains to ensure the permanent connection between the canonical URI and the resource, while other URIs, that might have lead to the resource “by coincidence” at certain points in time might become invalid at any time. (Due to the state of today's technology and the limits of human endeavor, even the validity of the canonical can not be guaranteed absolutely and for ever.)

After a move of a resource the maintainer of the canonical will re-adjust the canonical URI, if possible, while all other URIs might become invalid, so that the resource can not be reached by following them anymore.

Canonical URI after a move of a resource
                                 .----------------. 
| canonical URI | 
'----------------' 

'------------. 

.----------------. 
| resource | 
.------------------. | | 
| URI 0 |-------> (invalid) | | 
'------------------' '----------------' 
.------------------.  
| URI 1 |-------> (invalid)  
'------------------'  
.------------------.  
| URI 2 |-------> (invalid) 
'------------------'

The canonical URI is the “official” URI of a resource and as permanent as possible.

For links and bookmarks referring to a resource, only its canonical resource should be used.

PURLs

Stefan Ram  is assigning a PURL to most resources to serve as their canonical URI. A PURL is a “permanent URI” allowing to permanently be associated with a certain resource. The PURL service is provided by the Online Computer Library Center  and also is used by the W3C Consortium  or the Dublin-Core-Initiative, so that one can expect this service to be available permanently.

The following treatment is restricted to HTTP -URIs, because Stefan Ram  is delivering his resources almost exclusively using HTTP.

The URI of a resource that will directly by answered by the data of the resource (with HTTP -Status 200) here is called the URI of delivery  for a resource. Due to technical or organizational reasons, the URI of delivery might have to change. This will break all previous URIs of delivery, so all links to a resource using the previous URIs of delivery will become unusable. Thus, one understands that it is not recommended to use a URI of delivery as the canonical URI of a resource.

The damage can be avoided by using a PURL, because this can continue to lead to the resource even after the URI of delivery has changed. This is accomplished by implementing the PURL as a redirection (with HTTP status 302) to the current URI of delivery.

PURL and temporary URI (before the URI of delivery is changed)
                   .---------------------. 200   .--------------. 
.-->| URI of delivery |------>| resource | 
| '---------------------' '--------------' 
.-----------. | 
| PURL |--' 
'-----------' 302

The target of forwarding for the PURL can be re-adjusted to the URI of delivery, without the need to change the PURL itself. This means, that all directories and links, which use the PURL, will continue to be valid unmodified.

PURL and temporary URI (after the URI of delivery is changed)
                   .---------------------. 
| old URI of delivery |-------> (invalid) 
'---------------------' 
.-----------. 302  
| PURL |--. 
'-----------' |  
| .---------------------. 200 .--------------. 
'-->| new URI of delivery |------>| resource | 
'---------------------' '--------------'

If all links to a resource are carried out using the PURL, no modification to these links is necessary. Search-engines that correctly interpret the HTTP status code will index the PURL and not the URI of delivery. Resource using the old URI of delivery would contain an invalid link.

bad linking by resource 1
.------------.              .---------------------. 
| resource 1 |------------->| old URI of delivery |--> (invalid) 
'------------' '---------------------'

correct linking to resource 0 by resources
.------------. 
| resource 2 |-. 
'------------' | 302 200 
.------------. | .------. .---------------------. .------------. 
| resource 3 |-+->| PURL |->| URI of delivery |->| resource 0 | 
'------------' | '------' '---------------------' '------------' 
.------------. | 
| resource 4 |-' 
'------------'

Resources

The following resource provide additional information regarding the topics addressed on this page.

http://www.purl.org/
About the concept of a PURL
http://www.w3.org/Addressing/
About the notion “URI” and the concept of a “resource”
http://www.google.com/search?q=canonical+URI
When one is searching for this page, Google  usually shows the PURL of this page in the result list, because Google  is interpreting the HTTP  status code 302 correctly and most links to this page use the PURL. However, this might not always happen.

About this page, Impressum  |   Form for messages to the publisher regarding this page  |   "ram@zedat.fu-berlin.de" (without the quotation marks) is the email-address of Stefan Ram.   |   Beginning at the start page often more information about the topics of this page can be found. (A link to the start page appears at the very top of this page.)  |   Copyright 2004 Stefan Ram, Berlin. All rights reserved. This page is a publication by Stefan Ram. slrprd, PbclevtugFgrsnaEnz