chickadee » uri-generic

uri-generic

Description

The uri-generic library contains procedures for parsing and manipulation of Uniform Resource Identifiers (RFC 3986). It is intended to conform more closely to the RFC, and uses combinator parsing and character classes rather than regular expressions.

This library should be considered to be a basis for creating scheme-specific URI parser libraries. This library only parses the generic components from a URI. Any specific library can further parse subcomponents. For this reason, encoding and decoding of percent-encoded characters is not done automatically. This should be handled by specific URI scheme implementations.

Library Procedures

Constructors and predicates

As specified in section 2.3 of RFC 3986, URI constructors automatically decode percent-encoded octets in the range of unreserved characters. This means that the following holds true:

(equal? (uri-reference "http://example.com/foo-bar")
        (uri-reference "http://example.com/foo%2Dbar"))  => #t
(uri-reference STRING) => URI procedure

A URI reference is either a URI or a relative reference (RFC 3986, Section 4.1). If the given string's prefix does not match the syntax of a scheme followed by a colon separator, then the given string is parsed as a relative reference. If STRING is neither a URI nor a relative reference, uri-reference returns #f.

(uri-reference? URI) => BOOL procedure

Is the given object a URI reference? All objects created by URI-generic constructors are URI references; they are either URIs or relative references. The constructors below are just more strict checking versions of uri-reference. They all create URI references.

(absolute-uri STRING) => URI procedure

Parses the given string as an absolute URI, in which no fragments are allowed. If no URI scheme is found, or a fragment is detected, this raises an error.

Absolute URIs are defined by RFC 3986 as non-relative URI references without a fragment (RFC 3986, Section 4.2). Absolute URIs can be used as a base URI to resolve a relative-ref against, using uri-relative-to (see below).

(make-uri #!key authority scheme path query fragment host port username password) => URI procedure

Constructs a URI from the given components.

(absolute-uri? URI) => BOOL procedure

Is the given object an absolute URI?

(uri? URI) => BOOL procedure

Is the given object a URI? URIs are all URI references that include a scheme part. The other type of URI references are relative references.

(relative-ref? URI) => BOOL procedure

Is the given object a relative reference? Relative references are defined by RFC 3986 as URI references which are not URIs; they contain no URI scheme and can be resolved against an absolute URI to obtain a complete URI using uri-relative-to.

(uri-path-absolute? URI) => BOOL procedure

Is the URI's path component an absolute path?

(uri-path-relative? URI) => BOOL procedure

Is the URI's path component a relative path?

Attribute accessors

(uri-authority URI) => URI-AUTH procedure
(uri-scheme URI) => SYMBOL procedure
(uri-path URI) => LIST procedure
(uri-query URI) => STRING procedure
(uri-fragment) URI => STRING procedure
(uri-host URI) => STRING procedure
(uri-port URI) => INTEGER procedure
(uri-username URI) => STRING procedure
(uri-password URI) => STRING procedure
(authority? URI-AUTH) => BOOL procedure
(authority-host URI-AUTH) => STRING procedure
(authority-port URI-AUTH) => INTEGER procedure
(authority-username URI-AUTH) => STRING procedure
(authority-password URI-AUTH) => STRING procedure

If a component is not defined in the given URI, then the corresponding accessor returns #f, except for uri-path, which will always return a (possibly empty) list.

(update-uri URI #!key authority scheme path query fragment host port username password) => URI procedure
(update-authority URI-AUTH #!key host port username password) => URI procedure

Update the specified keys in the URI or URI-AUTH object in a functional way (ie, it creates a new copy with the modifications).

String and List Representations

(uri->string URI [USERINFO]) => STRING procedure

Reconstructs the given URI into a string; uses a supplied function LAMBDA USERNAME PASSWORD -> STRING to map the userinfo part of the URI. If not given, it represents the userinfo as the username followed by ":******".

(uri->list URI USERINFO) => LIST procedure

Returns a list of the form (SCHEME SPECIFIC FRAGMENT); SPECIFIC is of the form (AUTHORITY PATH QUERY).

Reference Resolution

(uri-relative-to URI URI) => URI procedure

Resolve the first URI as a reference relative to the second URI, returning a new URI (RFC 3986, Section 5.2.2).

(uri-relative-from URI URI) => URI procedure

Constructs a new, possibly relative, URI which represents the location of the first URI with respect to the second URI.

(use uri-generic)
(uri->string (uri-relative-to (uri-reference "../qux") (uri-reference "http://example.com/foo/bar/")))
 => "http://example.com/foo/qux"

(uri->string (uri-relative-from (uri-reference "http://example.com/foo/qux") (uri-reference "http://example.com/foo/bar/")))
 => "../qux"

String encoding and decoding

(uri-encode-string STRING [CHAR-SET]) => STRING procedure

Returns the percent-encoded form of the given string. The optional char-set argument controls which characters should be encoded. It defaults to the complement of char-set:uri-unreserved. This is always safe, but often overly careful; it is allowed to leave certain characters unquoted depending on the context.

(uri-decode-string STRING [CHAR-SET]) => STRING procedure

Returns the decoded form of the given string. The optional char-set argument controls which characters should be decoded. It defaults to char-set:full.

Normalization

(uri-normalize-case URI) => URI procedure

URI case normalization (RFC 3986 section 6.2.2.1)

(uri-normalize-path-segments URI) => URI procedure

URI path segment normalization (RFC 3986 section 6.2.2.3)

Character sets

As a convenience for sub-parsers or other special-purpose URI handling code, there are a couple of character sets exported by uri-generic.

char-set:gen-delims constant

Generic delimiters.

 gen-delims  =  ":" / "/" / "?" / "#" / "[" / "]" / "@"
char-set:sub-delims constant

Sub-delimiters.

 sub-delims  =  "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="
char-set:uri-reserved constant

The union of gen-delims and sub-delims; all reserved URI characters.

 reserved    =  gen-delims / sub-delims
char-set:uri-unreserved constant

All unreserved characters that are allowed in a URI.

 unreserved  =  ALPHA / DIGIT / "-" / "." / "_" / "~"

Note that this is _not_ the complement of char-set:uri-reserved! There are several characters (even printable, noncontrol characters) which are not allowed at all in a URI.

Requires

Version History

License

Based on the Haskell URI library by Graham Klyne <gk@ninebynine.org>.

 Copyright 2008-2014 Ivan Raikov, Peter Bex.
 All rights reserved.
 
 Redistribution and use in source and binary forms, with or without
 modification, are permitted provided that the following conditions are
 met:
 
 Redistributions of source code must retain the above copyright
 notice, this list of conditions and the following disclaimer.
 
 Redistributions in binary form must reproduce the above copyright
 notice, this list of conditions and the following disclaimer in the
 documentation and/or other materials provided with the distribution.
 
 Neither the name of the author nor the names of its contributors may
 be used to endorse or promote products derived from this software
 without specific prior written permission.
 
 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
 "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
 LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
 FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
 COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
 INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
 (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
 SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
 STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
 ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
 OF THE POSSIBILITY OF SUCH DAMAGE.

Contents »