chickadee » phricken

phricken

Synopsis

phricken is a flexible Gopher server implemented on top of the gopher extension.

Overview

The phricken egg provides a high-level interface to the Gopher protocol, along with a Gopher server. Client requests are passed to a configurable chain of handlers, giving you complete control over the meaning of Gopher selectors. Handlers are included for URLs, files, directory listings, and Scheme gophermap files (an S-expression syntax for Gopher response entries). You may also define new handlers and attach them wherever you like in the selector space. Finally, detailed logging is performed by default.

Interface

Entries

(make-entry type name sel #!optional (host (host)) (port (port))) procedure
(make-info-entry . msg) procedure
(make-error-entry . msg) procedure
(make-url-entry name url) procedure

make-entry creates a record consisting of the five main fields in RFC 1436, in the fashion of gopher#make-entry. Here, the host and port fields are optional and will be filled in from the (host) and (port) module parameters. Fields may be of any type, as they are converted to strings via ->string before sending.

The other three procedures are convenience functions creating entries of type i, 3 and h respectively. As in pygopherd, info and error entries have their selectors set to "fake", hosts to "(NULL)", and ports to "0".

Examples:

(make-entry 'I "Picture of me" "/me.jpg")
(make-info-entry "There are " time-til-boom " seconds until self-destruct")
(sgm->entry expr) procedure

sgm->entry converts a Scheme gophermap entry (s-expression) to a Gopher entry record, using the sgm-rules alist to transform it. Here are a couple example SGM entries:

(i "Blog entry for " y "-" m "-" d)
(1 "Public directory" "/pub")
(h "3e8.org hypertext service" "http://3e8.org")

Entries are meant to look exactly like you'd expect a gophermap to look if implemented using s-expressions, instead of the more typical flat file. However, the exact behavior is dictated by the sgm-rules.

sgm-rules parameter

sgm-rules is an alist mapping a type symbol to a transformer procedure. To transform, the appropriate procedure is looked up using the first field (type), and then the entire entry is passed to the procedure via apply. It is therefore natural to use the existing make-entry family of procedures here.

Here's the default definition of sgm-rules:

(define sgm-rules
  (make-parameter
   `((*default* . ,make-entry)
     (i . ,(lambda (type . msg)    (apply make-info-entry msg)))
     (3 . ,(lambda (type . msg)    (apply make-error-entry msg)))
     (h . ,(lambda (type name url) (make-url-entry name url))) )))

If the entry type is not found, the rule *default* is consulted; an error is signaled if no rules match.

(send-entry e) procedure

Sends one entry to the client. e may be an entry record or an sgm entry (s-expr), which is automatically converted into an entry record.

(send-entry `(3 "Invalid selector " ,selector))
(send-entries L) procedure

Sends multiple entries to the client, using send-entry.

(send-entries
 `((i "Chat log")
   (i "--------")
   (i)
   ,@(map (lambda (x)
            `(i ,(utc-seconds->string (car x))
                " | " ,(cadr x)))
          (read-file (chatfile)))
   (i)
   (7 "Say"     ,(request-selector req))
   (1 "Refresh" ,(request-selector req))
   (1 "Go home" "")))

Handlers

(handle-request selector extra) procedure
(request selector matches extra) record
handlers parameter

handle-request is the primary handler procedure, suitable for passing to gopher#accept.

It executes the handlers in (handlers) in order until one returns a true value. If a handler throws an exception, processing terminates immediately, a generic internal server error is sent to the user, the error is logged, and the exception is re-signaled.

Each handler is passed a request record. The selector and extra fields are taken directly from the arguments to handle-request (see gopher#accept for an explanation). The matches field is initially the empty list, but may be modified by the matcher procedures, which perform a regex match on the selector and set matches to the submatch results.

(handle-open-dir root) procedure

Returns a handler which generates a directory listing for any directory under document root ROOT, using filenames->entries to determine how to generate an entry for each filename.

Expects to be attached to a resource (path is second submatch).

(match-resource "/pub/www" (handle-dir "/var/www/myhost/pub"))
; Now selector /pub/www/files sends an index of the 
; directory /var/www/myhost/pub/files
(handle-file root) procedure

Returns a file handler for the document root at ROOT. The filename path will be taken from the request's second submatch and so is generally wrapped in a match-resource.

This handler sends every file via gopher#send-binary-file, even type 0 text files. This seems to be okay with modern (ahem) Gopher clients, which are less stringent about a terminating full-stop and don't require lines end in CRLF. Clients therefore receive a verbatim representation of the text file instead of a transformed one, as you would over the web. If you wish to treat text files separately, you might define a similar handler which uses extension-type to distinguish between text and binary files.

(match-resource "/pub/www" (handle-file "/var/www/myhost/pub"))
; Now selector /pub/www/todo.txt sends /var/www/myhost/pub/todo.txt
(handle-sgm root) procedure
(sgm-filename "index.sgm") parameter

Serves up Scheme gophermaps. If sgm-filename exists in the directory indicated by the selector (relative to ROOT), read the file contents as a Scheme Gophermap and send the results. The file is read with read-file.

Expects to be attached to a resource (path is second submatch).

Example:

(match-resource "" (handle-sgm "/var/phricken/root"))
; An access to /pix/index.sgm will now render the
; contents of /var/phricken/root/pix/index.sgm

where index.sgm might contain:

(i "My pictures")
(i "-----------")
(I "Me at the Apollo" "/pix/apollo.jpg")
(I "Me at Carnegie Hall" "/pix/carnegie.jpg")
handle-url procedure
(url-redirect-time 0) parameter

A handler which sends a meta redirect HTML page to the user. The destination URL comes from the request's first submatch, so this is usually used with match-url.

url-redirect-time can be parameterized to set the content refresh time.

(match-url handle-url)

Matchers

The matcher procedures act as handler "gatekeepers". They take an existing handler and return a new handler which performs a regular expression match on the selector before proceeding with the original handling. If the match fails, #f is immediately returned.

This allows you to 'mount' a handler anywhere you want in selector space. The request record is also updated with any regex submatch information, which is required by some handlers.

(match-selector rx handler) procedure

Returns a new handler that matches the incoming request selector against regex RX using string-match, and calls HANDLER with the request object. Any submatches will be added to the matches field of the request (i.e., it is the CDR of the result of string-match).

(match-resource resource handler) procedure

Returns a handler that matches a selector "resource" -- this is just a shortcut for match-selector, matching the directory (posix-string or SRE) you provide as "resource", plus optional subdirectory path.

For example, "/wiki" will match "(/wiki)($|/*)" and provide those two submatches in the request. The handle-file and handle-open-dir handlers expect exactly this.

(match-url handler) procedure

Convenience matcher for URL:xxx selectors; the first submatch will be the URL, as expected by handle-url.

Handler helpers

These handler helpers are used by handle-open-dir, and can also be used in your own handlers.

(extension-type ext) procedure

Convert a filename extension EXT (a case-insensitive string) to a Gopher entry type (a symbol) using extension-type-map.

(extension-type (pathname-extension "me.jpg")) ; => I
extension-type-map parameter

Case-insensitive map of file extension (as symbol) to 1-character Gopher entry type (as symbol).

(define extension-type-map
 (make-parameter
  `((txt . 0) (log . 0) (scm . 0) (sgm . 0) (c . 0) (h . 0)
    (png . I) (gif . g) (jpg . I) (svg . I))))
path->entry parameter
((path->entry) dir fn dir-sel) procedure

Convert pathname into a Gopher entry. DIR is the directory on disk; FN is the file's basename; DIR-SEL is the selector corresponding to DIR.

Returns an entry object or an SGM entry; either is permissible. Generated entries need not be file entries; they might be, for example, info entries!

This is a parameter used by filenames->entries and ultimately by handle-dir, so override this if you would like to change how directory contents are presented to the user.

The default value is a procedure that maps directories to type 1, other files based on extension-type-map, and defaults to binary type 9. Symbolic links are currently ignored.

(filenames->entries dir basenames dir-sel) procedure

Invokes path->entry on a list of basenames instead of just one. If path->entry returns #f for any entry, it is omitted from the resulting list.

DIR is the containing directory on disk; BASENAMES are the basenames of the files, such as those provided via the (directory dir) call; DIR-SEL is the absolute selector corresponding to this directory (not relative to any resource).

Utilities

(any-handler . handlers) procedure

Returns a handler which executes HANDLERS in order and returns the first true value, or #f. Useful when you have more than one handler you'd like to try against a particular matched selector.

(bind-fs sel root) procedure

Utility function which 'mounts' fs ROOT on resource selector SEL with default filesystem handlers.

Handlers used are handle-sgm, handle-open-dir, handle-file.

(define (bind-fs sel root)
  (match-resource
   sel
   (any-handler (handle-sgm root)
                (handle-open-dir root)
                (handle-file root))))
(sanitize-filename fn) procedure

Sanitize filename FN; currently just removes any references to a parent directory "..".

(selector->filename s root) procedure

Converts a selector string into a filename string by prepending the ROOT path. Also confirms the file exists and the user has read permission. Returns #f on failure.

(send-line line) procedure

Send a single line to the client, and terminate it with a CRLF.

(send-lastline) procedure

Send an end-of-transmission indicator to the client, which is simply a period on a line by itself.

(utc-seconds->string seconds) procedure

Convert seconds since UNIX epoch into a UTC time string suitable for logging.

(utc-seconds->string (current-seconds))
;=> "2009-02-13 21:32:18"

Logging

logger parameter
((logger) type req . msg) procedure
(logger-port (current-error-port)) parameter

The default logger implementation logs a formatted message to (logger-port), or skips logging if the port is #f. No locking is performed. Seeking to end is performed prior to writing, but it is recommended any file be opened in #:append mode.

It is legal for REQ to be #f if a request has not yet been created--for example, upon early error, or initial connect.

TYPE can be any symbol; current types are 'connect, 'access, 'error, 'redirect. By default, types are not treated specially, just displayed in the log message.

The logger parameter can be overridden to use your own logging procedure, as long as it implements the interface above.

Miscellaneous parameters

(host (get-host-name)) parameter
(port 70) parameter
(listen-address #f) parameter
client-ip parameter

host and port are used in make-entry as the default hostname and port, and port is used in start-server! to determine which port to listen on. Note that host must be a DNS name which clients can resolve to reach your machine.

listen-address is the IP address to listen on (bind to), as a string, or #f for the unspecified address.

client-ip is read-only; you can read it inside a handler to determine the IP address of the remote end.

Finally, the server

(start-server! #!optional (bg #f)) procedure

Starts a new threaded server on (port) using the tcp-server extension. Upon connection, control is passed to gopher#accept, which will then dispatch back to our own handle-request.

If optional BG is #t, the server will itself be started in a new thread, allowing you to debug at the REPL.

The tcp6 extension is used to permit IPv6 support, so tcp6 parameters such as tcp-bind-ipv6-only and tcp-buffer-size are applicable.

Examples

Other than the inline examples in this document, there is an operational phricken server running at gopher://3e8.org and gopher://ipv6.3e8.org. It implements several custom handlers, a simple config file, listening on multiple addresses and IPv6 support.

The example source code is also available in the egg.

Author

Jim Ursetto

Version history

License

Copyright (c) 2009-2011 Jim Ursetto.  All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

 Redistributions of source code must retain the above copyright notice,
  this list of conditions and the following disclaimer.
 Redistributions in binary form must reproduce the above copyright notice,
  this list of conditions and the following disclaimer in the documentation
  and/or other materials provided with the distribution.
 Neither the name of the author nor the names of its contributors 
  may be used to endorse or promote products derived from this software 
  without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDERS OR
CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Contents »