SCSH-Process
TOC »
Description
A reimplementation for Chicken of SCSH's process notation.
Requirements
Documentation
This egg implements all the special forms required to implement SCSH's convenient process notation. This notation is called "EPF", which stands for extended process form.
The procedural equivalents for these special forms are also available. Where it's required or makes sense, a handful of other procedures are also provided. This egg does not strive for full SCSH compatibility; it exists only to bring the convenience of EPF notation to Chicken.
Caveats
There are a few important caveats, especially related to threading. Please read this section thoroughly before using scsh-process! If none of this makes sense yet, read the rest of the manual first, then come back. The issues are listed in order of decreasing importance.
Forking and threads
If you are using threads, it is important to realize that thread killing is completely insensitive to your delicate synchronization primitives or resource management. Ensure that all locks are released and you don't have any non-GCable memory allocated in other threads. If you have finalizers registered, those will still run, though.
Signal handling
Beware that if you set a signal/chld handler, you will need to remember the original handler and call it from your handler. If you don't, you must manually wait for all children. If you've installed a handler before scsh-process is loaded, it will be automatically chained by the handler installed by scsh-process.
Process reaping
Loading scsh-process will transparently cause the posix process-wait procedure to be replaced so it can update the bookkeeping information about reaped child processes. See the description in the SCSH manual about how SCSH performs process reaping. Currently only the "early" reaping strategy is supported.
Macros
The special forms provided by this egg are the most convenient to use, but also slightly less powerful than the underlying procedures they use.
Basic process macros
- (run pf [redirection ...])syntax
- (& pf [redirection ...])syntax
- (exec-epf pf [redirection ...])syntax
These forms run a new process pipeline, described by process form pf, with optional redirections of input and output descriptors indicated by any number of redirection patterns. These processes don't interact with the calling process; if you want to capture their output there are other forms, like run/string, run/port and friends. Please also beware that these forms don't do anything to indicate nonzero exit statuses in pipelines. The && and || macros might help if you need to do status checks.
The run form simply runs the process and waits for the final process in the pipeline to exit. It returns three values relating to this final process:
- Either its exit status as an integer if the process terminated normally, or the signal number that terminated/stopped the process.
- #t if the process exited normally, #f if it terminated abnormally.
- Its process id (an integer).
You'll note that these are just the same values returned by process-wait, but in a slightly different order (wait utilises this modified order, as well). This is for compatibility reasons: In SCSH this form returns only the exit status, but because Chicken accepts multiple values in single value contexts (discarding all but the first), we can provide the other ones as extra values, thereby avoiding gratuitous incompatibility.
The & form simply runs the process in the background and returns one value: a process object representing the last process in the pipeline.
The exec-epf form never returns; it replaces the current process by the last process in the pipeline. The others are implemented in terms of this form:
(& . epf) => (process-fork (lambda () (exec-epf . epf)) #t) (run . epf) => (process-wait (& . epf))
A process form followed by a set of redirections is called "extended process form" in SCSH terminology. A process form can be one of the following:
(<program> <arg> ...) (pipe <pf> ...) (pipe+ <connect-list> <pf> ...) (epf <pf> <redirection> ...) (begin <s-expr> ...)
The basic building blocks are <program> and <begin>, which always correspond to one process in a pipeline. The other rules are ways to combine these two.
<program> gives the name of a program to run. If it starts with a slash, it's an absolute filename. If it contains a slash but doesn't start with one, it's a relative filename. If it doesn't contain a slash, it's a command name that is located in PATH. One or more <arg>s can be given as arguments to the program.
<program> and each <arg> are implicitly quasiquoted. <program> can be a string, symbol, number, or an unquote expression ,e that evaluates to a value of one of those types. Each <arg> follows the same rules as <program>, but can additionally be an unquote-splicing expression ,@e that evaluates to a list of multiple arguments where each element is a string, symbol, or number.
NOTE: An unquote-splicing expression is not recognized in <program> position, so it's not possible to obtain <program> and <arg>s from the same ,@ expression. If you have both the program name and the arguments in the same list, use car and cdr to destructure the list before passing it to run.
The pipe rule will hook up standard output and standard error of each pf to the next pf's standard input, just like in a regular Unix shell pipeline.
The pipe+ rule is like pipe, but it allows you to hook up arbitrary file descriptors between two neighbouring processes. This is done through the connect-list, which is a list of fd-mappings describing how ports are connected from one process to the next. It has the form ((from-fd1 from-fd2 ... to-fd) ...). The from-fds correspond to outbound file descriptors in one process, the to-fds correspond to inbound file descriptors in the other process.
The epf rule is to get extended process forms in contexts where only process forms are accepted, like the pipe and pipe+ subforms, and in the && and || macros (so you can do file redirects here).
The begin rule allows you to write scheme code which will be run in a forked process, having its current-input-port, current-output-port and current-error-port hooked up to its neighbouring processes in the pipeline.
A redirection can be one of the following:
(> [<fd>] <file-name>) ; Write fd (default: 1) to the given filename (>> [<fd>] <file-name>) ; Like >, but append instead of overwriting (< [<fd>] <file-name>) ; Read fd (default: 0) from the filename (<< [<fd>] <scheme-object>) ; Like <, but use object's printed representation (= <fd> <fd-or-port>) ; Redirect fd to fd-or-port (- <fd-or-port>) ; Close fd stdports ; Duplicate fd 0, 1, 2 from standard Scheme ports
The arguments to redirection rules are also implicitly quasiquoted.
To tie it all together, here are a few examples:
(import scsh-process) ;; Writes "1235" to a file called "out" in the current directory. ;; Shell equivalent: echo 1234 + 1 | bc > out (run (pipe (echo "1234" + 1) ("bc")) (> out)) (define message "hello, world") ;; Writes 13 to stdout, with a forked Scheme process writing the data. ;; Shell equivalent (sort of): echo 'hello, world' | wc -c (run (pipe (begin (display message) (newline)) (wc -c))) ;; A verbose way of doing the same, using pipe+. It connects the {{begin}} ;; form's standard output and standard error to standard input of {{wc}}: (run (pipe+ ((1 2 0)) (begin (display message) (newline)) (wc -c))) ;; Same as above, using redirection instead of port writing: (run (wc -c) (<< ,(string-append message "\n"))) ;; Writes nothing because stdout is closed: (run (wc -c) (<< ,message) (- 1)) ;; A complex toy example using nested pipes, with input/output redirection. ;; Closest shell equivalent: ;; ((sh -c "echo foo >&2") 2>&1 | cat) | cat (run (pipe+ ((1 0)) (pipe+ ((2 0)) (sh -c "echo foo >&2") (cat)) (cat)))
Process macros for interfacing with Scheme
- (run/port pf [redirection ...])syntax
- (run/file pf [redirection ...])syntax
- (run/string pf [redirection ...])syntax
- (run/strings pf [redirection ...])syntax
- (run/sexp pf [redirection ...])syntax
- (run/sexps pf [redirection ...])syntax
These forms are equivalent to run, except they wire up the current process to the endpoint of the pipeline, allowing you to read the standard output from the pipeline as a whole. If you also need standard error or have even more specialized needs, take a look at the run/collecting form.
The difference between these forms is in how this output is returned, and when the call returns:
- run/port immediately returns after forking, and returns a port from which you can read.
- run/file returns after the final process exits, resulting in a string which indicates a temporary file containing the process' output.
- run/string returns when the process closes its standard output (ie, when EOF is read), collecting the standard output into a string.
- run/strings is like run/string, but returns a list of strings, split at newline characters.
- run/sexp reads an s-expression, and returns as soon as a complete s-expression was read.
- run/sexps reads all s-expressions until eof, returning a list of s-expressions. It returns as soon as EOF is read.
Macros for conditional process control
- (&& pf ...)syntax
- (|| pf ...)syntax
These macros act like their counterpart shell operators; they run the given process forms in sequence and stop either on the first "false" value (nonzero exit) or on the first "true" value (zero exit), respectively.
The result value of these is #f or #t, so they act a lot like regular Scheme and and or.
Note: The name of the || macro is really the empty symbol whereas in SCSH's reader, it reads as a symbol consisting of two pipe characters. The name of these macros may change in the future if it turns out to cause too much trouble.
Collecting multiple outputs
- (run/collecting fds pf ...)syntax
This form runs the pf form, redirecting each the file descriptors in the fds list to a separate tempfile, and waits for the process to complete.
The result of this expression is (status file ...). status is the exit status of the process. Each file entry is an opened input port for the temporary file that belongs to the file descriptor at the same offset in the fds list. If you close the port, the tempfile is removed.
See the SCSH documentation for an extended rationale of why this works the way it does.
Procedural interface
These procedures form the basis for the special forms documented above, and can be used to implement your own, more specialized macros.
Basic forking and pipeline primitives
- fork #!optional thunk continue-threads?procedure
- %fork #!optional thunk continue-threads?procedure
If thunk is provided and not #f, the child process will invoke the thunk and exit when it returns. If continue-threads is provided and #t, all existing threads will be kept alive in the child process; by default only the current thread will be kept alive.
fork differs from the regular process-fork in its return value. Instead of a pid value, this returns a process object representing the child is returned in the parent process. When thunk is not provided, #f (not zero!) is returned in the child process.
- wait #!optional pid-or-process nohangprocedure
Like process-wait, but nonblocking: Suspends the current process until the child process described by pid-or-process (either a numerical process ID or a scsh-process object) has terminated using the UNIX system call waitpid(). If pid-or-process is not given, then this procedure waits for any child process. If nohang is given and not #f then the current process is not suspended.
This procedure returns three values, in a different order from process-wait:
- Either the exit status, if the process terminated normally or the signal number that terminated/stopped the process.
- #t if the process exited normally or #f otherwise.
- pid or 0
All values are #f, if nohang is true and the child process has not terminated yet.
It is allowed to wait multiple times for the same process after it has completed, if you pass a process object. Process IDs can only be waited for once after they have completed and will cause an error otherwise.
This procedure is nonblocking, which means you can wait for a child in a thread and have the other threads continue to run; the process isn't suspended, but a signal handler will take care of unblocking the waiting thread.
- signal-process proc signalprocedure
Like process-signal from the POSIX unit, but accepts a process object (proc) instead of a pid. Sends signal (an integer) to the given process.
- process-sleep secprocedure
Put the entire process to sleep for sec seconds. Just an alias for sleep from the POSIX unit.
- process? objectprocedure
- proc? objectprocedure
Is object an object representing a process? The process? predicate is deprecated; proc? is API-compatible with SCSH.
- proc:pid procprocedure
Retrieve the process id (an integer) from the process object proc.
- fork/pipe #!optional thunk continue-threads?procedure
- %fork/pipe #!optional thunk continue-threads?procedure
These fork the process as per fork or %fork, but additionally they set up a pipe between parent and child. The child's standard output is set up to write to the pipe, while the parent's standard input is set up to read to the pipe. Standard error is inherited from the parent.
The return value is a process object or #f.
Currently fork%/pipe is just an alias for fork/pipe.
Important: These procedures only set up the file descriptors, not the Scheme ports. current-input-port, current-output-port and current-error-port still refer to their old file descriptors after a fork. This means that you'll need to reopen the descriptors to get a Scheme port that reads from the child or writes to the parent:
(import scsh-process (chicken file posix)) (process-wait (fork/pipe (lambda () (with-output-to-port (open-output-file* 1) (lambda () (display "Hello, world.\n")))))) (read-line (open-input-file* 0)) => "Hello, world"
- fork/pipe+ conns #!optional thunk continue-threads?procedure
- fork%/pipe+ conns #!optional thunk continue-threads?procedure
These are like fork/pipe and fork%/pipe, except they allow you to control how the file descriptors are wired. Conns is a list of lists, of the form ((from-fd1 from-fd2 ... to-fd) ...). See the description of pipe+ under the run special form for more information.
Currently fork%/pipe+ is just an alias for fork/pipe+.
Executing programs from the path
- (exec-path program [args ...])procedure
- exec-path* program arg-listprocedure
This will simply execute program with the given arguments in args or args-list. The program replaces the currently running process. All arguments (including program) must be strings, symbols or numbers, and they will automatically be converted to strings before invoking the program.
The program is looked up in the user's path, which is simply the $PATH environment variable. There currently is no separately maintained path list like in SCSH. The difference between the two procedures is that exec-path accepts a variable number of arguments and exec-path* requires them pre-collected into a list.
Pipeline procedures
- run/port* thunkprocedure
- run/file* thunkprocedure
- run/string* thunkprocedure
- run/strings* thunkprocedure
- run/sexp* thunkprocedure
- run/sexps* thunkprocedure
These set up a pipe between the current process and a forked off child process which runs the thunk. See the "unstarred" versions run/port, run/file ... run/sexps for more information about the semantics of these procedures.
Collecting multiple outputs
- run/collecting* fds thunkprocedure
Like run/collecting, but use a thunk instead of a process form.
Changelog
- 1.6.0 - Fix permission of files created to not be executable by default, as workaround to strange default permissions in CHICKEN's file-open (#1698, thanks to Vasilij Schneidermann).
- 1.5.2 - Forgot to import chicken.fixnum in CHICKEN 5 (thanks to Evan Hanson)
- 1.5.1 - Switch to llrb-tree egg instead of srfi-69.
- 1.5.0 - Fix a race condition which caused SIGCHLD to be missed sometimes, and replace srfi-69 with llrb-tree (thanks to Jörg F. Wittenberger).
- 1.4.0 - Port to CHICKEN 5 (thanks to Vasilij Schneidermann for providing an initial patch).
- 1.3.0 - Fix > redirection to truncate existing files (thanks to Jörg F. Wittenberger).
- 1.2.2 - Fix (= 1 2) style redirection to be the correct way around, to match documentation and the original SCSH implementation (thanks to Diego "dieggsy").
- 1.2.1 - Do not connect stdout of subprocess to stdin of parent in run/file; this is not needed because only the returned file should be written to (thanks to Diego "dieggsy").
- 1.2.0 - Do not redirect stderr to stdout in fork/pipe, run/file* and run/file; instead, stderr is inherited from the parent (thanks to Jörg F. Wittenberger). This improves compatibility with scsh.
- 1.1.0 - Move signal handler into a separate thread to allow signaling the thread that was interrupted by the handler.
- 1.0.0 - Fix fork restoration of signal mask to what it was before fork rather than blindly unmasking it. Fix wait test with #f argument. Fix (conditional) unmasking of signal/chld in the child thunk after performing a fork.
- 0.9.0 - Fix race condition that sometimes caused the loss of the exit statuses of already reaped child processes, which would cause an infinite loop when waiting for the process again.
- 0.8.3 - Fix wait with a plain pid or #f so that it updates any corresponding scsh-process objects that may exist (thanks to Jörg F. Wittenberger).
- 0.8.2 - Clean up file descriptors in run/..* procedures (again, thanks to Jörg F. Wittenberger).
- 0.8.1 - Reinstall deadlock detection workaround thread after forking and killing all threads (thanks to Jörg F. Wittenberger).
- 0.8 - Add support for waiting for children from threads without blocking the entire process (thanks to Jörg F. Wittenberger).
- 0.7.1 - Fix version number in .setup file (thanks to Jörg F. Wittenberger).
- 0.7 - Actually export pid:proc. Clear pending child process table on fork (thanks to Jörg F. Wittenberger).
- 0.6 - Add signal-process, process-sleep and pid:proc. Deprecated process? in favor of proc?. Thanks to Jörg F. Wittenberger for the suggestion of adding some of these.
- 0.5 - Standard error is no longer redirected by default, making it more consistent with UNIX shells and the original SCSH. Thanks to Haochi Kiang for pointing this out and providing a patch.
- 0.4.1 - Allow the use of unquote-splicing in run macro forms (thanks to Moritz Heidkamp for pointing this out)
- 0.4 - Support continue-threads parameter for fork on newer Chickens which support kill-all-threads option in process-fork. This should make this library safe for use in threaded programs.
- 0.3.1 - Change wait result values to all be #f if nohang is #t.
- 0.3 - Fix segfault properly, revert back to using standard-extension. Add wait and do not accept scsh-process objects in POSIX process-wait.
- 0.2.1 - Workaround for segmentation fault caused by compiled version of scsh-process in 4.8.0 due to -O3 being the default compilation of a standard-extension.
- 0.2 - Fix <<-redirection and increase robustness of test for it.
- 0.1.2 - Fix setup-file to have non-bogus version identifier. Don't rely on "bc" being present in the tests.
- 0.1.1 - Fix order of values returned by process-wait to be consistent with POSIX unit.
- 0.1 - Initial release, created at the Chicken UK 2012 hacking event.
Author
Repository
http://code.more-magic.net/scsh-process
License
Copyright (c) 2012-2020, Peter Bex All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE AUTHORS ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.