crunch compiles a severely restricted statically typed subset of R5RS Scheme to C++. It can be used to generate standalone executables or code embedded into Scheme programs.
This extension is highly experimental and likely to contain many bugs and incomplete functionality.
To use crunch in your Scheme code, simply wrap toplevel forms to be compiled in a (crunch ...) expression. All toplevel procedure definitions are accessible as global or local (depending on the context where the crunch form occurs) procedures callable from Scheme. The crunch macro can only be used in compiled code. To use the macro, put
in your code.
Alternatively, the chicken-crunch program can be used to translate and compile code in the crunch Scheme dialect into C++. The generated code has no dependencies. Only the headerfile crunch.h must be available and in your C++ compiler's include path. When installing this extension with chicken-install, the file will be located in your default include path, usually $PREFIX/include.
The compiler can also be used through its procedural API, see crunch-compile. In that case, load the runtime-part of the compiler with
Crunched procedures are in every respect identical to C/C++ functions called via the usual CHICKEN foreign function interface. Crunch does not know anything about Scheme data or memory management. Translated code can call back into Scheme (see define-crunch-callback) - callbacks are usually automatically detected and the generated Scheme wrapper function for a crunched procedure will be of the appropiate type, if required.
Crunch uses its own macro expander, a modified version of Al Petrofsky's alexpander, a R5RS compliant implementation of syntax-rules macros.
No garbage collector is used. All dynamically allocated data (strings and number vectors) are managed using reference-counting.
The dialect of R5RS Scheme supported is extremely limited. See Bugs and limitations for more information.
Note that if you use the crunch macro in your code, you must compile the file generated by chicken in C++ mode (just pass -c++ to csc when compiling).
To get maximum performance, inlining must be enabled in the C++ compiler when compiling crunch-generated code. The default optimization options do not enable inlining unless the default C compiler options have been overridden during installation of CHICKEN. Passing -C -O3 to csc for crunched code will usually optimize the C++ code considerably.
- (crunch-compile EXPRESSION [PORT debug: DBGMODE entry-point: SYMBOL]) procedure
Compiles the toplevel expression EXPRESSION into a C++ code, writing the generated code to PORT, which defaults to the value of (current-output-port). If DBGMODE is given, debugging output will be written to the current output port. DBGMODE can be a boolean or a number between 1 and 3. Debug mode 1 shows some information about each compiled procedure, debug mode 3 generates loads of diagnostic output about the type-inferencing process and expanded code.
If the entry-point name SYMBOL is given, then the (normally hidden) toplevel variable of the same name holding a pointer to the associated C++ function can be accessed from C/C++ code, i.e. it is exposed under the same name. Note that the exposed variable is a pointer to a function.
Each invocation of crunch-compile creates its own private namespace, global variables are not visible in subsequent compilation runs in the same process. Syntax definitions are persistent over several invocations, though.
- (crunch-expand EXPRESSION) procedure
Expands all macros in the given toplevel expression and returns the expansion.
- (crunch EXPRESSION ...) syntax
Compiles the given toplevel expressions and expands into a set of function definitions and an invocation of compiled toplevel expressions in EXPRESSION. The form can be used in a definition context but ends in a non-definition form (and so can with some macro systems not be followed by other definitions). Calls to Scheme callbacks are detected automatically and generate the appropriate foreign-safe-lambda definition. The result of the executed toplevel code is unspecified.
- (define-crunch-primitives ((NAME ARGTYPE ...) -> RESULTTYPE [C-NAME]) ...) syntax
Define additional primitives with the given names and argument- and result types. if C-NAME is given, it specifies the name of the actual C/C++ function to be called. Otherwise NAME is used.
- (define-crunch-callback (NAME (ARGFTYPE1 VAR1) ...) RESULTFTYPE BODY ...) syntax
Equivalent to define-external, but makes the callback accessible in subsequent translations of crunch code.
Note that you have to pass -emit-external-prototypes-first to csc (or chicken) when you use crunch callbacks to place function prototypes for the callbacks in front of code generated by crunch.
The program chicken-crunch can be used to generate a standalone program or module that has no CHICKEN dependencies.
usage: chicken-crunch OPTION | FILENAME ... -h show this message -o FILENAME set output filename -d enable debug output -dd enable more debug output -ddd enable massive debug output -cc CC select C++ compiler (default: "c++") -expand only show code after expansion -entry NAME set entry-point procedure -top NAME set name of initialization procedure -translate only generate C++, don't compile All other options (arguments beginning with "-") are passed to the C++ compiler. FILENAME may be "-", which reads source code from stdin.
Provided the file crunch.h is in the include path, the generated C++ code can be compiled by itself. To link, you may have to add the -lm switch to the linker, depending on the platform on which you are compiling the code.
Crunch performs type-inference to find out the types of local and global variables. It currently knows about these types:
|Crunch type||C type||Description|
|int short long||int short long||integer numbers|
|float double||float double||floating point numbers|
|void||void||the type of the "unspecified" value|
|blob||void *||a shapeless byte sequence|
|c-pointer||void *||an opaque pointer|
|u8vector s8vector u16vector s16vector u32vector s32vector f32vector f64vector||unsigned char * signed char * unsigned short * short * unsigned int * int * float * double *||SRFI-4 homogeneous number vectors|
Important: callbacks are likely to trigger a garbage collection, which will invalidate references to number-vectors or strings allocated in normal Scheme code. This does not apply to data allocated inside crunched code, which is not subject to garbage collection.
Variables defined with define or set! or bound with let or in a lambda list can be declared to have a particular type by suffixing them with :: followed by a typename:
(crunch (let ((a::int (* 8 (sin 1)))) (display a::int))) ; shows "8"
Note that the name of variable really is a::int, not a. You usually don't need these declarations, though.
Note also the absence of any other data types, in particular lists, vectors or record structures.
Crunched functions may return results of the following types:
char int short long float double c-string c-pointer
Polymorphic procedures are not supported.
The following non-standard macros are provided:
cond-expand when unless switch rec
cond-expand recognizes the feature identifiers crunch, srfi-0, highlevel-macros and syntax-rules. When code is compiled to a standalone program with chicken-crunch, the feature identifier crunch-standalone is defined as well.
All primitives take a fixed number of arguments, optional or "rest" arguments are not supported. Primitives may not be redefined. Uses of primitives in non-operator position are treated as (lambda (tmp1 ...) (<primitive> tmp1 ...)).
Argument type abbreviations:
|O O1 O2||any data object|
|N N1 N2||integer|
|K K1 K2||positive integer|
|R R1 R2||inexact number|
|S S1 S2||string|
|C C1 C2||character|
|U8 S8 U16 S16 U32 S32 F32 F64||SRFI-4 number vector|
The following R5RS procedures are provided:
(eq? O1 O2) (eqv? O1 O2) (equal? O1 O2)
(+ X Y) (- X Y) (* X Y) (/ X Y) (= X Y) (> X Y) (< X Y) (>= X Y) (<= X Y) (abs X) (acos R) (asin R) (atan R) (ceiling X) (cos R) (display O) (even? N) (exact? X) (exact->inexact X) (exp R) (expt R1 R2) (floor X) (inexact? X) (inexact->exact X) (integer? X) (log R) (max X Y) (min X Y) (modulo N1 N2) (negative? X) (odd? N) (positive? X) (quotient N1 N2) (remainder N1 N2) (round X) (sin R) (sqrt X) (tan R) (truncate X) (zero? X)
max, min and expt are not exactness preserving. expt always returns an inexact result.
(char=? C1 C2) (char>? C1 C2) (char<? C1 C2) (char>=? C1 C2) (char<=? C1 C2) (char->integer C) (char-alphabetic? C) (char-ci=? C1 C2) (char-ci>? C1 C2) (char-ci<? C1 C2) (char-ci>=? C1 C2) (char-ci<=? C1 C2) (char-downcase C) (char-lower-case? C) (char-numeric? C) (char-upper-case? C) (char-upcase C) (char-whitespace? C) (integer->char K)
(number->string X K) (make-string N C) (string=? S1 S2) (string>? S1 S2) (string<? S1 S2) (string>=? S1 S2) (string<=? S1 S2) (string->number S K) (string-ci=? S1 S2) (string-ci>? S1 S2) (string-ci<? S1 S2) (string-ci>=? S1 S2) (string-ci<=? S1 S2) (string-append S1 S2) (string-copy S) (string-fill! S1 C) (string-length S) (string-ref S K) (string-set! S K C) (substring S K1 K2)
string->number does not detect invalid numerical syntax and simply wraps strtol(3)/strtod(3). If a radix different from 10 is given, the result will always be converted with strtol(3).
number->string ignores the radix argument if the converted number is inexact.
(display X) (newline) (write-char C)
write-char, display and newline always write to stdout.
Non-R5RS procedures (see the The User's Manual for more information):
(add1 X) (atan2 R1 R2) (arithmetic-shift N1 N2) (bitwise-and N1 N2) (bitwise-ior N1 N2) (bitwise-not N) (bitwise-xor N1 N2) (sub1 X)
(f32vector-length F32) (f32vector-ref F32 K) (f32vector-set! F32 K R) (f64vector-length F64) (f64vector-ref F64 K) (f64vector-set! F64 K R) (make-f32vector K R) (make-f64vector K R) (make-s16vector K N) (make-s32vector K N) (make-s8vector K N) (make-u16vector K1 K2) (make-u32vector K1 K2) (make-u8vector K1 K2) (s16vector-length S16) (s16vector-ref S16 K) (s16vector-set! S16 K N) (s32vector-length S32) (s32vector-ref S32 K) (s32vector-set! S32 K N) (s8vector-length S8) (s8vector-ref S8 K) (s8vector-set! S8 K N) (subf32vector F32 K1 K2) (subf64vector F64 K1 K2) (subs16vector S16 K1 K2) (subs32vector S32 K1 K2) (subs8vector S8 K1 K2) (subu16vector U16 K1 K2) (subu32vector U32 K1 K2) (subu8vector U8 K1 K2) (u16vector-length U16) (u16vector-ref U16 K) (u16vector-set! U16 K1 K2) (u32vector-length U32) (u32vector-ref U32 K) (u32vector-set! U32 K1 K2) (u8vector-length U8) (u8vector-ref U8 K) (u8vector-set! U8 K1 K2)
(blob->f32vector B) (blob->f32vector/shared B) (blob->f64vector B) (blob->f64vector/shared B) (blob->s16vector B) (blob->s16vector/shared B) (blob->s32vector B) (blob->s32vector/shared B) (blob->s8vector B) (blob->s8vector/shared B) (blob->string B) (blob->string/shared B) (blob->u16vector B) (blob->u16vector/shared B) (blob->u32vector B) (blob->u32vector/shared B) (blob->u8vector B) (blob->u8vector/shared B) (f32vector->blob F32) (f32vector->blob/shared F32) (f64vector->blob F64) (f64vector->blob/shared F64) (s16vector->blob S16) (s16vector->blob/shared S16) (s32vector->blob S32) (s32vector->blob/shared S32) (s8vector->blob S8) (s8vector->blob/shared S8) (string->blob S) (string->blob/shared S) (u16vector->blob U16) (u16vector->blob/shared U16) (u32vector->blob U32) (u32vector->blob/shared U32) (u8vector->blob U8) (u8vector->blob/shared U8)
The .../shared conversion procedures return data objects that share the actual storage with the argument objects, this can be used for interesting applications.
(void) (error S) (exit N) (argc) (argv-ref K)
error shows a message and invokes abort(3). argc returns the number of arguments passed to the process (including the program name) and argv-ref returns the command line argument with the given index (or the program name, when the index is zero).
(pointer-u8-ref P N) (pointer-s8-ref P N) (pointer-u16-ref P N) (pointer-s16-ref P N) (pointer-u32-ref P N) (pointer-s32-ref P N) (pointer-f32-ref P N) (pointer-f64-ref P N) (pointer-u8-set! P N1 N2) (pointer-s8-set! P N1 N2) (pointer-u16-set! P N1 N2) (pointer-s16-set! P N1 N2) (pointer-u32-set! P N1 N2) (pointer-s32-set! P N1 N2) (pointer-f32-set! P N R) (pointer-f64-set! P N R)
- Pass -DDBGALLOC to the C++ compiler (either through chicken-crunch or to csc via -C -DDBGALLOC) to see log messages about the allocation and de-allocation of dynamic number vectors or strings.
- Runtime errors invoke abort(3) and thus can not be caught.
- Lexical scope is not supported, only references to global variables and variables local to the current lambda construct (including let bound variables) are visible. Expressions of the form ((lambda (...) ...) ...) are converted in the corresponding let construct.
- Local procedures are not available
- letrec is not supported (it makes no sense without local procedures)
- Continuations are not supported.
- Multiple values are not supported.
- Tail calls are only detected in self-recursive functions.
- Rest-arguments (dotted lambda lists) are not supported.
- Numeric overflow of fixnum operations is not detected.
- Nearly no error checks are made at runtime.
- Named let is always assumed to be a looping construct, calls to the loop variable must be in tail position.
- do and named let loops always return an unspecified value.
- The correctness of the C++ template code is unclear. C++ is insane.
- If a homogenous number vector or string is passed from Scheme to C++ code generated by crunch, then the length of the passed array is not known and the associated ...-length primitive and primitives that require the length of the vector will abort.
- Type-related errors do not always produce particularly useful context information
- Error messages are generally pretty bad
(use crunch) (crunch (define (string-reverse str) (let* ((n (string-length str)) (s2 (make-string n #\space))) (do ((i 0 (add1 i))) ((>= i n)) (string-set! s2 (sub1 (- n i)) (string-ref str i))) s2)) ) (print (string-reverse "this is a test!"))
Copyright (c) 2007-2012, Felix L. Winkelmann The "alexpander" is Copyright (c) 2002-2004, Al Petrofsky All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. Neither the name of the author nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- another setup-script fix (thanks to Mario)
- added "-top" option, some bugfixes
- fixed bug in installation script (thanks to Jim Pryor)
- fixed bug related to callbacks with void result type
- removed unused test files
- two bugfixes (Thanks to Jeronimo)
- fixed silly mistake
- fixed bug in setup script
- ported to CHICKEN 4
- updated to newest alexpander
- fixed buggy formatting directive
- support for libarena by Ivan Raikov
- fixed bugs in character handling [thanks to Alex Shinn]
- fixed bugs in naming of char->integer and integer->char
- initial release