chickadee » string-utils

Outdated egg!

This is an egg for CHICKEN 4, the unsupported old release. You're almost certainly looking for the CHICKEN 5 version of this egg, if it exists.

If it does not exist, there may be equivalent functionality provided by another egg; have a look at the egg index. Otherwise, please consider porting this egg to the current version of CHICKEN.

string-utils

Documentation

Memoized String

Usage

(require-extension memoized-string)

make-string+

make-string+ COUNT #!optional FILLprocedure

A tabling make-string.

FILL is any valid char, including codepoints outside of the ASCII range. As such UTF-8 strings can be memoized.

string+

string+ #!optional CHAR...procedure

A tabling string.

CHAR is any valid char, including codepoints outside of the ASCII range. As such UTF-8 strings can be memoized.

global-string

global-string STRprocedure

Share common string space.

make-string* (DEPRECATED)

make-string* COUNT #!optional FILLprocedure

String Hexadecimal

Usage

(require-extension string-hexadecimal)

string->hex

string->hex STRING #!optional START ENDprocedure

Returns a hexadecimal represenation of STRING. START and END are substring limits.

STRING is treated as a string of bytes, a byte-vector.

hex->string

hex->string STRING #!optional START ENDprocedure

Returns the binary representation of a hexadecimalSTRING. START and END are substring limits.

Unicode Utilities

The name of this extension is misleading. Only UTF-8 is currently supported.

For a better treatment of UTF-8 see the utf-8 extension.

Usage

(require-extension unicode-utils)

ascii-codepoint?

ascii-codepoint? CHARprocedure

unicode-char->string

unicode-char->string CHARprocedure

Returns a string formed from Unicode codepoint CHAR.

Note that the (string-length) (except under utf-8) may not be equal to 1.

Generates an error should the codepoint be out-of-range.

unicode-string

unicode-string #!optional CHAR...procedure

Returns a string formed from Unicode codepoints CHAR...

Note that the (string-length) (except under utf-8) may not be equal to the length of CHAR....

Generates an error should the codepoint be out-of-range.

*unicode-string

*unicode-string CHARSprocedure

Returns a string formed from Unicode codepoints CHARS, a (list-of char).

unicode-make-string

unicode-make-string COUNT #!optional FILLprocedure

Returns a string formed from COUNT occurrences of the Unicode codepoint FILL. The FILL default is #\space.

Note that the (string-length) (except under utf-8) may not be equal to COUNT.

Generates an error should the codepoint be out-of-range.

unicode-surrogate?

unicode-surrogate? NUMprocedure

unicode-surrogates->codepoint

unicode-surrogates->codepoint HIGH LOWprocedure

Returns the codepoint for the valid surrogate pair HIGH and LOW. Otherwise returns #f.

String Extensions

Some multi-string replacements.

Usage

(require-extension string-utils-extensions)

string-copy-over!

string-copy-over! FROM TO #!optional START ENDprocedure

Copies a substring of FROM, determined by START & END. Returns the modified TO.

: FROM ; string ; : TO ; string ; : START ; fixnum ; default 0 : END ; fixnum ; default string-length

string-count*

(string-count* PRED? [STR ...]) -> fixnumprocedure

: STR ; string ; : PRED? ; (#!rest char --> boolean) ;

string-any*

(string-any* PRED? [STR ...]) -> (or boolean char)procedure

: STR ; string ; : PRED? ; (#!rest char --> boolean) ;

string-every*

(string-every* PRED? [STR ...]) -> (or boolean char)procedure

: STR ; string ; : PRED? ; (#!rest char --> boolean) ;

String Utilities

Reexports all of the above.

Usage

(require-extension string-utils)

Bytes to Hexadecimal

A common bytevector-like object to hexadecimal string facility.

No error checking is performed!

Usage

(require-extension to-hex)

str_to_hex

str_to_hex OUT IN OFF LENprocedure

Writes the ASCII hexadecimal representation of IN to OUT.

IN is a nonnull-string.

OFF is the byte offset.

LEN is the length of the bytes at OFF.

OUT is a string of length >= (+ LEN 2).

blob_to_hex

blob_to_hex OUT IN OFF LENprocedure

Like str_to_hex except IN is a nonnull-blob.

u8vec_to_hex

u8vec_to_hex OUT IN OFF LENprocedure

Like str_to_hex except IN is a nonnull-u8vector.

s8vec_to_hex

s8vec_to_hex OUT IN OFF LENprocedure

Like str_to_hex except IN is a nonnull-s8vector.

mem_to_hex

mem_to_hex OUT IN OFF LENprocedure

Like str_to_hex except IN is a nonnull-c-pointer.

hex_to_str

hex_to_str OUT IN OFF LENprocedure

Reads the ASCII hexadecimal representation of IN to OUT.

IN is a nonnull-string.

OFF is the byte offset.

LEN is the length of the bytes at OFF.

OUT is a string of length >= (/ LEN 2).

hex_to_str

hex_to_blob OUT IN OFF LENprocedure

Like hex_to_str except OUT is a blob of size >= (/ LEN 2).

String Interpolation

Usage

(require-extension string-interpolation)
(require-extension utf8-string-interpolation)

string-interpolate

(string-interpolate STR [eval-tag: EVAL-TAG] [eval-env: EVAL-ENV]) -> stringprocedure

Performs substitution of embedded Scheme expressions, evaluated in the EVAL-ENV, prefixed with EVAL-TAG and optionally enclosed in curly brackets. Two consecutive EVAL-TAGs are translated to a single EVAL-TAG.

Similar to the #<# multi-line string.

STR is a string.

EVAL-TAG is a character, default #\#.

EVAL-ENV is an environment, default (interaction-environment).

Usage

(require-extension string-interpolation-syntax)

set-sharp-string-interpolation-syntax

set-sharp-string-interpolation-syntax PROCprocedure

Extends the read-syntax with #"..." where the "..." is evaluated using (PROC "..."). When PROC is #f the read-syntax is cleared. When PROC is #t then PROC is identity.

(use string-interpolation-syntax utf8-string-interpolation)

(set-sharp-string-interpolation-syntax string-interpolate)
;#"foo #(+ 1 2)bar #{(and 1 2)} baz"
;=> "foo 3bar 2 baz"

Requirements

check-errors miscmacros utf8

setup-helper test

Author

Kon Lovett

Version history

1.6.0
Add string-utils-extensions.
1.5.6
Add types.
1.5.5
1.5.4
1.5.3
memorize-string -> global-string.
1.5.2
Fix string+ & memorize-string.
1.5.1
Fix string+ unicode support.
1.5.0
Deprecate make-string* for make-string+, add memorize-string & string+.
1.4.0
Add string-interpolation modules.
1.3.1
Fix hex_to_str, hex_to_blob.
1.3.0
Add hex->string, hex_to_str, hex_to_blob.
1.2.5
Remove lookup-table.
1.2.2
Unicode string construction a little faster. Removed blob->hex.
1.2.1
Added blob->hex.
1.2.0
Added "generic" bytes to hexadecimal string.
1.1.0
Split into separate modules. Added some UTF-8 support.
1.0.0
Hello

License

Copyright (C) 2010-2017 Kon Lovett. All rights reserved.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the Software), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED ASIS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Contents »