string-utils
TOC »
Documentation
Memoized String
Usage
(import memoized-string)
make-string+
- make-string+ COUNT #!optional FILLprocedure
An interning make-string.
FILL is any valid char, including codepoints outside of the ASCII range, which produce UTF-8 strings.
string+
- string+ #!optional CHAR...procedure
An interning string.
CHAR is any valid char, including codepoints outside of the ASCII range, which produce UTF-8 strings.
global-string
- global-string STRprocedure
Share common string space.
String Hexadecimal
Usage
(import string-hexadecimal)
string->hex
- string->hex STRING #!optional START ENDprocedure
Returns a hexadecimal represenation of STRING. START and END are substring limits.
STRING is treated as a string of bytes, a byte-vector.
hex->string
- hex->string STRING #!optional START ENDprocedure
Returns the binary representation of a hexadecimalSTRING. START and END are substring limits.
Hexadecimal Procedures
Usage
(import to-hex)
str_to_hex
- str_to_hex OUT IN OFF LENprocedure
Writes the ASCII hexadecimal representation of IN to OUT.
IN is a nonnull-string.
OFF is the byte offset.
LEN is the length of the bytes at OFF.
OUT is a string of length >= (+ LEN 2).
blob_to_hex
- blob_to_hex OUT IN OFF LENprocedure
Like str_to_hex except IN is a nonnull-blob.
u8vec_to_hex
- u8vec_to_hex OUT IN OFF LENprocedure
Like str_to_hex except IN is a nonnull-u8vector.
s8vec_to_hex
- s8vec_to_hex OUT IN OFF LENprocedure
Like str_to_hex except IN is a nonnull-s8vector.
mem_to_hex
- mem_to_hex OUT IN OFF LENprocedure
Like str_to_hex except IN is a nonnull-c-pointer.
hex_to_str
- hex_to_str OUT IN OFF LENprocedure
Reads the ASCII hexadecimal representation of IN to OUT.
IN is a nonnull-string.
OFF is the byte offset.
LEN is the length of the bytes at OFF.
OUT is a string of length >= (/ LEN 2).
hex_to_blob
- hex_to_blob OUT IN OFF LENprocedure
Like hex_to_str except OUT is a blob of size >= (/ LEN 2).
Unicode Utilities
The name of this extension is misleading. Only UTF-8 is currently supported.
For a better treatment of UTF-8 see the utf-8 extension.
Usage
(import unicode-utils)
ascii-codepoint?
- ascii-codepoint? CHARprocedure
char->unicode-string
- char->unicode-string CHARprocedure
Returns a string formed from Unicode codepoint CHAR.
Note that the (string-length) (except under utf-8) may not be equal to 1.
Generates an error should the codepoint be out-of-range.
unicode-string
- unicode-string #!optional CHAR...procedure
Returns a string formed from Unicode codepoints CHAR...
Note that the (string-length) (except under utf-8) may not be equal to the length of CHAR....
Generates an error should the codepoint be out-of-range.
*unicode-string
- *unicode-string CHARSprocedure
Returns a string formed from Unicode codepoints CHARS, a (list-of char).
unicode-make-string
- unicode-make-string COUNT #!optional FILLprocedure
Returns a string formed from COUNT occurrences of the Unicode codepoint FILL. The FILL default is #\space.
Note that the (string-length) (except under utf-8) may not be equal to COUNT.
Generates an error should the codepoint be out-of-range.
unicode-surrogate?
- unicode-surrogate? NUMprocedure
unicode-surrogates->codepoint
- unicode-surrogates->codepoint HIGH LOWprocedure
Returns the codepoint for the valid surrogate pair HIGH and LOW. Otherwise returns #f.
String Utilities
Usage
(import string-utils)
string-split-chars
- string-split-chars STR #!optional DELIMITERSprocedure
Returns a list of substrings of STR & a list of the characters, from DELIMITERS, separating those substrings.
- STR
- string ; version string.
- DELIMITERS
- string ; string of version component delimiter characters, default ".,".
(string-split-chars "a.2,c" "$,.") ;=> ("a" "2" "c") (#\. #\,)
string-unzip
- string-unzip STR #!optional DELIMITERSprocedure
Returns a list of substrings of STR & a list of the delimiters, from DELIMITERS, separating those substrings.
- STR
- string ; version string.
- DELIMITERS
- string ; string of version component delimiter characters, default ".,".
(string-unzip "a.2,c" "$,.") ;=> ("a" "2" "c") ("." ",")
string-zip
- string-zip PARTS PUNCSprocedure
Returns a string formed from the concatenation of the PARTS and the interspersion of the PUNCS.
- PARTS
- (list-of string) ; version components.
- PUNCS
- (list-of string) ; version component separators.
(string-zip ("a" "2" "c") ("." ",")) ;=> "a.2,c"
string-trim-whitespace-both
- string-trim-whitespace-both Sprocedure
Returns the string S with whitespace trimmed.
list-as-string
- list-as-string LSprocedure
Returns the list LS written to a string.
number->padded-string
- number->padded-string N WIDTH #!optional PADCHAR BASEprocedure
- N
- number ; source
- WIDTH
- fixnum ; field width
- PADCHAR
- char ; padding character
- BASE
- fixnum ; number conversion base
string-fixed-length
- (string-fixed-length S N [pad-char: #\space] [trailing: "..."]) -> stringprocedure
Returns the string S with the string-length fixed to N.
A shorter string is padded. A longer string is truncated, & suffixed with the trailing.
string-longest-common-prefix
- string-longest-common-prefix STRINGSprocedure
Returns the longest comment prefix of STRINGS.
- STRINGS
- (list-of string)
string-longest-common-suffix
- string-longest-common-suffix STRINGSprocedure
Returns the longest comment suffix of STRINGS.
- STRINGS
- (list-of string)
string-longest-prefix
- string-longest-prefix CANDIDATE OTHERSprocedure
Returns the member with the longest comment prefix of CANDIDATE from OTHERS, or #f.
- CANDIDATE
- string
- OTHERS
- (list-of string)
string-longest-suffix
- string-longest-suffix CANDIDATE OTHERSprocedure
Returns the member with the longest comment suffix of CANDIDATE from OTHERS, or #f.
- CANDIDATE
- string
- OTHERS
- (list-of string)
String Interpolation
Extends the read-syntax with #"..." where tagged scheme expressions in the string are evaluated at runtime:
#"@ #(+ 1 2)## (#'and #1 #2) = #(and 1 2) trailing #" ;=> "@ 3# (and 1 2) = 2 trailing #"
Similar to the #<# multi-line string.
See Multiline String Constant with Embedded Expressions.
Note Support for the #{<sexpr>} subform is dropped. So SRFI 105 can work as expected:
(import (srfi-105 extra)) #"1 + 3 = #{1 + 3}" ;=> "1 + 3 = 4" #"An \"#{string-append(\"Hello, \" \"World\")}\" example" ;=> "An \"Hello, World\" example"
Usage
(import string-interpolation)
or using UTF8
(import utf8-string-interpolation)
Compiler Command-Line
csc -extend [utf8-]string-interpolation ...
Interpreter Command-Line
csi -require-extension [utf8-]string-interpolation ...
Activates string-interpolation #"..." syntax.
String Interpolation Syntax
Usage
(import string-interpolation-syntax)
set-sharp-string-interpolation-syntax
- set-sharp-string-interpolation-syntax PROCprocedure
Extends the read-syntax with #"..." where the "..." is evaluated using (PROC "...").
- PROC
- #f ; read-syntax is cleared.
- PROC
- #t ; PROC is identity.
- PROC
- procedure ; interpolation function.
String Interpolator
Usage
(import string-interpolator)
or using UTF8
(import utf8-string-interpolator)
string-interpolate
- (string-interpolate STR [eval-tag: EVAL-TAG]) -> listprocedure
Performs substitution of embedded Scheme expressions, prefixed with EVAL-TAG. Two consecutive EVAL-TAGs are translated to a single EVAL-TAG. A trailing EVAL-TAG is taken literally.
- STR
- string.
- EVAL-TAG
- character, default #\#.
Rabin Karp String Search
Usage
(import rabin-karp)
make-string-search
- make-string-search STRINGS #!optional COMPARE HASHprocedure
- STRINGS
- (list-of string) ;
- COMPARE
- (string string --> boolean) ;
- HASH
- (string [BOUNDS []]) ; SRFI-69 hash procedure.
- SEARCHER
- (string [START [END]]) --> RESULT
- RESULT
- (or #f (STRING . (START . END))) ; success or failure result
collect-string-search
- collect-string-search SEARCHER TARGETprocedure
Perform exhaustive search of the TARGET, returing a list of RESULT.
- SEARCHER
- from make-string-search
- TARGET
- string ; search within
- RESULT
- (or #f (STRING . (START . END))) ; success or failure result
Requirements
check-errors miscmacros srfi-1 srfi-13 srfi-69 utf8
Author
Version history
- 2.7.4
- More fixnum, add default delimiter for string-split-chars/string-unzip.
- 2.7.3
- Add tests, more fixnum, fix signatures.
- 2.7.2
- Fix signatures, new test-runner.
- 2.7.1
- Fix version.
- 2.7.0
- Add rabin-karp module.
- 2.6.0
- Remove #{...} support.
- 2.5.6
- Reflow.
- 2.5.5
- Update test-runner.
- 2.5.4
- UTF8.
- 2.5.3
- Add string-split-chars.
- 2.5.2
- Fix potential buffer overflow in to-hex.
- 2.5.0
- Add string-zip & string-unzip.
- 2.4.0
- Add string-longest-common-prefix/suffix, string-longest-prefix/suffix, number->padded-string, list-as-string, string-trim-whitespace-both.
- 2.3.2
- Deprecate unicode-char->string, fixes for memoized-string & string-utils modules, ascii-codepoint? & unicode-surrogate? are not predicates.
- 2.3.1
- Minor optimization.
- 2.3.0
- Deprecate #{...} support. Add string-interpolator modules.
- 2.2.0
- Fix string-interpolation.
- 2.1.0
- Add utf8-string-interpolation.
- 2.0.0
- C5 release.
License
Copyright (C) 2010-2024 Kon Lovett. All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. Neither the name of the author nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICESLOSS OF USE, DATA, OR PROFITSOR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.