Previous: Appending Strings, Up: Strings
Often during heavy text processing, a substring (see String Selection) is
either immediately subject to further substring calls, or discarded
completely without further processing. In these cases, it is more efficient
to use a shared substring, a type of object that functions in most ways
like a regular substring but with one important difference: it shares storage
with the parent string. Thus, creation of a shared substring can avoid the
copying step.
Return a shared substring of str. The semantics are the same as for the
substringfunction: the shared substring returned includes all of the text from str between indexes frm (inclusive) and to (exclusive). If to is omitted, it defaults to the end of str.Both frm and to may be a negative number n, in which case they refer to the index −n counted from the end of the string.
The shared substring returned by
make-shared-substringoccupies the same storage space as str.
Note that shared substrings are read-only; attempts to write to them with
string-set! result in an error. However, modifying the parent string
(assuming the parent is not itself a shared substring) effectively modifies
the shared substring.
(define parent "ABCDEFG")
(define sub (make-shared-substring parent 0 4))
parent ⇒ "ABCDEFG"
sub ⇒ "ABCD"
(string-set! sub 0 #\/)
error-->
ERROR: In procedure string-set!:
ERROR: argument is a read-only string
ABORT: (misc-error)
(string-set! parent 0 #\/)
parent ⇒ "/BCDEFG"
sub ⇒ "/BCD"
Guile provides a convenient set of procedures that take advantage of the shared substring capability. These are available by evaluating:
(use-modules (ice-9 string-fun))
This module provides the following procedures:
(split-after-char char str ret)
(split-before-char char str ret)
(split-discarding-char char str ret)
(split-after-char-last char str ret)
(split-before-char-last char str ret)
(split-discarding-char-last char str ret)
(split-before-predicate pred str ret)
(split-after-predicate pred str ret)
(split-discarding-predicate pred str ret)
(separate-fields-discarding-char ch str ret)
(separate-fields-after-char ch str ret)
(separate-fields-before-char ch str ret)
((string-prefix-predicate pred?) prefix str)
(string-prefix=? prefix str)
(sans-surrounding-whitespace s)
(sans-trailing-whitespace s)
(sans-leading-whitespace s)
(sans-final-newline str)
(has-trailing-newline? str)
String Fun: Dividing Strings Into Fields
The names of these functions are very regular.
Here is a grammar of a call to one of these:
<string-function-invocation>
:= (<action>-<seperator-disposition>-<seperator-determination>
<seperator-param> <str> <ret>)
<str> = the string
<ret> = The continuation. String functions generally return
multiple values by passing them to this procedure.
<action> = split
| separate-fields
"split" means to divide a string into two parts.
<ret> will be called with two arguments.
"separate-fields" means to divide a string into as many parts as
possible. <ret> will be called with however many fields are found.
<seperator-disposition> = before
| after
| discarding
"before" means to leave the seperator attached to
the beginning of the field to its right.
"after" means to leave the seperator attached to
the end of the field to its left.
"discarding" means to discard seperators.
Other dispositions might be handy. For example, "isolate"
could mean to treat the separator as a field unto itself.
<seperator-determination> = char
| predicate
"char" means to use a particular character as field seperator.
"predicate" means to check each character using a particular predicate.
Other determinations might be handy. For example, "character-set-member".
<seperator-param> = A parameter that completes the meaning of the
determinations. For example, if the determination
is "char", then this parameter says which character.
If it is "predicate", the parameter is the predicate.
For example:
(separate-fields-discarding-char #\, "foo, bar, baz, , bat" list)
=> ("foo" " bar" " baz" " " " bat")
(split-after-char #\- 'an-example-of-split list)
=> ("an-" "example-of-split")
As an alternative to using a determination "predicate", or to trying to
do anything complicated with these functions, consider using regular
expressions.
String Fun: String Prefix Predicates
Very simple:
(define-public ((string-prefix-predicate pred?) prefix str)
(and (<= (string-length prefix) (string-length str))
(pred? prefix (make-shared-substring str 0 (string-length prefix)))))
(define-public string-prefix=? (string-prefix-predicate string=?))
String Fun: Strippers
<stripper> = sans-<removable-part>
<removable-part> = surrounding-whitespace
| trailing-whitespace
| leading-whitespace
| final-newline