Next: , Previous: Base64, Up: Top


52 A String Supporting Efficient Insert/Delete

To load support for efficient insert/delete operations on strings:

     (use-modules (ice-9 gap-buffer))
A gap buffer is a structure that models a string but allows relatively
efficient insertion of text somewhere in the middle.  The insertion
location is called `point' with minimum value 1, and a maximum value the
length of the string (which is not fixed).

Specifically, we allocate a continuous buffer of characters that is
composed of the BEFORE, the GAP and the AFTER (reading L->R), like so:

                         +--- POINT
                         v
   +--------------------+--------------------+--------------------+
   |       BEFORE       |        GAP         |       AFTER        |
   +--------------------+--------------------+--------------------+

    <----- bef-sz ----->|<----- gap-sz ----->|<----- aft-sz ----->

    <-------------------|       usr-sz       |------------------->

    <-------------------------- all-sz -------------------------->

This diagram also shows how the different sizes are computed, and the
location of POINT.  Note that the user-visible buffer size `usr-sz' does
NOT include the GAP, while the allocation `all-sz' DOES.

The consequence of this arrangement is that "moving point" is simply a
matter of kicking characters across the GAP, while insertion can be viewed
as filling up the gap, increasing `bef-sz' and decreasing `gap-sz'.  When
`gap-sz' falls below some threshold, we reallocate with a larger `all-sz'.

In the implementation, we actually keep track of the AFTER start offset
`aft-ofs' since it is used more often than `gap-sz'.  In fact, most of the
variables in the diagram are for conceptualization only.

A gap buffer port is a "soft port" that wraps a gap buffer.

(The term and concept of "gap buffer" are borrowed from Emacs.  We will
gladly return them when libemacs.so is available. ;-)

A gap-buffer object has the following printed representation:

#<gap-buffer GAP-SIZE/ALLOC-SIZE:POINT-MIN:POINT:POINT-MAX>

with all fields (GAP-SIZE, ALLOC-SIZE, MINT-MIN, POINT, and POINT-MAX)
integers, and everything else as shown here.
— Scheme Procedure: gb? object

Return #t iff object is a gap buffer object.

— Scheme Procedure: make-gap-buffer [init]

Return a new gap buffer. Optional arg init is either a port to read from; a string, used to initialize the buffer contents; or an integer specifying the memory allocation (in bytes) requested. Point is left at the maximum position.

— Scheme Procedure: gb-toggle-read-only gb [arg]

Change whether gb is read-only. With arg, set read-only iff arg is positive.

Implementation note: significant performance gains for non-munging operations are possible when the gap-buffer is in read-only mode. This is primarily because in read-only mode, the gap can be reduced to zero, resulting in less wasteful data motion.

52.1 querying

— Scheme Procedure: gb-point gb

Return the position of point in gb. This is an integer starting with 1 (one).

— Scheme Procedure: gb-point-min gb

Return the minimum position possible for point in gb. At this time, this value is always 1 (one).

— Scheme Procedure: gb-point-max gb

Return the maximum position possible for point in gb. This value can be changed by inserting text into the buffer, and is limited by Guile's string implementation.

— Scheme Procedure: gb-char-after gb [pos...]

Return char after pos, or #f if there is no char there. If pos is not specified, it defaults to point.

— Scheme Procedure: gb-char-before gb [pos...]

Return char before pos, or #f if there is no char there. If pos is not specified, it defaults to point.

— Scheme Procedure: gb-bolp gb

Return #t if point in gb is at the beginning of a line.

— Scheme Procedure: gb-eolp gb

Return #t if point in gb is at the end of a line.

— Scheme Procedure: gb-bobp gb

Return #t if point is at the beginning of gb.

— Scheme Procedure: gb-eobp gb

Return #t if point is at the end of gb.

52.2 munging

— Scheme Procedure: gb-insert-string! gb string

Insert into gb a string, moving point forward as well as increasing the value that would be returned by gb-point-max.

— Scheme Procedure: gb-insert-char! gb char

Insert into gb a single char, moving point forward as well as increasing the value that would be returned by gb-point-max.

— Scheme Procedure: gb-insert gb [args...]

Insert the arguments at point. If an arg is a gap-buffer, insert its contents. If an arg is a pair, insert a string made by applying write to it. If an arg is a number, insert the result of number->string. Other types accepted: char, string, symbol. Point moves forward to end up after the inserted text.

— Scheme Procedure: gb-delete-char! gb count

In gb, delete count characters from point, forward if count is positive, backward if count is negative. (If count is zero, do nothing.) Deleting backwards moves point backwards. Deleting forwards or backwards decreases the value that would be returned by gb-point-max.

— Scheme Procedure: gb-delete-region gb beg end

Delete text between beg and end.

— Scheme Procedure: gb-erase! gb

Completely erase gb. Point is left at the minimum position possible (which happens to be also the maximum position possible since the buffer is empty).

52.3 moving

— Scheme Procedure: gb-goto-char gb new-point

In gb, move point to new-point and return it. If new-point is outside the minimum and maximum positions possible, it is adjusted to the the nearest boundary (however, the return value is new-point unchanged).

— Scheme Procedure: gb-forward-char gb n

In gap-buffer gb, move point forward n characters.

— Scheme Procedure: gb-backward-char gb n

In gap-buffer gb, move point backward n characters.

— Scheme Procedure: gb-forward-line gb [n...]

In gap-buffer gb, move point n lines forward (backward if n is negative). Precisely, if point is on line I, move to the start of line I + N. If there isn't room, go as far as possible (no error). Return the count of lines left to move. If moving forward, that is n - number of lines moved; if backward, n + number moved. With positive n, a non-empty line at the end counts as one line successfully moved (for the return value).

— Scheme Procedure: gb-beginning-of-line gb [n...]

In gap-buffer gb, move point to beginning of current line. With argument n not #f or 1, move forward n - 1 lines first. If point reaches the beginning or end of buffer, it stops there.

— Scheme Procedure: gb-end-of-line gb [n...]

In gap-buffer gb, move point to end of current line. With argument n not #f or 1, move forward n - 1 lines first. If point reaches the beginning or end of buffer, it stops there.

52.4 search (and replace)

— Scheme Procedure: gb-match-string gb n

Return string of text matched by last search. n specifies which parenthesized expression in the last regexp. Value is #f if nth pair didn't match, or there were less than n pairs. Zero means the entire text matched by the whole regexp or whole string.

— Scheme Procedure: gb-looking-at gb re-str

Return #t if text after point matches regular expression re-str. This function modifies the match data that gb-match-beginning, gb-match-end and gb-match-data access; save and restore the match data if you want to preserve them.

— Scheme Procedure: gb-match-beginning [n]

Return position of start of text matched by last search. subexp, a number, specifies which parenthesized expression in the last regexp. Value is #f if subexpth pair didn't match, or there were less than subexp pairs. Zero means the entire text matched by the whole regexp.

— Scheme Procedure: gb-match-end [n]

Return position of end of text matched by last search. subexp, a number, specifies which parenthesized expression in the last regexp. Value is nil if subexpth pair didn't match, or there were less than subexp pairs. Zero means the entire text matched by the whole regexp.

— Scheme Procedure: gb-search-forward string [bound [noerror [count]]]

Search forward from point for string. Set point to the end of the occurrence found, and return point. An optional second argument bounds the search; it is a buffer position. The match found must not extend after that position. #f is equivalent to (point-max). Optional third argument, if #t, means if fail just return #f (no error). If not #f and not #t, move to limit of search and return #f. Optional fourth argument is repeat count–search for successive occurrences.

— Scheme Procedure: gb-search-backward string [bound [noerror [repeat]]]

Search backward from point for string. Set point to the beginning of the occurrence found, and return point. An optional second argument bounds the search; it is a buffer position. The match found must not extend before that position. Optional third argument, if t, means if fail just return nil (no error). If not nil and not t, position at limit of search and return nil. Optional fourth argument is repeat count–search for successive occurrences.

— Scheme Procedure: gb-re-search-forward regexp [bound [noerror [repeat]]]

Search forward from point for regular expression regexp. Set point to the end of the occurrence found, and return point. An optional second argument bounds the search; it is a buffer position. The match found must not extend after that position. Optional third argument, if #t, means if fail just return #f (no error). If not #f and not #t, move to limit of search and return #f. Optional fourth argument is repeat count–search for successive occurrences.

regexp may be a string, or compiled regular expression made with make-regexp, in which case, it is the caller's decision whether or not to include the flag regexp/newline (normally used when regexp is a string to compile it internally).

— Scheme Procedure: gb-replace-match newtext [IGNORED [literal]]

Replace text matched by last search with newtext. The second arg is optional and ignored (for now – in the future it may specify case handling a la Emacs).

If third arg literal is non-#f, insert newtext literally. Otherwise treat \ as special:

            `\&' in NEWTEXT means substitute original matched text.
            `\N' means substitute what matched the Nth `(...)'.
                 If Nth parens didn't match, substitute nothing.
            `\\' means insert one `\'.

Leave point at end of replacement text.

52.5 misc

— Scheme Procedure: gb->port! gb port [beg [end]]

Send the contents of gb to the output port. Optional args beg and end specify a region to send. Point does not move.

— Scheme Procedure: gb->string gb

Return a new string representing the text of gb. Point does not move.

— Scheme Procedure: gb->substring gb start end

Return the region of gb from start to end as a string.

— Scheme Procedure: gb-filter! gb string-proc

Pass the string representing the text of gb to string-proc and use its return value to completely replace the contents of gb. Point is left at the maximum position.

— Scheme Procedure: gb->lines gb

Return a list of strings representing the lines of text of gb. Newlines are automatically removed. A buffer with N newlines results in a list of length N+1. Point does not move.

— Scheme Procedure: gb-filter-lines! gb lines-proc

Pass the list of strings representing the lines of text of gb to lines-proc and use its return value (another list of strings) to completely replace the contents of gb. Newlines are automatically removed and added back. Point is left at the maximum position.

— Scheme Procedure: make-gap-buffer-port gb

Return a "soft port" on gb that supports the write-character, write-string and read-character operations (flush-output and close-port are not supported). All operations move point forward. Additionally, writing operations increase the value that would be returned by gb-point-max.