I want to delete some characters at the end of a string.
I made this function :
(defun del-delimiter-at-end (string)
(cond
((eq (delimiterp (char string (- (length string) 1))) nil)
string )
(t
(del-delimiterp-at-end (subseq string 0 (- (length string) 1))) ) ) )
with this :
(defun delimiterp (c) (position c " ,.;!?/"))
But I don't understand why it doesn't work. I have the following error :
Index must be positive and not -1
Note that I want to split a string in list of strings, I already looked here :
Lisp - Splitting Input into Separate Strings
but it doesn't work if the end of the string is a delimiter, that's why I'm trying to do that.
What am I doing wrong?
Thanks in advance.
The Easy Way
Just use string-right-trim
:
(string-right-trim " ,.;!?/" s)
Your Error
If you pass an empty string to you del-delimiter-at-end
, you will be passing -1
as the second argument to char
.
Your Code
There is no reason to do (eq (delimiterp ...) nil)
; just use (delimiterp ...)
instead (and switch the clauses!)
It is mode idiomatic to use if
and not cond
when you have just two clauses and each has just one form.
You call subseq
recursively, which means that you not only allocate memory for no reason, your algorithm is also quadratic in string length.
There are really two questions here. One is more specific, and is described in the body of the question. The other is more general, and is what the title asks about (how to split a sequence). I'll handle the immediate question that's in the body, of how to trim some elements from the end of a sequence. Then I'll handle the more general question of how to split a sequence in general, and how to split a list in the special case, since people who find this question based on its title may be interested in that.
Right-trimming a sequence
sds answered this perfectly if you're only concerned with strings. The language already includes string-right-trim
, so that's probably the best way to solve this problem, if you're only concerned with strings.
A solution for sequences
That said, if you want a subseq
based approach that works with arbitrary sequences, it makes sense to use the other sequence manipulation functions that the language provides. Many functions take a :from-end
argument and have -if-not
variants that can help. In this case, you can use position-if-not
to find the rightmost non-delimiter in your sequence, and then use subseq
:
(defun delimiterp (c)
(position c " ,.;!?/"))
(defun right-trim-if (sequence test)
(let ((pos (position-if-not test sequence :from-end t)))
(subseq sequence 0 (if (null pos) 0 (1+ pos)))))
(right-trim-if "hello!" 'delimiterp) ; some delimiters to trim
;=> "hello"
(right-trim-if "hi_there" 'delimiterp) ; nothing to trim, with other stuff
;=> "hi_there"
(right-trim-if "?" 'delimiterp) ; only delimiters
;=> ""
(right-trim-if "" 'delimiterp) ; nothing at all
;=> ""
Using complement
and position
Some people may point out that position-if-not
is deprecated. If you don't want to use it, you can use complement
and position-if
to achieve the same effect. (I haven't noticed an actual aversion to the -if-not
functions though.) The HyperSpec entry on complement
says:
In Common Lisp, functions with names like xxx-if-not
are related
to functions with names like xxx-if
in that
(xxx-if-not f . arguments) == (xxx-if (complement f) . arguments)
For example,
(find-if-not #'zerop '(0 0 3)) ==
(find-if (complement #'zerop) '(0 0 3)) => 3
Note that since the xxx-if-not
functions and the :test-not
arguments have been deprecated, uses of xxx-if
functions or :test
arguments with complement are preferred.
That said, position
and position-if-not
take function designators, which means that you can pass the symbol delimiterp
to them, as we did in
(right-trim-if "hello!" 'delimiterp) ; some delimiters to trim
;=> "hello"
complement
, though, doesn't want a function designator (i.e., a symbol or function), it actually wants a function object. So you can define right-trim-if
as
(defun right-trim-if (sequence test)
(let ((pos (position-if (complement test) sequence :from-end t)))
(subseq sequence 0 (if (null pos) 0 (1+ pos)))))
but you'll have to call it with the function object, not the symbol:
(right-trim-if "hello!" #'delimiterp)
;=> "hello"
(right-trim-if "hello!" 'delimiterp)
; Error
Splitting a sequence
If you're not just trying to right-trim the sequence, then you can implement a split function without too much trouble. The idea is to increment a "start" pointer into the sequence. It first points to the beginning of the sequence. Then you find the first delimiter and grab the subsequence between them. Then find the the next non-delimiter after that, and treat that as the new start point.
(defun split (sequence test)
(do ((start 0)
(results '()))
((null start) (nreverse results))
(let ((p (position-if test sequence :start start)))
(push (subseq sequence start p) results)
(setf start (if (null p)
nil
(position-if-not test sequence :start p))))))
This works on multiple kinds of sequences, and you don't end up with non delimiters in your subsequences:
CL-USER> (split '(1 2 4 5 7) 'evenp)
((1) (5 7))
CL-USER> (split '(1 2 4 5 7) 'oddp)
(NIL (2 4))
CL-USER> (split "abc123def456" 'alpha-char-p)
("" "123" "456")
CL-USER> (split #(1 2 3 foo 4 5 6 let 7 8 list) 'symbolp)
(#(1 2 3) #(4 5 6) #(7 8))
Although this works for sequences of all types, it's not very efficient for lists, since subseq
, position
, etc., all have to traverse the list up to the start
position. For lists, it's better to use a list specific implementation:
(defun split-list (list test)
(do ((results '()))
((endp list)
(nreverse results))
(let* ((tail (member-if test list))
(head (ldiff list tail)))
(push head results)
(setf list (member-if-not test tail)))))
CL-USER> (split-list '(1 2 4 5 7) 'oddp)
(NIL (2 4))
CL-USER> (split-list '(1 2 4 5 7) 'evenp)
((1) (5 7))
Instead of member-if
and ldiff
, you could also us cut
from this answer to Idiomatic way to group a sorted list of integers?.