I have a dead simple Common Lisp question: what is the idiomatic way of removing duplicates from a list of strings?
remove-duplicates
works as I'd expect for numbers, but not for strings:
* (remove-duplicates '(1 2 2 3))
(1 2 3)
* (remove-duplicates '("one" "two" "two" "three"))
("one" "two" "two" "three")
I'm guessing there's some sense in which the strings aren't equal, most likely because although "foo" and "foo" are apparently identical, they're actually pointers to different structures in memory. I think my expectation here may just be a C hangover.
You have to tell remove-duplicates how it should compare the values. By default, it uses eql
, which is not sufficient for strings. Pass the :test
function as in:
(remove-duplicates your-sequence :test #'equal).
(Edit to address the question from the comments): As an alternative to equal
, you could use string=
in this example. This predicate is (in a way) less generic than equal
and it might (could, probably, possibly, eventually...) thus be faster. A real benefit might be, that string=
can tell you, if you pass a wrong value:
(equal 1 "foo")
happily yields nil
, whereas
(string= 1 "foo")
gives a type-error
condition. Note, though, that
(string= "FOO" :FOO)
is perfectly well defined (string=
and its friend are defined in terms of "string designators" not strings), so type safety would go only so far here.
The standard eql
predicate, on the other hand, is almost never the right way to compare strings. If you are familiar with the Java language, think of eql
as using ==
while equal
(or string=
, etc.) calling the equals(Object)
method. Though eql
does some type introspection (as opposed to eq
, which does not), for most (non-numeric) lisp types, eql
boils down to something like a pointer comparison, which is not sufficient, if you want to discriminate values based on what they actually contain, and not merely on where in memory they are located.
For the more Pythonic inclined, eq
(and eql
for non-numeric types) is more like the is
operator, whereas equal
is more like ==
which calls __eq__
.