Given a list,
(define ll '(a a a b c c c d e e e e))
I want to remove all non-duplicate elements and leave only one copy of the duplicate one, i.e. after removing, the result would be
(a c e)
My algorithm is:
The problem I'm having is
(define (remove-nondup lst)
(if (>= (length lst) 2)
(if (eq? (car lst) (cadr lst))
(cons (car lst) (remove-nondup (cdr lst)))
(remove-nondup (cddr lst)))
lst))
The problem that I'm having is if there are more than 3 consecutive elements, I have no way to keep track of the previous-previous one. So I wonder should I use another procedure to remove all duplicates? or I can just put them into one procedure?
So my alternative current solution was,
(define (remove-dup lst)
(if (>= (length lst) 2)
(if (eq? (car lst) (cadr lst))
(cons (car lst) (remove-dup (cddr lst)))
(cons (car lst) (remove-dup (cdr lst))))
lst))
(define (remove-nondup-helper lst)
(if (>= (length lst) 2)
(if (eq? (car lst) (cadr lst))
(cons (car lst) (remove-nondup-helper (cdr lst)))
(remove-nondup (cddr lst)))
lst))
; call the helper function and remove-dup
(define (remove-nondup lst)
(remove-dup (remove-nondup-helper lst)))
Here's my solution: first, grab bagify
(any version will do). Then:
(define (remove-singletons lst)
(define (singleton? ass)
(< (cdr ass) 2))
(map car (remove singleton? (bagify lst))))
remove
is from SRFI 1. If you're using Racket, run (require srfi/1)
first. Or, use this simple definition:
(define remove #f) ; Only needed in Racket's REPL
(define (remove pred lst)
(cond ((null? lst) lst)
((pred (car lst)) (remove pred (cdr lst)))
(else (cons (car lst) (remove pred (cdr lst))))))
Here's a way that uses only standard library functions and only tail calls, though it performs linear searches to see if an item has already been seen or put in the result:
(define remove-nondup
(λ (ls)
(reverse
(let loop ([ls ls] [found '()] [acc '()])
(cond
[(null? ls)
acc]
[(memq (car ls) found)
(loop (cdr ls)
found
(if (memq (car ls) acc)
acc
(cons (car ls) acc)))]
[else
(loop (cdr ls)
(cons (car ls) found)
acc)])))))
(remove-nondup '(a a a b c c c d e e e e)) =>
(a c e)
(remove-nondup '(a a a b c c c d e e e e f a a f)) =>
(a c e f)
The loop
is a "named let": a handy way to stick a helper procedure inside a procedure without a lot of syntactic clutter.
If you only want to shrink consecutive duplicates down to one item, and remove items only when they don't occur twice consecutively, then here's a way to "remember" the item two cells ago without searching for it, and using only tail calls:
(define remove-nonconsecdup
(λ (ls)
(reverse
(letrec (
[got1 (λ (ls prev acc)
(cond
[(null? ls)
acc]
[(eq? prev (car ls))
(got2 (cdr ls) (cons prev acc))]
[else
(got1 (cdr ls) (car ls) acc)]))]
[got2 (λ (ls acc)
(cond
[(null? ls)
acc]
[(eq? (car acc) (car ls))
(got2 (cdr ls) acc)]
[else
(got1 (cdr ls) (car ls) acc)]))])
(if (null? ls)
'()
(got1 (cdr ls) (car ls) '()))))))
(remove-nonconsecdup '(a a a b c c c d e e e e)) =>
(a c e)
(remove-nonconsecdup '(a a a b c c c d e e e e f a a f)) =>
(a c e a)
I don't like reversing lists, but calling reverse
is easy. If the extra cons'ing done by reverse
is a problem, you could do non-tail calls or stick the items at the end of the list, but that's harder to do efficiently (but easy with a non-standard library macro).