How to remove duplicate of a list? (running time is O(n log n) )
ex: '(4 6 1 1 2 3 3 5 6) => '(4 6 1 2 3 5)
(define (re-dup lst)
(cond
((empty? lst) empty)
(else
(define el (first lst))
(define el-free-lst (filter (lambda (x) (not (= el x))) (rest lst)))
(cons el (re-dup el-free-lst)))))
Is this right?
Your current solution is O(n^2)
, because filter
traverses the list once for each of the elements in the original list. It's possible to write an O(n)
solution using a helper data structure with constant time insertion and membership operations to keep track of the elements that have been found already.
In Racket, we have an out-of-the-box set
data structure which, for constant time operations, "actually requires O(log N) time for a set of size N" (see the documentation), therefore the set-member?
and set-add
procedures will be O(log n)
. So this implementation using Racket's set
is not optimal, but we achieve the O(n log n)
target:
(define (re-dup lst)
(let loop ((seen (set))
(lst lst)
(acc '()))
(cond ((null? lst)
(reverse acc))
((set-member? seen (car lst))
(loop seen (cdr lst) acc))
(else
(loop (set-add seen (car lst))
(cdr lst)
(cons (car lst) acc))))))
It works as expected, preserving the original order in the list (which is a constraint for this problem, as stated in the comments) at the cost of one additional O(n)
reverse
operation:
(re-dup '(4 6 1 1 2 3 3 5 6))
=> '(4 6 1 2 3 5)
You can get O(n log n)
behavior with the following:
- Merge sort (which is
O(n log n)
)
- Traverse and discard duplicates (which is
O(n)
Thus overall O(n log n)
.
(define (re-dup lst)
(if (null? lst)
lst
(let ((lst (list-sort < lst)))
(let thinning ((skip (car lst)) (rest (cdr lst)))
(cond ((null? rest) (list skip))
((equal? skip (car rest)) (thinning skip (cdr rest)))
(else (cons skip (thinning (car rest) (cdr rest)))))))))