I was asked to write my own implementation to remove duplicated values in an array. Here is what I have created. But after tests with 1,000,000 elements it took very long time to finish. Is there something that I can do to improve my algorithm or any bugs to remove ?
I need to write my own implementation - not to use Set
, HashSet
etc. Or any other tools such as iterators. Simply an array to remove duplicates.
public static int[] removeDuplicates(int[] arr) {
int end = arr.length;
for (int i = 0; i < end; i++) {
for (int j = i + 1; j < end; j++) {
if (arr[i] == arr[j]) {
int shiftLeft = j;
for (int k = j+1; k < end; k++, shiftLeft++) {
arr[shiftLeft] = arr[k];
}
end--;
j--;
}
}
}
int[] whitelist = new int[end];
for(int i = 0; i < end; i++){
whitelist[i] = arr[i];
}
return whitelist;
}
Why do all people not check this below lines?
I need to write my own implementation - not to use Set, HashSet etc. Or any other tools such as iterators. Simply an array to remove duplicates.
I'm posting very simple implementation with caring the above line.
Output: 5 6 8 0 1 2 9 11
0 1 2 5 6 8 9 11
0 1 2 5 6 8 9 11
I have just written the above code for trying out. thanks.
You need to sort your array then then loop and remove duplicates. As you cannot use other tools you need to write be code yourself.
You can easily find examples of quicksort in Java on the internet (on which this example is based).
So the process runs in 3 steps.
O(nlgn)
O(n)
O(n)
So this improves significantly on your
O(n^3)
approach.Output:
EDIT
OP states values inside array doesn't matter really. But I can assume that range is between 0-1000. This is a classic case where an O(n) sort can be used.
We create an array of size
range +1
, in this case1001
. We then loop over the data and increment the values on each index corresponding to the datapoint.We can then compact the resulting array, dropping values the have not been incremented. This makes the values unique as we ignore the count.
Output:
Please check this. It will work for both sorted/unsorted array. The complexity is O(n^2) same as bubble sort. Yes the complexity can be improved further with first sort and then binary search. But this is simple enough to work on every cases except negative element (-1). This can also be changed by using large integer value instead of -1.
I feel Android Killer's idea is great, but I just wondered if we can leverage HashMap. So I did a little experiment. And I found HashMap seems faster than HashSet.
Here is code:
Here is result: