This is what I am doing:
String one = "some string"
String two = "some string"
I want to know all the characters that are in string one and two and they should come in order as they are in string one
I wrote a Java program which by using Collections performs set operations on both the collection.
What I would like to know that what is the complexity of performing set operations, is it polynomial time or linear time
My program is here
/*
* To change this template, choose Tools | Templates
* and open the template in the editor.
*/
package careercup.google;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Iterator;
import java.util.List;
/**
*
* @author learner
*/
public class CharaterStringsIntersection {
private static final String one = "abcdefgabcfmnx";
private static final String two = "xbcg";
public static void main(String args[]){
List<Character> l_one = new ArrayList<Character>();
List<Character> l_two = new ArrayList<Character>();
for(int i=0; i<one.length(); i++){
l_one.add(one.charAt(i));
}
for(int j=0; j<two.length(); j++){
l_two.add(two.charAt(j));
}
l_one.retainAll(l_two);
Iterator iter = l_one.iterator();
while(iter.hasNext()){
System.out.println(" > " + iter.next());
}
}
}
Output :
run:
> b
> c
> g
> b
> c
> x
Its complexity is O(N * (N + M)), where N = one.length()
, M = two.length()
.
It works in the following way: for each character of l_one
(there are N of them) it scans l_two
(since l_two
is an ArrayList
, scan is linear, so it takes O(M) steps, see [1]) and removes item from l_one
if necessary (removing from ArrayList
takes O(N), see [1]), so you get O(N * (N + M)).
You can lower the complexity to O(N * M) by using LinkedList
for l_one
(since removing from LinkedList
is O(1), see [2]), and further
lower it to O(N * log(M)) by using TreeSet
for l_two
(since searching in TreeSet
takes O(log(M)), see [3]).
References:
- All of the other operations run in linear time (
ArrayList
javadoc)
- All of the operations perform as could be expected for a doubly-linked list, (
LinkedList
javadoc)
- This implementation provides guaranteed log(n) time cost for the basic operations (
add
, remove
and contains
) (TreeSet
javadoc)
I think the more important question you are trying to ask is what is the complexity of retainAll() function. Assuming there are no hashes or quick look up, I would assume the time complexity is O(n^2). One loop to iterate the list and one loop to compare each item.
You are using the wrong collection for your purposes..
Since you are doing it with plain ArrayList
the complexity wouldn't be good, I don't know the inner implementation but I would guess it's quadratic (since it has to scroll both lists), this unless Java internally converts two lists to sets before doing the intersection.
You should start from Set
interface (here) and choose which collection is more suitable for your needs, I think that a HashSet
(constant time) or a TreeSet
(logarithmic but able to keep natural ordering) would be best solutions.
Briefly:
ArrayList.add()
is O(1)
.
ArrayList.retainAll(other)
is O(N1 * N2)
where N1
is the size of this
and N2
is the size of other
. (The actual cost depends on the ratio of elements retained to elements removed.)
iter.hasNext()
and iter.next()
are O(1)
for an ArrayList
iterator.
This is all as you would expect, based on mentally modelling an ArrayList as a Java array.