The task is:
A non-empty zero-indexed string S is given. String S consists of N characters from the set of upper-case English letters A, C, G, T.
This string actually represents a DNA sequence, and the upper-case letters represent single nucleotides.
You are also given non-empty zero-indexed arrays P and Q consisting of M integers. These arrays represent queries about minimal nucleotides. We represent the letters of string S as integers 1, 2, 3, 4 in arrays P and Q, where A = 1, C = 2, G = 3, T = 4, and we assume that A < C < G < T.
Query K requires you to find the minimal nucleotide from the range (P[K], Q[K]), 0 ≤ P[i] ≤ Q[i] < N.
For example, consider string S = GACACCATA and arrays P, Q such that:
P[0] = 0 Q[0] = 8
P[1] = 0 Q[1] = 2
P[2] = 4 Q[2] = 5
P[3] = 7 Q[3] = 7
The minimal nucleotides from these ranges are as follows:
(0, 8) is A identified by 1,
(0, 2) is A identified by 1,
(4, 5) is C identified by 2,
(7, 7) is T identified by 4.
Write a function:
class Solution { public int[] solution(String S, int[] P, int[] Q); }
that, given a non-empty zero-indexed string S consisting of N characters and two non-empty zero-indexed arrays P and Q consisting of M integers, returns an array consisting of M characters specifying the consecutive answers to all queries.
The sequence should be returned as:
a Results structure (in C), or
a vector of integers (in C++), or
a Results record (in Pascal), or
an array of integers (in any other programming language).
For example, given the string S = GACACCATA and arrays P, Q such that:
P[0] = 0 Q[0] = 8
P[1] = 0 Q[1] = 2
P[2] = 4 Q[2] = 5
P[3] = 7 Q[3] = 7
the function should return the values [1, 1, 2, 4], as explained above.
Assume that:
N is an integer within the range [1..100,000];
M is an integer within the range [1..50,000];
each element of array P, Q is an integer within the range [0..N − 1];
P[i] ≤ Q[i];
string S consists only of upper-case English letters A, C, G, T.
Complexity:
expected worst-case time complexity is O(N+M);
expected worst-case space complexity is O(N),
beyond input storage
(not counting the storage required for input arguments).
Elements of input arrays can be modified.
My solution is:
class Solution {
public int[] solution(String S, int[] P, int[] Q) {
final char c[] = S.toCharArray();
final int answer[] = new int[P.length];
int tempAnswer;
char tempC;
for (int iii = 0; iii < P.length; iii++) {
tempAnswer = 4;
for (int zzz = P[iii]; zzz <= Q[iii]; zzz++) {
tempC = c[zzz];
if (tempC == 'A') {
tempAnswer = 1;
break;
} else if (tempC == 'C') {
if (tempAnswer > 2) {
tempAnswer = 2;
}
} else if (tempC == 'G') {
if (tempAnswer > 3) {
tempAnswer = 3;
}
}
}
answer[iii] = tempAnswer;
}
return answer;
}
}
It is not optimal, I believe it's supposed to be done within one loop, any hint how can I achieve it?
You can check quality of your solution here https://codility.com/train/ test name is Genomic-range-query.
Hope this helps.
perl 100/100 solution:
Here is my solution. Got %100 . Of course I needed to first check and study a little bit prefix sums.
pshemek's solution constrains itself to the space complexity (O(N)) - even with the 2-d array and the answer array because a constant (4) is used for the 2-d array. That solution also fits in with the computational complexity - whereas mine is O (N^2) - though the actual computational complexity is much lower because it skips over entire ranges that include minimal values.
I gave it a try - but mine ends up using more space - but makes more intuitive sense to me (C#):
In pshemek's answer - the "trick" in the second loop is simply that once you've determined you've found a range with the minimal value - you don't need to continue iterating. Not sure if that helps.
This program has got score 100 and performance wise has got an edge over other java codes listed above!
The code can be found here.
Here is the solution, supposing someone is still interested.