java codility training Genomic-range-query

2020-05-20 05:39发布

The task is:

A non-empty zero-indexed string S is given. String S consists of N characters from the set of upper-case English letters A, C, G, T.

This string actually represents a DNA sequence, and the upper-case letters represent single nucleotides.

You are also given non-empty zero-indexed arrays P and Q consisting of M integers. These arrays represent queries about minimal nucleotides. We represent the letters of string S as integers 1, 2, 3, 4 in arrays P and Q, where A = 1, C = 2, G = 3, T = 4, and we assume that A < C < G < T.

Query K requires you to find the minimal nucleotide from the range (P[K], Q[K]), 0 ≤ P[i] ≤ Q[i] < N.

For example, consider string S = GACACCATA and arrays P, Q such that:

P[0] = 0    Q[0] = 8
P[1] = 0    Q[1] = 2
P[2] = 4    Q[2] = 5
P[3] = 7    Q[3] = 7

The minimal nucleotides from these ranges are as follows:

    (0, 8) is A identified by 1,
    (0, 2) is A identified by 1,
    (4, 5) is C identified by 2,
    (7, 7) is T identified by 4.

Write a function:

class Solution { public int[] solution(String S, int[] P, int[] Q); } 

that, given a non-empty zero-indexed string S consisting of N characters and two non-empty zero-indexed arrays P and Q consisting of M integers, returns an array consisting of M characters specifying the consecutive answers to all queries.

The sequence should be returned as:

    a Results structure (in C), or
    a vector of integers (in C++), or
    a Results record (in Pascal), or
    an array of integers (in any other programming language).

For example, given the string S = GACACCATA and arrays P, Q such that:

P[0] = 0    Q[0] = 8
P[1] = 0    Q[1] = 2
P[2] = 4    Q[2] = 5
P[3] = 7    Q[3] = 7

the function should return the values [1, 1, 2, 4], as explained above.

Assume that:

    N is an integer within the range [1..100,000];
    M is an integer within the range [1..50,000];
    each element of array P, Q is an integer within the range [0..N − 1];
    P[i] ≤ Q[i];
    string S consists only of upper-case English letters A, C, G, T.

Complexity:

    expected worst-case time complexity is O(N+M);
    expected worst-case space complexity is O(N), 
         beyond input storage 
         (not counting the storage required for input arguments).

Elements of input arrays can be modified.

My solution is:

class Solution {
    public int[] solution(String S, int[] P, int[] Q) {
        final  char c[] = S.toCharArray();
        final int answer[] = new int[P.length];
        int tempAnswer;
        char tempC;

        for (int iii = 0; iii < P.length; iii++) {
            tempAnswer = 4;
            for (int zzz = P[iii]; zzz <= Q[iii]; zzz++) {
                tempC = c[zzz];
                if (tempC == 'A') {
                    tempAnswer = 1;
                    break;
                } else if (tempC == 'C') {
                    if (tempAnswer > 2) {
                        tempAnswer = 2;
                    }
                } else if (tempC == 'G') {
                    if (tempAnswer > 3) {
                        tempAnswer = 3;
                    }

                }
            }
            answer[iii] = tempAnswer;
        }

        return answer;
    }
}

It is not optimal, I believe it's supposed to be done within one loop, any hint how can I achieve it?

You can check quality of your solution here https://codility.com/train/ test name is Genomic-range-query.

30条回答
相关推荐>>
2楼-- · 2020-05-20 05:42

My C++ solution

vector<int> solution(string &S, vector<int> &P, vector<int> &Q) {

    vector<int> impactCount_A(S.size()+1, 0);
    vector<int> impactCount_C(S.size()+1, 0);
    vector<int> impactCount_G(S.size()+1, 0);

    int lastTotal_A = 0;
    int lastTotal_C = 0;
    int lastTotal_G = 0;
    for (int i = (signed)S.size()-1; i >= 0; --i) {
        switch(S[i]) {
            case 'A':
                ++lastTotal_A;
                break;
            case 'C':
                ++lastTotal_C;
                break;
            case 'G':
                ++lastTotal_G;
                break;
        };

        impactCount_A[i] = lastTotal_A;
        impactCount_C[i] = lastTotal_C;
        impactCount_G[i] = lastTotal_G;
    }

    vector<int> results(P.size(), 0);

    for (int i = 0; i < P.size(); ++i) {
        int pIndex = P[i];
        int qIndex = Q[i];

        int numA = impactCount_A[pIndex]-impactCount_A[qIndex+1];
        int numC = impactCount_C[pIndex]-impactCount_C[qIndex+1];
        int numG = impactCount_G[pIndex]-impactCount_G[qIndex+1];

        if (numA > 0) {
            results[i] = 1;
        }
        else if (numC > 0) {
            results[i] = 2;
        }
        else if (numG > 0) {
            results[i] = 3;
        }
        else {
            results[i] = 4;
        }
    }

    return results;
}
查看更多
三岁会撩人
3楼-- · 2020-05-20 05:44

Python Solution with explanation

The idea is to hold an auxiliary array per nucleotide X, with position i (ignoring zero) is how many times X has occurred as of now. And so if we need the number of occurrences of X from position f to position t, we could take the following equation:

aux(t) - aux(f)

Time complexity is:

O(N+M)

def solution(S, P, Q):
    n = len(S)
    m = len(P)

    aux = [[0 for i in range(n+1)] for i in [0,1,2]]

    for i,c in enumerate(S):
        aux[0][i+1] = aux[0][i] + ( c == 'A' )
        aux[1][i+1] = aux[1][i] + ( c == 'C' )
        aux[2][i+1] = aux[2][i] + ( c == 'G' )

    result = []

    for i in range(m):
        fromIndex , toIndex = P[i] , Q[i] +1
        if   aux[0][toIndex] - aux[0][fromIndex] > 0:
            r = 1
        elif aux[1][toIndex] - aux[1][fromIndex] > 0:
            r = 2
        elif aux[2][toIndex] - aux[2][fromIndex] > 0:
            r = 3
        else:
            r = 4

        result.append(r)

    return result
查看更多
smile是对你的礼貌
4楼-- · 2020-05-20 05:46

Here's 100% Scala solution:

def solution(S: String, P: Array[Int], Q: Array[Int]): Array[Int] = {


    val resp = for(ind <- 0 to P.length-1) yield {

      val sub= S.substring(P(ind),Q(ind)+1)


      var factor = 4

      if(sub.contains("A")) {factor=1}
      else{
        if(sub.contains("C")) {factor=2}
        else{
          if(sub.contains("G")) {factor=3}
        }
      }
      factor

    }

    return resp.toArray

  }

And performance: https://codility.com/demo/results/trainingEUR4XP-425/

查看更多
Viruses.
5楼-- · 2020-05-20 05:46

Java 100/100

class Solution {
public int[] solution(String S, int[] P, int[] Q) {
    int     qSize       = Q.length;
    int[]   answers     = new int[qSize];

    char[]  sequence    = S.toCharArray();
    int[][] occCount    = new int[3][sequence.length+1];

    int[] geneImpactMap = new int['G'+1];
    geneImpactMap['A'] = 0;
    geneImpactMap['C'] = 1;
    geneImpactMap['G'] = 2;

    if(sequence[0] != 'T') {
        occCount[geneImpactMap[sequence[0]]][0]++;
    }

    for(int i = 0; i < sequence.length; i++) {
        occCount[0][i+1] = occCount[0][i];
        occCount[1][i+1] = occCount[1][i];
        occCount[2][i+1] = occCount[2][i];

        if(sequence[i] != 'T') {
            occCount[geneImpactMap[sequence[i]]][i+1]++;
        }
    }

    for(int j = 0; j < qSize; j++) {
        for(int k = 0; k < 3; k++) {
            if(occCount[k][Q[j]+1] - occCount[k][P[j]] > 0) {
                answers[j] = k+1;
                break;
            }

            answers[j] = 4;
        }            
    }

    return answers;
}
} 
查看更多
兄弟一词,经得起流年.
6楼-- · 2020-05-20 05:48

Simple, elegant, domain specific, 100/100 solution in JS with comments!

function solution(S, P, Q) {
    var N = S.length, M = P.length;

    // dictionary to map nucleotide to impact factor
    var impact = {A : 1, C : 2, G : 3, T : 4};

    // nucleotide total count in DNA
    var currCounter = {A : 0, C : 0, G : 0, T : 0};

    // how many times nucleotide repeats at the moment we reach S[i]
    var counters = [];

    // result
    var minImpact = [];

    var i;

    // count nucleotides
    for(i = 0; i <= N; i++) {
        counters.push({A: currCounter.A, C: currCounter.C, G: currCounter.G});
        currCounter[S[i]]++;
    }

    // for every query
    for(i = 0; i < M; i++) {
        var from = P[i], to = Q[i] + 1;

        // compare count of A at the start of query with count at the end of equry
        // if counter was changed then query contains A
        if(counters[to].A - counters[from].A > 0) {
            minImpact.push(impact.A);
        }
        // same things for C and others nucleotides with higher impact factor
        else if(counters[to].C - counters[from].C > 0) {
            minImpact.push(impact.C);
        }
        else if(counters[to].G - counters[from].G > 0) {
            minImpact.push(impact.G);
        }
        else { // one of the counters MUST be changed, so its T
            minImpact.push(impact.T);
        }
    }

    return minImpact;
}
查看更多
看我几分像从前
7楼-- · 2020-05-20 05:48

I implemented a Segment Tree solution in Kotlin

import kotlin.math.*

fun solution(S: String, P: IntArray, Q: IntArray): IntArray {

    val a = IntArray(S.length)
    for (i in S.indices) {
        a[i] = when (S[i]) {
            'A' -> 1
            'C' -> 2
            'G' -> 3
            'T' -> 4
            else -> throw IllegalStateException()
        }
    }

    val segmentTree = IntArray(2*nextPowerOfTwo(S.length)-1)
    constructSegmentTree(a, segmentTree, 0, a.size-1, 0)

    val result = IntArray(P.size)
    for (i in P.indices) {
        result[i] = rangeMinQuery(segmentTree, P[i], Q[i], 0, a.size-1, 0)
    }
    return result
}

fun constructSegmentTree(input: IntArray, segmentTree: IntArray,  low: Int,  high: Int,  pos: Int) {

    if (low == high) {
        segmentTree[pos] = input[low]
        return
    }
    val mid = (low + high)/2
    constructSegmentTree(input, segmentTree, low, mid, 2*pos+1)
    constructSegmentTree(input, segmentTree, mid+1, high, 2*pos+2)
    segmentTree[pos] = min(segmentTree[2*pos+1], segmentTree[2*pos+2])
}

fun rangeMinQuery(segmentTree: IntArray, qlow:Int, qhigh:Int ,low:Int, high:Int, pos:Int): Int {

    if (qlow <= low && qhigh >= high) {
        return segmentTree[pos]
    }
    if (qlow > high || qhigh < low) {
        return Int.MAX_VALUE
    }
    val mid = (low + high)/2
    return min(rangeMinQuery(segmentTree, qlow, qhigh, low, mid, 2*pos+1), rangeMinQuery(segmentTree, qlow, qhigh, mid+1, high, 2*pos+2))
}

fun nextPowerOfTwo(n:Int): Int {
    var count = 0
    var number = n
    if (number > 0 && (number and (number - 1)) == 0) return number
    while (number != 0) {
        number = number shr 1
        count++
    }
    return 1 shl count
}
查看更多
登录 后发表回答