Moving some code from Python to C++.
BASEPAIRS = { "T": "A", "A": "T", "G": "C", "C": "G" }
Thinking maps might be overkill? What would you use?
Moving some code from Python to C++.
BASEPAIRS = { "T": "A", "A": "T", "G": "C", "C": "G" }
Thinking maps might be overkill? What would you use?
Here's the map solution:
Maybe:
;-)
If you are into optimization, and assuming the input is always one of the four characters, the function below might be worth a try as a replacement for the map:
It works based on the fact that you are dealing with two symmetric pairs. The conditional works to tell apart the A/T pair from the G/C one ('G' and 'C' happen to have the second-least-significant bit in common). The remaining arithmetics performs the symmetric mapping. It's based on the fact that a = (a + b) - b is true for any a,b.
This is the fastest, simplest, smallest space solution I can think of. A good optimizing compiler will even remove the cost of accessing the pair and name arrays. This solution works equally well in C.
base[] is a fast ascii char to Base (i.e. int between 0 and 3 inclusive) lookup that is a bit ugly. A good optimizing compiler should be able to handle base2() but I'm not sure if any do.
You can use the following syntax:
Until I was really concerned about performance, I would use a function, that takes a base and returns its match:
If I was concerned about performance, I would define a base as one fourth of a byte. 0 would represent A, 1 would represent G, 2 would represent C, and 3 would represent T. Then I would pack 4 bases into a byte, and to get their pairs, I would simply take the complement.