I have already converted user input of DNA code (A,T,G,C)
into RNA code(A,U,G,C)
. This was fairly easy
RNA_Code=DNA_Code.replace('T','U')
Now the next thing I need to do is convert the RNA_Code into it's compliment strand. This means I need to replace A with U, U with A, G with C and C with G, but all simultaneously.
if I say
RNA_Code.replace('A','U')
RNA_Code.replace('U','A')
it converts all the As into Us then all the Us into As but I am left with all As for both.
I need it to take something like AUUUGCGGCAAA
and convert it to UAAACGCCGUUU
.
Any ideas on how to get this done?(3.3)
Use a translation table:
RNA_compliment = {
ord('A'): 'U', ord('U'): 'A',
ord('G'): 'C', ord('C'): 'G'}
RNA_Code.translate(RNA_compliment)
The str.translate()
method takes a mapping from codepoint (a number) to replacement character. The ord()
function gives us a codepoint for a given character, making it easy to build your map.
Demo:
>>> RNA_compliment = {ord('A'): 'U', ord('U'): 'A', ord('G'): 'C', ord('C'): 'G'}
>>> 'AUUUGCGGCAAA'.translate(RNA_compliment)
'UAAACGCCGUUU'
You can use a mapping dictionary:
In [1]: dic={"A":"U","U":"A","G":"C","C":"G"}
In [2]: strs="AUUUGCGGCAAA"
In [3]: "".join(dic[x] for x in strs)
Out[3]: 'UAAACGCCGUUU'
If you're not already using it, I suggest trying out Biopython. It has all sorts of functions for dealing with biological data, including a pretty cool Seq
object. There is a reverse_complement()
function that does exactly what you're trying to do, and a bunch more that you might not even have thought of yet. Check it out, it's a real time-saver.
>>> from Bio.Seq import Seq
>>> from Bio.Alphabet import generic_dna
>>> my_dna = Seq("AGTACACTGGT", generic_dna)
>>> my_dna
Seq('AGTACACTGGT', DNAAlphabet())
>>> my_dna.complement()
Seq('TCATGTGACCA', DNAAlphabet())
>>> my_dna.reverse_complement()
Seq('ACCAGTGTACT', DNAAlphabet())
I have a simple solution:
# get the sequence from the user:
dna_seq = input("Please enter your sequence here: ")
# make a for loop to read the seq one nucleotide at a time and add each one in a new variable
compliment = ""
for n in dna_seq:
if n == "A":
compliment = compliment + "T"
elif n == "T":
compliment = compliment + "A"
elif n == "G":
compliment = compliment + "C"
elif n == "C":
compliment = compliment + "G"
print(compliment)