I'm fairly new in linux world and i need your help. i need a code to search for specific characters in spcific positions in a text file. i.e
The file sequences.txt looks like this:
ACGTCAGTCAG**T**CAGCATC**G**ATCGACTACGACCGTAGCTAGCTATACGACT**G**ATCAGCTACGATCAGCTACGATCAGCTACGAT
ACGTCAGTCAG**A**CAGCATC**C**ATCGACCATGCTAGCCGTACGATTAGCGACT**C**ATCAGCTACGATCAGCTACGATCAGCTACGAT
ACGTCAGTCAG**T**CAGCATCATCGACTACGACTACGATCGATCGATCGGACT**G**ATCAGCTACGATCAGCTACGATCAGCTACGATG
ACGTCAGTCAG**A**CAGCATC**G**ATCGACTACGACGATCGATCGATCTACGACT**C**ATCAGCTACGATCAGCTACGATCAGCTACGAT
What i want is to split the dataset in different output files grouping the equal lines containing the same specific charactrs.
hope someone can help me, all the best
To search for "foo" at position 42:
You can run a command like this multiple times on your input:
or as a loop:
awk
s substring operations might be useful here. Something along these lines:This would take the 3-character substring of each line starting at position 42 (0-based indexing, remember), and form an output file name "outputXYZ.txt" from that substring, and then append that line to it.