This IS kind of linked to another question - Code golf: "Color highlighting" of repeated text
I'm tying to figure out a way of breaking a file into all 'n' characters long groups.
Eg: If a file comprises of ONLY the following text:
ABCDEFGHIJ
And we want it broken into sets of 3, the output should be:
ABC
BCD
CDE
DEF
EFG
FGH
GHI
HIJ
No characters in the file are to be treated any differently from another. ie, a "space" is just another character which should follow the rule above...
sed:
echo "ABCDEFGHIJ" | sed -n ':a;/^...$/{p;b};s/.../&\n/;P;s/.//;s/\n//;ba'
A more generalized sed
version:
num=5; echo "ABCDEFGHIJ" | sed -n ":a;/^.\{$num\}\$/{p;b};s/.\{$num\}/&\n/;P;s/.//;s/\n//;ba"
Bash and ksh:
string="ABCDEFGHIJ"
for ((i=0;i<=${#string}-3;i++)); do echo ${string:i:3}; done
zsh:
string="ABCDEFGHIJ"
for ((i=1;i<=${#string}-2;i++)); do echo $string[i,i+2]; done
sh (specifically Dash):
string='ABCDEFGHIJ'
count=$(seq $((${#string}-2)))
for i in $count; do b="$b?"; done
for i in $count; do b="${b%?}"; echo "${string%$b}"; string="${string#?}"; done
AWK:
echo "ABCDEFGHIJ" | awk -v num=4 '{for (i=1; i<length($0)-num;i++) print substr($0,i,num)}'
Edit: Added a more generalized sed
version and an AWK version.
Does it have to be shell based or are you open to other scripting languages? Here's a version in Python:
width = 3
data = open("file").read()
for x in xrange(len(data) - width + 1):
print data[x : x+width]