I have some source code file which has mixed tabs/spaces and I want to convert it to a file where it has automatically replaced all indentation spaces by tabs for a given tab space length (i.e. for example tab = 2 spaces).
Any easy solution (with common Unix tools, MacOSX, bash or zsh)? Some sed script or Python command or so?
Thanks,
Albert
Ok, none of the given solutions satisfied me, so I coded it myself. :)
See here:
- http://github.com/albertz/helpers/blob/master/indent-spacestotabs.py
- http://github.com/albertz/helpers/blob/master/indent-tabtospaces.py
Depending on the source language, you could try out GNU indent. It can do a large number of things relating to the indentation of source code, though it might be more complex than you need.
For example, if I give the following program to indent -di0 <inputfile>
#include <stdio.h>
int main(int argc, char **argv)
{
int i;
int j;
for (i = 0; i < 10; i++)
{
for (j = 0; j < 10; j++)
{
printf("x");
}
}
}
It will replace it with:
#include <stdio.h>
int
main(int argc, char **argv)
{
int i;
int j;
for (i = 0; i < 10; i++) {
for (j = 0; j < 10; j++) {
printf("x");
}
}
}
Or, if you need something stupid simple, there is the expand/unexpand
commands.
You could use a regular expression to replace N spaces by a tab charater. For example in Python:
import re
re.sub('[ ]{4}', '\t', text)
Two things,
sed -i
is your friend - sed -i XXX.txt 's/^[ ]\{2\}/\t/g'
- You can't make regular expression to multiply the tab replacement by the space length.
Given my AWK-fu is not strong (and I don't know if it can do what #2 can't), I will write a PHP script to calculate the spaces and replace them with tabs.
sed -r 's/ {2}/\t/g' file
Here is a possible solution in Python:
import re
import fileinput
pat = re.compile("^( )+")
for line in fileinput.input(inplace=True):
print pat.sub(lambda m: "\t" * (m.end() // 2), line, 1),
This will convert leading spaces (even interspersed with tabs) into tabs. Specify the number of spaces to convert by setting the variable. Stray spaces will be collapsed to nothing. Spaces and tabs that appear after any character other than space or tab will not be touched.
tstop=2
sed "s/^\([[:blank:]]*\)\(.*\)/\1\n\2/;h;s/[^[\n]*//;x;s/\n.*//;s/ \{$tstop\}/X/g;s/ //g;G;s/\n//g" inputfile
Example:
[space][space][tab][tab][space][space][space][tab][space]TEXT[space][space][space]
will be converted to
[tab][tab][tab][tab][tab]TEXT[space][space][space]
If that's not exactly what you're looking for, adjustments can be made.