convert txt file with mixed spaces/tabs to tabs on

2019-02-13 13:12发布

问题:

I have some source code file which has mixed tabs/spaces and I want to convert it to a file where it has automatically replaced all indentation spaces by tabs for a given tab space length (i.e. for example tab = 2 spaces).

Any easy solution (with common Unix tools, MacOSX, bash or zsh)? Some sed script or Python command or so?

Thanks, Albert

回答1:

Ok, none of the given solutions satisfied me, so I coded it myself. :)

See here:

  • http://github.com/albertz/helpers/blob/master/indent-spacestotabs.py
  • http://github.com/albertz/helpers/blob/master/indent-tabtospaces.py


回答2:

Depending on the source language, you could try out GNU indent. It can do a large number of things relating to the indentation of source code, though it might be more complex than you need.

For example, if I give the following program to indent -di0 <inputfile>

#include <stdio.h>

int main(int argc, char **argv)
{
  int i;
    int j;
  for (i = 0; i < 10; i++)
    {
        for (j = 0; j < 10; j++)
    {
        printf("x");
    }
  }
}

It will replace it with:

#include <stdio.h>

int 
main(int argc, char **argv)
{
    int i;
    int j;
    for (i = 0; i < 10; i++) {
        for (j = 0; j < 10; j++) {
            printf("x");
        }
    }
}

Or, if you need something stupid simple, there is the expand/unexpand commands.



回答3:

You could use a regular expression to replace N spaces by a tab charater. For example in Python:

import re
re.sub('[ ]{4}', '\t', text)


回答4:

Two things,

  1. sed -i is your friend - sed -i XXX.txt 's/^[ ]\{2\}/\t/g'
  2. You can't make regular expression to multiply the tab replacement by the space length.

Given my AWK-fu is not strong (and I don't know if it can do what #2 can't), I will write a PHP script to calculate the spaces and replace them with tabs.



回答5:

sed -r 's/ {2}/\t/g' file


回答6:

Here is a possible solution in Python:

import re
import fileinput

pat = re.compile("^(  )+")

for line in fileinput.input(inplace=True):
    print pat.sub(lambda m: "\t" * (m.end() // 2), line, 1),


回答7:

This will convert leading spaces (even interspersed with tabs) into tabs. Specify the number of spaces to convert by setting the variable. Stray spaces will be collapsed to nothing. Spaces and tabs that appear after any character other than space or tab will not be touched.

tstop=2
sed "s/^\([[:blank:]]*\)\(.*\)/\1\n\2/;h;s/[^[\n]*//;x;s/\n.*//;s/ \{$tstop\}/X/g;s/ //g;G;s/\n//g" inputfile

Example:

[space][space][tab][tab][space][space][space][tab][space]TEXT[space][space][space]

will be converted to

[tab][tab][tab][tab][tab]TEXT[space][space][space]

If that's not exactly what you're looking for, adjustments can be made.