Renaming file which is contain Ö ö Ç ç Ş ş İ ı Ğ ğ

2019-06-14 13:12发布

问题:

I am trying to rename files with batch. I want to replace letters Ö ö Ç ç Ş ş İ ı Ğ ğ Ü ü with O o C c S s I i G g U u but its failing. What can i do for fix this problem.

@echo OFF
set TargetFolder=%~dp0target
setlocal enableDelayedExpansion
set srch=Ö ö Ç ç Ş ş İ ı Ğ ğ Ü ü
set rplc=O o C c S s I i G g U u
set /a n=0

for %%a in (!srch!) do set /a n+=1&set srch[!n!]=%%a
set /a n=0
for %%a in (!rplc!) do set /a n+=1&set rplc[!n!]=%%a

for /f "tokens=* delims=" %%a in ('dir /b /a-d "%TargetFolder%\*"') do (
  set NewFileName=%%~na
  for /l %%x in (1,1,!n!) do (
    for /f "tokens=* delims=" %%t in ('jrepl !srch[%%x]! !rplc[%%x]! /s NewFileName') do set "NewFileName=%%t"
  )
  ren "%TargetFolder%\%%~nxa" "!NewFileName!%%~xa"
)
endlocal
pause

PS: This code require JREPL.BAT file from @dbenham.

回答1:

The problem characters are Unicode that do not have an ASCII equivalent. The file system allows such unicode charactesr, but the command line has limited support for unicode.

It is possible to manipulate unicode characters with JREPL by using the \uNNNN escape sequence. But even if you do it correctly, the command line corrupts the value when you attempt to rename the file.

I have written another hybrid JScript/batch utility called JREN.BAT that renames files or folders via regular expression replacements. I didn't plan on this, but this is a perfect application for JREN.BAT. This works because JScript actually does the rename, and JScript natively works with unicode.

In order to do the rename, you must first establish the unicode code value for the problem characters. I copied the characters into a MicroSoft Word document, and used the process described at http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&id=UTTUsingUnicodeMacros to figure out the code values.

I wrote three solutions using JREN.

1) This first version is fairly easy to follow, and it is easy to maintain - simply add an additional
"find replace" line for each needed translation. The big disadvantage is slow performance because it renames every file repeatedly - once for each character to be translated.

@echo off
for %%A in (
  "00D6 O"
  "00F6 o"
  "00C7 C"
  "00E7 c"
  "015E S"
  "015F s"
  "0130 I"
  "0131 i"
  "011E G"
  "011F g"
  "00DC U"
  "00FC u"
) do for /f "tokens=1,2" %%B in (%%A) do call jren "\u%%B" "%%C" %*

2) This second version is a bit wicked to follow, and difficult to maintain. But it is much faster because it fully renames each file in one pass.

@echo off
call jren "(\u00D6)|(\u00F6)|(\u00C7)|(\u00E7)|(\u015E)|(\u015F)|(\u0130)|(\u0131)|(\u011E)|(\u011F)|(\u00DC)|(\u00FC)" ^
          "$1?'O':$2?'o':$3?'C':$4?'c':$5?'S':$6?'s':$7?'I':$8?'i':$9?'G':$10?'g':$11?'U':'u'" /j %*

3) This last version gives the best of both worlds. The translation list is easily maintained like the first version, but then it dynamically builds the search and replace expressions that are used by the second method. So it is able to rename all files in one pass.

@echo off
setlocal enableDelayedExpansion
set "find="
set "repl="
set /a n=0
for %%A in (
  "00D6 O"
  "00F6 o"
  "00C7 C"
  "00E7 c"
  "015E S"
  "015F s"
  "0130 I"
  "0131 i"
  "011E G"
  "011F g"
  "00DC U"
  "00FC u"
) do for /f "tokens=1,2" %%B in (%%A) do (
  set /a n+=1
  set "find=!find!|(\u%%B)"
  set "repl=!repl!$!N!?'%%C':"
)
call jren "!find:~1!" "!repl!$0" /j %*

Assume you name any of the above scripts "fixUnicode.bat" and you place it along with JREN.BAT somewhere in your PATH, then you could use any of the following:

Rename all files in the current directory

fixUnicode

Rename all files in the d:\test folder

fixUnicode /p "d:\test"

Recursively rename all files and folders on the c: drive

fixUnicode /s /p "c:\"
fixUnicode /d /s /p "c:\"

There are other options you can append to specify which files and/or paths to include or exclude. Use jren /? to get help on all the options that are available to JREN. Most of them can be used with fixUnicode.bat



回答2:

Change to correct code page. Hints:

  • Which code page the command line interpreter defaults to? CHCP command would answer... Generally: OEM ( =852 DOS: East Europe on my Windows).
  • Which code page the script is saved as? Generally: ANSI ( =1250 Windows: East Europe on my Windows)

Next script cpansi.bat is saved as ANSI

echo OFF
setlocal enableDelayedExpansion
set "srch=Ö ö Ç ç Ş ş İ ı Ğ ğ Ü ü"
set "rplc=O o C c S s I i G g U u"
echo srch %srch%
echo rplc %rplc%
endlocal
goto :eof

Output:

Active code page: 852

d:\bat>cpansi

d:\bat>echo OFF
srch Í ÷ ă š ¬ ║ I i G g ▄ Ř
rplc O o C c S s I i G g U u

d:\bat>chcp 1250
Active code page: 1250

d:\bat>cpansi

d:\bat>echo OFF
srch Ö ö Ç ç Ş ş I i G g Ü ü
rplc O o C c S s I i G g U u

Edit: Change ren to echo and add dir temporarily for debugging purposes to see what happens:

echo ren "%TargetFolder%\%%~nxa" "!NewFileName!%%~xa"
dir /B "%TargetFolder%\%%~nxa"