What DVCS support Unicode filenames?

2019-02-01 07:52发布

I'm interested in trying out distributed version control systems. git sounds promising, but I saw a note somewhere for the Windows port of git that says "don't use non-ASCII filenames". I can't find that now, but there is this link. It's put me off git for now, but I don't know if the other options are any better.

Support for non-ASCII filenames is essential for my Japanese company. I'm looking for one that internally stores filenames as Unicode, not a platform-dependent encoding which would cause endless grief. So:

  1. What DVCS support Unicode filenames?
  2. In both Windows and Linux?
  3. Ideally, with the possibility to transfer repositories between Windows and Linux machines with minimal issues?

7条回答
等我变得足够好
2楼-- · 2019-02-01 08:09

According to this page: Bazaar, Codendi, CVSNT, Monotone, Perforce, Rational Team Concert, Subversion, Surround SCM, Synergy. But there are lots of 'Unknowns' on that page.

查看更多
疯言疯语
3楼-- · 2019-02-01 08:16

Bazaar VCS works with unicode filenames internally. And it has very good support for unicode both on Linux and Windows.

查看更多
淡お忘
4楼-- · 2019-02-01 08:26

git

August 2009:

The msysgit project is busy fixing UTF-8 support for Git on Windows. It might be fixed in the next release.


Update February 2012

UTF-8 is coming for msysgit, with commits like this one "Update less settings for UTF-8 "

From the Git for Windows Google+ page:

Karsten Blees' UTF-8 patches for Git for Windows has now been merged to 'devel'.
This means the upcoming release will support Unicode filenames!


Update April 2012

It's now released in mSysGit 1.7.10.

See the page Git for Windows Unicode Support.

查看更多
【Aperson】
5楼-- · 2019-02-01 08:26

Git on Windows 1.7.10 now uses UTF-8 for filenames regardless of the user's locale.

查看更多
Luminary・发光体
6楼-- · 2019-02-01 08:28

See issue 80 in the same repository. In 2009, there was a discussion on the Git Mailing list (e.g. 1, 2) where the Git maintainer Junio Hamano asked some questions regarding this. I don't have it right here. By joining the thread in a constructive manner you might help in resolving the issue.

In the Java implementation JGit, we always use UTF-8 when we create textual metadata and filenames. That is the only way, but there are some things to consider.

查看更多
Ridiculous、
7楼-- · 2019-02-01 08:32

This is a really tricky problem. The problems come because either tools try to interpret filenames when they don't know the encoding for sure, or because they translate, but translate to a form which cannot handle all cases (e.g. ASCII or UTF-16). None of the main 3 OS's agree on how a filename is encoded either, making things even harder.

For a good understanding of the issues I suggest reading Mercurial's encoding strategy page. It describes how the various platforms vary, and why Mercurial has chosen the strategy it has.

If you really need to do this, then the most basic thing is that ALL systems need to be set-up to use UTF-8 filenames, and not one of the many Japanese code pages. This is easier said than done though, but once it is done, no system should need to translate the filenames into anything else.

No translation, no issues.


*: Yes, I know you can have a default system encoding, but this is not the same as a filesystem encoding. What happens when a filesystem is accessed by multiple systems or it is physically moved between systems?

查看更多
登录 后发表回答