How to keep BOM from removal from Perforce unicode

2019-05-09 14:31发布

问题:

I have converted entire branch with .NET and SQL sources to UTF-8 with BOM, having their Perforce file type changed to Unicode in the same operation. (Encoding difference might sound confusing, but in Perforce, Unicode file type denotes UTF-8 file content.) But later I have found out that Perforce silently elliminates BOM marker from UTF-8 files. Is it possible to set Perforce to keep UTF-8 BOM markers in files of Unicode file type? I can't find it documented.

Perforce server is switched to Unicode mode, connection encoding is UTF-8 no BOM (but changing it to UTF-8 with BOM doesn't make any difference).

Example:

  1. check out a source file from Perforce
  2. change file type to Unicode
  3. convert file content to format "UTF-8 with BOM"
  4. submit the file (now the file still keeps BOM in first 3 bytes)
  5. remove the file from workspace
  6. get the latest revision of the file (now the file doesn't contain BOM at the beginning)

回答1:

OK, Hans Passant's comment encouraged me to re-examine P4CHARSET and finally, the answer has two parts:


For Perforce command line access, setting of P4CHARSET variable controls the behavior. To enable adding BOM to files of Unicode type, use command

p4 set P4CHARSET=utf8-bom

In order to have these files without BOM, use

p4 set P4CHARSET=utf8

For P4V The Perforce Visual Client, the setting can be changed via menu Connection > Choose Character Encoding.... Use value Unicode (UTF-8) to enable adding BOM and Unicode (UTF-8, no BOM) to suppress it.

  • if menu item Choose Character Encoding... is disabled, ensure the following (and then check again)
    • P4V has connection to server open and working
    • pane containing depot/workspace tree is focused (click inside to re-ensure this)

Notes:

  • if you usually combine both above ways to access Perforce, you need to apply both solutions, otherwise you will keep getting mixed results
  • if you want to instantly add/remove BOM to/from existing files, adjust the above settings, then remove files from workspace and add them again (see steps 5 and 6 of example posted in the question). Other server actions changing content of files (integrating, merging etc.) will do the similar
  • for other encoding options and their impact on BOM, see the second table in Internationalization Notes for P4D, the Perforce Server and Perforce client applications