I have converted entire branch with .NET and SQL sources to UTF-8 with BOM, having their Perforce file type changed to Unicode in the same operation. (Encoding difference might sound confusing, but in Perforce, Unicode file type denotes UTF-8 file content.) But later I have found out that Perforce silently elliminates BOM marker from UTF-8 files. Is it possible to set Perforce to keep UTF-8 BOM markers in files of Unicode file type? I can't find it documented.
Perforce server is switched to Unicode mode, connection encoding is UTF-8 no BOM (but changing it to UTF-8 with BOM doesn't make any difference).
Example:
- check out a source file from Perforce
- change file type to Unicode
- convert file content to format "UTF-8 with BOM"
- submit the file (now the file still keeps BOM in first 3 bytes)
- remove the file from workspace
- get the latest revision of the file (now the file doesn't contain BOM at the beginning)
OK, Hans Passant's comment encouraged me to re-examine P4CHARSET and finally, the answer has two parts:
For Perforce command line access, setting of P4CHARSET
variable controls the behavior. To enable adding BOM to files of Unicode type, use command
p4 set P4CHARSET=utf8-bom
In order to have these files without BOM, use
p4 set P4CHARSET=utf8
For P4V The Perforce Visual Client, the setting can be changed via menu Connection
> Choose Character Encoding...
. Use value Unicode (UTF-8)
to enable adding BOM and Unicode (UTF-8, no BOM)
to suppress it.
- if menu item
Choose Character Encoding...
is disabled, ensure the following (and then check again)
- P4V has connection to server open and working
- pane containing depot/workspace tree is focused (click inside to re-ensure this)
Notes:
- if you usually combine both above ways to access Perforce, you need to apply both solutions, otherwise you will keep getting mixed results
- if you want to instantly add/remove BOM to/from existing files, adjust the above settings, then remove files from workspace and add them again (see steps 5 and 6 of example posted in the question). Other server actions changing content of files (integrating, merging etc.) will do the similar
- for other encoding options and their impact on BOM, see the second table in Internationalization Notes for P4D, the Perforce Server and Perforce client applications