There is a way to provide to the PTBTokenizer a set of delimiters characters to split a token ?
i was testing the behaviour of this tokenizer and i've realized that there are some characters like the vertical bar '|' for which the tokenizer diviedes a substring into two token, and others like the slash or the hypen for which the tokenizer return a single token.
There's not any simple way to do this with the PTBTokenizer, no. You can do some pre-processing and post-processing to get what you want, though there are two concerns worth mentioning:
(There is a similar question on customizing apostrophe tokenization behavior: Stanford coreNLP - split words ignoring apostrophe.)