Using cut in bash on a file with a unique delimint

2019-04-10 13:52发布

Can cut be used in bash with the ¬ delimiter?

This question is an extension of the topic covered here. One interpretation of the goal in that link is to use a delimiter that can not be found (or very rarely found) in human text. Say we choose the 'Not Sign' (¬) as a delimiter. My question is regarding the use of cut to pull specific columns of a file with said delimiter.

For example, say that we create a file with the ¬ delimiter. The file prac.txt might look like:

$cat prac.txt
"Billy""Car"¬"Red"¬"Garage"¬"3"
"Rob"¬"Truck"¬"Blue"¬"Street"¬"14" 

The following process produces an error:

$cut -d'¬' -f1 prac.txt  
cut: the delimiter must be a single character
Try `cut --help' for more information.

The correct output would be:

"Billy"
"Rob"

Possibly useful info from python:

import unicodedata
>>>unicodedata.lookup('Not sign')
u'\xac'

Possibly useful character conversion link.

My guess is that the -d flag uses some representation of '¬' that I have not tried yet or else it only works with single ascii characters. Thanks in advance for any help.

1条回答
对你真心纯属浪费
2楼-- · 2019-04-10 14:14

In UTF-8, the "not sign" is encoded in two bytes c2 ac. and cut doesn't handle this, which is arguably a bug. See this discussion on unix.stackexchange.

查看更多
登录 后发表回答