Delimit array with different strings

2019-08-20 09:56发布

I have a text file that contains 3 columns of useful data that I would like to be able to extract in python using numpy. The file type is a *.nc and is NOT a netCDF4 filetype. It is a standard file output type for CNC machines. In my case it is sort of a CMM (coordinate measurement machine). The format goes something like this:

X0.8523542Y0.0000000Z0.5312869

The X,Y, and Z are the coordinate axes on the machine. My question is, can I delimit an array with multiple delimiters? In this case: "X","Y", and "Z".

3条回答
何必那么认真
2楼-- · 2019-08-20 10:32

You can use Pandas

import pandas as pd
from io import StringIO

#Create a mock file
ncfile = StringIO("""X0.8523542Y0.0000000Z0.5312869
X0.7523542Y1.0000000Z0.5312869
X0.6523542Y2.0000000Z0.5312869
X0.5523542Y3.0000000Z0.5312869""")

df  = pd.read_csv(ncfile,header=None)

#Use regex with split to define delimiters as X, Y, Z.
df_out = df[0].str.split(r'X|Y|Z', expand=True)

df_out.set_axis(['index','X','Y','Z'], axis=1, inplace=False)

Output:

  index          X          Y          Z
0        0.8523542  0.0000000  0.5312869
1        0.7523542  1.0000000  0.5312869
2        0.6523542  2.0000000  0.5312869
3        0.5523542  3.0000000  0.5312869
查看更多
聊天终结者
3楼-- · 2019-08-20 10:32

I ended up using the Pandas solution provided by Scott. For some reason I am not 100% clear on, I cannot simply convert the array from string to float with float(array). I created an array of equal size and iterated over the size of the array, converting each individual element to a float and saving it to the other array.

Thanks all

查看更多
太酷不给撩
4楼-- · 2019-08-20 10:47

Using the filter function that I suggested in a comment:

String sample (standin for file):

In [1]: txt = '''X0.8523542Y0.0000000Z0.5312869
   ...: X0.8523542Y0.0000000Z0.5312869
   ...: X0.8523542Y0.0000000Z0.5312869
   ...: X0.8523542Y0.0000000Z0.5312869'''

Basic genfromtxt use - getting strings:

In [3]: np.genfromtxt(txt.splitlines(), dtype=None,encoding=None)
Out[3]: 
array(['X0.8523542Y0.0000000Z0.5312869', 'X0.8523542Y0.0000000Z0.5312869',
       'X0.8523542Y0.0000000Z0.5312869', 'X0.8523542Y0.0000000Z0.5312869'],
      dtype='<U30')

This array of strings could be split in the same spirit as the pandas answer.

Define a function to replace the delimiter characters in a line:

In [6]: def foo(aline):
   ...:     return aline.replace('X','').replace('Y',',').replace('Z',',')

re could be used for a prettier split.

Test it:

In [7]: foo('X0.8523542Y0.0000000Z0.5312869')
Out[7]: '0.8523542,0.0000000,0.5312869'

Use it in genfromtxt:

In [9]: np.genfromtxt((foo(aline) for aline in txt.splitlines()), dtype=float,delimiter=',')
Out[9]: 
array([[0.8523542, 0.       , 0.5312869],
       [0.8523542, 0.       , 0.5312869],
       [0.8523542, 0.       , 0.5312869],
       [0.8523542, 0.       , 0.5312869]])

With a file instead, the generator would something like:

(foo(aline) for aline in open(afile))
查看更多
登录 后发表回答