How to load 2D array from a text(csv) file into Oc

2019-08-17 11:05发布

Consider the following text(csv) file:

1, Some text
2, More text
3, Text with comma, more text

How to load the data into a 2D array in Octave? The number can go into the first column, and all text to the right of the first comma (including other commas) goes into the second text column.

If necessary, I can replace the first comma with a different delimiter character.

2条回答
仙女界的扛把子
2楼-- · 2019-08-17 11:19

AFAIK you cannot put stings of different size into an array. You need to create a so called cell array.

A possible way to read the data from your question stored in a file Test.txt into a cell array is

t1 = textread("Test.txt", "%s", "delimiter", "\n");
for i = 1:length(t1)
    j = findstr(t1{i}, ",")(1);
    T{i,1} = t1{i}(1:j - 1);
    T{i,2} = strtrim(t1{i}(j + 1:end));
end

Now
T{3,1} gives you 3 and
T{3,2} gives you Text with comma, more text.

查看更多
乱世女痞
3楼-- · 2019-08-17 11:30

After many long hours of searching and debugging, here's how I got it to work on Octave 3.2.4. Using | as the delimiter (instead of comma).

The data file now looks like:

1|Some text
2|More text
3|Text with comma, more text

Here's how to call it: data = load_data('data/data_file.csv', NUMBER_OF_LINES);

Limitation: You need to know how many lines you want to get. If you want to get all, then you will need to write a function to count the number of lines in the file in order to initialize the cell_array. It's all very clunky and primitive. So much for "high level languages like Octave".

Note: After the unpleasant exercise of getting this to work, it seems that Octave is not very useful unless you enjoy wasting your time writing code to do the simplest things. Better choices seems to be R, Python, or C#/Java with a Machine Learning or Matrix library.

function all_messages = load_data(filename, NUMBER_OF_LINES)
  fid = fopen(filename, "r");

  all_messages = cell (NUMBER_OF_LINES, 2 );
  counter = 1;

  line = fgetl(fid);

  while line != -1
      separator_index = index(line, '|');
      all_messages {counter, 1} = substr(line, 1, separator_index - 1); % Up to the separator
      all_messages {counter, 2} = substr(line, separator_index + 1, length(line) - separator_index); % After the separator
      counter++;

      line = fgetl(fid);
  endwhile

  fprintf("Processed %i lines.\n", counter -1);
  fclose(fid);
end
查看更多
登录 后发表回答