apache arrow - reading csv file

2019-09-21 11:15发布

问题:

all I'm working with apache arrow now.

When reading csv file with arrow::csv::TableReader::Read function, I want to read this file as a file with no header.

But, it reads csv file and treat first row as csv header(data field). Is there any options to read csv file with no header?

Thanks

回答1:

Check out the ParserOptions

int32_t arrow::csv::ParseOptions::header_rows = 1

It can be defined as third argument in TableReader::Make(...).

static Status   Make(MemoryPool *pool, std::shared_ptr< io::InputStream > input, const ReadOptions &, const ParseOptions &, const ConvertOptions &, std::shared_ptr< TableReader > *out)

Check the documentation: https://arrow.apache.org/docs/cpp/namespacearrow_1_1csv.html

and these test files: https://github.com/apache/arrow/tree/3cf8f355e1268dd8761b99719ab09cc20d372185/cpp/src/arrow/csv



回答2:

You can't at the moment. You'll got an error if header_rows == 0:

if (parse_options_.header_rows == 0) {
    // TODO allow passing names and/or generate column numbers?
    return Status::Invalid("header_rows == 0 needs explicit column names");
}

(https://github.com/apache/arrow/blob/3cf8f355e1268dd8761b99719ab09cc20d372185/cpp/src/arrow/csv/reader.cc)