What text format can I use to present data origina

2019-08-06 02:50发布

问题:

I have an Excel spreadsheet that has many people's estimates of another person's height and weight. In addition, some people have left comments on both estimate cells like "This estimate takes into account such and such".

I want to take the data from the spreadsheet (I've already figured out how to parse it), and represent it in a plain text file such that I can easily parse it back into a structured format (using Perl, ideally).

Originally I thought to use YAML:

Tom:
  Height:
    Estimate: 5
    Comment: Not that confident
  Weight:
    Estimate: 7
    Comment: Very confident
Natalia: ...

But now I'm thinking this is a bit difficult to read, and I was wondering if there were some textual tabular representation that would would be easier to read and still parsable.

Something like:

PERSON      HEIGHT     Weight
-----------------------------
Tom         5          7
___START_HEIGHT_COMMENT___
    We hold these truths to be self-evident, that all men are created equal, that they are endowed by their Creator with certain unalienable Rights, that among these are Life, Liberty and the pursuit of Happiness.  That to secure these rights, Governments are instituted among Men, deriving their just powers from the consent of the governed [...]  
Wait, what's this project about again?
___END_HEIGHT_COMMENT___
___START_WEIGHT_COMMENT___
    We hold these truths to be self-evident, that all men are created equal, that they are endowed by their Creator with certain unalienable Rights, that among these are Life, Liberty and the pursuit of Happiness.  That to secure these rights, Governments are instituted among Men, deriving their just powers from the consent of the governed [...]  
Wait, what's this project about again?
___END_WEIGHT_COMMENT___

Natalia     2          4
John        3          3

Is there a better way to do this?

回答1:

CSV (Comma Separated Values).

You can even save it directly into this format from Excel, and read it directly into Excel from this format. Yet it is also human readable, and easily machine parseable.



回答2:

Normally if I want to capture data from a spreadsheet in textual form I use CSV (which Excel can read and write). It's easy to generate and parse as well as being compatible with many other tools but it doesn't rank high on the "human readable" chart. It can be read but it's awkward for anything but simple files with equal field widths.

XML is an option, but YAML is easier to read. Being human-readable is one of the design goals of YAML. The YAML::Tiny module is a nice and lightweight module for typical cases.

It looks like what you have in mind is a plain text table, or possibly a tabular format with fixed with columns. There are some modules on CPAN that might be useful: Text::Table, Text::SimpleTable, others... These modules can generate a representation that's easy to read but parsing it will be harder. (They're intended for data presentation, not storage and retrieval.) You'd probably have to build your own parser.



回答3:

Adding to Robert's answer, you can simply put the comments in additional columns (commas will be escaped by the CSV output filter of Excel etc). More on CSV format: www.csvreader.com/csv_format.php



回答4:

No reason you can't use XML, though I'd imagine it's overkill in this particular case.



回答5:

There's also Config::General for simple data, and its family of related classes.