SQLite3 Import CSV & exclude/skip header

2020-06-01 03:18发布

问题:

I'm trying to get my data files (of which there are a dozen or so) into tables within SQLite. Each file has a header and I'll be receiving them a few times over the coming year so I'd like to:

  1. Avoid editing each file to remove the header when I receive them;
  2. Avoid falling back on shell scripts or Python to do this.

I define my table and import data...

> .separator "\t"
> .headers on
> CREATE TABLE clinical(
       patid      VARCHAR(20),
       eventdate  CHAR(10),
       sysdate    CHAR(10),
       constype   INT,
       consid     INT,
       medcode    INT,
       staffid    VARCHAR(20),
       textid     INT,
       episode    INT,
       enttype    INT,
       adid           INT);
> .import "Sample_Clinical001.txt" clinical
> SELECT * FROM clinical LIMIT 10;
patid   eventdate   sysdate constype    consid  medcode staffid textid  episode enttype adid
patid   eventdate   sysdate constype    consid  medcode staffid textid  episode enttype adid
471001  30/01/1997  09/03/1997  4   68093   180 0   0   0   20  11484
471001  30/01/1997  09/03/1997  2   68093   60  0   0   0   4   11485

My first thought was to DELETE the offending row, but that didn't work as expected, instead it deleted the whole table...

> DELETE FROM clinical WHERE patid = "patid";
> SELECT * FROM clinical LIMIT 3;
>

Did I get the syntax for testing equality wrong? I'm not sure; the docs don't seem to distinguish between the two. I thought I'd try again ...

> .import "Sample_Clinical001.txt" clinical
> SELECT * FROM clinical LIMIT 3;
patid   eventdate   sysdate constype    consid  medcode staffid textid  episode enttype adid
patid   eventdate   sysdate constype    consid  medcode staffid textid  episode enttype adid
471001  30/01/1997  09/03/1997  4   68093   180 0   0   0   20  11484
471001  30/01/1997  09/03/1997  2   68093   60  0   0   0   4   11485
> DELETE FROM clinical WHERE patid == "patid";
> SELECT * FROM clinical LIMIT 3;
> 

Am I even on the correct track here or am I doing something stupid?

I would have expected there to be an easy option to skip the header row when calling .import as having header rows in text files is a fairly common situation.

回答1:

patid is a column name.
"patid" is a quoted column name.
'patid' is a string.

The condition WHERE patid = "patid" compares the value in the patid column with itself.

(SQLite allows strings with double quotes for compatibility with MySQL, but only where a string cannot be confused with a table/column name.)



回答2:

With the sqlite3 shell's .import command, if the first character of a quote-enclosed filename is a |, the rest of the filename is instead treated as a shell command that is executed to produce the data to be imported. So, what I do in this situation is:

sqlite> .import '| tail -n +2 cars.csv' car

The tail invocation will print all but the first line of the file.



回答3:

import the csv to a new table and copy the new table's data to original target table, will that work?