convert multiple DBs to CSV

2019-02-20 19:16发布

问题:

I have thousands of dB files that need to be converted to CSV files. This can be achieved by a simple script / batch file i.e.

.open "Test.db"
.mode csv
.headers on.

I need the script to open the other db files which all have different names, is there a way that this can be performed as i do not want to write the above script for each db file

回答1:

The sqlite3 command-line shell allows some settings to be done with command-line arguments, so you can simply execute a simple SELECT * for the table in each DB file:

for %%a in (*.db) do sqlite3 -csv -header "%%a" "select * from TableName" > %%~na.csv

(When this is not part of a batch file but run directly from the command line, you must replace %% with %.)



回答2:

I made a script that batch-converts all db-sqlite files in the current directory to CSV, called 'sqlite2csv'. Well it outputs each table of each db-sqlite as a CSV file, so if you have 10 files with 3 tables each you will get 30 CSV files. Hope it helps at least as a starting point to make your own script.

#!/bin/bash

# USAGE EXAMPLES :
# sqlite2csv
# - Will loop all sqlite files in the current directory, take the tables of
#   each of these sqlite files, and generate a CSV file per table.
#   E.g. If there are 10 sqlite files with 3 tables each, it will generate
#        30 CSV output files, each containing the data of one table.
#   The naming of the generated CSV files take from the original sqlite
#   file name, prepended with the name of the table.

# check for dependencies
if ! type "sqlite3" > /dev/null; then
    echo "[ERROR] SQLite binary not found."
    exit 1
fi

# define list of string tokens that an SQLite file type should contain
# the footprint for SQLite 3 is "SQLite 3.x database"
declare -a list_sqlite_tok
list_sqlite_tok+=( "SQLite" )
#list_sqlite_tok+=( "3.x" )
list_sqlite_tok+=( "database" )

# get a lis tof only files in current path
list_files=( $(find . -maxdepth 1 -type f) )

# loop the list of files
for f in ${!list_files[@]}; do
    # get current file
    curr_fname=${list_files[$f]}
    # get file type result
    curr_ftype=$(file -e apptype -e ascii -e encoding -e tokens -e cdf -e compress -e elf -e tar $curr_fname)
    # loop through necessary token and if one is not found then skip this file
    curr_isqlite=0
    for t in ${!list_sqlite_tok[@]}; do
        curr_tok=${list_sqlite_tok[$t]}
        # check if 'curr_ftype' contains 'curr_tok'
        if [[ $curr_ftype =~ $curr_tok ]]; then
            curr_isqlite=1
        else
            curr_isqlite=0
            break
        fi
    done
    # test if curr file was sqlite
    if (( ! $curr_isqlite )); then
        # if not, do not continue executung rest of script
        continue
    fi
    # print sqlite filename
    echo "[INFO] Found SQLite file $curr_fname, exporting tables..."
    # get tables of sqlite file in one line
    curr_tables=$(sqlite3 $curr_fname ".tables")
    # split tables line into an array
    IFS=$' ' list_tables=($curr_tables)
    # loop array to export each table
    for t in ${!list_tables[@]}; do
        curr_table=${list_tables[$t]}
        # strip unsafe characters as well as newline
        curr_table=$(tr '\n' ' ' <<< $curr_table)
        curr_table=$(sed -e 's/[^A-Za-z0-9._-]//g' <<< $curr_table) 
        # temporarily strip './' from filename
        curr_fname=${curr_fname//.\//}
        # build target CSV filename
        printf -v curr_csvfname "%s_%s.csv" $curr_table "$curr_fname"
        # put back './' to filenames
        curr_fname="./"$curr_fname
        curr_csvfname="./"$curr_csvfname
        # export current table to target CSV file
        sqlite3 -header -csv $curr_fname "select * from $curr_table;" > $curr_csvfname
        # log
        echo "[INFO] Exported table $curr_table in file $curr_csvfname"
    done
done


回答3:

I prepared a short python script which will write a csv file from multiple sqlite databases.

python multiple_sqlite_files_tocsv.py -d <inputFolder> -e <extension> -t <tableName>

will output the data to output.csv file.

Jupyter notebook and a python script are on github.

https://github.com/darshanz/CombineMultipleSqliteToCsv