I have been searching but so far I only found how to insert date into tables based on a csv files.
I have the following scenario:
Directory name = ticketID
Inside this directory I have a couple of files, like:
Description.txt
Summary.txt
- Contains ticket header and has been imported succefully.Progress_#.txt
- this is everytime a ticket gets udpdated. I get a new file.Solution.txt
Importing the Issue.txt
was easy since this was actually a CSV.
Now my problem is with Description and Progress files.
I need to update the existing rows with the data from this files. Something on the line of
update table_ticket set table_ticket.description = Description.txt where ticket_number = directoryname
I'm using PostgreSQL and the COPY
command is valid for new data and it would still fail due to the ',;/ special chars.
I wanted to do this using bash script, but it seem that it is it won't be possible:
for i in `find . -type d`
do
update table_ticket
set table_ticket.description = $i/Description.txt
where ticket_number = $i
done
Of course the above code would take into consideration connection to the database.
Anyone has a idea on how I could achieve this using shell script. Or would it be better to just make something in Java and read and update the record, although I would like to avoid this approach.
Thanks Alex
Thanks for the answer, but I came across this:
Putting this into a simple script I created the following:
The downside is that it will create always a new connection, later I'll change to create a single file
But it does exactly what I was looking for, putting the contents inside a single column.
psql
can't read the file in for you directly unless you intend to store it as a large object in which case you can uselo_import
. See thepsql
command\lo_import
.Update: @AlexandreAlves points out that you can actually slurp file content in using
then reference it as a
psql
variable with:'myvar'
. Handy.While it's possible to read the file in using the shell and feed it to
psql
it's going to be awkward at best as the shell offers neither a native PostgreSQL database driver with parameterised query support nor any text escaping functions. You'd have to roll your own string escaping.Even then, you need to know that the text encoding of the input file is valid for your
client_encoding
otherwise you'll insert garbage and/or get errors. It quickly lands up being easier to do it in a langage with proper integration with PostgreSQL like Python, Perl, Ruby or Java.There is a way to do what you want in bash if you really must, though: use Pg's delimited dollar quoting with a randomized delimiter to help prevent SQL injection attacks. It's not perfect but it's pretty darn close. I'm writing an example now.
Given problematic file:
and sample table:
You can:
This could be a little hard to read and the random string generation is bash specific, though I'm sure there are probably portable approaches.
A random tag string consisting of alphanumeric characters (I used hex for convenience) is generated and stored in
seq
.psql
is then invoked with a here-document tag that isn't quoted. The lack of quoting is important, as<<'__END__'
would tellbash
not to interpret shell metacharacters within the string, wheras plain<<__END__
allows the shell to interpret them. We need the shell to interpret metacharacters as we need to substitutesep
into the here document and also need to use$(...)
(equivalent to backticks) to insert the file text. Thex
before each substitution ofseq
is there because here-document tags must be valid PostgreSQL identifiers so they must start with a letter not a number. There's an escaped dollar sign at the start and end of each tag because PostgreSQL dollar quotes are of the form$taghere$quoted text$taghere$
.So when the script is invoked as
bash testscript.sh difficult.txt
the here document lands up expanding into something like:where the tags vary each time, making SQL injection exploits that rely on prematurely ending the quoting difficult.
I still advise you to use a real scripting language, but this shows that it is indeed possible.
The best thing to do is to create a temporary table, COPY those from the files in question, and then run your updates.
Your secondary option would be to create a function in a language like pl/perlu and do this in the stored procedure, but you will lose a lot of performance optimizations that you can do when you update from a temp table.