Merge multiple jpg into single pdf in Linux

2019-03-08 02:42发布

I used the following command to convert and merge all the jpg files in a directory to a single pdf file.

convert *.jpg file.pdf

The files in the directory are numbered from 1.jpg to 123.jpg. The convertion went fine but after converting the pages were all mixed up. I wanted the pdf to have pages from 1.jpg to 123.jpg in the same order as they are named. I tried with the following command as well:

cd 1 
FILES=$( find . -type f -name "*jpg" | cut -d/ -f 2)
mkdir temp && cd temp 
for file in $FILES; do 
    BASE=$(echo $file | sed 's/.jpg//g');
    convert ../$BASE.jpg $BASE.pdf; 
    done && 
pdftk *pdf cat output ../1.pdf && 
cd .. 
rm -rf temp

But still no luck. Operating platform Linux.

6条回答
Bombasti
2楼-- · 2019-03-08 03:22

The problem is because your shell is expanding the wildcard in a purely alphabetical order, and because the lengths of the numbers are different, the order will be incorrect:

$ echo *.jpg
1.jpg 10.jpg 100.jpg 101.jpg 102.jpg ...

The solution is to pad the filenames with zeros as required so they're the same length before running your convert command:

$ for i in *.jpg; do num=`expr match "$i" '\([0-9]\+\).*'`;
> padded=`printf "%03d" $num`; mv -v "$i" "${i/$num/$padded}"; done

Now the files will be matched by the wildcard in the correct order, ready for the convert command:

$ echo *.jpg
001.jpg 002.jpg 003.jpg 004.jpg 005.jpg 006.jpg 007.jpg 008.jpg ...
查看更多
不美不萌又怎样
3楼-- · 2019-03-08 03:37

Or just read the ls manual and see :

-v natural sort of (version) numbers within text

So, doing what we need in single command.

convert `ls -v *.jpg` foobar.pdf

Have fun ;) F.

查看更多
Summer. ? 凉城
4楼-- · 2019-03-08 03:37

You could use

convert '%d.jpg[1-132]' file.pdf

via https://www.imagemagick.org/script/command-line-processing.php:

Another method of referring to other image files is by embedding a formatting character in the filename with a scene range. Consider the filename image-%d.jpg[1-5]. The command

magick image-%d.jpg[1-5] causes ImageMagick to attempt to read images with these filenames:

image-1.jpg image-2.jpg image-3.jpg image-4.jpg image-5.jpg

See also https://www.imagemagick.org/script/convert.php

查看更多
太酷不给撩
5楼-- · 2019-03-08 03:37

All of the above answers failed for me, when I wanted to merge many high-resolution jpeg images (from a scanned book).

Imagemagick tried to load all files into RAM, I therefore used the following two-step approach:

find -iname "*.JPG" | xargs -I'{}' convert {} {}.pdf
pdfunite *.pdf merged_file.pdf

Note that with this approach, you can also use GNU parallel to speed up the conversion:

find -iname "*.JPG" | parallel -I'{}' convert {} {}.pdf
查看更多
啃猪蹄的小仙女
6楼-- · 2019-03-08 03:43

This is how I do it:
First line convert all jpg files to pdf it is using convert command.
Second line is merging all pdf files to one single as pdf per page. This is using gs ((PostScript and PDF language interpreter and previewer))

for i in $(find . -maxdepth 1 -name "*.jpg" -print); do convert $i ${i//jpg/pdf}; done
gs -dNOPAUSE -sDEVICE=pdfwrite -sOUTPUTFILE=merged_file.pdf -dBATCH `find . -maxdepth 1 -name "*.pdf" -print"`
查看更多
疯言疯语
7楼-- · 2019-03-08 03:44

Mixing first idea with their reply, I think this code maybe satisfactory

jpgs2pdf.sh

#!/bin/bash

cd $1
FILES=$( find . -type f -name "*jpg" | cut -d/ -f 2)
mkdir temp > /dev/null
cd temp

for file in $FILES; do
 BASE=$(echo $file | sed 's/.jpg//g');
 convert ../$BASE.jpg $BASE.pdf;
done &&

pdftk `ls -v *pdf` cat output ../`basename $1`.pdf
cd ..
rm -rf temp
查看更多
登录 后发表回答