Merging PDF files with Linux

Almost everything I write here is related to problems I face daily and this time I came with something new and not very common :D.

A client of mine asked me to find a solution which allow him to combine several PDFs files into a single one without the “complicated” steps of Windows software. I think there is no free solution to do that and the one I know, Adobe Reader, comes with this feature only with paid licence.

The physical scenario involve a Linux file server reachable by a network of Windows machines acting as SMB/CIFS clients and the procedure to merge the files looks like the following:

– The file server shares, with read/write permissions, a directory (folder) called zip2pdf.
– If a user wants to combine a series of files he just need to zip the files and copy the compressed file into the shared directory.
– After a certain time, proportional to the size of the documents to be merged, a PDF file will “appear” in the shared directory containing the PDFs combined.

How I did this?

– I used a filesystem watcher which notifies me every change on a target directory, in my case, file creation.
– I unzipped the file, sorted the output and merged the PDFs with ghostscript.
– I put it all in a bash script and ran it in background.

[bash]
#!/bin/bash
inotifywait –monitor –event CREATE –format “%w%f” $1 | while read file
do
MATCH=echo $file | grep -P ".zip$"
if [ “$MATCH”!=”” ] && [ -e $file ];
then
sleep 3 # wait an arbitrary value to “ensure” the file was closed.
# There are another ways to check it using lsof for instance,
# but it would complicate the script a lot

STAMP=date +"%Y-%m-%d_%N"
DIR=”uncompressed_$STAMP/”

mkdir -p $DIR

unzip $file -d $DIR

FILES=find $DIR -type f -iname *.PDF | sort
#echo “LIST: $FILES”
gs -dNOPAUSE -sDEVICE=pdfwrite -sOUTPUTFILE=$STAMP.pdf -dBATCH $FILES

mv $STAMP.pdf $1
rm -rf $DIR
else
echo NOOOOOOOOOOO
fi
done
[/bash]

[bash]
$ ./z2p.sh /home/carlos/Desktop/zip2pdf/
[/bash]

It works and it was pretty simple 😀

Leave a Reply

Your email address will not be published. Required fields are marked *