I have a very simple application. A user uploads a pdf file to a postgres database via the web front end. That pdf should then be rendered in the browser via pdfjs.
I'm fairly certain my issue is an encoding one, but I don't think I understand encoding well enough to answer this on my own.
My model:
class Lesson(Base):
__tablename__ = 'lessons'
# Name of the lesson
lesson_order = db.Column(db.Enum(LessonIndexes), nullable=False)
name = db.Column(db.String(128), nullable=False)
summary = db.Column(db.String(500))
lesson_plan_id = db.Column(db.Integer(), ForeignKey('lesson_plans.id'), nullable=False)
pdf = db.Column(db.LargeBinary())
My Controller:
@mod_lp.route('/<lesson_plan_id>/create_lesson', methods=["POST"])
def create_lesson(lesson_plan_id):
form = LessonForm()
file = request.files['pdf'] # type: FileStorage
if form.validate_on_submit():
file = request.files['pdf']
lesson = Lesson(form.lesson_order.data, form.name.data, form.summary.data, lesson_plan_id,
pdf=file.read() # this line here
)
db.session.add(lesson)
db.session.commit()
return redirect(url_for('lesson_plan.show', lesson_plan_id=lesson_plan_id))
This stores the data to look something like:
%PDF-1.4
%����
1 0 obj
<</Creator (Mozilla/5.0 \(Macintosh; Intel Mac OS X 10_12_6\) AppleWebKit/537.36 \(KHTML, like Gecko\) Chrome/60.0.3112.113 Safari/537.36)
/Producer (Skia/PDF m60)
/CreationDate (D:20170916222407+00'00')
/ModDate (D:20170916222407+00'00')>>
endobj
2 0 obj
<</Filter /FlateDecode
/Length 1370>> stream
x���ݎ�4��<������� qq$8�@%`aB�H�_�����T�E���ړ�c'�t�Z��[������}�{�I���@���
(etc...)
my javasript (taken from PDFJS, hello world):
var pdfString = "{{ pdf_data}}";
var pdfData = atob(pdfString);
if (pdfData) {
var loadingTask = PDFJS.getDocument({data: pdfData});
loadingTask.promise.then(function (pdf) {
console.log('PDF loaded');
// Fetch the first page
var pageNumber = 1;
pdf.getPage(pageNumber).then(function (page) {
console.log('Page loaded');
var scale = 1.5;
var viewport = page.getViewport(scale);
// Prepare canvas using PDF page dimensions
var canvas = document.getElementById('pdf-canvas');
var context = canvas.getContext('2d');
canvas.height = viewport.height;
canvas.width = viewport.width;
// Render PDF page into canvas context
var renderContext = {
canvasContext: context,
viewport: viewport
};
var renderTask = page.render(renderContext);
renderTask.then(function () {
console.log('Page rendered');
});
});
}, function (reason) {
// PDF loading error
console.error(reason);
});
The current error I have is:
6:108 Uncaught DOMException: Failed to execute 'atob' on 'Window': The string to be decoded is not correctly encoded.
things i've tried:
file.stream.getvalue()
file.stream.getvalue().decode("latin-1") # for whatever reason, this was the only 'decode' that didn't throw an error
file.stream.getvalue().decode("latin-1").encode()
base64.b64encode(file.stream.getvalue().decode("latin-1").encode())
but these all failed in various ways. UPDATE:
If I send the binary data in the database to my template:
pdf_data = lesson.pdf
and forget about calling atob
on it:
var pdfData = pdfString;
if (pdfData) {
...
I get this error:
Error: Invalid XRef stream header
pdf.worker.js:340 at error (http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:340:17)
at XRef_readXRef [as readXRef] (http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:20943:13)
at XRef_parse [as parse] (http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:20613:28)
at PDFDocument_setup [as setup] (http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:26445:17)
at PDFDocument_parse [as parse] (http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:26336:12)
at http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:36120:28
at Promise (<anonymous>)
at LocalPdfManager_ensure [as ensure] (http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:36115:14)
at LocalPdfManager.BasePdfManager_ensureDoc [as ensureDoc] (http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:36067:19)