Convert PDF to Google Docs
I was able to run a script where the script automatically converts PDFs to Google Doc format. The problem we seem to be running into is that there are images in the PDF files. When we convert PDF to Google Doc, Google Doc has no images and only text. I believe the reason this is happening is due to OCR. Is it possible that I could automate a script to convert images to PDF as well as Google Docs?
Here is the script question:
GmailToDrive('0BxwJdbZfrRZQUmhldGQ0b3FDTjA', '"Test Email"');
function GmailToDrive(folderID, gmailSubject){
var threads = GmailApp.search('subject: ' + gmailSubject + ' -label: Imported'); // performs Gmail query for email threads
for (var i in threads){
var messages = threads[i].getMessages(); // finds all messages of threads returned by the query
for(var j in messages){
var attachments = messages[j].getAttachments(); // finds all attachments of found messages
var timestamp = messages[j].getDate(); // receives timestamp of each found message
var date = Utilities.formatDate(timestamp, "MST", "yyyy-MM-dd"); // rearranges the returned timestamp
for(var k in attachments){
var fileType = attachments[k].getContentType();
Logger.log(fileType);
if (fileType = 'application/pdf') { // if the application is a pdf then it will convert to a google doc.
var fileBlob = attachments[k].copyBlob().setContentType('application/pdf');
var resource = {
title: fileBlob.getName(),
mimeType: fileBlob.getContentType()
};
var options = {
ocr: true
};
var docFile = Drive.Files.insert(resource, fileBlob, options);
}
}
}
}
}
source to share
The option is ocr
intended for reading symbols from images and PDF documents. This will not include images in the uploaded result.
Look at the parameter convert
.
The API documentation provides a right-hand side check that you can quickly check each parameter.
source to share