Convert Octet-stream To Pdf

Jan 18, 2018 - File extension.azw3 Category Description AZW3 - Kindle Format 8 (KF8) is Amazon's newer version of AZW. It supports HTML5 and CSS3. AZW to PDF - Convert file now View other document file formats Technical Details Each PDF file encapsulates a complete description of a 2D document (and, with the advent of Acrobat 3D, embedded 3D documents) that includes the text, fonts, images and 2D vector graphics that compose the document. Some scanners mail the scanned pdf file as an application octet-stream with pdf extension these are indeed pdf files currently they are not.

Tools that I'm using for this:

Chrome Notepad++ Sublime Text 3 Fiddler WinMerge Adobe Acrobat Reader X

Synopsis

I have downloaded a pdf twice, once through Chrome as an experimental control; once again through a raw /GET request via Fiddler which returns me an octet-stream. To this point, I can save the octet-stream as pdf and I can get the proper page count and some of the page headers and numbers, but very little of the body content is loading. When I open my file in Adobe Reader X, I get an error that it

Cannot extract the embedded font 'LFIDTH+ArialMT'. Some characters may not display or print correctly

and I cannot work through why it can be extracted from the 'true' pdf but cannot from the one I am saving.

Details

As for my manual pull of the file, I have provided

Accept: application/pdf, application/x-pdf, application/x-gzpdf, application/x-bzpdf

The server sent me back an aplication/octet-stream with an attachment Disposition.

So to recap:

  1. Valid Foo.pdf sitting on my hard drive
  2. HTTP Response with an octet-stream version of same file, in UTF-8 encoding (I assume)

Here is what I know:

I pulled the Message Body of the response from the server and dropped it to file. I then ran a WinMerge comparison of it against the contents of the pdf and every line mismatched on line endings. I re-encoded the EOLs for Unix and the diff shrank to ~1k lines out of 160k. A close inspection of the mismatch indicates that the valid pdf maintains what looks like a NUL 00 character in places whereas my octet-stream contains literal spaces. Also, the 'true' pdf is reporting EOL: LF 1252 Mixed through WinMerge. My 'raw' pdf is reporting 1252 Unix When I homogenize the 'true' pdf to 1252 Unix, I get the same issue as I explained in the 'raw' one.

Is there anything I can do to get this mess of an octet-stream straightened out?

Note that the pdf that was downloaded through Chrome is historic. I have it on my machine, but I downloaded it 'sometime in the past' and the request headers used when processing that /GET are no longer available. Attempting to download through the browser 'now' results in an error, but an explicit GET request against the resource through Fiddler is returning the pdf as an octet-stream.

Convert Octet-stream To Pdf
K. Alan Bates
K. Alan BatesK. Alan Bates

1 Answer

Well now....

In Fiddler Session,

Right click HTTP Response with the application/octet-stream body Save Response Response Body

If Content-Disposition: attachment;filename has been set on the response, the File Save Dialog will be prepopulated with filename

Easy after you know it's there.

K. Alan BatesK. Alan Bates

Not the answer you're looking for? Browse other questions tagged pdfencodinghttpresponsefiddler or ask your own question.

I am using Mozilla Firefox with a PDF viewer plug-in. The plug-in has been correctly associated with Adobe Reader files to view them in the browser in the settings.

I would like to be able to view PDF files in Firefox rather than downloading them. This already works correctly when a web server indicates that a file has the Content-Type of application/pdf. However, some web servers provide other Content-Types for PDFs, such as application/octet-stream. (See this example of a PDF served with a non-pdf Content-Type.)

I have looked at Firefox's MimeTypes.rdf file, and it appears to only support mapping applications based on file extensions for non-Internet-based files. (It looks like it only uses Content-Type to map Internet-based files.)

How can I have Firefox view all PDF documents in-browser rather than only the ones with the application/pdfContent-Type?

SamSam

Convert Octet Stream To Pdf

2 Answers

Firefox has no content inspection code (e.g. the linux file command) to detect the actual content type and rely on the Content-Type header.

Workaround: mozplugger extension

See man 7 mozplugger:

Octet Stream Viewer

Workaround: human interaction

Save the file and open it in the file explorer ;-)

Workaround: misconfiguration

An additional workaround is to hack mimeTypes.rdf and assign application/octet-stream to the same value as application/pdf.

I don't advice this workaround.

user86064

You can use Force Content-Type extension for pdf files with wrong Content-Type response header.

For example if web server provides Content-Type: application/octet-stream you can add rule to transform it to Content-Type: application/pdf based on pdf file url, like this:

ks1322ks1322

Convert Octet-stream To Pdf

Not the answer you're looking for? Browse other questions tagged firefoxpdfmime-types or ask your own question.