Saving a responsive site to a PDF

One feature that’s cropped up a few times when working with clients is their need to save a webpage to a pdf. Instead of simply telling them no, or making them save it through chrome as a PDF, it’s nice to be able to give them what they want. This however tends to be a much less simple process than you would imagine. Rolling your own pdf writing software is often a huge ordeal, and even the libraries aren’t particularly up to scratch.

So what to do? Why simply use one of the many “Save a page to pdf” services that are available, where you put a link to their site on your page, and they use the http referer to grab your page and print it.

Simple, right?

Well not really.

##The Problem

These services all tend to have their quirks and issues, and given that they’re all paid for products, you’d expect to be able to print a responsive site as a desktop view. This isn’t the case. The way these services work is by using a VM to view your website, with a fixed, generally small resolution. So while you’d expect to be able to just get the desktop view of your site, you end up with a tablet or even mobile view of the site, stretched onto 15-20 pages of a pdf, since that’s how it looks for a mobile view.

One option is to use the print media type with these services, but that’s another thing entirely from a pdf view of a desktop site, which is what we want.

So are we completely out of luck? Well that depends on the supplier. For pdfcrowd, pretty much. There’s no way of specifying resolution, no way of specifying a url, and very very few options.

pdfmyurl on the other hand gives a small glimmer of hope. You can pass through the parameter of css_media_type which in theory you could pass through something unused (i.e not print) like embossed, or speech, and have a separate style for that media type without responsiveness. Except you can’t. It only really seems to support print, which means, again, we’re out of luck.

##Proving It

To prove all this I hacked together a small proof of concept page that outputs some of the details that the VMs use, so here’s what a few of them end up looking like:

Here’s some of the css I used to test what the VM browsers end up reporting:

Pdfmyurl with the above css and media type embossed

alt text

web2pdf

alt text

pdfcrowd

alt text

htm2pdf was kind of interesting since I didn’t actually specify embossed at all with their link, but it rendered out as if I had.

alt text

##The Solution

After a lot of work, it seems the only real way to handle it is to pass over a parameter with the url to print, pick this up in javascript, and display some different css based on that parameter. This completely rules out most of the suppliers except for pdf my url.

So how do we test this? Well, setting up a trial account for pdf my url is pretty simple, then it’s just a case of swapping out the CSS based on a parameter.

Another option is to detect user agent and do something similar, but there’s no way of knowing if these will change, and even then a lot simply report being a standard browser.

Any better ideas? Let me know!