PDF Snapshot

blog header image

A quite common request I have heard a number of times is the need to take a snapshot of a web site and store it securely, in order to in the future be able to proof what was stated on the site at a given date. I know a number of EPiServer customers have already implemented solutions for this – but so far there haven’t been an easy, generic solution to the problem. Until now :-)

Here is a small, handy scheduled task, that you can set up to take a snapshot of the entire website into JPG and / or PDF every day. The snapshot can then be stored either on the hard-drive or in a virtual path and later accessed. It uses ExpertPDF's HTML to PDF converter for which EPiServer has bought a redistributable license.

The installation is fairly straightforward – just copy the 2 assemblies in this zip into your bin folder and you should see it appear as a scheduled job. Before you run it the first time, make sure that you go to the Admin-mode Plug-in-Manager and fill in the configuration:

  • Author name – the PDF meta-field for the Author.
  • Which page size to use. Americans often use “Letter” and Europeans tend to go for “A4”
  • If some of your pages require log in to see, you can provide a user name that it should impersonate in order to retrieve those pages.
  • The folder where you want your snapshots to be stored. Either a local physical folder (make sure that the IIS user has access) or a virtual path like “~/Global/Snapshots”
  • Finally, you need to check whether you want PDF files to be generated, JPGs and if PDF’s should have a header and footer indicating where they are from and when they were generated.
  • Set the starting point for the generation and you are done!

The files generated will be in a folder hierarchy similar to the site structure, and all languages will be extracted and files generated. Since this can be a rather slow process, it spawns off in it’s separate thread, that reports back to the Scheduled Task log when it’s done. If you try to start a new instance of the job manually, while one is already running, it will simply report back the progress the existing job is having.

Here’s a couple of examples of generated snapshotsPDF and JPG.

Note that this is released as a research prototype. No guarantees or promises – use AS-IS. Suggestions for improvements and bug-reports can me left in a comment or tweeted to me (@athraen).

Recent posts