Friday, July 24, 2009

Converting XHTML, HTML to PDF with CSS styles

I was wondering to convert HTML with CSS styles 2 PDF. Though googling, I could not find a good guidance on this.
So, I thought of publishing some information I came across.

There are actually several options to convert a HTML or a Web page into a PDF.

1. Adobe Acrobat

Yes off cause, Adobe Acrobat is one of the best options to convert a HTML into a PDF,
it will most of the time preserve the WYSIWYG.
The problem is that you need to pay for it. Adobe Acrobat is not free.

2. iText Library

“iText” Library is free and open-source. This project is mainly in Java, but now you have a C# iText library (iTextSharp) too.

The library is a very rich one doing many things on PDF files: create, manipulate etc.., but here I am considering only the HTML to PDF conversion. Actually you can convert a HTML into PDF easily with iText

"HtmlParser" class supports limited XHTML to PDF conversion. To apply CSS styles you need to use "HTMLWorker" class. Sample codes of these classes are available in Chapter 14 of iText in Action book. You can freely get the sample code form iText website.

However even this will not be a perfect XHTML to PDF conversion when it comes to CSS styles applied in HTMLs.

3. Flying Saucer & iText Library

The third option "Flying Saucer" with "iText" is much better, converting most of the CSS styles applied XHTML into PDFs nicely. This also free and open-source library given under LGPL. "Flying Saucer" bundles iText library in its binary distribution. The souce distribution also include set of demo and sample codes making it easy to understand the API.

Very simple set of lines to convert the XHTML into PDF,

Input xhtml:


<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>My First Document</title>
<style type="text/css"> b { color: green; } </style>
</head>

<body>
<p>
<b>Greetings Earthlings!</b>
We've come for your Java.
</p>
</body>

</html>



Java code to convert XHTML into PDF:

package flyingsaucerpdf;

import java.io.*;
import com.lowagie.text.DocumentException;
import org.xhtmlrenderer.pdf.ITextRenderer;

public class FirstDoc {

public static void main(String[] args)
throws IOException, DocumentException {
String inputFile = "samples/firstdoc.xhtml";
String url = new File(inputFile).toURI().toURL().toString();
String outputFile = "firstdoc.pdf";
OutputStream os = new FileOutputStream(outputFile);

ITextRenderer renderer = new ITextRenderer();
renderer.setDocument(url);
renderer.layout();
renderer.createPDF(os);

os.close();
}
}

I believe even SAAS clouds in www.salesforce.com also render PDFs using these packages.

4. Online HTML to PDF conversion

Apart from these options there are list of web sites those provide online HTML to PDF conversion which I will not discuss here.

Here are the useful weblinks:

Flying Saucer : https://xhtmlrenderer.dev.java.net/
iText : http://www.lowagie.com/iText/
Adobe Acrobat : http://www.adobe.com/products/acrobat/?promoid=BPDDU

Thank you, Have a nice Day..!

No comments:

Post a Comment

Search