html2ps/pdf FAQ

Back to table of contents

How would I report a bug?

Use the support forum of tufat.com.

Please, provide the following:

The will greatly reduce the time required for solving your issue. Thank you for understanding.

Installation.

Does html2ps require any external utilities like ghostscript?
No. PHP with GD extension is sufficient to run conversion. You may use additional extensions/utilities to use alternative output methods or to boost conversion speed a little bit, though.
Can I call this script from the command line?
Probably yes; check if your PHP support command line interface. Also, consider reading this article on php.net: Using PHP from the command line
How can I determine the script version?
Look in config.inc.php for HTML2PS_VERSION_MAJOR, HTML2PS_VERSION_MINOR and HTML2PS_SUBVERSION constants. The full version number is MAJOR.MINOR.SUBVERSION. If you cannot find these constants, you're using very old html2ps release.

No output at all. Broken output.

I'm getting "Warning: DOMDocument::loadXML() [function.DOMDocument-loadXML]: Empty string supplied as input" error message in PHP 5.2.0 when attemting to convert some files
A new configuration parameter pcre.backtrack_limit was introduced in PHP 5.2.0. html2ps does the excessive regexp usage; it is recommended to increase pcre.backtrack_limit value to 1000000.
Warning: DOMDocument::loadXML() [function.DOMDocument-loadXML]: Input is not proper UTF-8, indicate encoding ....
The page you're trying to convert specifies UTF8 encoding in header / meta tag, but is face uses different encoding. You need to switch from "Autodetect" encoding to the real one (in most cases iso-8859-1 will do) if you want to convert such page.
HTML2PS returns blank page. There's some strange messages in PHP error log, for example:
Parent: child process exited with status 3221225477 -- Restarting.
I'm using PHP 4.4.2
It is a PHP 4.4.2 bug #36017; there's no workarounds except changing PHP version or writing your own fetcher without 'fopen' function calls. I would recommend either downgrading to earlier 4.4.x versions or installing PHP 5.
All I'm getting is a blank page; no error messages in PHP error log. Whats happened?
The script is probably running out of memory or execution time. Try increasing the values of max_execution_time and/or memory_limit PHP configuration variables. Recommended values are 120 seconds and 32 megabytes. Nevertheless, if you're using VERY big images, you'll probably need to increase these values even more.
Another cause may be a JavaScript or META redirect on page you're trying to convert. As HTML2PS script is not designed as interactive user agent, it will not follow such redirects for you. You may try to open the url in question in your browser and check if the URL will change when page finishes loading. In this case, just supply the final URL to the script.
Also, please note that domain.com and www.domain.com may point to different sites. In the worst case, domain.com (without 'www' part) may just ignore HTTP requests. On the other side, popular browsers try to guess correct URL; for example, when you enter 'something' to the address bar, they may try to get something.com or www.something.com. This may lead to problem similar to one described in previous paragraph; the solution is the same: open URL in browser and check it will change.
Yet another cause may be browser built-in timeout; in particular, Safari for Windows has a built-in timeout of 60 seconds. In this case you will not be able to get the PDF file if conversion takes more than a minute.
I got the following error message: Fatal error: Allowed memory size of … bytes exhausted (tried to allocate … bytes) in&hellip
The script is running out of memory. Please refer to memory_limit PHP.net documentation regarding increasing memory limit.
The script just hangs when converting page containing images! With "render images" options disabled it works!
There were reports on this problem on Windows recently. A quick investigation showed that for some reason PHP 4.4.0 sometimes hangs indefinitely inside the 'fsockopen' call. Consider upgrading your PHP version in this case.
I've increased limits, but still sometimes get a blank page immediately after the script starts! Some sites are parsed, though...
Some users encountered this problem using the GD library bundled with PHP. While it matched the GD version requirement, it sometimes caused PHP to silently die on some images. The problem is solved by recompiling the PHP using the external (recent enough) GD library. Note that NOT ALL PHP configurations are subject to this problem.
I'm getting "PDF doesn't start with "%PDF-" message from Acrobat Reader. Nevertheless, when I save file to my hard drive, it opens perfectly. I'm using Firefox.
There were user reports on issues related to Firefox/Acrobat Reader plugin incompatibility. In particular, this problem appeared with Firefox 1.0.7 and Reader 6.0.2 PL. You may consider upgrading your software to latest versions in this case.
Some characters are displayed incorrectly or missing.

If you've installed, removed or changed font files, you may need to clear cache subdirectory. HTML2PS do store information extracted from file fonts there to reduce script initialization overhead. See also "I've installed/updated True-Type fonts, but it seems that ... (some mysterious problem) ... happens"

Another cause of this problem may be incorrect source encoding; when encoding is not explicilty specified, html2ps tries to take encoding from HTTP headers and META tags. If no encoding information found, html2ps assumes iso-8851-1.

I cannot fetch files from local hard disk using 'file' procolol.

First of all, please ensure you're providing URL, not the file path; for example, if you're trying to use image from c:\foo\bar\baz.gif, you'll need to use the following URL: file:///C:/foo/bar/baz.gif.

Second, due the security reasons, files accessed via 'file' protocol are limited to html2ps directory by default. This restriction is controlled by FILE_PROTOCOL_RESTRICT constant in the config.inc.php file. Note that this constant contains file path prefix; for example, to use files from C:\images directory you'll need to store C:\images\ value in this constant.

Broken layout.

Some characters are missing in my PDFs on some Acrobat Reader versions / different OSes
Try enabling font embedding (set 'embed' property in html2ps.config to value 1 for fonts used in your documents).
Sites are cut-off on the right side when I'm using 640 pixels page width. What can I do?
Nothing. Treat this as a feature. Just increase the page width. Most sites are NOT designed for such small resolutions and will cause a horizontal scrollbar to appear in browser in such cases.
I've disabled the "Keep screen pixel/point ratio" option and the page layout is completely broken! What can I do?
Nothing. Treat this as a feature. If you want to get the layout close to the image rendered by the browser, never disable this option. The only time you'll need it is when you need to render text having the exact size specified in points.
Some images are rendered inside black rectangles!
PNG images with alpha channel are NOT supported. Swicth to single-color transparency, if you need it.
Horizontal lines (e.g. line under the text) look like they consist of several parts with slightly different width.
Try disabling antialiasing in your PDF reader.
My absolute-positioned content is cut at the last page
Note that absolute and fixed-positioned content do not generate page breaks (see CSS 2.1. 13.2.3 Content outside the page box). The simplest workaround is to add static-positioned div with fixed height.

Customizing output.

How can I make an explicit page break?
You may use one of the following HTML2PS script-specific commands:
<!--NewPage-->
<pagebreak/>
<?page-break>
Or CSS page-break-after property:
<div style="page-break-after: always">
... some content ...
</div>
How should I add headers or footers to generated Postscript / PDF files?
You may use one of the following options: Note that when you use PreTreeFilterHeaderFooter or Header/Footer fields in web interface, content is implicitly placed in fixed-positioned div; you may thing of this as follows:
...
<body>
<!--header starts-->
<div style="position: fixed; ....">...your header content...</div>
<!--header ends-->
...
your HTML content
...
<!--footer starts-->
<div style="position: fixed; ....">...your footer content...</div>
<!--footer ends-->
</body>
...

Important note: HTML code added via PreTreeFilterHeaderFooter should be (almost) valid XHTML (see XHTML 1.0: Differences with HTML 4); in particular, all tags / attributes should be in lower case. Almost means that you don't need to specify wrapping html, head and body tags for the header content.

I've added headers and footers to my HTML pages, but how I can prevent them from showing up in the browser?
Use @media css rules setting 'display: none' or 'display: block' for header/footer blocks on different media.
Is there a possibility to create pdf documents with more than 72dpi using html2ps?
You may make a page with high-resolution images and set their on-page height and width using height and width attributes. HTML2PS does not resample images, just outputs them to PDF and provides the scaling factor.
Can I control media size / margins via CSS?
Yes. Use CSS 3 'size' and 'margin' properties. In addition, html2ps provides '-html2ps-pixels' property which allows you to override 'pixels' value specified in pipeline configuration code.
##PAGES## directive always displays 1 in batch mode!
Yes, it is a documented feature. ##PAGES## always refer to the number of pages in file being processed.

API

How could I convert HTML file from my local drive?
Use example file in samples/sample.simplest.from.file.php as a starting point.
How could I convert HTML code contained in a variable?
Use example file in samples/sample.simplest.from.memory.php as a starting point.
Can I convert a page using some authentication mechanism using the html2ps webinterface?
Out-of-the-box – no. Depending on the type of the authentication you may override the fetcher object with your custom one able to bypass authentication. Still, the recommended approach is html2ps API usage; in this case, you store your HTML code in a PHP variable instead of outputting it to the browser and call conversion engine directly.
I'm using API to convert files and images and / or CSS files seems to be ignored.
Most likely, you're using relative URLs and, at the same time, converting either HTML string from memory or local file. In this case script doesn't know the base URL to use while resolving relative paths, so these URLs are ignored. You have two options in this case:

Fonts. National symbols.

How can I use fonts other than standard (Times, Helvetica and Courier)?
Follow these instructions
Euro symbol is not displayed
First of all, check if you provided correct information on the file encoding to html2ps; encoding vectors containing euro symbol are 'iso-8859-15', 'windows-1250', 'windows-1251' or 'windows-1252'. Alternatively, you may use UTF-8 or HTML entities &euro; or &8364;.
Cyrillic symbols are not displayed in PS output
Install sharatype-fonts package to your Ghostscript; the script is configured to use these fonts out-of-the-box.
Greek symbols with tonos are not displayed in PS output; all other greek symbols rendered normally.
Chinese (Japanese, Arabic, etc...) symbols do not show on the page. What I need to do?
First of all, you'll need fonts containing these symbols; in most cases default fonts bundled with Ghostscript or PDFLIB will contain only Western/Central European symbols. After you find fonts containing characters you need, you should install them instead of the standard fonts, using the answer for this question «How can I use fonts other than standard (Times, Helvetica and Courier)?»
I've installed/updated True-Type fonts, but it seems that ... (some mysterious problem) ... happens
First of all, clean a "parsed fonts" cache in 'fpdf/font' subdirectory (just remove all files). This could solve most font-related issues.

Interactive forms

When I try to submit the form, Acrobat responds with a "Cannot handle content type: …" message.
Every time I submit the form, I get a strange-looking result page in by browser.
PDF interactive forms are not like HTML forms; you MUST modify the server-side script so it return FDF file instead of normal HTML in this case. See PDF Reference, v 1.6, page 1026, par. 134 for futher information. Also, you may check for a brief outline of PDF forms.

Frames

I have a page with frames containing a lot of text, but generated PDF contains only 1 page. Where's my content?
As produced PDFs are static, you have no ways to scroll frame content. Thus, only initially visible frame content will be available. It is a feature.
Some links inside the frames are not active even when I enable "Render Hyperlinks" option.
As was stated previously, script may render only a part of frame content. So, if rendered part contains a local hyperlink pointing to non-rendered part, this hyperlink will be disabled, as it points to nowhere.

Miscellaneous

Is it possible to reduce the size of output PDF file?
Yes. By default HTML2PS embeds fonts used during conversion in the generated PDF. You may disable this option by setting 'embed' attribute to '0' for these fonts in html2ps.config. Note that it will probably cause problems with national symbols on older versions of Acrobat Reader; also, this assumes that users have all fonts used in PDF files on their machines. Also, refer to the description of FONT_EMBEDDING_MODE configuration constant.
Is it possible to use a custom file name when outputting the pdf file? As of right now, the filename is long ugly string and doesn't look very clean. Can I pass the script a varible such as &saveas=thispdffile.pdf and use that for the file name when saving in the browser?
Yes. If you're using the web interface (html2ps.php file from distribution) you would need to replace $g_baseurl with $_REQUEST['saveas'] in the following piece of code near the end of html2ps.php:
switch ($g_config['output']) {
case 0:
   $pipeline->destination = new DestinationBrowser($g_baseurl);
   break;
case 1:
   $pipeline->destination = new DestinationDownload($g_baseurl);
   break;
case 2:
   $pipeline->destination = new DestinationFile($g_baseurl);
   break;
}; 
Also please note that by default output file name can contain only latin letters, digits, '-' and '_' signs, any other symbols will be replaced by underscores; you may change this behavior by hacking the filename_escape function in destination._interface.class.php.

If you're using API, refer to DestinationBrowser/DestinationDownload/DestinationFile class documentation.