Java PDF Blog

PDF solutions for big and small customers

Java and PDF development - our personal experiences and discoveries

Download JPedal

Download JPedal PDF viewers

PDF to Image service

Try our PDF to image conversion service now.

Java PDF Ebook Solution

Try our Ebook solution now.

Subscribe

Your email:

Java PDF blog

Current Articles | RSS Feed RSS Feed

Understanding the PDF file format - Text, shapes and images

Posted by Mark Stephens on Wed, May 26, 2010 @ 02:50 AM
Submit to Digg digg it |  Add to delicious  delicious |  Submit to StumbleUpon StumbleUpon | Submit to Reddit reddit 

I have been looking at an issue for a potential client recently which required the generation of different views of the page. This is interesting because it allows me to show you the internal workings of the PDF file format rather elegantly. It seems to be an increasingly common activity from our clients these days as they build web applications to display PDFs and need to separate out text and images.

What is in a PDF

A PDF can contain bitmapped images, Vector graphics and text (which can be Vector or bitmapped depending on the font used). Sometimes, you may be surprised at what you find. While a PDF may look like it contains text, the lettering may actually be part of the image (as in a scan) or shapes (where the text was converted to paths). Here is a rather nice PDF page showing what is going on...

Here is the complete page

 

which consists of images

text and vector graphics

and just the text

(the white text is invisible on a default white background)

  

The white text in particular illustrates how dependent on each other the layers are - we could generate it as a transparent image and add a coloured background if we wanted to highlight the text layer on its own. 

Creating your own separations

If you would like to create your own separations, there is a new support page explaining how to use the feature in our JPedal PDF library - you will need version 4.20 or later. 

 

1 Comments Click here to read/write comments

Convert PDF to grayscale or black and white

Posted by Mark Stephens on Fri, Mar 12, 2010 @ 02:16 AM
Submit to Digg digg it |  Add to delicious  delicious |  Submit to StumbleUpon StumbleUpon | Submit to Reddit reddit 

PDFs are designed to use full colour with transparency. We generally use ARGB to provide the best quality display of PDF content and we are always trying to improve our quality of output. So it was something of a culture shock to have several inquiries about making the output WORSE. They wanted the PDF displayed or printed in Grayscale or black and white...

Taking a PDF an making the output grayscale or black and white does result in some image deterioration which varies from file to file. On a page of black and white text it is negligible whereas it can have a big effect on images. Here is an example.

original PDFThe original...

grayscale PDF...as grayscale

black and white PDFand in black and white.

 As you can see in this example, grayscale works well whereas black and white does impact image appearance.

The advantages of having a PDF in grayscale or black and white are that it allows for a smaller image size and allows it use in faxes (which do not seem to be as extinct as I had believed). It also means that people who do lots of printing can use the cheaper print modes.

There are several ways to convert a PDF to grayscale or black and white. If you are using JPedal and want to use the feature we have added, there is a new tutorial here.

0 Comments Click here to read/write comments

All Posts