Java PDF Blog

PDF solutions for big and small customers

Java and PDF development - our personal experiences and discoveries

Download JPedal

Download JPedal PDF viewers

PDF to Image service

Try our PDF to image conversion service now.

Java PDF Ebook Solution

Try our Ebook solution now.

Subscribe

Your email:

Java PDF blog

Current Articles | RSS Feed RSS Feed

Why the TrueType hinting patent expiration matters

Posted by Sam Howard on Thu, Aug 05, 2010 @ 12:12 PM
Submit to Digg digg it |  Add to delicious  delicious |  Submit to StumbleUpon StumbleUpon | Submit to Reddit reddit 

A while back we put a lot of effort into implementing some of the hinting technology in the TrueType specification. This system effectively uses small programs in a stack based environment to manipulate the set of points which define the contours of a glyph.

Up until recently, some of the instructions available in the language were covered by a set of patents belonging to Apple, meaning anyone wishing to actually execute those particular instructions would need a licence from Apple. Unfortunately the patented instructions were the most frequently used instructions for moving points, meaning that simply executing the rest of the instructions does more harm than good. Now that those patents have expired, this is no longer an issue.

So what does this actually mean? These patents have stood for 20 years and the world hasn't fallen apart.

Well, that's true. New font technologies have been developed with different hinting mechanisms, tools for rendering TrueType have created their own automatic hinting algorithms and antialiasing technologies have vastly improved. All of this is true, and to some extent has reduced the need for the original hinting instructions, but the fact remains that a font hand hinted by an expert will always look clearer than any automatically hinted rendering.

However, what really clinches it for me is the fact that numerous Chinese fonts actually construct their glyphs by defining a range of glyphs as simple strokes, then using them in composite glyphs which are heavily manipulated by hinting instructions in order to form the final characters. It is impossible to work around as performing the relevant shifts and alignments uses the methods described by the patents. Now that the patents have expired there will be no need to buy a license from Apple, meaning many products can now add (or enable) the functionality used to display these fonts, including open source packages like FreeType which is used for font rendering in many Linux distributions.

components

Individual stroke component glyphs

 

unhinted

Unhinted composite glyph outline

 

hinted

Final hinted glyph

There's a lot of discussion on this topic over at Slashdot, and some more details about what was patented can be found at the home of FreeType.

0 Comments Click here to read/write comments

Software Development: Are we listening carefully?

Posted by Sam Howard on Tue, May 25, 2010 @ 01:53 AM
Submit to Digg digg it |  Add to delicious  delicious |  Submit to StumbleUpon StumbleUpon | Submit to Reddit reddit 

The quality of documentation and tools in an area of software development has an incredible impact on us developers. They both influence how fast we can work, the quality of the software we write, and even how we feel about our work.

I've recently been working on adding some support for hinting into our TrueType font renderer. I was delighted to find a number of excellent tools for working with TrueType fonts, such as Microsoft's font tools and FontForge, but unfortunately found the TrueType documentation rather lacking.

Instructed font hinting like that used in TrueType fonts is a complicated subject. Each glyph contains a small (ish!) program written in TrueType byte code, which is run in order to manipulate the points. This manipulation was initially designed primarily for grid fitting, a process which improved the appearance of text on screen before anti-aliasing became a feasible option, but is now also used extensively in foreign language fonts to move and change the shape of components of a compound glyph.

 

Japanese characters before and after their glyph programs have been run
Chinese characters before and after their glyph programs are run.


As can be expected with such a complex field, mistakes and ambiguities have crept into the documentation, and while generally very good, even the tools have some flaws.

TrueType was initially developed by Apple in response to Adobe's rather restrictively licensed Type 1 font technology, but was later licensed by Microsoft, eventually making it the de-facto standard on desktop computers. As a result, there are two primary sources of documentation – Apple and Microsoft. While they supposedly define exactly the same system, there are occasional direct contradictions in what they say! In fact, in Apple's guide the definition and example given for one of the most important byte codes is completely wrong.

This wouldn't be surprising in a new document – as I said, it's a complex topic – but these guides were written in the mid nineties! I can't be the first to have found these mistakes, but since in both cases there is no obvious way of contacting the authors, they've stayed incorrect for almost 15 years.

This, to me, highlights the need for a clear and direct line of communication between those writing specifications and those who use them – something we've been trying hard to achieve. Anything unclear? Let us know! We're here to help.

11 Comments Click here to read/write comments

Embedded PDF Truetype fonts are always MAC encoded unless they are not

Posted by Mark Stephens on Wed, Jan 13, 2010 @ 06:32 AM
Submit to Digg digg it |  Add to delicious  delicious |  Submit to StumbleUpon StumbleUpon | Submit to Reddit reddit 
Tags: 

As the PDF file specification has evolved, it has developed some 'quirks' - areas where it does not always work as documented. One of the most annoying areas of these is in Truetype font encoding. It is one of these features which is broken but it is now too late to fix.

Inside a PDF file, all text data is stored as a binary number and this value is decoded into the actual glyph value (ie the value 65 is converted into the text value 'A'). Because the PDF file format is 'multiplatform', there are a several possible sets of Standard Encoding Formats to use for this conversion (ie WinAnsi for Windows, and MacRoman for standard MAC values). This is because Windows and MAC originally evolved with different charactersets and values. Most of the time values are identical (A is value 65 in both MAC and WIN encoding) but certain accented characters have different values. So values 132 is Ntilde (letter N with a wavy line above in MAC encoding) but quotedblbase (double quotes at bottom of the line) on Windows. So long as we know which translation table to use, this is not a problem of course....

The issue comes with embedded Truetype fonts because they will always list them as MAC encoded in the PDF file (which is what the specification says they should be) when they are actually WIN encoded. Using the wrong look-up table does not matter for most values (as the results are identical) but it does break certain letters.

So what you need to do is to figure out if the font is actually WIN or MAC encoded yourself and ignore the setting in the PDF file. There is (of course) no documented way to do and several values can appear as different values in either...

What we did was to develop some heuristics to work it out which we continually test against known files and tweak as needed looking at the actually font values present, seeing whether WIN or MAC encoding gives a 'better fit' and checking certain key values. It also needs to factor in the fact that the font maybe subsetted so only a selection of values will be present.

So if you get some odd characters working with PDF files containing Truetype fonts, this may well be the reason. And if you come across a file displayed in our PDF viewer which has some odd characters,  please do send us the file so we can continue to improve our code.

 

0 Comments Click here to read/write comments

All Posts