HOCRTextBox

* I believe computing the textBox is rather expensive. The results should be
  cached for better performance.


PDFAWriter

* In PDFAWriter::addPages(const JBIG2Document &, …) we should consider the case
  that the PDFDataChunk might be empty when the JBIG2 was generated using the
  generic encoder.  For performance reasons, we should avoid generating empty
  PDF objects.

* The PDF code can probably be compactified a little, avoiding unnecessary
  spaces in dictionary definitions, etc.

* We should avoid constructing identical objects twice.  This might happen,
  e.g., when a large document containss several empty pages.

* Compete with Google's "pdfsizeopt" for the shortest possible PDF files
