Thursday, February 5, 2009

Defining the Undefined

Despite being such an enormous specification, PDF ISO 32000-1 still has some holes in it.  Each time I encounter such a scenario I'm going to write about it and start to lock down behavior for PDF/D. Please correct me if I miss something and if the scenario I'm describing is actually defined.


Empty Object
The specification does not mention the meaning of empty indirect objects like:

10 0 obj
endobj

I've tried to read between the lines to fathom the meaning of this emptiness but it simply is not defined. An obvious choice would be to treat such an object as the null object. I believe Acrobat Reader does this.

Variations on this theme that are defined include an indirect object containing an empty dictionary or an indirect object that is simply the null object:

11 0 obj
<<>>
endobj

12 0 obj
null
endobj

In addition, indirect references to undefined objects are treated as the null object (7.3.10) and a dictionary entry whose value is null shall be treated the same as if the entry does not exist (7.3.7). 

PDF/D
To simplify the specification and reduce unnecessary bloat, the only null object shall be:

null

Illegal in PDF/D:
  • indirect references to undefined objects
  • empty indirect objects
Best Practices in PDF/D:

In addition, we consider it a best practice to omit an entry from a dictionary rather than to include an entry with a null value.

"Zero" Object
Another illegal behavior I've seen in customer files is indirect references that look like this:

... 0 0 R ...

Sometimes I've also seen this object actually defined like:

0 0 obj
...
endobj

ISO 32000-1 clearly states that this is illegal which means it is obviously illegal for PDF/D too. In 7.3.10 the object number is defined as a positive integer. Last time I checked, 0 is not one of the positive integers.

4 comments:

  1. Good catch! I've added this to my list of corrigendum items for 32000 - but please also submit it yourself via the standard ISO process.

    Leonard Rosenthol
    PDF Standards Architect
    Adobe Systems

    ReplyDelete
  2. FWIW: Empty Objects are invalid according to ISO 32000 since there is nothing in there...and Acrobat will treat it as if there is no object present.

    ReplyDelete
  3. Is there a clause in ISO 32000 that specifies this? I would expect something explicit in 7.3.10 or at least a "Note".

    Thanks,
    Michael

    ReplyDelete