PDF ISO-32000 has a note in clause 12.6.2 that is just dying to get the PDF/D Best Practices treatment:
Thursday, February 26, 2009
Wednesday, February 25, 2009
Inspired by the Isartor test set for validating PDF/A compliance we are working on a similar style set of negative tests for basic XMP compliance (PDF/A XMP TechNotes).
While it is clear that this work needs to be done, nobody appears to be tackling it. PDF/A 19005-1 is now heading into its 3rd year so we're attempting to fill this gap.
While each vendor will obviously implement their own XMP validator for PDF/A validation and conversion, there are some areas where we can easily collaborate. We believe that it is in all our interests to openly share an RDF and PDF/A compliant XMP implementation of the pre-defined schemas required to validate PDF/A files.
Monday, February 23, 2009
As promised, we've posted more tools for standardized compliance testing.
Friday, February 20, 2009
The RDF specification clearly uses "Bag", "Alt" and "Seq" for the names of these container elements. This is a requirement for the names of these array container elements:
In building our new and improved validator we decided to use the pdfaExtension schema (and friends) to define all the schemas we are validating including all the pre-defined schemas. This process of eating our own dogfood has exposed numerous holes in both the XMP Specification and the PDF/A Specification.
Sunday, February 15, 2009
I've been working on building a better XMP validator. My idea was to define all the pre-defined schemas as pdfaExtension schemas and pre-load them into my validator. With this approach, I only need one validator (that validates pdfaExtension schemas) to validate all the pre-defined schemas as well as any user defined schemas.
- status: Closed Choice of Text - required|prohibited|restricted|recommended|ignored
- constraint: Text - regular expression for constraining simple literal fields for PDF/A compliance.
Monday, February 9, 2009
Earlier I discussed Numbers in a general post about improving PDF for easier parsing.
Sunday, February 8, 2009
Resources for a Page's Contents entry are defined in Resources dictionary of that Page or inherited from one of the ancestor nodes of that Page in the page node tree.
The last time a content operator was added to PDF was with PDF 1.2
Thursday, February 5, 2009
Despite being such an enormous specification, PDF ISO 32000-1 still has some holes in it. Each time I encounter such a scenario I'm going to write about it and start to lock down behavior for PDF/D. Please correct me if I miss something and if the scenario I'm describing is actually defined.
- indirect references to undefined objects
- empty indirect objects
Tuesday, February 3, 2009
That didn't take long! I've been urged to compromise on legacy features already.
- only a single xref table (no Prev field in Trailer)
- no hybrid files (no XRefStm in trailer)
- no deleted objects (no f type in the xref table except for the first entry)
- generation numbers always zero
- only one section (implies consecutive object numbers starting at 1)