PDF ISO-32000 has a note in clause 12.6.2 that is just dying to get the PDF/D Best Practices treatment:
Thursday, February 26, 2009
Anomalous Situations - Best Practices
Posted by
Michael Cartwright
at
2:31 PM
0
comments
Labels: ISO 32000, PDF, PDF/D, Undefined Behavior
Preferred prefix for Colorant Basic Value Type

Posted by
Michael Cartwright
at
10:45 AM
0
comments
Labels: XMP
Wednesday, February 25, 2009
Open Source PDF/A RDF Schemas
Inspired by the Isartor test set for validating PDF/A compliance we are working on a similar style set of negative tests for basic XMP compliance (PDF/A XMP TechNotes).
While it is clear that this work needs to be done, nobody appears to be tackling it. PDF/A 19005-1 is now heading into its 3rd year so we're attempting to fill this gap.
While each vendor will obviously implement their own XMP validator for PDF/A validation and conversion, there are some areas where we can easily collaborate. We believe that it is in all our interests to openly share an RDF and PDF/A compliant XMP implementation of the pre-defined schemas required to validate PDF/A files.
Today we released our first version of the PDF/A pre-defined schemas in RDF form. You can find these resources at the PDF/D website.
Posted by
Michael Cartwright
at
1:30 PM
0
comments
Monday, February 23, 2009
Isartor Truth
As promised, we've posted more tools for standardized compliance testing.
Posted by
Michael Cartwright
at
2:28 PM
0
comments
Labels: Compliance Reports, PDF/A
Friday, February 20, 2009
XMP: bag vs Bag, seq vs Seq
The RDF specification clearly uses "Bag", "Alt" and "Seq" for the names of these container elements. This is a requirement for the names of these array container elements:
Posted by
Michael Cartwright
at
11:56 AM
1 comments
Labels: XMP
XMP pdfaValidate Schema
In building our new and improved validator we decided to use the pdfaExtension schema (and friends) to define all the schemas we are validating including all the pre-defined schemas. This process of eating our own dogfood has exposed numerous holes in both the XMP Specification and the PDF/A Specification.
Posted by
Michael Cartwright
at
8:51 AM
0
comments
Labels: XMP
Sunday, February 15, 2009
XMP Validator
I've been working on building a better XMP validator. My idea was to define all the pre-defined schemas as pdfaExtension schemas and pre-load them into my validator. With this approach, I only need one validator (that validates pdfaExtension schemas) to validate all the pre-defined schemas as well as any user defined schemas.
- status: Closed Choice of Text - required|prohibited|restricted|recommended|ignored
- constraint: Text - regular expression for constraining simple literal fields for PDF/A compliance.
Posted by
Michael Cartwright
at
11:22 AM
0
comments
Labels: XMP
Monday, February 9, 2009
More on Numbers
Earlier I discussed Numbers in a general post about improving PDF for easier parsing.
Posted by
Michael Cartwright
at
12:10 PM
2
comments
Sunday, February 8, 2009
Resources
Resources for a Page's Contents entry are defined in Resources dictionary of that Page or inherited from one of the ancestor nodes of that Page in the page node tree.
Posted by
Michael Cartwright
at
9:03 AM
1 comments
Labels: Content Streams, Obsolete, PDF/D
BX and EX
The last time a content operator was added to PDF was with PDF 1.2
Posted by
Michael Cartwright
at
8:56 AM
2
comments
Labels: Content Streams, Parser, PDF/D
Thursday, February 5, 2009
Defining the Undefined
Despite being such an enormous specification, PDF ISO 32000-1 still has some holes in it. Each time I encounter such a scenario I'm going to write about it and start to lock down behavior for PDF/D. Please correct me if I miss something and if the scenario I'm describing is actually defined.
- indirect references to undefined objects
- empty indirect objects
Posted by
Michael Cartwright
at
2:16 PM
4
comments
Labels: ISO 32000, PDF, PDF/D, Undefined Behavior
Tuesday, February 3, 2009
XRef stream vs xref
That didn't take long! I've been urged to compromise on legacy features already.
- only a single xref table (no Prev field in Trailer)
- no hybrid files (no XRefStm in trailer)
- no deleted objects (no f type in the xref table except for the first entry)
- generation numbers always zero
- only one section (implies consecutive object numbers starting at 1)
Posted by
Michael Cartwright
at
11:04 AM
2
comments
Labels: File Structure, PDF/D
Sunday, February 1, 2009
Parsing PDF/D
Posted by
Michael Cartwright
at
5:18 PM
0
comments
Labels: File Structure, Parser, PDF/D