PDF/A-3 and e-Invoicing

Manuel Polling
PDF/A-3 and e-Invoicing

PDF/A-3 and e-Invoicing

e-Invoicing standards are cropping up everywhere and are increasingly becoming mandatory. A number of those standards are based on PDF/A-3 with an XML attachment. Starting with version 2019.1, OL Connect comes with features for creating PDF/A-3, and turning them into ZUGFeRD and Factur-X conforming e-invoices. PDF/A-3 is not only for e-invoicing, it’s an archive format after all, so other applications become possible as well, of course. This article explains how to use this functionality in OL Connect.

Why PDF/A-3

Anyone not yet familiar with PDF/A-3 might wonder what the benefits are beyond the older PDF/A-1, or even regular PDF. These can be summarized as follows:

  • PDF/A is a standardized subset of PDF that focuses on archiving. A PDF/A file is intended to still be readable decades after it has been produced; for instance, by requiring all fonts to be embedded so there are no dependencies on other files.
  • PDF/A-3 allows transparency, OpenType fonts, and other things that were not yet allowed in PDF/A-1, because the A-3 version of the standard is based on a later version of PDF (as is PDF/A-2).
  • PDF/A-3 specifically allows embedding of any file as an attachment in a PDF/A file.

The ability to have a single file that is a human readable document (and that remains readable in the future), that can also contain a machine readable version, the original source, or any other related data, opens up interesting new possibilities. More detailed information about PDF/A-3 can be found at the bottom of this article.

PDF/A-3 output in OL Connect

OL Connect lets you create PDF/A-3 files from the Output Creation step. To embed files inside these PDFs, a new task “PDF/A-3 attachments” is available in OL Connect Workflow. Together, these functionalities allow you to produce PDF/A-3 files for archiving, e-invoicing, or any other application where it is convenient to combine PDF with other file formats.

Having the PDF/A-3 functionality in Output Creation and the attachment capability in OL Connect Workflow also means that PDF/A-3 can be created with any OL Connect product: PrintShop Mail Connect, PlanetPress Connect, and PReS Connect, but that adding attachments to these PDFs is restricted to PlanetPress Connect and PReS Connect. The PDF/A-3 Attachments task also requires an OL Connect Imaging license.

Create PDF/A-3 conforming files

Creating a PDF/A-3 file with OL Connect is just as easy as it is to create a PDF/A-1 file. In the Output Preset Wizard, you simply select “Generic PDF”, press Next, and then choose PDF/A-3b in the drop-down of the PDF Options page.

Set PDF document information

The PDF Options page offers controls to set the basic metadata for PDF’s. When creating PDF for archiving purposes, it can be important to have proper values for Title, Author, Description, and Keywords. This document information will work for any kind of PDF that can be created with Output Creation, and it can include variables or metadata from the Job Creation step.

Embed files in PDF/A-3 conforming files

Once you have a PDF/A-3, you can use OL Connect Workflow to add one or more attachments to it with the PDF/A-3 Attachments task (located in the Actions plugin category of the OL Connect Workflow Configuration tool).

The task will allow you to embed as many files as you like inside your PDF/A-3 file and it accepts either the PDF or one of the intended attachments as its input. The output will always be the modified PDF/A-3 with the attachments in it. Most settings in this task allow dynamic values, which makes it very flexible.

This task also allows you add the proper information to create ZUGFeRD or Factur-X conforming PDF/A-3.

Using OL Connect’s PDF/A-3 for e-Invoicing

ZUGFeRD 1.0, and Factur-X 1.0

Germany’s ZUGFeRD, and France’s Factur-X (which is part of Chorus Pro) formats are both essentially PDF/A-3 with an XML file embedded using a specific method. Part of this method is specifying metadata that makes it possible to identify the PDF/A-3 as an e-invoice according to those standards. The PDF/A-3 Attachments task has an additional tab (labeled e-Invoicing) that lets you simply choose the metadata for the standard you need, and set any configurable properties.

ZUGFeRD 2.0, and X-Rechnung

Both of these standards are identical to the French Factur-X standard, designed jointly by France and Germany, so naturally they are supported as well.

Other PDF-based e-invoicing standards

While the French and German standards are supported out of the box in OL Connect, other similar standards are already being worked on acros the globe and will be easy to add in future versions. Although the mechanism to add other standards is not documented, it should be no problem to add support for similar PDF-based e-invoicing standards on a case by case basis. If you run into such a standard, contact OL : once an actual case has provided clarity on that particular standard, it will be fairly simple to reliably support it out of the box.

Create ZUGFeRD and Factur-X conforming files

When creating PDF/A-3’s conforming to Factur-X, or ZUGFeRD, you have to pick the right extension schema on the e-Invoicing metadata tab, and make sure to match the attachment name of your conforming XML file with the DocumentFileName property of the metadata. The screen shots below show all these settings.

The last thing to set in the PDF/A-3 attachments task, is the conformance level of the attached XML. Each standard has specified a number of different levels a file can conform to. For ZUGFeRD 1.0, these are BASIC, COMFORT, and EXTENDED. Factur-X has a few more choices in conformance levels: MINIMUM, BASIC WL (“WL” is short for Without Lines), BASIC, EN 16931, and EXTENDED. The conformance level you need is essentially defined by the receiving party, so make sure to obtain that information from whomever is meant to be the recipient for the e-Invoice.

How to create conforming XML files

With the ability to create PDF/A-3 files with OL Connect, and an easy way to embed XML files and turn the PDF’s into e-Invoicing compliant files, only one challenge remains: how to create a standards-conforming XML file. The answer is easy, although its implementation may not be as easy: if you don’t already have a conforming file to begin with, you have to implement your own data transformation mechanism. When starting from the result of data mapping, this can be achieved through an XSLT transformation in OL Connect Workflow using the Open XSLT task.

Even though creating XSLT transformations is not trivial, this does give us a solution for creating an e-Invoices from any input file that can be handled by the OL Connect Data Mapper.

Obviously, we would all prefer an out-of-the-box transformation feature, but that is simply not possible: just for ZUGFeRD and Factur-X, there are potentially 8 different formats to implement due to the different conformance levels (and other standards are likely to be different) but more importantly, the input data is likely to be different for every type of document. So there currently is no generic method for achieving this transformation.

What about the existing ZUGFeRD plugin?

Since we now have a way to support all ZUGFeRD conformance levels in a more generic way, this plugin is no longer needed. In cases where it is already in use, there is no reason to change anything at this time, but for new implementations the new approach is recommended.

Sample OL Connect Workflow processes

Putting all this functionality together requires a relatively simple OL Connect Workflow process. Given that the PDF/A-3 Attachments task can work with a PDF/A-3 job file, or an intended attachment, an OL Connect Workflow process for creating a PDF/A-3 with attachments can take two shapes. Here are examples of both.

A ZUGFeRD invoice with a PDF job data file

This process creates a single ZUGFeRD-conforming PDF/A-3 from an input file. The input for the process is a single record that gets data mapped, so this process is a recipe for handling any data that can processed by OL Connect’s DataMapper.

  1. Capture the data file
  2. The Data Mapper task returns a record and record set id that is needed later for Content Creation (step 8)
  3. The XML invoice data file is created first on a branch, so it will still be available after Output Creation has completed
  4. The record id is used to retrieve all record data as JSON
  5. The JSON is converted to XML, so it can be processed with an XSLT transformation
  6. The XML from the Connect Data Model is transformed into one that complies with a specific ZUGFeRD conformance level
  7. The XML file is saved in OL Connect Workflow’s temporary folder for this process, so it will get cleaned up automatically once the process is done
  8. The All in One performs the Content Creation, Job Creation, and Output Creation, and results in a PDF/A-3 output file.
  9. The intermediate XML file is embedded into the PDF/A-3 job file.
  10. The resulting ZUGFeRD file is saved.

A ZUGFeRD invoice with an attachment as the job data file

If the input file already is an XML file, it can be more convenient to create the PDF/A-3 in a branch on the side. This is especially true when the incoming XML is already ZUGFeRD-conforming, or, at the opposite, when the XML needs lots of processing, while the PDF/A-3 creation is fairly simple, as it is in this example.

  1. Capture the XML file
  2. Set the PDF file name and location because it is needed both in the branch and the main steps
  3. Create the PDF/A-3 file on a branch
  4. Create the PDF/A-3 file. In this case the All in One can handle this in a single step
  5. Temporarily save the PDF in Workflow’s temporary folder for this process, so it will get cleaned up automatically when the process is done
  6. Transform the incoming XML into one that complies with a certain ZUGFeRD conformance level. As mentioned already, this step may be ignored if the incoming XML is already conforming
  7. Pick up the intermediate PDF/A-3 file and embed the XML job file, resulting in a conforming ZUGFeRD PDF
  8. Save the resulting file

About PDF/A-3

The PDF/A-3 archive standard is a successor of PDF/A-1, and PDF/A-2. They are all detailed in the ISO 19005 standard, as Part 1, 2, and 3 respectively. Both PDF/A-2, and A-3 are based on PDF 1.7 (a.k.a. ISO 32000-1), allowing for features such as transparency effects and layers, and embedding of OpenType fonts, among others.

PDF/A-3 differs from PDF/A-2 in that it allows embedding of arbitrary file formats. This opens up applications that can have the best of both worlds:

  • Hybrid archiving, where, for instance, an MS Excel source file is embedded in the PDF that is to be archived; the Excel file is likely to offer the best viewing experience in the short term, while the PDF version provides a stable view of the same data in the long term, when there may no longer be a compatible Excel version available (anything can happen in 10, 20, or 30 years). Having one embedded in the other underlines the dependency relationship and reduces the risk of losing that connection.
  • Combining a human readable version of a document with a machine readable version. This is why e-invoicing standards like the German ZUGFeRD and French Factur-X use PDF/A-3 as their file format. Both specify XML-based standards for invoicing data, which is then embedded in a PDF/A-3 file. The XML can be used for straight-through processing of the invoice, while the PDF provides a human readable version of the invoice.

When adding a file as an attachment to a PDF/A-3, it is required to specify the relationship between the PDF, and the attachment. This relationship can be between the embedded file and the entire PDF document, but it can also be related to a specific part or object in the PDF. OL Connect only supports relationships between the entire PDF document and the attachments. Five types of relationships are supported:

Alternative
means that the attachment is an alternative representation of the PDF document itself. For instance, an XML version of the exact same invoice as shown by the PDF.
Data
is for files that contain the information to derive a visual representation in the PDF. For instance, a CSV with the detail lines of an invoice. Another example could be the data that was used to create a graph, although OL Connect will not let you associate that file to just that graph (it will associate it to the entire PDF file).
Source
is used when the embedded file is the source that the PDF was created from. In case of OL Connect content creation, it is debatable whether the input data or the template itself should be considered the source. But when  a PDF is created from Microsoft Word, the Word file would be the source
Supplement
is used when the file represents a supplemental representation of the PDF content that, for instance, may be easier to consume for some. A plain text version of an invoice might be a good example of this.
Unspecified
is a fallback to use when the relationship is not known or cannot be described using one of the other values.

Which relationship to use is mostly a question of semantics and user choice; it makes sense to only use Source if the PDF was actually created from that file. And with Alternative, one would assume the file to have the same information, so you can assume it’s fine to use just the attachment for further processing.

All PDF/A versions define different conformance levels. Level a requires the PDF to be structured with tags (e.g, Tagged PDF), and have full unicode information. Level b does not have these requirements, and is therefore easier to produce, especially when coming from print data. PDF/A-2 and A-3, also introduce conformance level u, which is equivalent to level a, but without the tags (so only the unicode information).

Tagged in: Archiving, Attachments, e-Invoicing, Factur-X/4, output, pdf, PDF/A-3, ZUGFeRD