Naming convention for Data Fields

Philippe Fontan

January 12th, 2022

As a general rule, software engineers love to give users of their applications as much flexibility as possible. If you ask them for an option allowing you to pick one of the seven rainbow colors, they’ll usually respond with some way of picking any of the 32 million colors that your display is capable of. After all, they say, it’s no more work for them to do so and it gives you, the user, much greater flexibility. But that flexibility may come at a price you’re not prepared to pay.

Sometimes, less is more

If your application is a virtual coloring book for young children, 32 million colors might be a bit much. How are toddlers meant to remember which seven of those are typically associated with rainbows and unicorns? The point is: having a slew of options doesn’t mean you should always use all of them.

Field names in the data model

Leaving the joyful world of coloring books and turning our attention to OL Connect, let’s take a look at the data model fields that identify the data elements being merged onto documents.

It’s only natural to give those fields meaningful names. “CustomerNumber” or “ShippingAddress” are much more explicit names than “Field1” or “Column17” because your brain doesn’t have to go through the abstract process of associating seemingly random names with their corresponding data elements.

But sometimes, it’s tempting to use an even more elaborate naming convention to make things even more explicit. You could, for instance, create fields like “Address (shipping)” and “Address (billing)“. Names like these have the advantage of making things plainly readable to the template designer. However, what you gain in readability you actually start losing in functionality: your scripts will no longer be able to refer to the fields with basic dot notation (e.g. record.fields.ShippingAddress) because of the spaces and parentheses in the field names. You will have to use the slightly more convoluted bracket notation (e.g. record.fields[“Address (shipping)”] ).

Security and stability

Granted, syntax adjustments in scripts might be a very small price to pay for added readability. There is a point, however, when you might unknowingly be introducing security risks and stability issues through the naming of your data fields. For instance, if some of your fields contain JavaScript or SQL punctuation characters, you might be opening the door to remote code execution or SQL injection attacks from malicious users. Note that this is unlikely to happen because at some point OL Connect will error out before any harm can be caused, but it is still a concern. More likely, the job will simply fail (hence the stability issue) and you may have a difficult time trying to trace the source of the error.

We are not going to examine the ways in which the field names could be used to cause issues with your OL Connect jobs – we don’t want to give anyone any ideas! The goal here is to raise awareness about potential issues and to prevent them from happening.

Best practice

The simplest way to avoid potential issues is to adhere to a well-defined naming convention. The most consistent and established of these conventions is arguably the XML Naming Scheme (a summary of which can be found here), which defines how elements must be named for an XML file to be well-formed. Since OL Connect does not store its field names inside an XML structure, some of the XML rules don’t apply to it, so we can pretty much use a simple subset of those rules:

Field names must start with a letter or an underscore
Field names may contain letters, digits, and underscores
Spaces and other punctuation are not allowed

Going back to the examples we used earlier, proper field names could therefore be : Address_Shipping, ShippingAddress, shipping_address etc.

It’s important to note that in the case of XML, database and CSV data, OL Connect automatically uses the names of the elements, fields and columns already defined in the data to name the data model fields. In most cases, this produces field names that adhere to the rules laid out above. But occasionally – especially with DB and CSV input data – column names may include characters that result in a field name that doesn’t match those rules (for instance “Employee’s benefits” contains a space and a single quote, neither of which is allowed in the XML Naming Scheme). You should adjust those names to fit the rules (e.g. EmployeeBenefits or employee_benefits).

Should existing data models be revisited?

The short answer is yes. At some point in the future, OL Connect may implement stricter naming rules for fields. Those new rules would likely restrict usage of reserved characters. You don’t have to start converting your data models right away – there aren’t yet any firm plans to implement stricter rules – but whenever you are making changes to a data model or to a template, you should take the opportunity to also change the field names so they follow the above rules. This will immediately make your system more secure and stable, and it will also ensure that if/when OL Connect starts implementing stricter naming rules for fields, your resources will be ready.

Important note

Restrictions on variable names don’t only apply to data model fields but also to properties (which can be assigned to any entity in OL Connect, be it a data record, a content item, a job set, etc.). You should therefore follow the same naming pattern when using properties.

Tagged in: best practice, data, datamapper

Comments are closed here.

Naming convention for Data Fields

Sometimes, less is more

Field names in the data model

Security and stability

Best practice

Should existing data models be revisited?

Important note

Keep up with the latest in email

Next best reads for you!