Breaking out of loops in DataMapper

Avatar
Breaking out of loops in DataMapper

Breaking out of loops in DataMapper

Loops are commonplace in data mapping configurations. Unless all your data is static, you have to use them in order to extract repeating items. However, it can sometimes be challenging to loop precisely on a specific set of conditions, especially when dealing with compound conditions. Let’s see how OL Connect 2021.1 can help us streamline loops.

Note: all examples shown use resources that are attached at the bottom of this article.

A simplistic example

Let’s say you are given the list of chemical elements found in the human body. Since there are several tens of elements that only have trace amounts, you are tasked with only extracting in a main table all the top elements that make up a maximum of 99% of the composition of the body. The rest can be discarded. Fortunately, you are provided with a sorted list of all elements in descending order :

Oxygen,65.0
Carbon,18.5
Hydrogen,9.5
Nitrogen,3.2
Calcium,1.5
Phosphorus,1.0
Potassium,0.4
Sulfur,0.3
Sodium,0.2
Chlorine,0.2
Magnesium,0.1

If we add up the elements above, we can see that going from Oxygen to Phosphorus, we reach a total of 98.7%. Adding Potassium would make us go over the 99% mark. So our (simple) challenge is to extract the first 6 elements in a loop.

Classic DataMapper

The easy way to do this is to loop through all elements and simply ignore elements once we’ve reached the maximum allowed. The following DM config should work just fine:

However, there are a number of issues with it. First of all, the loop ran 11 times, which is a waste of time. Second, the loop extracted Sodium, after having skipped over Postassium and Sulfur. That’s because at that point, it had reached a total of 98.7%, so it skipped those two elements, but then Sodium came next and adding it to our total would still not exceed our 99% limit. But that’s incorrect: we were asked to add the top elements, up to 99%, not pick and choose all elements until we get as close as possible to 99%. And the last issue we have is that at the end of the loop, we are at the bottom of our data page so if we were planning on doing any further processing, we don’t know where we actually reached our maximum percentage.

Fortunately, we can fix it easily: let’s change the loop so that it stops as soon as the next element would make the total exceed 99%. So we convert the loop from “On all elements” to a “While statement is true” loop and check for a simple value that gets set in the false branch of the embedded condition. This works better and brings down the total number of loops to 7 and the maximum percentage of all top elements to 98.7% (both highlighted in blue), which is what we want:

However, look at the position of our data (highlighted in red): we are now positioned at the Sulfur line, which means we skipped over the Potassium line (which is the one that would have made our percentage total exceed 99%).

So if we have further processing to do, we’ll start that processing one line too far. The problem is with the Goto step in the loop: it gets executed even though our loop is, in theory, finished. You could try putting it inside a condition, but then the loop will complain of a possible infinite loop. There are a few ways to get around this issue, but I will leave that as an exercise for the reader. Let’s just say that however you get around it, it’s a PITA and a waste of valuable time.

2021.1 DataMapper

With version 2021.1, the DataMapper implements a method to break out of a loop and immediately jump to the next task following the current loop:

Here, we go back to looping through all elements, so we don’t have to set a condition on our loop. But when the element we are processing makes our total exceed 99%, the false branch of the condition now executes the new Break out of repeat loop action, which immediately transfers control to the first task following the loop. That way, we avoid executing that extra Goto step inside the loop and our data pointer is now positioned exactly where it should be: at the Potassium line.

Conclusion

You will find that this feature simplifies your data mapping configurations as it avoids having to specify extra conditions for the loops or inside the loops. It also ensures that the number of iterations is kept to a minimum, thereby making the process more efficient.

Resources

Tagged in: break, datamapper, loop



Leave a Reply

Your email address will not be published. Required fields are marked *