Multi-Tasking 4: Custom parallel processing for output engines

Manuel Polling
Multi-Tasking 1: Engines

Multi-Tasking 1: Engines

When Parallel Processing presets do not perform as expected, you can choose your own custom settings. This article goes over all Parallel Processing settings for output engines.

This article is part of a series. The previous article in this series explained the settings for merge engines: Multi-Tasking 3: Custom parallel processing for merge engines.

As for the differences between Parallel Processing presets, they were explained in the second article in the series: Multi-Tasking 2: Parallel Processing Presets.

Output speed

Settings for Parallel Processing are available in the Connect Server Configuration tool under Parallel Processing. The settings for output engines are found on the Output Creation tab. Note that the terms Output Engine and Weaver Engine are equivalent.

The options on the Content Creation tab and Output Creation tab may look the same at first sight, but they are in fact very different.

  • The content creation settings control the distribution of merge engines across all content creation tasks: print, email and web.
  • The output creation settings only affect output engines, and output engines only produce print output from the intermediate print content created by merge engines. This includes PDF output, even if it is not meant to be printed but saved to a file.

But there is another difference, which has to do with the license. Not only does the license put a maximum to the number of tasks that can be performed simultaneously, it also sets a maximum output speed. (For the exact numbers, see the first article in this series: Multi-Tasking 1: Engines.)
This limit plays no role in the content creation settings, but it does in the output creation settings.

If only one output creation task can run at a time (as is the case in PrintShop Mail Connect, or when the OL Connect Server is set to launch a single output engine), then this engine gets all tasks and always runs up to the maximum speed allowed by the license. But when several output engines run tasks in parallel, the tasks and the licensed speed must be divided amongst the engines.

And that is what output creation settings are all about.

Output Creation tab

On the Output Creation tab you can reserve output engines for small, medium and large jobs and control their performance by setting a target speed in pages per minute (ppm).

If the Output Creation tab is disabled, it means only a single output engine is specified, in which case there is obviously no need for setting how engines work in parallel. (How many engines you need is the subject of the first article in this series: Multi-Tasking 1: Engines.)

Now let us go over the individual options on the Output Creation tab.

Job sizes

The size of a job (a print output task) is measured by its page count.

The upper limit for small jobs and the lower limit for large jobs can result in 3 size classes for Output Creation tasks: small, medium, and large. If the minimum size for large jobs is set to exactly 1 page more than the maximum size for small jobs, medium jobs are effectively eliminated; jobs can then only be small or large.

The medium size may be needed in situations where both single documents and larger batch jobs are being processed simultaneously, with some of those batch jobs being exceptionally large; most batch jobs are between 100 and 2,000 pages, but some can be well over 50,000 pages.
Therefore, by reserving engines and setting speed targets for each job size, you can ensure that regular batch jobs come out the same day, that small jobs are not blocked (these are likely to be on-demand jobs), and that the exceptionally large jobs keep going until they are completed.

Make sure the job sizes match the typical size of jobs in your production environment. Remember that the key value is the number of pages for a job, not the number of records: after all, a single record job might produce 2000 pages (think of a Tel-Com invoice, for instance) so it wouldn’t be considered small for output creation. In fact, with the default setting of 10 pages, it would not be considered small if it only had 11 pages!

For instance, let’s say you want to use invoices as your criteria for determining job sizes. Let’s say again that you know your average invoice is 5 pages, and your average batch of invoices contains 250 invoices (i.e. 1250 pages, on average). You could use these values to set the maximum size of a small job to 8 (thereby leaving a little bit of room for invoices with 6, 7 or 8 pages). You would then set the minumum number of pages for large jobs to 1000 (thereby allowing the system to still consider a smaller-than-usual 200 invoice batch as a large job). Anything in between would be a medium job.

Reserved engines

In some environments, large output creation tasks can take hours to finish. To ensure other tasks can still run while these large jobs are running, it is possible to reserve engines for small and medium tasks. If on-demand jobs cannot wait for an output engine to be reassigned to them, you should reserve output engines for them.
This is not necessary for large tasks, because tasks are picked up in the order they arrive, and smaller tasks finish relatively fast anyway.

Reserved engines cannot be used for jobs of a different size. Again, make sure that the job size settings are correct.

Suppose you started out with the “On demand print” preset to get many small documents processed in parallel. With this preset, all output engines except one are reserved for small jobs. If your small documents are typically longer than 10 pages, then with the default setting they will not be considered small, and the only output engine not reserved for small jobs will have to do all the work!

Target speed

When multiple tasks run in parallel, the licensed speed must be divided between them. By default, all tasks are given the same speed, regardless of their size. In many scenarios this is perfectly fine, such as:

  • when all jobs are the same size
  • when jobs have different sizes, but the system is not continuously busy with small jobs.

However, with a mixed size job load on a busy system, smaller jobs may be limiting the throughput of larger jobs. If at least one small job is always running, the maximum speed for a larger job will remain at half the licensed speed. This is when setting the target speed becomes important.

Note that the target speed is not a guaranteed actual speed, but it helps the OL Connect Server in determining how to distribute the licensed speed proportionally across tasks. To make sure large batch jobs get sufficient speed during output creation, set a lower target speed for small jobs. This will automatically allow more speed for the large jobs. However, large jobs don’t always need to run at top speed either (maybe the batch only has to be ready by the next morning). So you may very well keep a lower target speed for large jobs to maximize the speed of other jobs during the day.

What if there is “unused” speed?
When multiple tasks run at the same time, each task first gets its target speed.
If the tasks together do not reach the speed allowed by the license, the “unused speed” is divided among the running tasks. If there is a difference in target speed between two tasks, this means that one task should run faster than the other. The unused speed is therefore divided proportionally, based on target speeds.

For example, if a small job has a target speed of 50 ppm, and a large job has a target speed of 1000 ppm, then the unused speed (1950 PPM, in the case of PlanetPress Connect) will be divided in a 1:20 ratio. The speed limit for the small job becomes 147.5 PPM, while the large job’s speed limit is set to 2852.5 PPM.

As long as you do not set the target speeds too high, the system will normally exceed them.

What if there is not enough speed to meet the targets?
If the target speeds together exceed the licensed speed, the speed distribution mechanism changes to avoid that overly greedy tasks take all the available speed and leave nothing for other tasks. In this case it works as follows:

  1. Small ones go first. The server first looks at how much speed the tasks would each get if the speed were equally distributed. Any task that has a target speed that is equal to or less than the target speed with an equal distribution is immediately assigned the target speed.
  2. Any unused speed will then be used to try to satisfy the “greedy tasks”.

Example

Take, for example, a PReS Connect installation that has 4 output engines and can therefore run 4 print output creation tasks in parallel at most. PReS Connect has a licensed speed of 10,000 PPM. With these numbers, target speeds up to 2500 PPM can always be achieved. Now suppose that the system is usually running many small tasks that must go directly to a 50 PPM office printer, and a few large tasks each day. A target speed for small jobs of 50 PPM will get them delivered on time. This could lead to setting the target speed for large jobs to 9000 PPM, which can easily be met even when 3 small jobs are running:

  • When 2 small tasks run in parallel with 1 large task, each can get their target speed, and the remaining unused speed is divided proportionally with a 1:180 ratio, resulting in 55 PPM for the small tasks, and 9890 PPM for the large task.

However, if 2 large jobs start at the same time, giving 9000 PPM to both is impossible, because this would exceed the licensed speed. In this scenario, any task with a target speed of 2500 PPM or less will get that target speed. Tasks that have a target speed above 2500 PPM, will get at least that 2500 PPM:

  • When 2 small tasks and 2 large tasks run in parallel, the small tasks get their target speed of 50 PPM, while the large tasks each get 4950 PPM instead of the target speed of 9000 PPM.

When setting the target speed, keep in mind that:

  • Some tasks may never reach the target speed, since the output speed also depends on the content to process, and on what needs to be done with it.
  • If the target speeds are not too “greedy”, tasks will usually run faster than their target speed.
  • There is no need to be precise. It is unlikely that a task will run at exactly the target speed.

Tagged in: Engine, Multi-tasking, Parallel, Presets



Leave a Reply

Your email address will not be published. Required fields are marked *