Profiling by Sampling

When this method of measurement is activated, CODESYS generates an additional task. This task interrupts the application task to be measured at random times and determines its current call tree.

Sampling is supported on multicore systems only. As the user, you have to assign the automatically generated profiling task to a separate task group in the task configuration. This task group should run on a different core than the application task to be measured.

The recorded call trees of the task to be measured are transferred in cycles to the development system for processing. For that reason, this method works only when the development system is in online mode.

Sampling is not suitable for determining outliers of task runtimes. Use the method when you want to determine over a longer period of time which functions take a lot of time and which ones take very little time. The result is a random collection over many task cycles. One-time effects in individual cycles cannot be detected.

One advantage of this measurement method is that the influence on the task runtime is comparatively low and that the measurement can be switched on and off at any time at runtime.

Functionality of the measurement

The profiling task runs in an infinite loop and at high priority. At random times, the profiling task checks whether or not the application task to be measured is currently running. If it is running, then it is stopped and the current call tree is determined. The determined call tree is entered into a list (array).

This list of call trees is transferred cyclically to the development system and processed there with the previous measurements. Therefore, sampling runs only as long as the development system is connected to the runtime.

The sampling method is used to determine a statistical distribution of the execution of POUs. The runtimes displayed in the result view are not measured directly, but are the result of a calculation. The calculation is based on the assumption that POUs which are often found in the call tree also need a longer execution time. The portion of the measurements (samples) of a POU to the total number of samples is converted to the time portion of the POU call to the cycle time of the task.

Example: In a task T1, two programs P1 and P2 are called and the task cycle time is 20 ms. The Profiler task performs 100 samples and determines the following:

T1 halts in the program P1 20 times.
T1 halts in the program P2 50 times.
T1 does not run 30 times.

Then the Profiler detects the following times from the proportion of the task to the cycle time and displays it in the inline view:

P1: 4 ms
P2: 10 ms
Idle: 6 ms

The runtime of the task is obviously extended by a measurement. This increase is not constant, but depends on the depth of the call tree. Depending on the platform, an extension of the runtime in the range of 10 µs – 100 µs has to be expected, possibly even more.

Measurements of the "Failed Samples" category:

Important

Errors can occur when determining the call tree. Possible error causes:

The length of the array where the call tree is stored is too short. In this case, no call tree is determined at all. The user can change the length of the array in the Profiler settings (Maximum depth of call stack).
The task is in an unfavorable state, for example in a "lock" because it is currently attempting to operate I/Os (to access hardware). The measurement fails.

The number of failed measurements is shown in the online view in the category Failed Samples:

If the number of such failed measurements is very high despite a sufficiently long list (array) for the call tree, then you should try to resort to another measurement method.

Measurements of the category IDLE:

Measurements for which the task is currently not running are displayed in the online view in the IDLE category. Background: A cyclic task is usually configured with a shorter runtime than the cycle time is. As a result, there is a time period within the cycle in which the task does not run.

Missing samples:

Missing samples are measurement recordings on the controller that are not transferred to the development system.

Due to a large call tree array (large required call tree depth) and/or a high sampling rate (small Sampling interval), it can happen that not all recorded call trees can be transferred to the development system. However, because lost measurements are distributed over the cycle in the same way as the measurements that are transmitted, the result is not corrupted. For this reason, the number of missing samples is displayed only in the Online – Overview view (Number of missing samples), not as a separate category in the result views (like the failed samples). But the number of lost samples indicates a possibly too high sampling density, which also unnecessarily increases the cycle time. In this case, you should modify either the settings accordingly for the Maximum depth of call stack or for the Sampling interval.

Notes about the settings

For the sampling method, there are the following special items on the Settings tab of the Profiler editor, in the Sampling parameters group:

Profiler task group: Task group that contains the Profiler task.

Sampling interval: Time period in which a sampling is performed at a random time (a recording of the call tree is made and stored).

Maximum depth of call stack: Maximum nesting depth for which the call tree should be determined during the sampling.

Activating and running the profiling operation by sampling

Requirement:

A CODESYS project application with multiple POUs is open in offline mode.
The connection to a multicore controller is configured in the communication settings, and the controller is running.
The Profiler task for sampling is running (ideally as a single task) on another core than the application task to be measured.
In the task configuration, a "Profiler" task group is therefore created on its own core (ideally with the "FixedPinned" property). The automatically generated Profiler Task is the only one assigned to this group. The task group for the task of your IEC application to be measured is located on another core.
The application is the active application and can be compiled without errors.

In the application, you can create a Boolean variable that can be used to activate and deactivate profiling programmatically. This is optional. Profiling can also be switched on and off at runtime by clicking the Pause/Start button in the control panel of the Online view.
Click Add Object to add a Profiler object below the application in the device tree.
Double-click the object.
The object editor opens. The Settings tab is in focus.
Set the Method to Sampling.
In the Recording group, select the Task of your application for whose POU calls you want to perform the time measurements.
For Condition, click the button and select the Boolean variable from your application that you want to use to switch the value recording on and off. Note: Using this kind of variable is optional. If you leave the field blank, then every cycle is recorded.
Configure the following settings in the Sampling parameters group:
- Profiler task group: Name of the task group that contains the automatically generated task for profiling (see above: requirements in the task configuration).
- Sampling interval: Time between the measurements (recommended: value of the task cycle time)
- Maximum depth of call stack that should be determined each time.
Under Snapshot appearance, select the Time format for displaying the recording.
Click Online → Login to download the application to the controller.
The display is shown in the status bar of CODESYS Profiler.
Click Online → Start to start the application.
The project runs and you see the current variables values in the usual monitoring view.
If you have configured a Boolean variable as a condition to start profiling, then now set this variable to TRUE.
Now look at the sampling results. Open the editor of the Profiler object and its Online tab. Click the Refresh snapshot button and again after some time.
You see the call tree of the task to be measured. For the individual blocks, the respective number of samples (measurements) and the determined total time that the block calls consume are displayed.