Examples
The Generic String Base
library is provided to make string processing as simple as possible. However, it uses the VAR_GENERIC CONSTANT
compiler feature and can therefore only be used as of CODESYS V3.5 SP19 Patch 5.
Library: Generic String Base
Here an instance (myString
) of a GSB.UTF8String
with a capacity of 128 bytes is created and the string segment with the value of a STRING variable (Roman numeral 1968) is initialized. The methods of STR.IString
are available.
STRING
to IString
VAR myString : GSB.UTF8String<128> := (sValue := UTF8#'𝕄CMℒ✖Ⅷ'); // römisch 1968 psString : POINTER TO STRING; udiSize, udiLength : UDINT; xASCII, xOk : BOOL; END_VAR psString := myString.GetSegment(udiSize=>udiSize, udiLength=>udiLength, xASCII=>xASCII); // Conversion back to the STRING data type xOk := ( myString.IsValid() AND // A valid UTF-8 encoding is present udiSize = 128 AND // The capacity of the string in bytes myString.Len() = 17 AND // The current length of the string in bytes STR.RuneCount(myString) = 6 // The current number of characters in the string );
VAR myString : GSB.UTF8String<20> := (sValue := UTF8#'𝕄CMℒ✖Ⅷ'); // roman 1968 sValue : STRING := 'wurden in Mexico-Stadt die'; wsValue : WSTRING := "ⅩⅨ."; diSpace : STR.RUNE := 32; myValue : GSB.UTF8String<128> := (sValue := UTF8#'Ѻℓƴμρ☤ṧḉнεη $$ρї℮łℯ α♭ℊεℌαʟ⊥℮ᾔ.'); myBuilder : GSB.Builder<(*udiInitialCapacity*) 64, (*usiExtensionFactor*) 50> := (itfString:=myString); myResult : GSB.UTF8String<128>; {attribute 'monitoring_encoding' := 'UTF-8'} sResult : STRING(128) := UTF8#'𝕄CMℒ✖Ⅷ wurden in Mexico-Stadt die ⅩⅨ. Ѻℓƴμρ☤ṧḉнεη $$ρї℮łℯ α♭ℊεℌαʟ⊥℮ᾔ.'; psResult : POINTER TO STRING; udiLength : UDINT; xOk : BOOL; END_VAR myBuilder.WriteRune(diSpace); myBuilder.WriteString(sValue); myBuilder.WriteRune(diSpace); myBuilder.WriteWString(wsValue); myBuilder.WriteRune(diSpace); myBuilder.WriteIString(myValue); udiLength := myBuilder.Len(); // The number of bytes occupied in the builder. myBuilder.ToIString(myResult); // The individual parts of the string are copied together to myResult. psResult := myResult.GetSegment(); xOk := (psResult^ = sResult); // Both memory areas should have the same content.
In the example above, an instance of the builder is created with an initial capacity of 64 bytes (udiInitialCapacity
) and a dynamic factor of 50 (usiExtensionFactor
). The string generated further above is still passed in the declaration, and as a result the builder is filled with this string (UTF8#'𝕄CMℒ✖Ⅷ'
). Using the usiExtensionFactor
parameter, increases the builder by 50% when its current capacity is used up.
VAR sPath : STRING := 'myFilePath'; hFile : RTS_IEC_HANDLE := RTS_INVALID_HANDLE; myBuilder : GSB.Builder<(*udiInitialCapacity*) 16#10000, (*usiExtensionFactor*) 50>; abyBuffer : ARRAY[0..4095] OF BYTE; pbyData : POINTER TO BYTE; udiSize : UDINT; udiCount : UDINT; eEncoding : SCV.ENCODING; eErrorID : SCV.ERROR; udiResult : RTS_IEC_RESULT; END_VAR hFile := SysFileOpen(sPath, ACCESS_MODE.AM_READ, ADR(udiResult)); IF udiResult <> ERRORS.ERR_OK THEN // handle error condition RETURN; END_IF REPEAT // fake loop - We need the EXIT feature pbyData := ADR(abyBuffer); udiSize := TO_UDINT(SysFileRead(hFile, pbyData, XSIZEOF(abyBuffer), ADR(udiResult))); IF udiResult <> ERRORS.ERR_OK THEN // handle error condition EXIT; END_IF // Determination of the file encoding udiCount := SCV.DecodeBOM(pbyData, udiSize, eEncoding=>eEncoding, eErrorID=>eErrorID); IF eErrorID <> 0 THEN // handle error condition EXIT; END_IF pbyData := pbyData + udiCount; udiSize := udiSize - udiCount; WHILE udiSize > 0 DO // Convert file content to UTF-8 and copy to Builder-Content udiCount := myBuilder.WriteMemSegment(pbyData, udiSize, eEncoding, eErrorID=>eErrorID); IF eErrorID <> 0 THEN // handle error condition EXIT; END_IF pbyData := ADR(abyBuffer); udiSize := TO_UDINT(SysFileRead(hFile, pbyData, XSIZEOF(abyBuffer), ADR(udiResult))); IF udiResult <> ERRORS.ERR_OK THEN // handle error condition EXIT; END_IF END_WHILE UNTIL TRUE END_REPEAT IF hFile <> RTS_INVALID_HANDLE THEN SysFileClose(hFile); hFile := RTS_INVALID_HANDLE; udiCount : UDINT; END_IF
VAR myRange : SBD.Range := (itfBuilder := myBuilder); diRune : STR.RUNE; eError : STR.ERROR; END_VAR myRange.Reset(); WHILE (diRune := myRange.GetNextRune(eErrorID=>eErrorID)) <> 0 AND_THEN eErrorID = 0 DO IF UC.IsSpace(diRune) THEN // The characters in the builder which are considered as spaces according to UNICODE are counted. udiCount := udiCount + 1; END_IF END_WHILE
For passing UTF-8 encoded contents, no cache is needed for encoding conversion because the data is already UTF-8 encoded in the builder. Therefore, the contents of a segments of a builder can be sent directly, for example over a TCP/IP connection.
VAR itfConnection : NBS.IConnection; pbySegment : POINTER TO BYTE; udiSize : UDINT; eError : NBS.ERROR; END_VAR (* Provide an active itfConnection *) pbySegment:= myBuilder.GetFirstSegment(udiSize=>udiSize, eErrorID=>eErrorID); WHILE pbySegment <> 0 AND eErrorID = 0 DO eError := itfConnection.Write(pbySegment, udiSize, udiCount=>udiCount); IF eError <> 0 OR udiCount <> udiSize THEN // Handle Error EXIT; END_IF pbySegment := myBuilder.GetNextSegment(pbySegment, udiSize=>udiSize, eErrorID=>eErrorID); END_WHILE (* e.g. Close itfConnection *)
Working with the StringPool
and RangePool
function blocks
The following code shows how to use of dynamic IString
instances from a StringPool
. A StringPool
or a RangePool
is well suited to be passed to subordinate parts of a program. Then these can create the corresponding instances from the respective pool as needed, work with them, and then return these instances to the pool.
StringPool
and RangePool
VAR myString : GSB.UTF8String<256> := (sValue:=UTF8#'Was du nicht willst, dass man dir tu’, das füg auch keinem andern zu.'); myRange : STR.Range := (itfString:=myString); myStringPool : GSB.StringPool<(*udiStringSize*) 30, (*udiInitialCapacity*) 25, (*usiExtensionFactor*) 0>; myRangePool : GSB.RangePool<GSB.RANGE_TYPE.ISTRING, (*udiInitialCapacity*) 10, (*usiExtensionFactor*) 0>; diRune : STR.RUNE; eErrorID : STR.ERROR; itfSubString : STR.IString; liStart, liEnd : LINT; udiCount : UDINT; END_VAR myRange.Reset(); // Decompose myString into substrings and analyze them via a subroutine. WHILE (diRune:=myRange.GetNextRune(eErrorID=>eErrorID)) <> 0 AND eErrorID = 0 DO IF diRune = 16#2C (*,*) OR diRune = 16#2E (*.*) THEN itfSubString := myStringPool.GetString(); IF itfSubString = 0 THEN (* Handle Error *) EXIT; END_IF myString.ToIString(itfSubString, liStart+1, liEnd, eErrorID=>eErrorID); IF eErrorID <> 0 THEN (* Handle Error *) EXIT; END_IF // Analyse the substring and use pool's // Will release itfSubString udiCount := Analyse(itfSubString, myStringPool, myRangePool); (* ... Handle Result ... *) IF diRune = 16#2E (*.*) THEN EXIT; END_IF diRune:=myRange.GetNextRune(eErrorID=>eErrorID); IF diRune = 16#20 (* space *) AND eErrorID = 0 THEN liEnd := liEnd + 1; ELSE myRange.UngetLastRune(); END_IF liStart := liEnd + 1; END_IF liEnd := liEnd + 1; END_WHILE
Working with the character categories of Unicode
The Unicode standard aims to digitally capture all characters worldwide and describe their properties. To do this, the characters are combined into groups (categories). In the Unicode library, there are functions which check a character regarding which category it belongs to. These functions return TRUE
if the passed character belongs to the respective category; otherwise FALSE
is returned.
Name | Function |
---|---|
| Recognizes general control characters |
| Recognizes letters in the broader sense |
| Recognizes combining characters, for example, diacritical characters |
| Recognizes decimal digits |
| Recognizes lowercase letters |
| Recognizes digits and characters which apply to numbers |
| Recognizes only printable characters (including different types of space characters) |
| Recognizes uppercase letters |
| Recognizes punctuation characters |
| Recognizes only printable characters (considers only |
| Recognizes uppercase letters for headers |
| Detects spaces of different width, line breaks, etc. |
| Recognizes symbols in a broader sense, for example, mathematical symbols and currency symbols. |
The contents of an IString
or IBuilder
instance can be analyzed "character by character" using a suitable function block of type Range
. The functions from the Unicode library can be very useful for the analysis.
VAR myString : GSB.UTF8String<50> := (sValue:='Hello World!'); myBuilder : GSB.Builder<100, 0> := (itfString:=myString); mySRange : STR.Range := (itfString:=myString); myBRange : SBD.Range := (itfBuilder:=myBuilder); diSRune, diBRune : STR.RUNE; eErrorID : STR.ERROR; udiCount : UDINT; END_VAR WHILE (diSRune:=mySRange.GetNextRune(eErrorID=>eErrorID)) <> 0 AND eErrorID = 0 DO diBRune := myBRange.GetNextRune(); IF diSRune <> diBRune THEN (* Solle nicht vorkommen *) END_IF IF UC.IsSpace(diSrune) THEN udiCount := udiCount + 1; END_IF END_WHILE
Conversion of characters
Convert letters to uppercase (
UC.ToUpper
)Convert letters to lowercase (
UC.ToLower
)
VAR diRuneA, diRuneB : STR.RUNE; END_VAR diRuneA := 16#1F3; // U+01F3 = Dž diRuneB := UC.ToUpper(diRuneA); // U+01F1 = DZ diRuneA := UC.ToLower(diRuneB); // U+01F3 = Dž diRuneB := UC.ToTitle(diRuneA); // U+01F2 = Dz
Comparison of strings
Case-sensitive (
STR.Compare
)Not case-sensitive (
UC.EqualFold
)
VAR myFirstString : GSB.UTF8String<50> := (sValue:='test'); mySecondString : GSB.UTF8String<50> := (sValue:='Test'); myThirdString : GSB.UTF8String<50> := (sValue:='CoDeSys'); myFourthString : GSB.UTF8String<50> := (sValue:='CODESYS'); diResult : DINT; xEqual : BOOL; END_VAR /// Comparing two Strings lexicographically /// diResult = 1 --> myFirstString > mySecondString diResult := STR.Compare(myFirstString, mySecondString); /// Unicode defined simple case folding /// xEqual = TRUE --> myThirdString == myFourthString xEqual := UC.EqualFold( ADR(myThirdString.sValue), myThirdString.Len(), ADR(myFourthString.sValue), myFourthString.Len() );