Problem with to_mixed

Vlad_Ghitulescu · ‎05-24-2023

Hey,

I have a text containing strings in capitals like this:

AGGSTALL
BAD KÖTZTING
RIED A.HAIDSTEIN
SCHMID I.LEHEN
TAUFKIRCHEN (VILS)
BAD KÖNIGSHOFEN I.GRABFELD

I have to transform them to:

Aggstall
Bad Kötzting
Ried a.Haidstein
Schmid i.Lehen
Taufkirchen (Vils)
Bad Königshofen i.Grabfeld

Google-ing I've learned about to_mixed and found here a trick that solves the second string BUT not the rest:

Do you have an idea how can I solve this?

Thanks!

Regards,
Vlad

Former Member · ‎05-26-2023

Hello, try with this code :

data(lv_output_string_1) = to_mixed(
  val = |{ replace( val = cv_input_string_1 sub = `.` with = `..` occ = 0 ) } |
  sep = '.'
  min = ( find( val = cv_input_string_1 sub = '.' ) + 1 ) ).

Works for me.

Former Member · ‎05-26-2023

Hum I've just realized you have string with more than 2 words and also string with no point inside but parenthesis...

It work only with string with 2 words and with point inside, so not a solution.

Maybe you can build your own to_mixed function, somethings like this (but you have to manage the special case).

Perhaps my code is not the best way to do it, but it's working for your example.

class up_and_low_mixer definition.
  public section.
    class-methods mix importing str      type string
                      returning value(r) type string.
endclass.

class up_and_low_mixer implementation.
  method mix.

    data table_for_string type standard table of string.
    data temp_str type string.
    data(working_str) = str.

    working_str = to_lower( working_str ).
    split working_str at ` ` into table table_for_string.

    loop at table_for_string reference into data(line).

      temp_str = cond #(
        when line->* ca `.` " for case a.Aaaa
          then to_mixed(
            val = |{ replace( val = line->* sub = `.` with = `..` ) }| sep = `.` min = 2 )
            
        when line->* ca `(` " for case (Aaa)
          then to_mixed(
            val = |{ replace( val = line->* sub = `(` with = `((` ) }| sep = `(` )
            
        else                " normal 
          to_mixed( val = line->* case = 'A' ) ).

      r = r && temp_str && ` `.

    endloop.
  endmethod.
endclass.

" And you call your function class
  data(lv_output_string_1) = up_and_low_mixer=>mix( cv_input_string_1 ).
  data(lv_output_string_2) = up_and_low_mixer=>mix( cv_input_string_2 ).
  ...

Vlad_Ghitulescu · ‎05-26-2023

lou.meron that does it! Thanks!

Could you please explain me how each of the two replace work (separately and in the cond)?

Thanks again!

Former Member · ‎05-29-2023

Hi vladghitulescu, perfect if it's working for your need.

For e.g the first replace function input is "a.haidstein'" output will be "a..haidstein" .

For the second you may have input like "(vals)" and output will be "((vals)".

After, everything is done in to_mixed function. Pay attention to the "sep" and "min" parameters. It's explained like this :

The function to_mixed transforms all letters in the character string to lowercase letters from the second position. It then removes occurrences of the first character specified in sep from the character string (from left to right from the second position) and transforms the next letter to an uppercase letter. The default value for separator sep is an underscore (_). If case is not specified, the first character of the string remains unchanged. If case is specified and the first character of case is an uppercase letter, the first character in the string is also uppercase and then lowercase in all other occurrences. A positive number can be passed to min to specify a minimum number of characters that must appear before a separator (from the start of the string or since the last replacement) before the separator becomes effective. The default value for min is 1.

to_upper, to_lower, to_mixed, from_mixed - - ABAP Keyword Documentation (sap.com)

Vlad_Ghitulescu · ‎05-29-2023

Ah, very cool trick with sep & min! Thanks again lou.meron !

Sandra_Rossi · ‎05-26-2023

You should indicate which exact rule you are looking for.

Any character before dot must be lower case?

Any character after opening bracket must be upper case?

What else?

Vlad_Ghitulescu · ‎05-26-2023

Thanks Lou, I'll check this as soon as my system is alive again (right now is down due to maintenance) and report back.

Vlad_Ghitulescu · ‎05-26-2023

sandra.rossi you're right, I'll specify more.

I have this 6 examples:

AGGSTALL
BAD KÖTZTING
RIED A.HAIDSTEIN
SCHMID I.LEHEN
TAUFKIRCHEN (VILS)
BAD KÖNIGSHOFEN I.GRABFELD

AGGSTALL (= 1) is solved by genuine to_mixed.

BAD KÖTZTING (= 2) is solved by to_mixed with the replace - see the screenshot.

All others (= 3...6) are NOT solved as expected.
I stated at the beginning what I would like as output.
You can see in the screenshot from the debugger the values of lv_output_string_3...6.

Now to the rules:

every single letter followed by a dot should be lower case
every character after a dot should be upper case
every character after an opening bracket should be upper case
every character after a space should be upper case

I hope this describes the problem better.

Vlad_Ghitulescu · ‎05-26-2023

A necessary correction of the last rule above: Every char after a space should be upper case EXCEPT when immediately followed by a dot (because chars followed by a dot should be lower case, see first rule)

Sandra_Rossi · ‎05-29-2023

You are using to_mixed as a nice shortcut to solve your issue, although it was not really designed for what you're trying to achieve (it's more about transforming technical names like client_number to ClientNumber, but almost not more).

If you run ABAP 7.55, PCRE can be used.

Otherwise, it's probably better to not use to_mixed, but some code straight to the point.

Based on the rules you defined (I completed with the last two ones):

every single letter followed by a dot should be lower case
every character after a dot should be upper case
every character after an opening bracket should be upper case
every character after a space should be upper case
the first character should be upper case
all other characters should be lower case

I see that there are conflicting rules so let's suppose that the rule number 1 has the priority over the others.

Below example shows code for PCRE (ABAP >= 7.55) and classic code with REDUCE (ABAP >= 7.40 SP 8).

EDIT: I have added comments for the PCRE code.

CLASS ltc_main DEFINITION
      FOR TESTING
      DURATION SHORT
      RISK LEVEL HARMLESS.
  PRIVATE SECTION.
    METHODS run_all_test_cases FOR TESTING.
    METHODS run_one_test_case
      IMPORTING
        input         TYPE csequence
        exp           TYPE csequence
      RETURNING
        VALUE(result) TYPE string.
    METHODS with_pcre " ABAP >= 7.55
      IMPORTING
        input         TYPE csequence
      RETURNING
        VALUE(result) TYPE string.
    METHODS without_pcre
      IMPORTING
        input         TYPE csequence
      RETURNING
        VALUE(result) TYPE string.
ENDCLASS.
CLASS ltc_main IMPLEMENTATION.
  METHOD run_all_test_cases.
    run_one_test_case( input = `AGGSTALL`                   exp = `Aggstall` ).
    run_one_test_case( input = `BAD KÖTZTING`               exp = `Bad Kötzting` ).
    run_one_test_case( input = `RIED A.HAIDSTEIN`           exp = `Ried a.Haidstein` ).
    run_one_test_case( input = `SCHMID I.LEHEN`             exp = `Schmid i.Lehen` ).
    run_one_test_case( input = `TAUFKIRCHEN (VILS)`         exp = `Taufkirchen (Vils)` ).
    run_one_test_case( input = `BAD KÖNIGSHOFEN I.GRABFELD` exp = `Bad Königshofen i.Grabfeld` ).
    run_one_test_case( input = `.A.b`                       exp = `.a.B` ). " rule 1 has priority over rule 2
    run_one_test_case( input = `(a.`                        exp = `(a.` ). " rule 1 has priority over rule 3
    run_one_test_case( input = ` a.`                        exp = ` a.` ). " rule 1 has priority over rule 4
    run_one_test_case( input = `a.`                         exp = `a.` ). " rule 1 has priority over rule 5
  ENDMETHOD.
  METHOD run_one_test_case.
    cl_abap_unit_assert=>assert_equals( act = with_pcre( input = input ) exp = exp msg = |Input: { input } -> Exp: { exp }| ).
    cl_abap_unit_assert=>assert_equals( act = without_pcre( input = input ) exp = exp ).
  ENDMETHOD.
  METHOD with_pcre.
    result = replace( val  = input 
                      pcre = `(?:`
                               & `(\w\.)`        " subgroup $1: either word char followed by dot
                               & `|(?<=\.)(\w)`  " subgroup $2: or word char preceded by dot
                               & `|(?<=\()(\w)`  " subgroup $3: or word char preceded by start bracket
                               & `|(?<=\ )(\w)`  " subgroup $4: or word char preceded by space
                               & `|^(\w)`        " subgroup $5: or word char at start
                               & `|(.)`          " subgroup $6: or any character
                           & `)`
                      with = `\l$1`  " convert character of subgroup $1 to lower case
                           & `\u$2`  " convert character of subgroup $2 to upper case
                           & `\u$3`  " convert character of subgroup $3 to upper case
                           & `\u$4`  " convert character of subgroup $4 to upper case
                           & `\u$5`  " convert character of subgroup $5 to upper case
                           & `\l$6`  " convert character of subgroup $6 to lower case
                      occ  = 0 ).
  ENDMETHOD.
  METHOD without_pcre.
    result = reduce #( init t = ``
                            prev = ``
                            char = cond string( when input <> `` then substring( val = input off = 0 len = 1 ) )
                       for off = 0 while off < strlen( input )
                            let next = cond string( when off + 1 < strlen( input ) then substring( val = input off = off + 1 len = 1 ) )
                            in
                       next t = t && COND string(
                                      WHEN next = '.' THEN to_lower( char )
                                      WHEN prev = '.' THEN to_upper( char )
                                      WHEN prev = '(' THEN to_upper( char )
                                      WHEN prev = ` ` THEN to_upper( char )
                                      WHEN off = 0    THEN to_upper( char )
                                      ELSE                 to_lower( char ) )
                            prev = char
                            char = next ).
  ENDMETHOD.
ENDCLASS.

Former Member · ‎05-29-2023

Love your reduce part. I hadn't thought of doing that with reduce, very cool !

Vlad_Ghitulescu · ‎05-30-2023

Thanks sandra.rossi!
I'll use this one as an incentive to finally apply some RegEx in ABAP AND as a chance to implement some useful Unit Tests 🙂

I'm not quite sure though about "ABAP >= 7.55":

I'll try it anyway 😉 and report back.

Thanks again!

Vlad_Ghitulescu · ‎05-30-2023

No PCRE yet 😞 so REDUCE is the way to go until then.

I have one question:

I inserted the whole class into Eclipse's "Global Class"-tab but there's a "Test Classes"-tab as well. How would one spread the methods over this tabs? Should I rather put the definition and implementations of the run_all_test_cases - method in the "Test Classes" and everything else into "Global Class"? Or only its implementation?

Thanks!

Sandra_Rossi · ‎05-30-2023

SAP_BASIS 750 0017/SAP_ABAP 750 0017 means ABAP 7.50 SP 17 (so it's < ABAP 7.55).

In my system, I have SAP_BASIS 757 0000 / SAP_ABA 75H 0000 so it's ABAP 7.57 (don't ask me what 7.5H is, I use then SAP_BASIS to know the actual ABAP version).

Sandra_Rossi · ‎05-30-2023

Yes, the whole test classes (definition + implementation) go in the "Test Classes" tab.

My code was just for quick testing, I put it directly in an executable program right below the line "REPORT ztestprogram". Don't bother creating a class pool just to run the test.

PS: bravo for using Eclipse ADT!