5 Use Cases of GROUP BY for Internal Tables

nomssi · ‎11-30-2017

ABAP developers target a moving Netweaver platform (7.0, or 7.02 or 7.31 or 7.40 or 7.5x or the Cloud) where the state of the art method to shoot yourself in the foot is a contest between vintage idioms and idioms using recent additions to the language.

* Add line to internal table LT_ENTRY

ls_entry-action = shoot.

ls_entry-aim = foot.

APPEND ls_entry TO lt_entry.



* Version 2 using field-symbol

APPEND INITIAL LINE TO lt_entry ASSIGNING <ls_entry>.

<ls_entry>-action = shoot.

<ls_entry>-aim = foot.



* Version 3

APPEND VALUE #( action = shoot

                aim = foot ) TO lt_entry.



* Version 4

lt_entry = VALUE #( BASE lt_entry 

                    ( action = shoot aim = foot ) ).

ABAP developers struggle in a popularity contest that drives the adoption of no nonsense features but pushes powerful features available in Netweaver out of fashion because they are or seem complex, like

LOOP AT WHERE

A Graph Library

A Lexer

table expression's DEFAULT / OPTIONAL parameter

GROUP BY for internal tables.

Case in Point

GROUP BY for internal tables is nothing new under the sun. Horst Keller announced it 3 years ago and then produced a sequel last year explaining binding. Then Kerem Koseoglu called it cool. I think it outmodes the following idioms

SORT + DELETE ADJACENT DUPLICATE.

COLLECT for unique table entries.

LOOP AT NEW

and it is simple enough so everyone should be using it, today.

So what is GROUP BY for internal tables?

Read the introduction courtesy Horst Keller and (his) official documentation. I will motivate with before and after samples below and perfect the confusion by saying grouping is

classification: creation of groups and unique assignment of table entries to a group

dynamic mapping while defining a group key for sorting

native: trying to implement a dynamic sorting logic once made me miss native priority queues in ABAP

I was not using it because...

It does not work on my Netweaver stack, we still have 7.31 around and downgrading is a pain...

Once every second week, I have an epic coding moment when I really grasp GROUP BY. But then, it passes...

It is arcane, I didn't have to, was not forced use it to and could live without it.

It must hurt, it isn't called bleeding edge for nothing...

I am now using it because...

7.40 is not cutting edge (anymore), and compared to the procedural to objects conundrum, GROUP BY is a low-hanging fruit with an impressive documentation.

I abstract its power with a word: grouping. Grouping is a recurring problem in data processing. It is widely applicable, it is a pattern. While refactoring, GROUP BY reveals the intent of the design better than the alternative ABAP constructs.

With a native idiom available, it is easier to talk about design. We will hear ABAP developer bragging about grouping soon:

Do not DELETE ADJACENT DUPLICATES, GROUP BY a given key

Do not LOOP.. AT NEW, GROUP BY

Do not sort the table according to an alternate key, GROUP BY varying keys.

Fireside Chat with CAPRA

CAPRA: Okay, so you are an ABAP developer? That don't impress me much. Don't get me wrong, I think you're alright, but which subset of the 700+ keywords do you really know and use?

me: I can GROUP BY with representative binding!

CAPRA: wow! are you now a member of a group WITHOUT MEMBERS?

me: Stop kidding me!

How am I using it?

Change Purchase Orders

To update a list of purchase order items via BAPI_PO_CHANGE each BAPI call is done with the collected list of items belonging to a single purchase order. I used to do it like this

METHOD update.

  DATA ls_item LIKE LINE OF it_item.

  DATA lt_item TYPE tt_item.



  LOOP AT it_item INTO ls_item.

    AT NEW ebeln.

      CLEAR lt_item.

    ENDAT.



    APPEND ls_item TO lt_item.



    AT END OF ebeln.

      update_po( lt_item ).

    ENDAT.

  ENDLOOP.

ENDMETHOD.                    "update

I can now create a GROUP at Purchase Order level:

  METHOD update.

    LOOP AT it_item INTO DATA(ls_item) GROUP BY ls_item-ebeln INTO DATA(lv_ebeln).

      update_po( VALUE #( FOR g IN GROUP lv_ebeln ( g ) ) ).

    ENDLOOP.

  ENDMETHOD.

Priority Queue for Messages

The BAPI call returns messages of type A, E, X, I, W, S. I have pick one to be displayed in an ALV output, I do not care which info/warning/status message is selected, but if error messages (A, E, X) should be preferred.

The design is to order the return messages according to type and pick up the message with the highest priority for display. In a first approach, I assume type X does not occur so lexical ordering AEISWX is appropriate:

METHOD get_current_message.

  DATA lt_return LIKE it_return.

  DATA ls_return TYPE bapiret2.

* Priority

  SORT lt_return BY type.

  LOOP AT lt_return INTO ls_return.

    EXIT.

  ENDLOOP.

  rv_message = ls_return-message.

ENDMETHOD.                    "get_current_message

With GROUP BY it would be:

 METHOD get_current_message.

*   Priority

    LOOP AT it_return INTO DATA(ls_return) GROUP BY ls_return-type.

      rv_message = ls_return-message.

      RETURN.

    ENDLOOP.

  ENDMETHOD.

And it is now easy to roll out my custom ordering logic

  METHOD priority.

    rv_prio = SWITCH #( iv_type WHEN 'A' THEN 0

                                WHEN 'X' THEN 1

                                WHEN 'E' THEN 2

                                WHEN 'W' THEN 3

                                WHEN 'S' THEN 4

                                ELSE 5 ).

  ENDMETHOD.



  METHOD get_current_message.

    LOOP AT it_return INTO DATA(ls_return) GROUP BY priority( ls_return-type ).

      rv_message = ls_return-message.

      RETURN.

    ENDLOOP.

  ENDMETHOD.

Wait, I could use the standard translate( ) function here

  METHOD get_current_message.

    LOOP AT it_return INTO DATA(ls_return) 

      GROUP BY translate( val = ls_return-type from = `AXEWS ` to = `012345` ).

      rv_message = ls_return-message.

      RETURN.

    ENDLOOP.

  ENDMETHOD.

Well, the code have two bugs, thanks christian.guenter and sdfraga

Message type 'I' is not handled

the ASCENDING or DESCENDING addition must be used to reorder the groups

This is an opportunity to write failing ABAP unit tests before applying this correction and check if the tests now pass:

  METHOD get_current_message.

    LOOP AT it_return INTO DATA(ls_return) GROUP BY 

      translate( val = ls_return-type from = `AXEWSI ` to = `0123456` ) ASCENDING.

      rv_message = ls_return-message.

      RETURN.

    ENDLOOP.

  ENDMETHOD.

Filter IDocs by Segment

To filter out entries containing a given segment from a list of IDocs, I could do

* we check witch idoc message types contains the required segment

  lt_edidc[] = ct_edidc[].

  SORT lt_edidc BY idoctp cimtyp.

  DELETE ADJACENT DUPLICATES FROM lt_edidc COMPARING idoctp cimtyp.



  LOOP AT lt_edidc ASSIGNING <ls_edidc>.

    REFRESH lt_syntax.

    CALL FUNCTION 'EDI_IDOC_SYNTAX_GET'

      EXPORTING

        pi_idoctyp      = <ls_edidc>-idoctp

        pi_cimtyp       = <ls_edidc>-cimtyp

      TABLES

        pt_syntax_table = lt_syntax

      EXCEPTIONS

        OTHERS          = 0.

    READ TABLE lt_syntax TRANSPORTING NO FIELDS WITH KEY segtyp = iv_segnam.

    IF sy-subrc NE 0.

      DELETE ct_edidc WHERE idoctp = <ls_edidc>-idoctp

                        AND cimtyp = <ls_edidc>-cimtyp.

    ENDIF.

  ENDLOOP.

But now that I think in groups, I will GROUP BY:

METHOD filter_by_segment.

    DATA lt_syntax TYPE STANDARD TABLE OF edi_iapi06.



    LOOP AT ct_edidc INTO DATA(ls_edidc)

      GROUP BY ( idoctp = ls_edidc-idoctp

                 cimtyp = ls_edidc-cimtyp )

      WITHOUT MEMBERS ASSIGNING FIELD-SYMBOL(<ls_group>).



      CLEAR lt_syntax[].

      CALL FUNCTION 'EDI_IDOC_SYNTAX_GET'

        EXPORTING

          pi_idoctyp      = <ls_group>-idoctp

          pi_cimtyp       = <ls_group>-cimtyp

        TABLES

          pt_syntax_table = lt_syntax

        EXCEPTIONS

          OTHERS          = 0.



      CHECK NOT line_exists( lt_syntax[ segtyp = iv_segnam ] ).

      DELETE ct_edidc WHERE idoctp = <ls_group>-idoctp

                        AND cimtyp = <ls_group>-cimtyp.

    ENDLOOP.



  ENDMETHOD.

The WITHOUT MEMBERS addition is needed because entries in the GROUP are changed.

Report Contract Release Documentation

From a list of purchasing contract items I compare the release history with the sum of invoices. A s those are value contracts an aggregation of the items is made to compare at contract header level. Without GROUP BY, LOOP AT NEW helps:

  METHOD add_data.

    DATA lv_logsy TYPE logsys.

    DATA ls_totals TYPE lcl_contract=>ts_totals.

    DATA lv_error TYPE flag.

    DATA lo_contract TYPE REF TO lcl_contract.

    FIELD-SYMBOLS <ls_ekab> TYPE ts_ekab.



    CHECK mt_ekab IS NOT INITIAL.



    lv_logsy = get_logsys( ).

    CREATE OBJECT lo_contract

      EXPORTING

        io_ekab = me

        io_rfc  = io_rfc.



*   Value Contracts

    LOOP AT mt_ekab ASSIGNING <ls_ekab>. 



      AT NEW konnr.

        lo_contract->consumption( EXPORTING iv_konnr = <ls_ekab>-konnr

                                  IMPORTING es_totals = ls_totals

                                            ev_error = lv_error ).

      ENDAT.



      <ls_ekab>-logsy = lv_logsy.

      <ls_ekab>-wrbtr = read_history( <ls_ekab> ).

      IF <ls_ekab>-erekz EQ abap_true.

        <ls_ekab>-total = <ls_ekab>-wrbtr.

      ELSE.

        <ls_ekab>-total = <ls_ekab>-netwr.

      ENDIF.

      IF lv_error EQ abap_false.

        <ls_ekab>-rewwb = ls_totals-value.

      ENDIF.

    ENDLOOP.



  ENDMETHOD.

But grouping with GROUP BY is simpler to grasp

  METHOD add_data.

    CHECK mt_ekab IS NOT INITIAL.

    DATA(lv_logsy) = get_logsys( ).

    DATA(lo_contract) = NEW lcl_contract( io_ekab = me

                                          io_rfc = io_rfc ).

    LOOP AT mt_ekab INTO DATA(ls_ekab) GROUP BY ls_ekab-konnr INTO DATA(lv_konnr).



      lo_contract->consumption( EXPORTING iv_konnr = lv_konnr

                                IMPORTING es_totals = DATA(ls_totals)

                                          ev_error = DATA(lv_error) ).



      LOOP AT GROUP lv_konnr ASSIGNING FIELD-SYMBOL(<ls_ekab>).

        <ls_ekab>-logsy = lv_logsy.

        <ls_ekab>-wrbtr = read_history( <ls_ekab> ).

        <ls_ekab>-total = SWITCH #( <ls_ekab>-erekz WHEN 'X' THEN <ls_ekab>-wrbtr 

                                                             ELSE ls_ekab>-netwr ).

        IF lv_error EQ abap_false.

          <ls_ekab>-rewwb = ls_totals-value.

        ENDIF.

      ENDLOOP.

    ENDLOOP.

  ENDMETHOD.

Aggregate Function SUM like SQL's GROUP BY

Next - building a subtotal like the aggregate GROUP BY for database tables

  METHOD convert.



    CLEAR et_comm.

    LOOP AT it_ekab ASSIGNING FIELD-SYMBOL(<ls_ekab>) 

        GROUP BY ( ekgrp2 = <ls_ekab>-ekgrp2

                   konnr =  <ls_ekab>-konnr

                   logsy = <ls_ekab>-logsy ) INTO DATA(ls_group).

      

      DATA(ls_comm) = VALUE ts_comm( ekgrp2 = ls_group-ekgrp2

                                     konnr = ls_group-konnr

                                     logsy = ls_group-logsy ).

      LOOP AT GROUP ls_group INTO DATA(ls_ekab).

        ls_comm-ktwrt = ls_ekab-ktwrt.        " Contract target value

        ls_comm-total = ls_ekab-rewwb.        " Contract sum of releases

        ls_comm-kwaers = ls_ekab-kwaers.      " Contract currency



        ADD ls_ekab-total TO ls_comm-wrbtr.   " Sum invoices or PO value if no EREKZ

      ENDLOOP.

      ls_comm-exceeded = xsdbool( ls_comm-wrbtr + ls_ekab-rewwb > ls_comm-ktwrt ).



      APPEND ls_comm TO et_comm.

    ENDLOOP.



  ENDMETHOD.

Now, downgrading is pain,

  METHOD convert.

    DATA ls_comm TYPE ts_comm.

    FIELD-SYMBOLS <ls_ekab> TYPE ts_ekab.



    CLEAR et_comm.



    TYPES: BEGIN OF ts_key,

             ekgrp2 TYPE ekgrp,

             konnr TYPE konnr,

             logsy TYPE logsys,

           END OF ts_key.

    DATA lt_key TYPE SORTED TABLE OF ts_key WITH UNIQUE KEY ekgrp2 konnr logsy.

    DATA ls_key LIKE LINE OF lt_key.



    LOOP AT mt_ekab ASSIGNING <ls_ekab>.

      MOVE-CORRESPONDING <ls_ekab> TO ls_key.

      INSERT ls_key INTO TABLE lt_key.

    ENDLOOP.



    LOOP AT lt_key INTO ls_key.



      LOOP AT it_ekab ASSIGNING <ls_ekab> WHERE ekgrp2 = ls_key-ekgrp2

                                            AND konnr = ls_key-konnr

                                            AND logsy = ls_key-logsy.

        AT FIRST.

          MOVE-CORRESPONDING <ls_ekab> TO ls_comm.

        ENDAT.

        ls_comm-ktwrt = <ls_ekab>-ktwrt.        " Contract target value

        ls_comm-total = <ls_ekab>-rewwb.        " Contract sum of releases

        ls_comm-kwaers = <ls_ekab>-kwaers.      " Contract currency



        ADD <ls_ekab>-total TO ls_comm-wrbtr.   " Sum invoices or PO value if no EREKZ

        AT LAST.

          DATA lv_sum LIKE ls_comm-wrbtr.



          lv_sum = ls_comm-wrbtr + <ls_ekab>-rewwb.

          IF lv_sum > ls_comm-ktwrt.

            ls_comm-to_be_checked = abap_true.

          ELSE.

            ls_comm-to_be_checked = abap_false.

          ENDIF.

          APPEND ls_comm TO et_comm.

        ENDAT.

      ENDLOOP.

    ENDLOOP.



  ENDMETHOD.                    "convert

Expressions

Modern ABAP introduced expressions, creating a conflict almost by design as new features were hitherto added to ABAP as new statements.

A statement does something: statements work on data; they are imperative and applied in sequence.

An expression returns a value: pure functions (without side effect) are composable. But note ABAP does not try to be a functional language, the type system does not support complex/algebraic data types

So GROUP BY also has an expression oriented form.

  METHOD convert.

    et_comm = VALUE #( FOR GROUPS ls_group OF <ls_ekab> IN it_ekab

                     GROUP BY ( ekgrp2 = <ls_ekab>-ekgrp2

                                konnr =  <ls_ekab>-konnr

                                logsy = <ls_ekab>-logsy



                                ktwrt = <ls_ekab>-ktwrt    " target value

                                total = <ls_ekab>-rewwb    " sum of releases

                                kwaers = <ls_ekab>-kwaers  " Contract currency

                                exceeded = <ls_ekab>-exceeded



                                rewwb = <ls_ekab>-rewwb

                                count = GROUP SIZE )

             ( VALUE #( LET lv_sum = REDUCE wrbtr( INIT lv_val TYPE wrbtr  

                                                   FOR ls_ekab IN GROUP ls_group

                                                   NEXT lv_val = lv_val + ls_ekab-total )

               IN BASE CORRESPONDING ts_comm( ls_group )

               wrbtr = lv_sum  " Sum invoices or PO value if no EREKZ

               to_be_checked = xsdbool( lv_sum + ls_group-rewwb > ls_group-ktwrt ) ) ) ).

  ENDMETHOD.

Are you using it?

Discern wisdom from myths and you will know when GROUP BY should be used. Should it be more widely used? I say yes. You can check grouping now and add your comment, or write yet another blog about it.

SuhaSaha · ‎11-30-2017

IMO, the GROUP BY for internal tables makes the loop control statements(esp. AT NEW…ENDAT) obsolete.

I personally use it it do the subtotals (your last use case) but have occasionally used it instead of the loop control statements.

I found your example of sorting the RETURN message table based on the message type really interesting & will try to implement it when need arises

BR Suhas

PS – Honestly i’m not yet comfortable with the FOR GROUP BY function, i tend to use the verbose LOOP…GROUP BY!

DanielG · ‎12-01-2017

I lately try to use proper sorted/hashed tables instead of standard tables and manual sorting, so I quite frequently use the GROUP BY feature in constructor expressions when filling said tables. It does the job, but frankly, I'd prefer if there were syntatically simpler ways to do get a duplicate free collection of items.

SuhaSaha · ‎12-01-2017

I lately try to use proper sorted/hashed tables instead of standard tables and manual sorting … if there were syntatically simpler ways to do get a duplicate free collection of items

Can’t you use COLLECT to fill the records?

Cf. https://help.sap.com/doc/abapdocu_751_index_htm/7.51/en-US/index.htm?file=abencollect_guidl.htm

DanielG · ‎12-01-2017

Thanks for the suggestion.

It's not really an option for me. For one, I like to use the expressions for creating values instead of doing it imperatively (as long as it can be reasonably done), and also:

If a row with the relevant primary key already exists in the target table, the values of numeric components are added to the values in the existing table row.

If the items I want to collect are structures with numeric attributes, this is in 99.9% of cases an undesired side effect.

ChristianGünter · ‎12-01-2017

Nice work Jacques.

I have a question about your examples ‘Priority Queue for Messages’ with GROUP BY. Do they rely also on the sort order of the internal table?

Sandra_Rossi · ‎12-01-2017

I'm sorry to say that my current Customer is using 7.40 SP 7. Did they do it on purpose so that we developers can't use GROUP BY and REDUCE !? 🙂

nomssi · ‎12-01-2017

Thanks Christian.

It is yes, the messages are in the same order as in the internal table.

From the help:

The default order of the groups in the group loop plus the order of the members within a group is defined by the processing order of the LOOP in the first phase:

The default order of the groups is based on the time their group key is first created, which itself can be overridden using the additions ASCENDING or DESCENDING.

If the assignment of the rows to their group is defined, the order of the rows of a group is based on the time they are assigned to the group. This defines, in particular, the first row of each group, used as a representative in the representative binding.

nomssi · ‎12-01-2017

Thanks for your feedback Daniel.

How could this ignore duplicates option be implemented? Maybe in the FILTER operator, with representative binding as in GROUP BY.

ChristianGünter · ‎12-03-2017

Hm ok. But wouldn’t it give incorrect results in this case?

...



DATA(lt_return) = VALUE bapiret2_tab( ( type = 'I' message = 'Type I' )

                                      ( type = 'X' message = 'Type X' ) ).





DATA(lv_message) = get_current_message( lt_return ).



...



  METHOD get_current_message.

    LOOP AT it_return INTO DATA(ls_return) 

      GROUP BY translate( val = ls_return-type from = `AXEWS ` to = `012345` ).

      rv_message = ls_return-message.

      RETURN.

    ENDLOOP.

  ENDMETHOD.

Or am I getting something wrong?

nomssi · ‎12-03-2017

I see the bug now, thanks for the report Christian.

while the priority( ) mapping would anything not specified to 5, I missed a mapping for type 'I' in the translate( ) mapping. I would need something like

translate( val = ls_return-type from = `AXEWSI ` to = `0123456` ).

I guess 'I' is larger than any numbers so the behavior is still as expected, but this will be a warning to create an ABAP unit test here.

JNN

sdfraga · ‎12-05-2017

Hello Jacques,

This is a very nice blog showcasing the usage of GROUP BY. Thanks.

Regarding the message example, I have tryed to implement this but I always get the first entry on the table:

    data(return) = value bapiret2_tab( ( type = 'I' message = 'Type I' )

                                       ( type = 'A' message = 'Type A' )

                                       ( type = 'X' message = 'Type X' )  ).



    loop at return into data(ls_return)"assigning field-symbol(<return>)

        group by translate( val = ls_return-type from = 'AXEWSI ' to = '0123456' ).

      r_message = ls_return-message.

      return.

    endloop.

r_message will always have the "Type I" value.

As far as I understand the idea fo the group by and translate is to return the message converning the order on the from right?

In the example we should get the message A first.

Am I missing something?

Thanks again for the excelent blog!

Sérgio Fraga

nomssi · ‎12-05-2017

Hello Sergio,

thanks for the test case, I found the bug: you must add the ASCENDING keyword after GROUP BY clause.

LOOP AT return INTO DATA(ls_return)

    GROUP BY translate( val = ls_return-type from = `AXEWSI `

                                               to = `0123456` ) ASCENDING.

And make sure to use the ` separator for string with trailing space.

JNN

Jelena · ‎12-06-2017

Nice blog, thanks for sharing! These specific use cases are very helpful and I also liked the RETURN table idea especially. This is exactly what I'm looking for when trying new things. "Because Horst (or Paul) told us so" just doesn't always work as a motivation. 🙂

7.4 is still beyond cutting edge for us but I'm sure I'll come back to this as soon as we get there. And it's a one more item I can bring up as "look, this is why we need to upgrade".

Thanks again! Well done.

shais · ‎01-09-2018

cool 😉

peter_collins3 · ‎04-20-2020

Let's say I have an internal table with 2 fields, MATNR and MENGE. I can get a unique list of materials in T_SUM with the following:

TYPES: BEGIN OF ty_mat,

         matnr TYPE matnr,

         menge TYPE menge_d,

       END OF ty_mat,

       ty_t_mat TYPE STANDARD TABLE OF ty_mat WITH DEFAULT KEY.



DATA: t_mat TYPE ty_t_mat,

      t_sum TYPE ty_t_mat.



* (t_mat is already populated)



  t_sum = VALUE #(

            FOR GROUPS mat OF <fs_mat>

            IN t_mat

            GROUP BY ( matnr = <fs_mat>-matnr ) ASCENDING

            ( matnr = mat-matnr ) ).

How do I amend this statement to also include in T_SUM the total quantity of each material?

Obviously I can do this in various ways with the COLLECT statement, but I want to know how to do it this way (irrespective of whether or not it's the best way!)

The final example in the blog looks like it might be doing something similar.

Thanks!

peter_collins3 · ‎04-20-2020

Actually, as soon as I posted this I've got something working:

t_sum = VALUE #(

          FOR GROUPS mat OF <fs_mat>

          IN t_mat

          GROUP BY ( matnr = <fs_mat>-matnr ) ASCENDING

          ( matnr = mat-matnr

            menge = REDUCE menge_d( INIT lv_menge TYPE menge_d

                                    FOR ls_mat IN GROUP mat

                                    NEXT lv_menge = lv_menge + ls_mat-menge ) ) ).

It does produce the correct result, but any comments? I like the fact that it is one statement (rather than LOOP ... COLLECT ... ENDLOOP)

nomssi · ‎04-20-2020

Comment? This is exactly how we do it: use REDUCE to compute any aggregate.

The blog's own CONVERT( ) method uses this approach (and LET to reuse the result in a condition).

regards,

JNN

adityaw · ‎11-29-2022

I know it has been 5 years, but have you found a simpler way yet? I too have started using sorted tables and find it annoying to remove duplicates.

nomssi · ‎12-20-2022

The DISCARDING DUPLICATE option of the CORRESPONDING operator might help:

DATA(sorted_table) = CORRESPONDING sorted_table_type( std_table DISCARDING DUPLICATES ).

adityaw · ‎01-04-2023

Thanks, this works!

gustavo_lobo2 · ‎02-02-2023

It is very good and so helpful. Well done.