SAP Knowledge Base Article - Public

2366796 - Best Practices for Modelling Data Sources and Reports - Performance

Symptom

You are creating many data sources and reports and you would like to know what the best practices are to ensure the best performance. This is especially relevant if you handle a large data volume.

Please note: Generally, the performance of data sources and reports cannot be compared between 2 systems, even if the data source / report is the same. There are many factors that influence the performance. Below you will find some guidelines on the relevant data source / report settings.

Environment

  • SAP Business ByDesign
  • SAP Hybris Cloud for Customer

Resolution

Best Practices for Modelling Data Sources

Fundamental understanding:

  • An inner join returns only those rows from the left table that have a matching row in the right table based on the join criteria.
  • A left outer join returns all rows from the left table even if no matching rows where found in the right table. Any values selected out of the right table will be null for those rows where no matching row is found in the right table.

For a detailed explanation of the join types, please see the help document called Data Sources.

  • A Cloud Data Source can be used in order to upload external data into your solution. Please note the following restrictions for cloud data sources:
    • No access context is currently supported for cloud data sources, so the access context will always be Unrestricted. This means that whoever has access to reports based on this data source will see the complete data set.
    • If you use a cloud data source as part of a join and then create a report based on this join, the Add Fields button will not show the fields from the cloud data source.

Best Practices:

  • Combined data sources make sense where all involved data sources have many characteristics in common, but have different key figures. If there are only very few matches between the characteristics, a join might return better results. For example, you would like to see the combined revenue of a company, irrespective of whether it is coming from sales orders or service orders. So you can create a combined data source on sales orders and service orders to see the aggregated revenue of a company.
  • Combined data sources tend to be better in terms of performance. In a join, the data volume is multiplied MxN, depending on the join conditions. Hence it is also advisable to reduce the number of join conditions if possible.
  • When joining data sources, inner joins are the preferred approach
  • If left outer joins are unavoidable, please note the following:
    • The data is fetched from the left to the right join members
    • For this reason, to improve performance, the anchor of your data source should be the one containing least records
    • If there are more than 2 join members, sort them by increasing data volume, i.e. the smallest data source should be the anchor, the next bigger one should be to the right of it, and the largest data source should be to the very right of the table.
    • For performance, it is preferrable to have a master data data source as an achor, instead of a transactional data source.
  • Avoid numerous "layers" of data sources, for example scenarios like the example below. These are complex joins and their calculation may take a very long time.
    • Join 1: several standard data sources are joined
    • Join 2: Join 1 and one or several standard data sources are joined
    • Join 3: Join 2 and more standard or customized data sources are joined
    • and so on...
    • Such scenarios should be avoided.
  • The join conditions should be set on UUID fields. Joins on ID fields should be avoided because IDs can have different lengths in different data sources and the join will fail in those cases.

Best Practices for Modelling Reports:

  • Avoid the Show Master Data setting (step 4 in the report wizard) unless absolutely necessary. If this setting is required, it should be limited to 1 characteristic.
  • If possible, mandatory selection variables should be set up in the steps 4 and 5 in the report wizard.
  • When setting variables in the report, this should be done via the report wizard, not via the Add Fields button in the report. End users can use the Add Fields button if they need to add one or two variables, whereas if you designing a report as a key user please set the variables in step 4 of the report wizard. Otherwise, the system first fetches the data for all characteristics and key figures that are part of the report, and then another call is made to get the data for the fields added via Add Fields. This will impact the report performance negatively.
  • When choosing fields for variables, try to use those fields that come from the anchor data source. Setting filters on fields from the anchor data source is always preferable.
  • Exclude selections take up more system resources than Include. Try to avoid the Exclude operator if possible, however if they are necessary, set them on fields of type ID, rather than UUID. So for example, if you need a negative selection (IS NOT) on the Employee field, make use of the field Employee Responsible ID, rather than Employee Responsible.

 

Other Factors

There are other factors that influence the performance of Analytical objects in systems, such as overall data volume in the system, system usage, background jobs and their frequency etc. For this reason, a comparison of the performance of data sources and reports between different systems is not feasible, even if the above settings are the exact same.

Keywords

report cannot be opened, time out, report loading time, dump , KBA , best practices , performance report , AP-RC-ANA , C4C, ByD, Travel: Analytics Tools & Frameworks , How To

Product

SAP Business ByDesign all versions ; SAP C4C CORE all versions ; SAP Cloud for Customer all versions