Table.AddRankColumn

Table.AddRankColumn is the latest addition to the M language which lets you create ranks in Power Query directly. No more buffer-sort-add index column trick which was computationally expensive and could be less accurate. With this function you can specify one or more columns to rank by and also the order (ascending/descending). This is not a replacement for the RANKX function in DAX. If your rank order is not going to change or need it for validation purposes, this is a great function.

Read the official documentation for all the function parameters. But Reza Rad has the most comprehensive overview of the function with examples so definitely check it out.

Note: You need to upgrade to June 2022 version of Power BI to use it in Desktop.

Subgroup Rank

In the example below, we have three products, their colors and the Units. Rank1 column shows the rank by number of units for all the products in the ascending order. What we are looking for is the subgroup rank, i.e rank within each product type as shown in Rank2 column.

import pandas as pd
df = (pd.read_clipboard()
      .assign(Units = lambda s:s['Units'].astype('int'))
      .assign(Rank1 = lambda s:s['Units'].rank().astype('int'))
      .assign(Rank2 = lambda s:s.groupby('Product')['Units'].transform('rank').astype('int'))
      .sort_values(['Product','Rank2'])
           
     )
df
Product Color Units Rank1 Rank2
1 Product A Red 4549 6 1
2 Product A Green 5562 10 2
0 Product A Black 7635 11 3
3 Product A Yellow 15000 12 4
4 Product B Black 1513 2 1
5 Product B Red 2969 3 2
8 Product B Black 3860 4 3
7 Product B Yellow 5409 8 4
6 Product B Green 5491 9 5
11 Product C Yellow 600 1 1
10 Product C Green 4264 5 2
9 Product C Red 4839 7 3

Here is my walkthrough of how you can create subgroup ranks in Power Query.

Important Notes:

  1. As I mention in the video, this function is not query foldable so be sure to plan your steps to utilize folding first. If you don't know what is query folding, watch my in-depth session on query folding.

  2. When you create the rank column, it will always sort the table by the newly created rank column. This may not always be desired. For example if your table is sorted by date or a key column when imported, that sort order will be disrupted when you create the rank column. I am not sure if there is a way to prevent that from happening. This also has an unintended effect that vertipaq compression may not be optimal. As you know, cardinality of the column matters in vertipaq compression but distribution of values matters as well. If the sort order is disrupted it may increase the size of the dataset. Be sure to check before and after.

  3. I haven't tested this on a large dataset but be sure to check the refresh time and dataset size after creating the rank column.

  4. Look at the RankKind parameter to specify how the ranking should be done.