Selecting Columns by Data Type in Power Query
This blog shows how to select columns by data type. This is helpful if you have hundreds of columns and want to see, for example, only numerical or date columns.
It's not uncommon to have a table with hundreds of columns with different column types such as numerical, text, date, percentage etc. You can select columns by using the UI option "Choose Columns" and then selecting which columns to keep. But if you have have tens of columns of the same type, you will have to manually go through the list of columns and select the columns to keep. As far as I know, there is no direct way to only keep, for example, numerical columns. In Pandas, you can use
select_dtypes method and pass which columns to choose based on the data types. No such option or function is available in Power Query.
In this example, I will show how I achieved it. You can customize it or turn it into a function based on your use case.
I am using two M functions to achieve the desired output -
Value.Type returns the data type of the value. For example,
number as type, while
text as type. There is not much about it in the official documentation here
Type.Is is a logical function that you can use to get a True/False if the type equals defined column type.
For example, to check if number 123 is a text you would write
Type.Is(Value.Type(123), Text.Type). This will return
Type.Is(Value.Type(123), Number.Type) returns
TRUE. Here we first obtained the type of the value using
Value.Type and then compared it with
Number.Type. Since both match, the result is
TRUE. You can see a list of all data types in M here.
We can combine these two functions with parameters to dynamically select columns based on selected data type.
First create a parameter list as shown below. Include all data types you want to select. In this case, I am only selecting text, numeric and date type. I have also included 'All' to return the entire table.
You can see the code below, but at a high level here are the steps:
- Create a parameter list
- Ensure all columns have correct data types assigned. This is important because that's how M will idenify the columns to select
- Unpivot the table to create two columns - 'ColumnName' & 'Value'
Type.Isand create a new column that shows the data type of the value
- Link parameter list with the filter to select data type
- Get filtered column list and pass it to
You can see it in action below:
To apply this to your table, you only need to make sure
SourceTable2 is your table with all columns with correctly assigned data types. You could convert it to a function to take table and data type as arguments. I will leave it to you.
While thinking more about ways to make this function better, I actually found two more ways. One is to use
Table.ColumnsofType (documentation) and other is to use
Table.Schema gives you metadata of the entire table along with the column types. I can use the same approach as above, but just use
Table.Schema instead to get data type of each column. This is faster, better, cleaner!!! Here is the code: