Power Query for Azure tables and updating column names
I recently did some PowerBI work using Azure Table Storage. In the query editor, it seemed that each table I pulled in from Azure Tables had a very similar process.:
- Get the list of tables available for the storage account.
- Select a table
- Select the Content of the table (excluding the RowKey, PartitionKey, and timestamp columns)
- Expand the columns in the Content
- Rename the columns
Azure table column names don't allow spaces, but they are case sensitive, so I found that if I was careful to use PascalCasing in my Azure Table columns (which I am careful to do in any storage container), I could adapt the steps above to make a function that would do all those things for me, just by providing the name of the account and the name of the table.
let GetAzTable = (account as text, table as text) =>
let
Source = AzureStorage.Tables(account),
Table1 = Source{[Name=table]}[Data],
ContentOnly = Table.SelectColumns(Table1,{"Content"}),
ColumnName = Table.ColumnNames(ContentOnly){0},
Content = Table.Column(Table.FirstN(ContentOnly,100),ColumnName),
ColumnsToExpand = List.Distinct(List.Combine(List.Transform(Content,
each if _ is record then Record.FieldNames(_) else {}))),
NewColumnNames = List.Transform(ColumnsToExpand, each NewColumnName(_)),
ExpandedContent = Table.ExpandRecordColumn(ContentOnly, ColumnName, ColumnsToExpand,NewColumnNames)
in
ExpandedContent
in GetAzTable
In this code, I reference another function that generates the new "nice" column name based on the column name in the Azure table. This function doesn't do the name formatting, but it determines if a column should get a nice name or not. I use a naming convention for columns that are meant to be used internally - either as a join key or for a sort column, or for a measure calculation. This function returns the column name "as is" if it follows the naming convention for internal columns, or calls the function GetPascalName to make it a "nice" name.
let NewColumnName = (colName as text) as text =>
let
Source = if Text.EndsWith(colName, "Key") or Text.EndsWith(colName,"Num") or Text.EndsWith(colName, "Sort") or Text.EndsWith(colName,"SortOrder") or colName = Text.Upper(colName) then colName else GetPascalName(colName)
in
Source
in NewColumnName
Here is the cool code:
let PascalName = (pascalName as text) as text =>
let
Source = Text.ToList(pascalName),
Caps = { "A" .. "Z" },
NewSource = List.Combine(List.Transform(Source, (x) => if List.PositionOf(Caps, x) >= 0 then {" ",x} else {x})),
result = Text.Trim(Text.Combine(NewSource, ""))
in
result
in PascalName
This code takes a text value and turns it into a list (of individual characters). Then it recombines the list of characters, adding a space before each capital letter. Finally, it trims the leading space.
This isn't perfect by all means - it could be improved by only putting a space in front of the first capital in a series of capitals. A column named "FTPUrl" will come back as "F T P Url". (Note that I've already caught the case of a column with ALLCAPS in the NewColumnName function.) Just looking at this edge case, I think the algorithm would need to put a space before the first and the last capital in a series of capitals (" FTP Url", trimmed to "FTP Url").
Care to put your improved algorithm below in the comments?