Springwater 05 | Creating the Site dimension

Springwater 05 | Creating the Site dimension

Introduction

In previous posts, I created simple dimensions in Power Query for the Revenue Type and Cost Type dimensions in my Power BI data model. In this post I’m going to tackle something more complex – the Site dimension.

The original OLTP database stores site information in six tables in the client schema:

site-oltp

For simplicity in the Power BI data model, I want to combine all of the data in these six tables into a singe Site dimension table.

Source data

As with the Revenue Type and Cost Type dimensions, the first step is to extract the source data from the OLTP database. For this I’ll use a query called client_Site in Power Query:

// client Site (data source query)

let
    q = Sql.Databases ( "SWITCH" )
        { [ Name = "Springwater" ] } [ Data ]
        { [ Schema = "client", Item = "Site" ] } [ Data ],
    Result = if buffer then Table.Buffer ( q ) else q
in
    Result

Remember that I’ve enabled the option in Power Query to import relationships from the source database on first load, so this query returns not only the rows in the Site table, but also the related values from the Client and City tables and the related rows from the Set table:

The Set table appears twice in the query because there are two relationships between the Site table and the Set table in the OLTP database – I’ll explain why in a later post. For the purposes of the Site dimension I don’t need the related Set rows, so I’ll be discarding these two columns in any case.

The Site dimension query

The query that defines the Site dimension has six steps. Firstly, and straightforwardly, we need to reference the source data query:

// Site (dimension query)

let
    Source = client_Site,

Expanding required columns

The next step is to expand the columns needed for the dimension. This will involve a number of nested expansions. In the code below, I expand the client.Client and client.City columns. The client.Client column contains only one attribute of interest, the client name, which is expanded as the Client attribute in the resulting dimension query. However, when I expand the client.City column, this exposes the client.Country table (see the OLTP diagram earlier in this post). Therefore, I need to expand client.Country as well. This expansion in turn exposes the client.Forecast Group and client.Region tables, which also require expansion. The resulting code contains five nested expansions, which I’ve grouped together in a single Power Query step:

    // Expand all required columns currently represented as related values
    ExpandColumns = 
    Table.ExpandRecordColumn (
        Table.ExpandRecordColumn (
            Table.ExpandRecordColumn (
                Table.ExpandRecordColumn (
                    Table.ExpandRecordColumn (
                        Source, 
                        "client.Client", 
                        { "Client Name" }, 
                        { "Client" }
                    ), 
                    "client.City", 
                    { "City Name", "client.Country", "Latitude", "Longitude" },
                    { "City Name", "Country", "Latitude", "Longitude" }
                ), 
                "Country", 
                {
                    "Country Code", 
                    "Country Name", 
                    "client.Forecast Group", 
                    "client.Region"
                }, 
                { "Country Code", "Country Name", "Forecast Group", "Region" }
            ), 
            "Forecast Group", 
            { "Forecast Group Name" }, 
            { "Forecast Group" }
        ), 
        "Region", 
        { "Region Name" }, 
        { "Region" }
    ),

Following this step, my query produces the following intermediate output (so that you can see what is going on, I’ve captured just a single record from the query):

Deleting columns not required

I don’t need all of these columns, so let’s delete those not required (including those Set columns identified earlier). Specifically, we don’t need the ClientID and CityID columns because we’ve now included the Client name and City directly in the dimension:

    // Remove columns not required in the final query
    RemoveColumns = 
    Table.RemoveColumns (
        ExpandColumns,
        { "ClientID", "CityID", "set.Set(SiteID)", "set.Set(SiteID) 2" }
    ),

Final steps

We’re almost there now. If you look at the record above, you can see that the Internal Site attribute is represented as either a 1 or 0. I want to convert this to a boolean type (true or false):

    // Convert binary columns to logical (boolean)
    ChangeType = 
    Table.TransformColumnTypes (
        RemoveColumns,
        { { "Internal Site", Logical.Type } }
    ),

The Site Postcode / ZIP attribute is blank, meaning that it actually contains an empty string rather than a null value, so I’ll tidy this up as well:

    // Replace any blank text with nulls
    ReplaceBlanks = 
    Table.ReplaceValue (
        ChangeType,
        "", null,
        Replacer.ReplaceValue,
        { "Site Postcode / ZIP" }
    ),

The final step is to convert the boolean value of Internal Site to a Yes/No attribute, which will be easier than a true/false value for business users to work with:

    // Convert Internal Site to a Yes/No column
    Result = 
    Table.RemoveColumns (
        Table.AddColumn (
            Table.RenameColumns (
                ReplaceBlanks,
                { { "Internal Site", "_Internal Site" } }
            ), 
            "Internal Site", 
            each if   [ _Internal Site ] 
                then "Yes" 
                else "No", 
            Text.Type
        ),
        { "_Internal Site" }
    )

Here’s the complete code for the query including all of the steps described above:

// Site (dimension query)

let
    Source = client_Site,

    // Expand all required columns currently represented as related values
    ExpandColumns = 
    Table.ExpandRecordColumn (
        Table.ExpandRecordColumn (
            Table.ExpandRecordColumn (
                Table.ExpandRecordColumn (
                    Table.ExpandRecordColumn (
                        Source, 
                        "client.Client", 
                        { "Client Name" }, 
                        { "Client" }
                    ), 
                    "client.City", 
                    { "City Name", "client.Country", "Latitude", "Longitude" },
                    { "City Name", "Country", "Latitude", "Longitude" }
                ), 
                "Country", 
                {
                    "Country Code", 
                    "Country Name", 
                    "client.Forecast Group", 
                    "client.Region"
                }, 
                { "Country Code", "Country Name", "Forecast Group", "Region" }
            ), 
            "Forecast Group", 
            { "Forecast Group Name" }, 
            { "Forecast Group" }
        ), 
        "Region", 
        { "Region Name" }, 
        { "Region" }
    ),

    // Remove columns not required in the final query
    RemoveColumns = 
    Table.RemoveColumns (
        ExpandColumns,
        { "ClientID", "CityID", "set.Set(SiteID)", "set.Set(SiteID) 2" }
    ),

    // Convert binary columns to logical (boolean)
    ChangeType = 
    Table.TransformColumnTypes (
        RemoveColumns,
        { { "Internal Site", Logical.Type } }
    ),

    // Replace any blank text with nulls
    ReplaceBlanks = 
    Table.ReplaceValue (
        ChangeType,
        "", null,
        Replacer.ReplaceValue,
        { "Site Postcode / ZIP" }
    ),

    // Convert Internal Site to a Yes/No column
    Result = 
    Table.RemoveColumns (
        Table.AddColumn (
            Table.RenameColumns (
                ReplaceBlanks,
                { { "Internal Site", "_Internal Site" } }
            ), 
            "Internal Site", 
            each if   [ _Internal Site ] 
                then "Yes" 
                else "No", 
            Text.Type
        ),
        { "_Internal Site" }
    )
in
    Result

And here’s the final record format for the dimension:

Progress summary

So far, I’ve imported the data for Revenue Type, Cost Type and Site, creating a dimension for each of them. In the next post, I’ll take a look at the Set dimension.

Leave a Reply