How to… Consolidate Quarterly or Monthly Data Using Power Query

Do you get regular data files that you need to consolidate/compile together for a report or dashboard? This method will help you keep your reporting files lightweight and portable, while making updating their data as easy as pressing Refresh. The basics can be done in five easy steps. For our example, we’ll be using the quarterly NHS CYPED (Children and Young People Eating Disorders) data, specifically the ‘Provider and CCG’ files for 2021/22, if you’d like to follow along at home.

https://www.england.nhs.uk/statistics/statistical-work-areas/cyped-waiting-times/

Step 1.

Collect all your raw data files, separated by month/quarter/day, into a single folder.

Example: Three Quarters of Raw Data Files Stored in the Same Folder

Step 2.

In Excel, go to the ribbon (the taskbar at the top of the application), and enter the ‘Data’ tab. From there, select Get Data -> From File -> From Folder, and then navigate to the folder you’ve stored your data in from step 1.

Example: Location of the Get Data From Folder Process.

Step 3.

In the dialogue boxes that open, select ‘Combine & Transform Data’.

Example: The Power Query Wizard for Extracting Data From a Folder

Step 4.

Select a sample file to transform (the default option is the first file, which should be fine in most cases). Then, select the sheet that contains the raw data you need. Hit OK.

Example: Selecting the file and sheet we want to extract using Power Query.

Step 5.

In the now open Power Query editor, select ‘Transform Sample File’ from the left pane. You can then transform the data as required (if no transformation is required, just hit ‘Close & Load’).

Example: The Power Query Editor
Example: The Transform Sample File Query

Continuing the Example

We will now show how to transform data step-by step using NHS CYPED (Children and Young People Eating Disorder) data.

https://www.england.nhs.uk/statistics/statistical-work-areas/cyped-waiting-times/

In this example, we will need to transform the data in two steps. Once for the Urgent case data, and another for the Routine case data.

First, we will work on extracting the Urgent Case Data.

Step 1. Removing Rows

In the Transformation Sample File, remove the top 14 rows of data, as this contains information regarding the data set, not the data itself. To do this, use the button in the ribbon called ‘Remove Rows’ -> ‘Remove Top Rows’.

Example: Removing the Top Rows from the Transformation Sample File.

Step 2. Removing Unneeded Columns

Now select columns 2-7. These columns contain the required raw data for Urgent CYPED cases. Right click and select ‘Remove Other Columns’.

NB: We are not interested in column 8 or 9 of this data as they are summary statistics that are derived from columns 4-7. They can easily be calculated after we are done.

Example: Removing Unneeded Columns.

Step 3. Making Our Headers

Select ‘Use First Row as Headers’ from the ribbon to make the first row of data into our new table headings.

Example: Selecting the ‘Use First Row as Headers’ button from the ribbon.

Step 4. Unpivoting Columns

Select the ‘>0-1 Week’ to ’12 plus’ columns. Right click and select ‘Unpivot Columns’. This will transform the data into an easily manipulated and pivotable data format known as ‘Tidy Data’.

NB: We do this as these columns describe different categories of the same independent variable, ‘Time to Start Treatment’. Ideally, all columns will either be independent variables (what you are changing; e.g. categories, dates, names), or dependent variables (what you are measuring; e.g. counts, temperature, and other data).

Example: Unpivoting Columns

Rename the new columns to better describe the variables. Do this by double clicking in the heading for each column.

‘Attribute’ = ‘Time to Start Treatment’

‘Value’ = ‘Number of Patients’

Example: The result of unpivoting columns
Example: New columns given descriptive names.

Step 5. Remove Redundant and Erroneous steps.

Now, navigate to the ‘CYPED Data Files’ query in the left pane.

Example: Selecting the CYPED Data Files Query from the ‘Queries’ pane.

Remove the ‘Changed Type’ step by clicking on the ‘X’ alongside it within the ‘Applied Steps’ pane on the right of the Power Query editor.

This will have been causing problems as it was trying to change types of columns that we have since removed.

Example: Finding the ‘X’ to remove the ‘Changed Type’ step

Step 6. Adding Date Column

Select the ‘Removed Other Columns1’ step.

Example: The data held in the ‘Removed Other Columns1’ step.

Here, we can add another column to describe the quarter of data each file refers to.

Select the ‘Source.Name’ column, then in the ribbon, go to the ‘Add Column’ tab and select ‘Column from Examples’ -> ‘From Selection’.

Example: Where to find the From Selection add column button.

You will then be met with an ‘Insert Step’ dialogue box. Insert the step. This shouldn’t break anything as this new column will not be interacted with by any future step).

Example: Insert Step Dialogue Box

Fill in the ‘Custom’ column on the right of the main pane, denoting the date of the data for each file.

Example: Entering the date for each data file in the ‘Custom’ column.

You can then rename the ‘Custom’ column to ‘Quarter’ by double clicking its header.

Example: The newly renamed ‘Quarter’ column.

Reorder the columns so that the ‘Quarter’ column is on the left by dragging and dropping its header.

You will be asked to insert a step again, click insert.

Remove the ‘Source.Name’ column as it’s no longer needed. Right click its heading and select ‘Remove’. Its descriptive property has been inherited by the new ‘Quarter’ column. Insert the step when prompted.

Example: Removing the ‘Source.Name’ column after moving the ‘Quarter’ column left.

Step 7. Viewing the Finished Query, and Renaming the Query

Example: The newly transformed, collated, Urgent case data.

Clicking on the final step of the ‘Applied Steps’ pane will show us the completed, transformed data set.

In the ‘Queries’ pane on the left, rename the query to better reflect the data it contains, e.g. ‘Urgent CYPED Data’.

Example: Renaming the Query.

Step 8. Loading the Data into Excel

Go to ‘Close & Load’ in the ribbon’s home tab. This will load the data into Excel.

Example: The ‘Close & Load’ option in the ribbon.
Example: The loaded data within the Excel spreadsheet.

Step 9. Extracting the Routine CYPED data.

Repeat the above steps, but during step 2 (removing unneeded columns), select columns 2-3 and columns 11-14 (containing our routine data).

Follow the remaining steps, but on step 7 (viewing the final query and renaming it), rename the query to ‘Routine CYPED Data’ instead of ‘Urgent CYPED Data’.

Example: Completed Routine CYPED Data query

Step 10. Merging both queries

10a. Add a custom column to both queries at the ‘Removed Columns’ step.

Example: Where to Find the ‘Custom Column’ button.

Name the column ‘Patient Type’ and assign it a string by entering:

For the routine data: = "Routine"

For the urgent data: = "Urgent"

Example: New Column ‘Patient Type’ with string ‘Routine’ for the routine data query.

10b. Reorder the column by dragging and dropping it to the left.

Example: Reordered columns

10c. Once 10a and 10b completed for both queries, go to ‘Append Queries’ -> ‘Append Queries as New’ in the home tab of the ribbon. Select the two tables and hit OK.

Rename the new query as ‘All CYPED Data’.

Example: Renaming the new ‘Append1’ query.

Step 11. Load data into Excel

Example: All the data is available in Excel, each table on a different sheet.

If any new data is made available, you will need to add it to your data folder and hit ‘Refresh All’ to load the data into the consolidated data file.

Example: Where to locate the ‘Refresh All’ button.
Advertisement