Exploratory Analysis of Unplanned Admission Data (NHS Wales)

Having recently taken a career break, I’ve been applying for work back in the NHS as an analyst. One of the roles required some analysis to be done on unplanned admission data for a 10 minute presentation to be presented at interview, a fairly standard interview process for data analyst roles. The basic premise of these tasks is to carry out exploratory analysis on a set of data you’ve likely never seen before, with a short brief about what to focus on. My task was to figure out the conditions placing the most unplanned burden on the Welsh health service, including geographical locations, age groups, and reasons for admission.

In the above presentation, you can see each step of the analysis I did using the basic tenets of creating good data viz, and using many of the techniques described in both Storytelling with Data by Cole Knaflic, and Data Visualisation by Andy Kirk. This presentation just about fit into the ten minutes I was allotted, and provided a jumping off point for discussion for the interview panel.


Some of the questions I had asked as a result were:

In slide 2, why might the 70+ population have the largest admission numbers without any knowledge or reference to their potential poorer health?

This was simply the largest age grouping, including all of those aged above 70, while the other groups each covered only a ten year cohort.

In slide 7, what are some of the potential problems with using an average (mean) of the length of stays of admission for each condition group?

Means have the potential to be skewed heavily by individual data points that are far outside of the normal range. If I had the data at a patient level, I might choose to highlight these potential anomalous points of data, or instead calculate a median, that is less likely to be skewed by extremely low or high lengths of stay. As an example of individual data points causing potential skews, in slide 10, the grouping for F70-F79 conditions had a very low number of total bed days but an incredibly high average length of stay, suggesting that only a few patients make up the cohort, but one or more of them ended up in hospital for a long time.

Where should we target our efforts, as a result of your analysis?

This question came as a result of me not including a conclusion, which I would make sure to do in future to avoid this needing to be asked. In this case, Aneurin Bevan University Local Health Board and Betsi Cadwaladr should be targeted in particular as localities with the greatest potential impact. Cardiff and Vale, and Cwm Taf should have their high admission counts for the 0-9 age group investigated. Learning Disabilities and Autism within the community should be focussed on to bring down the number of highly complex cases presenting unplanned to hospital. Considerations should be given to look further into Hywel Dda health board, due to its high total bed days of its unplanned admissions.


If you’d like to take a look at my Excel workbook to see how I went about performing the analysis, you can download it here or click the link below.

I mostly used pivot tables and pivot charts to create the visualisations. These are ‘quick and dirty’ methods for exploratory analysis, and the work could be expanded upon by more rigorous statistical analysis using Python or R. In particular, modelling the data might be useful to determine a baseline prediction for different scenarios: no intervention vs. a variety of targeted interventions as identified in the exploratory analysis.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s