Examples of AHS employee carelessnessAHS Workers could care less about accurate data gathering

view

Step 1: Go to https://open.alberta.ca/opendata/deaths-cause-by-gender-and-age

Step 2: Download a file named deaths-by-gender-and-age00.csv by clicking the bottom download button or the included link.

 

 

 

 

 

 

 

 

 

 

Step 3: Open in  a spreadsheet program like Excel. I opened it using OpenOffice, which is open source, so it’s free.

The first thing I saw that every thing was laid out neatly and it was easy to spot the system.

Columns run from A to Z. Perfect. Very Organized.

year, cause, gender, 19 age categories spanning 5 years each with the exception of the 90+ category, which covers everyone who makes it there.

Then there’s people who they didn’t know how old they were when they died,

whether they died at home or not

and a totals column.

Top of AHS deaths data spreadsheet
The AHS data seems logically organized and easy to understand

Step 4: Looking to the rows, every year for every cause they’ve recorded 3 values for each cause of death.

  1. Male total
  2. female total
  3. Grandtotal

with each value recorded under the appropriate age column.

Step 5: But all is not as it seems. Note that the government had appended a category number to each cause name and separated it from the cause name with a space. Given that .csv in the filename stands for comma separated values, this means the creator of the file did not want the the ID# and the description of each cause to be separate.

Step 6: Find out why. This is why we downloaded the .csv version of the data so we can import it in a better database program for analysis. I imported the file into a mySql database that looks like this:

Plan for the design of a mySQL database
The .csv version of the data can be imported into mySQL

Step 7: Once it’s been imported into mySQL, you can start querying the data. First finding: There are 2290 “unique” category names. This can be reduced by just eliminating extra spaces, leaving us with 1903 unique causes of death.

Step 8: This is where you’ll realize that the cause ID#’s do not remain the same throughout the data.

Screenshot showing AHS constantly changes the cause ID#s for various causes of death
The cause ID#’s used by AHS change from year to year.

Step 8: Separate the cause ID# from its name. This still leaves us with well over 1000 unique causes of death. Now we can start fixing some of the carelessness of those responsible for creating this document.

Examples of AHS employee carelessness
AHS Workers could care less about accurate data gathering

This reduces the total # of causes of death to 439.

439 Unique causes of death revealed by fixing AHS staff errors
439 Unique causes of death revealed by fixing AHS staff errors

Step 9: Deal with naming inconsistencies across years. I started by finding out how many categories there were per year.

Database query showing AHS had 356 causes of death in 2001
Alberta Health Services (AHS) used 356 different Causes to categorize their mortality data in 2001.
Database query showing AHS had 358 causes of death in 2004
In 2002 AHS used 356 different Causes to categorize their mortality data in 2002. Same as in 2001.
Database query showing AHS had 358 causes of death in 2006
Database query showing AHS had 357 causes of death in 2003. This is one more than in the first 2 years of this dataset.
Database query showing AHS had 358 causes of death in 2004
Database query showing AHS had 358 causes of death in 2004, up 1 from the previous year and 2 from the first 2 years of the data table.

Database query showing AHS had 358 causes of death in 2005

Database query showing AHS had 358 causes of death in 2006
And so it was 358 for the next couple of years when suddenly:
Database query showing AHS had 346 causes of death in 2007
Alberta Health Services (AHS) used only 346 different Causes to categorize their mortality data in 2007. This means 12 of the old causes just got blended somehow…

Step 10: Discover only two different names being used for unknown( I think – which we’ll get to in step 11)

Before:

 

After: (Shows columns updated appropriately)

Final Version of blended category
Final Version of blended category

Step 11: Establish a public forum where any medically astute truth-seekers can take a look at somehow standardizing….click here