The aggregated data values

Description of your first forum.
Post Reply
asimd23
Posts: 592
Joined: Mon Dec 23, 2024 3:25 am

The aggregated data values

Post by asimd23 »

We did run into an issue when we loaded the data as it wasn’t always clear what the use of dashes meant within the data files. Sometimes ‘–’ was used to indicate an impossible cell, e.g. Married people age 0 to 4, which would then make a data value of null. In other cases ‘–’ was used to indicate that there could be a value but for some reason one wasn’t supplied. The use of the ‘–’ meant that we had to do some preprocessing of the data files to enable us to load the data into our system, where we had told it that the cell contents should be a number.

During this data aggregation step, we america rcs data identified some further challenges with data from Scotland. weren’t matching the figures for Council Areas or the country total. On investigation, this was because Scotland released their data at Super Output Area equivalent using 2001 definitions. Due to changes in the makeup of the Council areas between 2001 and 2011, some of the data was either being undercounted or overcounted. Luckily around this time in the process, Scotland rereleased their data using the newly created 2011 Super Output Areas for Scotland. We were then able to swap out the 2001 Super Output Area data with the 2011 Super Output Area data and rerun the data aggregation exercise, which greatly reduced the inconsistency of data. This then highlighted a number of cells where the data had been corrected, so we needed to also swap out the Scotland totals.
Post Reply