Data fascinates me. Ironically, much as I enjoyed mathematics in high school, I did not enjoy Statistics and Probability – strands that haunt me particularly in my professional life. Another ironic thing is that while I enjoyed working with databases in my previous career in IT (I love SQL), I’ve not really ‘taught’ databases in high school computing in a way that genuinely shares this joy.
This post is not about Stats, Probability, or databases but rather something more fundamental.
This post is about data and, in particular, some novel ideas I’ve heard/read lately about what data is…and I’m fascinated even more!
3 ways to spot a bad statistic
in this TED talk, Mona Chalabi is charmingly entertaining as she unpacks the problem with polls and averages, as well as people’s perception of data. The thing about averages is not new to me but I love her (new to me) rationale for her hand-drawn visualisation which strengthens her strategies for spotting dodgy data. I teach these data concepts but I love the (new to me) way of framing them.
Can you see uncertainty?
There’s usually a big emphasis on data accuracy and precision (oh the beauty and irony of floating point representation in computers). Chalabi points out this problem when dealing especially with human behavioural data (ha! Recall my action-research on well-being). And truly, sometimes we got hung up on quantification and numbers, sometimes losing sight of the real story the data is trying to tell.
Can you see yourself in the data?
The second point is interesting because it’s not just about whether data is personally relevant. But rather, it’s to do with granularity and visualisation techniques, particularly when only aggregated data is shown. People cannot be summed into one data point so it makes sense to look for other data points or perspectives or axes, e.g. over time or split into gender.
How was the data collected?
If there’s anything I learnt from working with decision-support systems it’s this, be careful what questions you ask as the answers you get may not be what you’re after in the first place…and the good ‘ol “rubbish in, rubbish out“. This point is certainly about the source and process of data collection but more importantly, it is about data integrity.
Lupi extols ‘Embrace complexity’. Here I am, schooled in the idea of ‘keep it simple so the audience gets it’ but,
We can write rich and dense stories with data. We can educate the reader’s eye to become familiar with visual languages that convey the true depth of complex stories.
And here’s me scratching my head why school reports show a bunch of numbers and letters, maybe some written comments, that quite often fail to tell the ‘true depth of complex stories’. Context is usually missing because (I know) it’s in the ‘too hard basket’.
So, here is the challenge I’m setting for myself:
Data, if properly contextualized, can be an incredibly powerful tool to write more meaningful and intimate narratives.
I’ve got such a long, arduous and exciting learning journey ahead of me!