Last week I received a #blimage challenge from @debsnet aka the édu flâneuse. When I came to the photo she had posted to inspire her challengees, it only took me a moment to link those overflowing hands with the data we researchers love to gather.
Data is a Latin plural word meaning ‘things that are given’, though it is used in English as plural or singular (e.g. ‘a piece of data’). In English it refers to information of various kinds: numbers, words, facts, opinions, pictures, tweets – the list is long. Social scientists can amuse themselves for hours by arguing about what constitutes data. There is a popular saying that ‘anecdote is not data’ although, when a qualitative researcher collects anecdotes from interview and focus group participants, data is exactly what they become.
Different types of researchers have different ideas about what constitutes data. To an anthropologist, an interview transcript may be only interesting for its textual content, while for a conversation analyst, the length of the pauses may be a fascinating aspect of that data. Some researchers treat focus group data just like interview data, while others see the interactions between people in the focus group as an enriching layer of extra data. For some people, data is collected; for others, it is constructed. I use ‘gathered’ when I want to encompass both perspectives.
Then there is ‘big data’: data generated by national governments, or by technology, which is so copious that it requires whole new methods of analysis and new words to describe its size like exabyte, zettabyte, or snakebyte (I might have made up one of those). Big data challenges the etymological suggestion that data is ‘given’ because big data is often a by-product of other activity, such as using social media or loyalty cards. This is an ethical minefield. For example, people may not realise that their data is of value to the companies running the facilities they use, and it can be difficult to track individuals down to seek consent for their data to be used in research.
You can do all sorts of things with data. For example, you can prepare data, code data, analyse data, synthesise data, visualise data, present data – and, if you’re like me, you can love data. In fact, I adore data! A new dataset to explore is so exciting because I never know what I might discover. I guess it’s the same feeling an archaeologist gets when they’re starting a dig, or an antiques dealer opening a box from a house clearance. There might be treasure in here! And even if there isn’t, even if there are only mundane things, I will still have seen something I hadn’t seen before, and maybe learned something new, or at least increased my experience.
You can also abuse and misuse data, by picking out the parts that support the argument you want to make, rather than preparing, coding, and analysing data as rigorously and honestly as possible. We are all susceptible to biases such as confirmation bias and hindsight bias, and there is only so much any one of us can do to counteract these. This is part of the reason for the scholarly peer review process, where others can scrutinise your work to check for bias. It is also why researchers encourage each other to track the links in our writing from research design, through data collection and analysis, to findings and conclusions, so that our processes and influences are clear to readers and they can make their own mind up about any biases they may perceive in our work.
It’s not only individual researchers, though, who abuse and misuse data. Research commissioners in every sector regularly bury data-based findings that don’t align with their political or organisational aims. And the media is notorious for putting spin on such findings. This has led to the establishment of independent fact-checking organisations such as Fact Check in the US and Full Fact in the UK.
It is easy to develop conspiracy theories about the ways in which governments, corporations, and the media use and misuse data. It is harder to do the tough research work necessary to counteract this, as far as we can, by producing firm findings, based on enough good-quality data, and presenting those findings in clear and understandable ways. To do that, we have to gather our data carefully, with a solid rationale for why we gathered it in the ways we did, so that we can be confident about the status and limitations of our data and about the findings we draw from its analysis. This is not easy – but it is possible, and it is our responsibility as researchers to do this work to the best of our abilities.
Now, a #blimage challenge for Naomi Barnes: I look forward to seeing what she makes from this picture. And if anyone else would like to use it for inspiration: help yourself!