If you’re going through all the trouble of collecting data, then hopefully you’re making good use of it.
Typical uses of data include gaining a deeper understanding about an organisation or the world in general, deriving insights that help you make better decisions, building new products and services, and improving existing offerings. In order to do this, data might be analysed, processed, modified, or even shared with people outside of your organisation.
Data usage covers a wide scope of activities and throws up numerous ethical issues particularly around transparency, fairness and bias. This can be particularly problematic when algorithms are used to inform or automate decision making.
The Data Protection Act 2018 says that people who are responsible for using data must make sure that the information is being used fairly, lawfully and transparently. It should also be used for the specified purposes only, as well as in a way that is adequate, relevant and limited to what is absolutely necessary.
Here are our top tips for things to think about to make sure you’re using data in an ethical way.
Before you begin using, analysing and interrogating your data have you considered?
1) Being open: are you transparent about the use of your data?
Transparency is a fundamental principle in GDPR, and with the increasing use of algorithms in data-driven automated decision making, it has become even more important to explain how an algorithm arrived at its outcomes.
You might have heard many algorithms described as black boxes, in that it’s impossible to know how they really arrive at specific outputs. However, methods to help explain their inner workings (so called explainable AI) are becoming increasingly used.
As well as stating an intended purpose up front, organisations should also be ready to clearly communicate how the data they collect or acquire is being used. It’s really important to be able to explain how data is driving a decision making process, or how a particular decision was made from it.
Ask yourself:
- How open are you about data collection, processing and usage?
- Are you able to easily articulate and explain how you are using data and for what specific purpose?
- Are you transparent about how you protect and secure the data you collect from consumers or service users?
- If you’re using algorithms and AI, are you using methods to help explain their inner workings and how they arrived at an outcome?
2) Being equitable: is your data use fair?
In order to be considered fair, the collection and use of data should be proportionate to the challenge or issue it sets out to resolve. Sometimes it may be difficult to collect data from all groups, but could this lead to unfair outcomes?
The quality of data should also be such that the insights it generates are fair and any impact on a particular group can be justified and not be considered detrimental to them. It should also be complete enough to drive fair outcomes upon analysis or processing.
Ask yourself:
- Am I offering customers value in exchange for using their data? For example, is it being used to improve a product or service they use?
- Should you be using personal data at all? Is it fair to do so?
- Are you collecting data in a fair way so that all groups are represented? Are there any gaps?
3) Being honest: have you acknowledged the existence of bias in your data?
Linked to fairness in data usage is bias. While we acknowledge that humans are often biased and naturally prejudiced, it’s a common misconception that machines make cold and objective decisions. The reality is that they are as susceptible to bias as any human.
When you think about it, this makes sense, as whether people are analysing data manually or using algorithms that are created by humans, processing data can create, reinforce and perpetuate real-world biases. Humans are involved at every stage: generating data, collecting data, making key modelling decisions, and observing and using the outputs. For that reason, human biases are present at every stage too.
When bias is present in your data pipeline, whether through algorithmic bias, data bias, model bias or human bias, it can lead to discrimination. It’s therefore extremely important to be aware of and take steps to mitigate biases wherever possible.
Ask yourself:
- Can I identify the points at which biases may have crept into the data, the outputs of analytics and AI, or the human processes using insights to make decisions?
- Are there steps I can take to reduce this bias?
- What level of bias is tolerable? While completely eliminating bias is impossible, how can I strive for the least amount of bias in the system?
- What tools and assistance is available to help minimise bias? Can we explain the decisions our AI models reach?
How predictive policing is entrenching bias within the justice system
There has recently been a growth in the use of predictive analytics within the legal system and law enforcement. Increasingly, AI and machine learning are steering decisions around which neighbourhoods or areas police should focus their attention on (also known as predictive policing).
However, as a report by Durham Constabulary highlights, these algorithms can end up discriminating against certain groups, particularly those from poorer socioeconomic backgrounds. By focusing on particular groups, these AI systems perpetuate biases as target groups become increasingly profiled, gathering more and more data that then reinforces future predictions.
Our tips to reduce bias in your data
- Review the data, algorithms and human processes involved in decision making to help identify and assess potential sources of bias. This can go hand in hand with implementing ways to measure and mitigate potential biases.
- Consider representation both pre and post processing. Are certain underrepresented groups in the data used to train an algorithm?
- Are there any inconsistencies or poor predictions for certain demographics or groups in the insights generated?
- The UK Government’s Data and Ethics Framework includes using AI responsibly and defining governance structures.
Our next article will look at data sharing — how to share data in a way that’s fair, safe and transparent.
Data cataloguing is hard!
Find out about the challenges of enabling data discovery and the need to recognise the efforts of data publishers to enable data findability.
Read moreOur recent insights
Transformation is for everyone. We love sharing our thoughts, approaches, learning and research all gained from the work we do.
Keeping your systems on track with digital MOTs
Outdated tech can hold back organisations. Learn how digital MOTs can assess and future-proof your systems.
Read more
The NHS shouldn’t be using AI…Yet
Artificial intelligence could transform our health service, but we need to act now so it can effectively harness its benefits
Read more
Common misunderstandings about LLMs within Data and Analytics
GenAI and LLMs have their benefits, but understanding their limitations and the importance of people is key to their success.
Read more