After seeing the work IF did with the London School of Economics and Political Science (LSE) on Understanding Automated Decisions, Matt Jones at Google AI asked us to look at explainability in personalised machine learning, particularly in the context of federated machine learning—an intriguing new technology that has recently been open-sourced by Google. Together with Harry, I spent a week on this with Alison Powell and her team from LSE.

The LSE team shared an annotated reading list about the different stages of machine learning—excerpts from this are at the bottom of this post. Harry and I then ran a design sprint for a week, bringing in the rest of the IF team to share their feedback, expertise and ideas. Since then, we’ve been sharing what we prototyped with some of the relevant Google teams, who have given us really helpful perspectives (thank you!).

At IF, we want technology to be in service to people. We’re already seeing debates about filter bubbles and personalised information creating or exacerbating socio-economic divisions. Add machine learning to that equation and both the potential benefits and risks of technology are higher.

If devices present a personalised view on the world, people should be able to understand how and when personalisation happened. But how do you explain that in a way that people can actually digest, especially when it comes to complex scenarios like federated learning? And can you take the leap from explaining what’s happening at certain points, to creating interfaces where people can stop or influence what’s happening? Those are some of the questions we prototyped in this project.

Making machines that learn worth trusting

This project looks at how explainability could work in a user interface. Unlike our data permissions catalogue, it’s not meant to be a set of patterns that product teams can use directly. What we arrived at is more abstract than that.

Being able to understand, inspect and control what the machine learning model is doing is critical for developing understanding and trust. It’s important to choose meaningful moments for explanations. Too much information at once is overwhelming and impossible to take in.

How does federated learning work?

A lot of this work is relevant to machine learning more generally, but we wanted to include new approaches, such as federated learning, in this project. It’s a relatively recent technology, with a lot of promise for protecting people’s privacy, while having a lot of the benefits of centralised machine learning. It’s also complex enough that explaining it in a way that people intuitively understand is a really interesting challenge.

In federated learning, models are trained collaboratively. Unlike centralised machine learning models, federated learning is based on the relationship between a global model and what can be learned on local devices. Data is held on those devices to ensure users’ privacy is protected.

Simplified diagram comparing centralised machine learning and federated machine learning (María Izquierdo/IF: CC-BY)

An initial global model is distributed over several devices. Based on how individuals interact with their devices, the global model is improved upon locally on the device:

  • The device shares these improvements with the provider of the global model: Google, for example.
  • These local improvements travel across a secure channel from individual devices to the provider.
  • They are securely aggregated with improvements from other devices.
  • The aggregated improvements (from all the devices included in the collaborative training system), are merged with the original global model.

This creates a new global model, which can then be iteratively improved again, or pushed out for use to individual devices. For a more detailed explanation, this paper outlines exactly how Google implements federated learning at scale and this blog post looks at the introduction of federated learning into a software library called Tensorflow.

Federated learning can protect people’s privacy

Because the input data stays on individual devices, federated learning is better at protecting people’s privacy than centralised machine learning. And due to the collaborative training process, the global model can be created based on a much larger dataset than you’d find on any one person’s phone.

At the moment, federated machine learning is used for mobile keyboard predictions. Here, the details of what a person has typed stays on the device, and isn’t shared with the cloud-based, machine learning provider. The provider (like Google) can see securely aggregated summaries of what’s been typed and corrected, across many devices. But they can’t see the contents of what you’ve typed. This protects individual people’s privacy, while improving predictions for everyone. This approach is also compatible with additional personalised learning that occurs on device.

Explaining personalised machine learning

Explaining all of this in words is complex. So we looked at how you might do this visually instead. We looked at four different moments in personalised machine learning systems (federated or otherwise) where there are opportunities to show people how machines learn. With some of these, we went a step further, suggesting ways that people can change how the models use data about them.

1. Comparing to understand

In this example, people who have their devices close together can compare what local models have learned by using data about them. We experience software through interfaces. How do you explain why the interfaces on one person’s phone are different to those on someone else’s, even though they’ve gone to the same page?

People should be able to see the differences and similarities between what the models on their devices are doing. Here, people can see the difference between what they see on their device and what someone else sees. And they might transfer comparable information between devices.

Two people compare local models to see what their different models have learnt (IF: CC-BY)

2. Retracing steps

In this moment, people can identify why and when a model changed by retracing the steps in a local model’s training log. We also suggested an interaction for restoring the local model to a previous version. This sort of interaction gives people more control over what their devices do, and means they can correct anything that isn’t accurate.

A person retraces the steps in a local model’s training log and can restore it to a previous version (IF: CC-BY)

3. The freedom to opt out

At IF, we argue that one of the characteristics of a trusted system is that you can get out of it, gracefully. We should design for moments where people can stop the learning on their device from being used to improve a global model, without having to stop using the global model on their device. Like using Wikipedia, without having to contribute to it.

Yes, we acknowledge that in order to train large-scale deep learning networks using federated learning, especially without bias, data from a substantial fraction of the users of the model is needed. But we also believe that people should also be able to stop using a global model that doesn’t work for them. They also need to be able to see how different global models perform on their device.

With this in mind, we suggested ways people could try out different global models on their device and reject them if they don’t like them. We also investigated approaches for stopping data from being fed into the global model.

A person tries a global model and is able to stop using it if they don’t like how it works (IF: CC-BY)

4. Setting boundaries, together

Services today often leave individuals to make decisions about privacy. But people often don’t have the knowledge or resources to make meaningful and informed decisions. A lot of data is relational, drawing on information about different people. Shouldn’t how we manage data reflect that? Also, the power imbalance between technology companies and individuals is huge. That’s where designing for different kinds of relationships, such as between machines, or between a group of people and machines, can be a powerful approach.

One way to do that is opening up how to manage the boundaries of what models can learn to third parties. They wouldn’t be able to see the underlying data, but could see when, and how, a decision about that data has been made.

We imagine individuals would grant third parties the authority to act on their behalf, as part of a collective.

To achieve this, a model’s boundaries need to be open and interpretable for third parties acting for the collective. That’s why this moment shows a third party acting on behalf of many people, defining the limits of what different local and global models can learn.

A third party acts on behalf of many to set limits around what local and global models can learn (IF: CC-BY)

This is just the beginning

Explaining automated decisions that are taken as a result of personalised machine learning is an important first step in making sure they’re in service to people. That needs to happen in ways that people can both understand and care about.

We think the next step would be testing these “moments” across a wider range of user groups, to see how intuitive these visual explanations are. So much of explainability is about context: when is it helpful for people to know what’s going on with the underlying technology, and when is it just in the way?

This research raised lots of interesting questions around rights and privacy, like what happens when models cannot be explained, or how to explain personalisation on shared devices. Federated learning is intriguing in this context as it allows the usability of global machine learning, while protecting individuals’ privacy. We’re keen to see more use cases and critically, more user research as the technology has been made open-source and is therefore more widely available.


Thanks to Matt Jones, Rob Marchant, Alex Ingerman and Brendan McMahan at Google AI, and IF’s Sarah Gold, Ella Fitzsimmons and Felix Fischer for their contributions to this post. Thanks also to Alison Powell and her team at LSE. Further edits and suggestions by David Marques, Grace Annan-Callcott, Jess Holland and James Barrington-Wells.



Federated learning and machine learning in academic literature

“Federated Learning: strategies for improving communication efficiency”, Konecný, McMahan, Ramage & Richtárik, 2016

“Federated Learning: Collaborative Machine Learning without Centralized Training Data”, McMahan & Ramage, 2017

“Machine Learning and Knowledge Extraction”, p1-8, Holzinger, Kieseberg, Weippl, Tjoa, 2018

Retracing in academic literature

“Practical Secure Aggregation for Privacy-Preserving Machine Learning”, Bonawitz, Ivanov, Kreuter, Marcedone, McMahan, Patel & Seth, 2017

“Technological due process”, Citron, 2007

The freedom to opt out in academic literature

“The Politics of Possibility”, Amoore, 2013

“Smartness and agency: Intricate entanglements of law and technology”, Hildebrandt, 2015

Setting boundaries in academic literature

“Algorithmic Impact Assessments: a practical framework for public agency accountability”, Reisman, Schultz, Crawford, Whittaker, 2018

“Smartness and agency: Intricate entanglements of law and technology”, Hildebrandt, 2015