Privacy protection in federated computing

Privacy protection in federated computing, 해시게임 Federated learning offers various privacy benefits out of the box.

Following the principle of data minimization, the original data remains on the device,

Updates sent to the server are focused on a specific target and aggregated as quickly as possible.

In particular, no non-aggregated data is kept on the server,

End-to-end encryption protects data in transit, and both the decryption key and decryption value are only temporarily stored in RAM.

Machine learning engineers and analysts interacting with the system only have access to aggregated data.

Aggregation is fundamental in federation methods, making it natural to limit the impact of any single client on the output,

But if the goal is to provide more formal guarantees, such as differential privacy, then the algorithm needs to be carefully designed.

While basic federated learning approaches have been shown to work,

and got a lot of adoption, but still far from being used by default,

Inherent tensions with fairness, accuracy, speed of development, and computational cost may hinder data minimization and anonymization approaches.

Therefore, we need composable privacy-enhancing techniques.

Ultimately, decisions about the deployment of privacy technologies are made by the product or service team in consultation with privacy, policy, and legal experts in specific fields.

The product is able to provide more privacy protection through the available federated learning system,

Perhaps more importantly, help policy experts strengthen privacy definitions and requirements over time.

When considering the privacy properties of federated systems, it is useful to consider access points and threat models.

Do participants have access to physical devices or networks?

Server serving FL via root or physical access?

Models and metrics published to machine learning engineers?

The final deployed model?

As information flows through this system, the number of potentially malicious parties varies widely.

Therefore, the privacy statement must be evaluated as a complete end-to-end system.

Without proper security measures in place to protect raw data on the device or intermediate computational state in transit,

Then it may not matter whether the final deployed model stores user data guarantees.

Data minimization addresses potential threats to devices, networks, and servers by improving security and minimizing retention of data and intermediate results.

When models and metrics are released to model engineers or deployed into production,

Anonymous aggregation will protect personal data from parties accessing these published outputs.

3.1 Aggregated Data Minimization

At several points in federated computing, participants expect each other to take appropriate actions, and can only take those actions.

For example, servers expect clients to perform their preprocessing steps exactly;

Clients expect the server to keep their personal updates private until they are aggregated;

Both the client and the server expect that neither the data analyst nor the user of the deployed machine learning model can extract personal data and so on.

Privacy-preserving technologies support the structural execution of these cross-components, preventing participants from deviating.

In fact, the federal system itself can be seen as a privacy-preserving technology,

Structurally prevents the server from accessing any client-side data that is not contained in updates submitted by the client.

Take the aggregation stage as an example.

An idealized system could imagine a fully trusted third-party aggregation client for updates, and only show the final aggregation to the server.

In practice, there is usually no such mutually trusted third party to play this role,

But various techniques allow federated learning systems to simulate such third parties under various conditions.

For example, the server can run the aggregation process in a secure area,

This safe area is a specially constructed hardware,

Not only can it prove to the client what code it is running,

It also ensures that no one can observe or tamper with the execution of the code.

However, currently, the availability of secure environments is limited,

Whether in the cloud or on consumer devices,

Available security environments may only implement some specified property fields.

Furthermore, even when available and fully functional,

A secure environment may also impose additional constraints, including very limited memory or speed;

Vulnerable to data exposed through secondary channels (e.g. cache timing attacks);

Difficult to verify correctness; relies on manufacturer-provided authentication services (such as key secrecy), etc.

Distributed encryption protocols for multi-party secure computing can be used in concert to simulate trusted third parties,

No special hardware is required, as long as the participants are honest enough.

While multiparty secure computation of arbitrary functions remains a computational hurdle in most cases,

But specialized aggregation algorithms for vector summation in federated settings have been developed,

Privacy is preserved even against adversaries who observe the server and control most of the clients,

Cryptographically secure aggregation protocols have been widely deployed in commercial federated computing systems.

In addition to private aggregation, privacy-preserving technologies can also be used to protect other parts of the federation system.

For example, a secure environment or cryptographic techniques (e.g., zero-knowledge proofs)

can ensure that the server can trust that the client has faithfully performed the preprocessing.

Even the model broadcasting stage can benefit: for many learning tasks,

A single client may only have data related to a small part of the model,

In this case, the client can privately retrieve that part of the model for training,

Again using a secure environment or encryption technology,

to ensure that the server does not know about any part of the model that has training data relevant to the client.

3.2 Anonymous Aggregation for Computation and Verification

While security context and privacy aggregation techniques can enhance data minimization,

But they are not specifically designed to generate anonymous aggregates.

For example, limiting user influence on the model being trained.

In fact, the learned model can leak sensitive information in some cases.

The standard method of data anonymity is differential privacy.

For a generic procedure that aggregates records in a database,

Differential privacy requires limiting the contribution of any record to the aggregate,

Then add an appropriately scaled random perturbation.

For example, in the differentially private stochastic gradient descent algorithm,

The norm of the gradient is clipped, the clipped gradient is aggregated,

And add Gaussian noise to each training epoch.

Differential privacy algorithms are necessarily random,

so the distribution of the model produced by the algorithm over a particular dataset can be considered.

Intuitively, when a differential privacy algorithm operates on a single record different input dataset,

The distributions between such models are similar.

In the cross-device federated learning scenario, records are defined as all training instances for a single user/client.

Differential privacy can be user-level or scale-level.

Federated learning algorithms are well suited for training with user-level privacy guarantees, even in centralized configurations,

Because they compute a single model update from all of a user’s data,

Makes it easier to bind the total impact of each user on model updates.

Guaranteeing can be particularly challenging in the context of cross-device federated learning systems, since the set of all eligible users is dynamic,

And without prior knowledge, participating users may drop out at any point in the training phase,

Building an end-to-end protocol suitable for production federated learning systems remains an important problem to solve.

In the context of cross-organization federated learning, privacy units can have different meanings.

For example, if a participating institution wants to ensure access to model iterations or the final model,

cannot determine whether a particular institution’s dataset was used to train the model,

A record can then be defined as all instances in a data silo.

User-level differential privacy still makes sense in cross-organizational settings.

However, if multiple institutions have records from the same user,

Implementing user-level privacy can be more challenging.

Differential privacy data analysis in the past was mainly used for central or trusted aggregators,

Where raw data is collected by trusted service providers that implement differential privacy algorithms.

Local differential privacy avoids the need for a fully trusted aggregator,

But it will lead to a sharp drop in accuracy.

To restore the utility of centralized differential privacy without having to rely on a fully trusted central server,

Several emerging approaches are available, commonly referred to as distributed differential privacy.

The goal is to give the output a different level of privacy before the server sees it (in the clear).

Under distributed differential privacy, the client first computes the application-specific minimum data,

This data is slightly perturbed with random noise, and a privacy aggregation protocol is performed.

The server then only has access to the output of the privacy aggregation protocol.

The noise added by a single customer is usually not enough to provide a meaningful guarantee for local differential.

However, after privacy aggregation, the output of the privacy aggregation protocol provides stronger DP guarantees based on the noise summation of all clients.

This even applies to people with server access, given the security assumptions required by privacy aggregation protocols.

For an algorithm that provides formal user-level privacy guarantees,

Not only must the sensitivity of the model be tied to each user’s data,

And also noise proportional to this sensitivity must be added.

While enough random noise needs to be added to ensure that the differential privacy definition is small enough to provide strong guarantees,

But even using small noise limiting sensitivity can significantly reduce deciphering.

Because differential privacy assumes a “worst-case adversary”,

Has unlimited computation and access to information on any side.

These assumptions are often unrealistic in practice.

Therefore, training with differential privacy algorithms that limit the influence of each user has substantial advantages.

However, designing practical federated learning and federated analysis algorithms to achieve small guarantees is an important area of ​​research.

Model auditing techniques can be used to further quantify the benefits of training with differential privacy.

They include quantifying the degree to which the model overlearns or rare training examples,

and to what extent quantification can infer whether users are using the technique during training.

These auditing techniques are useful even when used,

They quantify the gap between differential privacy worst-case adversaries and real-world adversaries with limited computing power and side information.

They can also serve as complementary techniques for stress testing: unlike the formal mathematical statement of differential privacy,

These auditing techniques apply to complete end-to-end systems and may catch software bugs or wrong parameter choic

Federal Analysis

In addition to learning machine learning models, data analysts are often interested in applying data science methods to analyze raw data locally on user devices.

For example, analysts may be interested in aggregated model metrics, popular trends and activities, or geospatial location heatmaps.

All of this can be done by using federated analysis.

Similar to federated learning, federated analytics works by running local computations on each device’s data and providing only aggregated results.

However, unlike federated learning, federated analytics is designed to support basic data science needs,

Such as counts, averages, histograms, quantiles, and other SQL-like queries.

For an application where an analyst wants to use federated analysis to learn the 10 most played songs in a music library shared by many users.

This task can be performed using the federation and privacy techniques discussed above.

For example, clients can encode songs they have heard into a binary vector of length equal to the size of the library,

and use distributed differential privacy to ensure that the server only sees one value of these vectors,

Differential privacy histogram giving how many users played each song.

However, federated analysis tasks differ from federated learning tasks in several ways:

Federated analysis algorithms are usually non-interactive and involve a large number of clients.

In other words, unlike federated learning applications, there are no diminishing returns to having more customers in a round.

Therefore, it is less challenging to apply differential privacy in federated analysis,

Because each round can contain a large number of customers and requires fewer rounds.

It is not necessary for the same client to re-engage in subsequent rounds.

In fact, re-engaging customers may also skew the algorithm’s results.

Therefore, federated analysis tasks are best served by an infrastructure that limits the number of times any individual can participate.

Federated analysis tasks are often sparse, which makes efficient private sparse aggregation a particularly important topic.

It is worth noting that while limiting customer participation and sparse aggregation are particularly relevant for federated analysis,

But they can also be applied to federated learning problems.

Federated learning is being applied to more types of data and problem domains,

It has even been considered an important way of privacy computing, that is, an AI-oriented privacy protection method.

Regarding the practice of federated learning, TensorFlow Federated may be a good starting point.