Core Module References

federatedscope.core.fed_runner

class federatedscope.core.fed_runner.BaseRunner(data, server_class=<class 'federatedscope.core.workers.server.Server'>, client_class=<class 'federatedscope.core.workers.client.Client'>, config=None, client_configs=None)[source]

This class is a base class to construct an FL course, which includes _set_up() and run().

Parameters
  • data – The data used in the FL courses, which are formatted as {'ID':data} for standalone mode. More details can be found in federatedscope.core.auxiliaries.data_builder .

  • server_class – The server class is used for instantiating a ( customized) server.

  • client_class – The client class is used for instantiating a ( customized) client.

  • config – The configurations of the FL course.

  • client_configs – The clients’ configurations.

data

The data used in the FL courses, which are formatted as {'ID':data} for standalone mode. More details can be found in federatedscope.core.auxiliaries.data_builder .

server

The instantiated server.

client

The instantiate client(s).

cfg

The configurations of the FL course.

client_cfgs

The clients’ configurations.

mode

The run mode for FL, distributed or standalone

gpu_manager

manager of GPU resource

resource_info

information of resource

abstract _get_client_args(client_id, resource_info)[source]

Get the args for instantiating the server.

Parameters
  • client_id – ID of client

  • resource_info – information of resource

Returns

data which client holds; kwargs dict to instantiate the client.

Return type

(client_data, kw)

abstract _get_server_args(resource_info, client_resource_info)[source]

Get the args for instantiating the server.

Parameters
  • resource_info – information of resource

  • client_resource_info – information of client’s resource

Returns

None or data which server holds; model to be aggregated; kwargs dict to instantiate the server.

Return type

(server_data, model, kw)

abstract _set_up()[source]

Set up and instantiate the client/server.

_setup_client(client_id=- 1, client_model=None, resource_info=None)[source]

Set up and instantiate the client.

Parameters
  • client_id – ID of client

  • client_model – model of client

  • resource_info – information of resource

Returns

Instantiate client.

_setup_server(resource_info=None, client_resource_info=None)[source]

Set up and instantiate the server.

Parameters
  • resource_info – information of resource

  • client_resource_info – information of client’s resource

Returns

Instantiate server.

check()[source]

Check the completeness of Server and Client.

abstract run()[source]

Launch the FL course

Returns

best results during the FL course

Return type

dict

class federatedscope.core.fed_runner.DistributedRunner(data, server_class=<class 'federatedscope.core.workers.server.Server'>, client_class=<class 'federatedscope.core.workers.client.Client'>, config=None, client_configs=None)[source]
_get_client_args(client_id, resource_info)[source]

Get the args for instantiating the server.

Parameters
  • client_id – ID of client

  • resource_info – information of resource

Returns

data which client holds; kwargs dict to instantiate the client.

Return type

(client_data, kw)

_get_server_args(resource_info, client_resource_info)[source]

Get the args for instantiating the server.

Parameters
  • resource_info – information of resource

  • client_resource_info – information of client’s resource

Returns

None or data which server holds; model to be aggregated; kwargs dict to instantiate the server.

Return type

(server_data, model, kw)

_set_up()[source]

To set up server or client for distributed mode.

run()[source]

Launch the FL course

Returns

best results during the FL course

Return type

dict

class federatedscope.core.fed_runner.FedRunner(data, server_class=<class 'federatedscope.core.workers.server.Server'>, client_class=<class 'federatedscope.core.workers.client.Client'>, config=None, client_configs=None)[source]

This class is used to construct an FL course, which includes _set_up and run.

Parameters
  • data – The data used in the FL courses, which are formatted as {'ID':data} for standalone mode. More details can be found in federatedscope.core.auxiliaries.data_builder .

  • server_class – The server class is used for instantiating a ( customized) server.

  • client_class – The client class is used for instantiating a ( customized) client.

  • config – The configurations of the FL course.

  • client_configs – The clients’ configurations.

Warning

FedRunner will be removed in the future, consider using StandaloneRunner or DistributedRunner instead!

_handle_msg(msg, rcv=- 1)[source]

To simulate the message handling process (used only for the standalone mode)

_setup_client(client_id=- 1, client_model=None, resource_info=None)[source]

Set up the client

_setup_for_distributed()[source]

To set up server or client for distributed mode.

_setup_for_standalone()[source]

To set up server and client for standalone mode.

_setup_server(resource_info=None, client_resource_info=None)[source]

Set up the server

check()[source]

Check the completeness of Server and Client.

run()[source]

To run an FL course, which is called after server/client has been set up. For the standalone mode, a shared message queue will be set up to simulate receiving message.

class federatedscope.core.fed_runner.StandaloneRunner(data, server_class=<class 'federatedscope.core.workers.server.Server'>, client_class=<class 'federatedscope.core.workers.client.Client'>, config=None, client_configs=None)[source]
_get_client_args(client_id=- 1, resource_info=None)[source]

Get the args for instantiating the server.

Parameters
  • client_id – ID of client

  • resource_info – information of resource

Returns

data which client holds; kwargs dict to instantiate the client.

Return type

(client_data, kw)

_get_server_args(resource_info=None, client_resource_info=None)[source]

Get the args for instantiating the server.

Parameters
  • resource_info – information of resource

  • client_resource_info – information of client’s resource

Returns

None or data which server holds; model to be aggregated; kwargs dict to instantiate the server.

Return type

(server_data, model, kw)

_handle_msg(msg, rcv=- 1)[source]

To simulate the message handling process (used only for the standalone mode)

_run_simulation()[source]

Run for standalone simulation (W/O online aggr)

_run_simulation_online()[source]

Run for online aggregation. Any broadcast operation would be executed client-by-clien to avoid the existence of #clients messages at the same time. Currently, only consider centralized topology

_set_up()[source]

To set up server and client for standalone mode.

run()[source]

Launch the FL course

Returns

best results during the FL course

Return type

dict

federatedscope.core.workers

class federatedscope.core.workers.BaseClient(ID, state, config, model, strategy)[source]
_register_default_handlers()[source]

Register default handler dic to handle message, which includes sender, receiver, state, and content. More detail can be found in federatedscope.core.message.

Note

the default handlers to handle messages and related callback function are shown below:

Message type

Callback function

assign_client_id

callback_funcs_for_assign_id()

ask_for_join_in_info

callback_funcs_for_join_in_info()

address

callback_funcs_for_address()

model_para

callback_funcs_for_model_para()

ss_model_para

callback_funcs_for_model_para()

evaluate

callback_funcs_for_evaluate()

finish

callback_funcs_for_finish()

converged

callback_funcs_for_converged()

abstract callback_funcs_for_address(message)[source]

The handling function for receiving other clients’ IP addresses, which is used for constructing a complex topology

Parameters

message – The received message

abstract callback_funcs_for_assign_id(message)[source]

The handling function for receiving the client_ID assigned by the server (during the joining process), which is used in the distributed mode.

Parameters

message – The received message

abstract callback_funcs_for_converged(message)[source]

The handling function for receiving the signal that the FL course converged

Parameters

message – The received message

abstract callback_funcs_for_evaluate(message)[source]

The handling function for receiving the request of evaluating

Parameters

message – The received message

abstract callback_funcs_for_finish(message)[source]

The handling function for receiving the signal of finishing the FL course.

Parameters

message – The received message

abstract callback_funcs_for_join_in_info(message)[source]

The handling function for receiving the request of join in information (such as batch_size, num_of_samples) during the joining process.

Parameters

message – The received message

abstract callback_funcs_for_model_para(message)[source]

The handling function for receiving model parameters, which triggers the local training process. This handling function is widely used in various FL courses.

Parameters

message – The received message

register_handlers(msg_type, callback_func, send_msg=[None])[source]

To bind a message type with a handling function.

Parameters
  • msg_type (str) – The defined message type

  • callback_func – The handling functions to handle the received message

abstract run()[source]

To listen to the message and handle them accordingly (used for distributed mode)

class federatedscope.core.workers.BaseServer(ID, state, config, model, strategy)[source]
_register_default_handlers()[source]

Register default handler dic to handle message, which includes sender, receiver, state, and content. More detail can be found in federatedscope.core.message.

Note

the default handlers to handle messages and related callback function are shown below:

Message type

Callback function

join_in

callback_funcs_for_join_in()

join_in_info

callback_funcs_for_join_in()

model_para

callback_funcs_model_para()

metrics

callback_funcs_for_metrics

abstract callback_funcs_for_join_in(message)[source]

The handling function for receiving the join in information. The server might request for some information (such as num_of_samples) if necessary, assign IDs for the servers. If all the clients have joined in, the training process will be triggered.

Parameters

message – The received message

abstract callback_funcs_for_metrics(message)[source]

The handling function for receiving the evaluation results, which triggers check_and_move_on (perform aggregation when enough feedback has been received).

Parameters

message – The received message

abstract callback_funcs_model_para(message)[source]

The handling function for receiving model parameters, which triggers check_and_move_on (perform aggregation when enough feedback has been received). This handling function is widely used in various FL courses.

Parameters

message – The received message.

register_handlers(msg_type, callback_func, send_msg=[None])[source]

To bind a message type with a handling function.

Parameters
  • msg_type (str) – The defined message type

  • callback_func – The handling functions to handle the received message

abstract run()[source]

To start the FL course, listen and handle messages (for distributed mode).

class federatedscope.core.workers.Client(ID=- 1, server_id=None, state=- 1, config=None, data=None, model=None, device='cpu', strategy=None, is_unseen_client=False, *args, **kwargs)[source]

The Client class, which describes the behaviors of client in an FL course. The behaviors are described by the handling functions (named as callback_funcs_for_xxx)

Parameters
  • ID – The unique ID of the client, which is assigned by the server

  • course (when joining the FL) –

  • server_id – (Default) 0

  • state – The training round

  • config – The configuration

  • data – The data owned by the client

  • model – The model maintained locally

  • device – The device to run local training and evaluation

ID

ID of worker

state

the training round index

model

the model maintained locally

cfg

the configuration of FL course, see federatedscope.core.configs

mode

the run mode for FL, distributed or standalone

monitor

monite FL course and record metrics, see federatedscope.core.monitors.monitor.Monitor

trainer

instantiated trainer, see federatedscope.core.trainers

best_results

best results ever seen

history_results

all evaluation results

early_stopper

determine when to early stop, see federatedscope.core.monitors.early_stopper.EarlyStopper

ss_manager

secret sharing manager

msg_buffer

dict buffer for storing message

comm_manager

manager for communication, see federatedscope.core.communication

callback_funcs_for_address(message: Message)[source]

The handling function for receiving other clients’ IP addresses, which is used for constructing a complex topology

Parameters

message – The received message

callback_funcs_for_assign_id(message: Message)[source]

The handling function for receiving the client_ID assigned by the server (during the joining process), which is used in the distributed mode.

Parameters

message – The received message

callback_funcs_for_converged(message: Message)[source]

The handling function for receiving the signal that the FL course converged

Parameters

message – The received message

callback_funcs_for_evaluate(message: Message)[source]

The handling function for receiving the request of evaluating

Parameters

message – The received message

callback_funcs_for_finish(message: Message)[source]

The handling function for receiving the signal of finishing the FL course.

Parameters

message – The received message

callback_funcs_for_join_in_info(message: Message)[source]

The handling function for receiving the request of join in information (such as batch_size, num_of_samples) during the joining process.

Parameters

message – The received message

callback_funcs_for_model_para(message: Message)[source]

The handling function for receiving model parameters, which triggers the local training process. This handling function is widely used in various FL courses.

Parameters

message – The received message

join_in()[source]

To send join_in message to the server for joining in the FL course.

run()[source]

To listen to the message and handle them accordingly (used for distributed mode)

run_standalone()[source]

Run in standalone mode

class federatedscope.core.workers.Server(ID=- 1, state=0, config=None, data=None, model=None, client_num=5, total_round_num=10, device='cpu', strategy=None, unseen_clients_id=None, **kwargs)[source]

The Server class, which describes the behaviors of server in an FL course. The behaviors are described by the handled functions (named as callback_funcs_for_xxx).

Parameters
  • ID – The unique ID of the server, which is set to 0 by default

  • state – The training round

  • config – the configuration

  • data – The data owned by the server (for global evaluation)

  • model – The model used for aggregation

  • client_num – The (expected) client num to start the FL course

  • total_round_num – The total number of the training round

  • device – The device to run local training and evaluation

ID

ID of worker

state

the training round index

model

the model maintained locally

cfg

the configuration of FL course, see federatedscope.core.configs

mode

the run mode for FL, distributed or standalone

monitor

monite FL course and record metrics, see federatedscope.core.monitors.monitor.Monitor

trainer

instantiated trainer, see federatedscope.core.trainers

best_results

best results ever seen

history_results

all evaluation results

early_stopper

determine when to early stop, see federatedscope.core.monitors.early_stopper.EarlyStopper

aggregators

a protocol for aggregate all clients’ model(s), see federatedscope.core.aggregators

sample_client_num

number of client aggregated in each round

msg_buffer

dict buffer for storing message

staled_msg_buffer

list buffer for storing staled message

comm_manager

manager for communication, see federatedscope.core.communication

_merge_and_format_eval_results()[source]

The behaviors of server when receiving enough evaluating results

_perform_federated_aggregation()[source]

Perform federated aggregation and update the global model

_start_new_training_round(aggregated_num=0)[source]

The behaviors for starting a new training round

broadcast_client_address()[source]

To broadcast the communication addresses of clients (used for additive secret sharing)

broadcast_model_para(msg_type='model_para', sample_client_num=- 1, filter_unseen_clients=True)[source]

To broadcast the message to all clients or sampled clients

Parameters
  • msg_type – ‘model_para’ or other user defined msg_type

  • sample_client_num – the number of sampled clients in the broadcast behavior. And sample_client_num = -1 denotes to broadcast to all the clients.

  • filter_unseen_clients – whether filter out the unseen clients that do not contribute to FL process by training on their local data and uploading their local model update. The splitting is useful to check participation generalization gap in [ICLR’22, What Do We Mean by Generalization in Federated Learning?] You may want to set it to be False when in evaluation stage

callback_funcs_for_join_in(message: Message)[source]

The handling function for receiving the join in information. The server might request for some information (such as num_of_samples) if necessary, assign IDs for the servers. If all the clients have joined in, the training process will be triggered.

Parameters

message – The received message

callback_funcs_for_metrics(message: Message)[source]

The handling function for receiving the evaluation results, which triggers check_and_move_on (perform aggregation when enough feedback has been received).

Parameters

message – The received message

callback_funcs_model_para(message: Message)[source]

The handling function for receiving model parameters, which triggers check_and_move_on (perform aggregation when enough feedback has been received). This handling function is widely used in various FL courses.

Parameters

message – The received message.

check_and_move_on(check_eval_result=False, min_received_num=None)[source]

To check the message_buffer. When enough messages are receiving, some events (such as perform aggregation, evaluation, and move to the next training round) would be triggered.

Parameters
  • check_eval_result (bool) – If True, check the message buffer for evaluation; and check the message buffer for training otherwise.

  • min_received_num – number of minimal received message, used for async mode

check_and_save()[source]

To save the results and save model after each evaluation, and check whether to early stop.

check_buffer(cur_round, min_received_num, check_eval_result=False)[source]

To check the message buffer

Parameters
  • cur_round (int) – The current round number

  • min_received_num (int) – The minimal number of the receiving messages

  • check_eval_result (bool) – To check training results for evaluation results

Returns

bool: Whether enough messages have been received or not

check_client_join_in()[source]

To check whether all the clients have joined in the FL course.

eval()[source]

To conduct evaluation. When cfg.federate.make_global_eval=True, a global evaluation is conducted by the server.

merge_eval_results_from_all_clients()[source]

Merge evaluation results from all clients, update best, log the merged results and save them into eval_results.log

Returns

the formatted merged results

run()[source]

To start the FL course, listen and handle messages (for distributed mode).

save_best_results()[source]

To Save the best evaluation results.

save_client_eval_results()[source]

save the evaluation results of each client when the fl course early stopped or terminated

terminate(msg_type='finish')[source]

To terminate the FL course

trigger_for_feat_engr(trigger_train_func, kwargs_for_trigger_train_func={})[source]

Interface for feature engineering, the default operation is none

trigger_for_start()[source]

To start the FL course when the expected number of clients have joined

trigger_for_time_up(check_timestamp=None)[source]

The handler for time up: modify the currency timestamp and check the trigger condition

class federatedscope.core.workers.Worker(ID=- 1, state=0, config=None, model=None, strategy=None)[source]

The base worker class, the parent of BaseClient and BaseServer

Parameters
  • ID – ID of worker

  • state – the training round index

  • config – the configuration of FL course

  • model – the model maintained locally

ID

ID of worker

state

the training round index

model

the model maintained locally

cfg

the configuration of FL course

mode

the run mode for FL, distributed or standalone

monitor

monite FL course and record metrics

federatedscope.core.trainers

class federatedscope.core.trainers.BaseTrainer(model, data, device, **kwargs)[source]
print_trainer_meta_info()[source]

Returns: String contains meta information of Trainer.

class federatedscope.core.trainers.Context(model, cfg, data=None, device=None)[source]

Record and pass variables among different hook functions.

Parameters
  • model – training model

  • cfg – config

  • data (dict) – a dict contains train/val/test dataset or dataloader

  • device – running device

  • init_dict (dict) – a dict used to initialize the instance of Context

  • init_attr (bool) – if set up the static variables

Note

  • The variables within an instance of class Context can be set/get as an attribute.

` ctx.${NAME_VARIABLE} = ${VALUE_VARIABLE} ` where ${NAME_VARIABLE} and ${VALUE_VARIABLE} is the name and value of the variable.

  • To achieve automatically lifecycle management, you can wrap the variable with CtxVar and a lifecycle parameter as follows

` ctx.${NAME_VARIABLE} = CtxVar(${VALUE_VARIABLE}, ${LIFECYCLE}) ` The parameter ${LIFECYCLE} can be chosen from LIFECYCLE.BATCH, LIFECYCLE.EPOCH and LIFECYCLE.ROUTINE. Then the variable ctx.${NAME_VARIABLE} will be deleted at the end of the corresponding stage

  • LIFECYCLE.BATCH: the variables will be deleted after running a batch

  • LIFECYCLE.EPOCH: the variables will be deleted after running a epoch

  • LIFECYCLE.ROUTINE: the variables will be deleted after running a routine

More details please refer to our [tutorial](https://federatedscope.io/docs/trainer/).

We classify and show the default attributes below:

Data-related attributes
  • ctx.data: the raw data (not split) the trainer holds

  • ctx.num_samples: the number of samples used in training

  • ctx.train_data, ctx.val_data, ctx.test_data: the split data the trainer holds

  • ctx.train_loader, ctx.val_loader, ctx.test_loader: the DataLoader of each split data

  • ctx.num_train_data, ctx.num_val_data, ctx.num_test_data: the number of samples of the split data Model-related attributes

  • ctx.model: the model used

  • ctx.models: the multi models if use

  • ctx.mirrored_models: the mirrored models

  • ctx.trainable_para_names: the trainable parameter names of the model

Optimizer-related attributes
  • ctx.optimizer: see torch.optim

  • ctx.scheduler: decays the learning rate of each parameter group

  • ctx.criterion: loss/criterion function

  • ctx.regularizer: regular terms

  • ctx.grad_clip: gradient clipping

Mode-related attributes
  • ctx.cur_mode: mode of trainer, which is one of ['train',           'val', 'test']

  • ctx.mode_stack: stack of mode, only used for switching mode

  • ctx.cur_split: split of data, which is one of ['train',           'val', 'test'] (Note: use train data in test mode is allowed)

  • ctx.split_stack: stack of split, only used for switching data split

Metric-related attributes
  • ctx.loss_batch_total: Loss of current batch

  • ctx.loss_regular_total: Loss of regular term

  • ctx.y_true: true label of batch data

  • ctx.y_prob: output of the model with batch data as input

  • ctx.ys_true: true label of data

  • ctx.ys_prob: output of the model

  • ctx.eval_metrics: evaluation metrics calculated by ctx.monitor

  • ctx.monitor: used for monitor trainer’s behavior and statistics

Other (statistics) attributes (@property, query from cfg if not set)
  • ctx.cfg: configuration of FL course

  • ctx.device: current device, such as cpu and gpu0.

  • ctx.num_train_batch_last_epoch, ctx.num_total_train_batch: the number of batch

  • ctx.num_train_epoch, ctx.num_val_epoch, ctx.num_test_epoch: the number of epoch in each data split

  • ctx.num_train_batch, ctx.num_val_batch, ctx.num_test_batch: the number of batch in each data split

class federatedscope.core.trainers.FedEMTrainer(model_nums, models_interact_mode='sequential', model=None, data=None, device=None, config=None, base_trainer: Optional[Type[GeneralTorchTrainer]] = None)[source]

The FedEM implementation, “Federated Multi-Task Learning under a Mixture of Distributions (NeurIPS 2021)” based on the Algorithm 1 in their paper and official codes: https://github.com/omarfoq/FedEM

_hook_on_batch_end_gather_loss(ctx)[source]

Note

The modified attributes and according operations are shown below:

Attribute

Operation

ctx.all_losses_model_batch

Gather loss

_hook_on_batch_forward_weighted_loss(ctx)[source]

Note

The modified attributes and according operations are shown below:

Attribute

Operation

ctx.loss_batch

Multiply by weights_internal_models

_hook_on_batch_start_track_batch_idx(ctx)[source]

Note

The modified attributes and according operations are shown below:

Attribute

Operation

ctx.optimizer_for_global_model

False

_hook_on_fit_end_ensemble_eval(ctx)[source]

Ensemble evaluation

Note

The modified attributes and according operations are shown below:

Attribute

Operation

ctx.ys_prob_ensemble

Ensemble ys_prob

ctx.ys_true

Concatenate results

ctx.ys_prob

Concatenate results

ctx.eval_metrics

Get evaluated results from ctx.monitor

_hook_on_fit_end_flop_count(ctx)[source]

Note

The modified attributes and according operations are shown below:

Attribute

Operation

ctx.monitor

Count total_flops

_hook_on_fit_start_flop_count(ctx)[source]

Note

The modified attributes and according operations are shown below:

Attribute

Operation

ctx.monitor

Count total_flops

_hook_on_fit_start_mixture_weights_update(ctx)[source]

Note

The modified attributes and according operations are shown below:

Attribute

Operation

ctx.mode

Evaluate

register_multiple_model_hooks()[source]

customized multiple_model_hooks, which is called in the __init__ of GeneralMultiModelTrainer

class federatedscope.core.trainers.GeneralMultiModelTrainer(model_nums, models_interact_mode='sequential', model=None, data=None, device=None, config=None, base_trainer: Optional[Type[GeneralTorchTrainer]] = None)[source]
_run_routine(mode, hooks_set, dataset_name=None)[source]

Run the hooks_set and maintain the mode for multiple internal models

Parameters

mode – running mode of client, chosen from train/val/test

Note

Considering evaluation could be in `hooks_set[ "on_epoch_end"]`, there could be two data loaders in self.ctx, we must tell the running hooks which data_loader to call and which num_samples to count

get_model_para()[source]

return multiple model parameters

Returns

init_multiple_models()[source]

init multiple models and optimizers: the default implementation is copy init manner; ========================= Extension ============================= users can override this function according to their own requirements

register_multiple_model_hooks()[source]

By default, all internal models adopt the same hook_set.

Extension

Users can override this function to register customized hooks for different internal models.

Note

  • for sequential mode, users can append interact_hook on begin/end triggers such as ” -> (on_fit_end, _interact_to_other_models) -> “

  • for parallel mode, users can append interact_hook on any trigger they want such as ” -> (on_xxx_point, _interact_to_other_models) -> “

  • we must tell the running hooks which data_loader to call and which num_samples to count

update(model_parameters, strict=False)[source]
Parameters
  • model_parameters (list[dict]) – Multiple pyTorch Module object’s

  • state_dict.

class federatedscope.core.trainers.GeneralTFTrainer(model, data, device, config, only_for_eval=False, monitor=None)[source]
_hook_on_batch_end(ctx)[source]

Note

The modified attributes and according operations are shown below:

Attribute

Operation

ctx.num_samples

Add ctx.batch_size

ctx.loss_batch_total

Add batch loss

ctx.loss_regular_total

Add batch regular loss

ctx.ys_true

Append ctx.y_true

ctx.ys_prob

Append ctx.ys_prob

_hook_on_batch_forward(ctx)[source]

Note

The modified attributes and according operations are shown below:

Attribute

Operation

ctx.optimizer

Initialize optimizer

ctx.batch_size

Calculate batch size

ctx.loss_batch

Calculate batch loss

ctx.model

Forward propagation

ctx.y_true

Get y_true from batch

ctx.y_prob

Forward propagation to get y_prob

_hook_on_batch_start_init(ctx)[source]

Note

The modified attributes and according operations are shown below:

Attribute

Operation

ctx.data_batch

Initialize batch data

_hook_on_epoch_start(ctx)[source]

Note

The modified attributes and according operations are shown below:

Attribute

Operation

ctx.{cur_split}_loader

Initialize DataLoader

_hook_on_fit_end(ctx)[source]

Evaluate metrics.

Note

The modified attributes and according operations are shown below:

Attribute

Operation

ctx.ys_true

Convert to numpy.array

ctx.ys_prob

Convert to numpy.array

ctx.monitor

Evaluate the results

ctx.eval_metrics

Get evaluated results from ctx.monitor

_hook_on_fit_start_init(ctx)[source]

Note

The modified attributes and according operations are shown below:

Attribute

Operation

ctx.model

Move to ctx.device

ctx.loss_batch_total

Initialize to 0

ctx.loss_regular_total

Initialize to 0

ctx.num_samples

Initialize to 0

ctx.ys_true

Initialize to []

ctx.ys_prob

Initialize to []

parse_data(data)[source]

Populate “{}_data”, “{}_loader” and “num_{}_data” for different modes

update(model_parameters, strict=False)[source]

Called by the FL client to update the model parameters

Parameters
  • model_parameters (dict) – {model_name: model_val}

  • strict (bool) – ensure the k-v paris are strictly same

class federatedscope.core.trainers.GeneralTorchTrainer(model, data, device, config, only_for_eval=False, monitor=None)[source]
_hook_on_batch_backward(ctx)[source]

Note

The modified attributes and according operations are shown below:

Attribute

Operation

ctx.optimizer

Update by gradient

ctx.loss_task

Backward propagation

ctx.scheduler

Update by gradient

_hook_on_batch_end(ctx)[source]

Note

The modified attributes and according operations are shown below:

Attribute

Operation

ctx.num_samples

Add ctx.batch_size

ctx.loss_batch_total

Add batch loss

ctx.loss_regular_total

Add batch regular loss

ctx.ys_true

Append ctx.y_true

ctx.ys_prob

Append ctx.ys_prob

_hook_on_batch_forward(ctx)[source]

Note

The modified attributes and according operations are shown below:

Attribute

Operation

ctx.y_true

Move to ctx.device

ctx.y_prob

Forward propagation get y_prob

ctx.loss_batch

Calculate the loss

ctx.batch_size

Get the batch_size

_hook_on_batch_forward_flop_count(ctx)[source]

The monitoring hook to calculate the flops during the fl course

Note

For customized cases that the forward process is not only based on ctx.model, please override this function (inheritance case) or replace this hook (plug-in case)

The modified attributes and according operations are shown below:

Attribute

Operation

ctx.monitor

Track average flops

_hook_on_batch_forward_regularizer(ctx)[source]

Note

The modified attributes and according operations are shown below:

Attribute

Operation

ctx.loss_regular

Calculate the regular loss

ctx.loss_task

Sum the ctx.loss_regular and ctx.loss

_hook_on_batch_start_init(ctx)[source]

Note

The modified attributes and according operations are shown below:

Attribute

Operation

ctx.data_batch

Initialize batch data

_hook_on_epoch_start(ctx)[source]

Note

The modified attributes and according operations are shown below:

Attribute

Operation

ctx.{ctx.cur_split}_loader

Initialize DataLoader

_hook_on_fit_end(ctx)[source]

Evaluate metrics.

Note

The modified attributes and according operations are shown below:

Attribute

Operation

ctx.ys_true

Convert to numpy.array

ctx.ys_prob

Convert to numpy.array

ctx.monitor

Evaluate the results

ctx.eval_metrics

Get evaluated results from ctx.monitor

_hook_on_fit_start_calculate_model_size(ctx)[source]

Note

The modified attributes and according operations are shown below:

Attribute

Operation

ctx.monitor

Track model size

_hook_on_fit_start_init(ctx)[source]

Note

The modified attributes and according operations are shown below:

Attribute

Operation

ctx.model

Move to ctx.device

ctx.optimizer

Initialize by ctx.cfg

ctx.scheduler

Initialize by ctx.cfg

ctx.loss_batch_total

Initialize to 0

ctx.loss_regular_total

Initialize to 0

ctx.num_samples

Initialize to 0

ctx.ys_true

Initialize to []

ctx.ys_prob

Initialize to []

discharge_model()[source]

Discharge the model from GPU device

get_model_para()[source]
Returns

model_parameters (dict): {model_name: model_val}

parse_data(data)[source]

Populate “${split}_data”, “${split}_loader” and “num_${ split}_data” for different data splits

setup_data(ctx)[source]

Initialization data by cfg.

update(model_parameters, strict=False)[source]

Called by the FL client to update the model parameters

Parameters

model_parameters (dict) – PyTorch Module object’s state_dict.

class federatedscope.core.trainers.Trainer(model, data, device, config, only_for_eval=False, monitor=None)[source]

Register, organize and run the train/test/val procedures

_param_filter(state_dict, filter_keywords=None)[source]

model parameter filter when transmit between local and gloabl, which is useful in personalization. e.g., setting cfg.personalization.local_param= [‘bn’, ‘norms’] indicates the implementation of “FedBN: Federated Learning on Non-IID Features via Local Batch Normalization, ICML2021”, which can be found in https://openreview.net/forum?id=6YEQUn0QICG

Parameters

state_dict (dict) – PyTorch Module object’s state_dict.

Returns

remove the keys that match any of the given keywords.

Return type

state_dict (dict)

Populate ${split}_data, ${split}_loader and num_${split}_data for different data splits, and setup init var in ctx.

get_model_para()[source]
Returns

model_parameters (dict): {model_name: model_val}

parse_data(data)[source]

Populate ${split}_data, ${split}_loader and num_${split}_data for different data splits

print_trainer_meta_info()[source]

print some meta info for code-users, e.g., model type; the para names will be filtered out, etc.,

setup_data(ctx)[source]

Initialization data by cfg.

update(model_parameters, strict=False)[source]

Called by the FL client to update the model parameters

Parameters
  • model_parameters (dict) – {model_name: model_val}

  • strict (bool) – ensure the k-v paris are strictly same

federatedscope.core.trainers.wrap_DittoTrainer(base_trainer: Type[GeneralTorchTrainer]) Type[GeneralTorchTrainer][source]

Build a DittoTrainer with a plug-in manner, by registering new functions into specific BaseTrainer

The Ditto implementation, “Ditto: Fair and Robust Federated Learning Through Personalization. (ICML2021)” based on the Algorithm 2 in their paper and official codes: https://github.com/litian96/ditto

federatedscope.core.trainers.wrap_fedprox_trainer(base_trainer: Type[GeneralTorchTrainer]) Type[GeneralTorchTrainer][source]

Implementation of fedprox refer to Federated Optimization in Heterogeneous Networks [Tian Li, et al., 2020]

(https://proceedings.mlsys.org/paper/2020/ file/38af86134b65d0f10fe33d30dd76442e-Paper.pdf)

federatedscope.core.trainers.wrap_nbafl_server(server)[source]

Register noise injector for the server

federatedscope.core.trainers.wrap_nbafl_trainer(base_trainer: Type[GeneralTorchTrainer]) Type[GeneralTorchTrainer][source]

Implementation of NbAFL refer to Federated Learning with Differential Privacy: Algorithms and Performance Analysis [et al., 2020]

(https://ieeexplore.ieee.org/abstract/document/9069945/)

Arguments:

mu: the factor of the regularizer epsilon: the distinguishable bound w_clip: the threshold to clip weights

federatedscope.core.trainers.wrap_pFedMeTrainer(base_trainer: Type[GeneralTorchTrainer]) Type[GeneralTorchTrainer][source]

Build a pFedMeTrainer with a plug-in manner, by registering new functions into specific BaseTrainer

The pFedMe implementation, “Personalized Federated Learning with Moreau Envelopes (NeurIPS 2020)” is based on the Algorithm 1 in their paper and official codes: https://github.com/CharlieDinh/pFedMe

federatedscope.core.data

class federatedscope.core.data.base_data.ClientData(client_cfg, train=None, val=None, test=None, **kwargs)[source]

ClientData converts split data to DataLoader.

Parameters
  • loaderDataloader class or data dict which have been built

  • client_cfg – client-specific CfgNode

  • data – raw dataset, which will stay raw

  • train – train dataset, which will be converted to Dataloader

  • val – valid dataset, which will be converted to Dataloader

  • test – test dataset, which will be converted to Dataloader

Note

Key {split}_data in ClientData is the raw dataset. Key {split} in ClientData is the dataloader.

setup(new_client_cfg=None)[source]

Set up DataLoader in ClientData with new configurations.

Parameters

new_client_cfg – new client-specific CfgNode

Returns

Status for indicating whether the client_cfg is updated

Return type

Bool

class federatedscope.core.data.base_data.StandaloneDataDict(datadict, global_cfg)[source]

StandaloneDataDict maintain several ClientData, only used in Standalone mode to be passed to Runner, which will conduct several preprocess based on global_cfg, see preprocess() for details.

Parameters
  • datadictDict with client_id as key, ClientData as value.

  • global_cfg – global CfgNode

attack(datadict)[source]

Apply attack to StandaloneDataDict.

preprocess(datadict)[source]

Preprocess for:

  1. Global evaluation (merge test data).

  2. Global mode (train with centralized setting, merge all data).

  3. Apply data attack algorithms.

Parameters

datadict – dict with client_id as key, ClientData as value.

resetup(global_cfg, client_cfgs=None)[source]

Reset-up new configs for ClientData, when the configs change which might be used in HPO.

Parameters
  • global_cfg – enable new config for ClientData

  • client_cfgs – enable new client-specific config for ClientData

class federatedscope.core.data.base_translator.BaseDataTranslator(global_cfg, client_cfgs=None)[source]

Translator is a tool to convert a centralized dataset to StandaloneDataDict, which is the input data of runner.

Notes

The Translator is consist of several stages:

Dataset -> ML split (split_train_val_test()) -> FL split (split_to_client()) -> StandaloneDataDict

split(dataset)[source]

Perform ML split and FL split.

Returns

dict of ClientData with client_idx as key to build StandaloneDataDict

split_to_client(train, val, test)[source]

Split dataset to clients and build ClientData.

Returns

dict of ClientData with client_idx as key.

Return type

dict

split_train_val_test(dataset, cfg=None)[source]

Split dataset to train, val, test if not provided.

Returns

List of split dataset, like [train, val, test]

Return type

List

class federatedscope.core.data.dummy_translator.DummyDataTranslator(global_cfg, client_cfgs=None)[source]

DummyDataTranslator convert datadict to StandaloneDataDict. Compared to core.data.base_translator.BaseDataTranslator, it do not perform FL split.

split(dataset)[source]

Perform ML split

Returns

dict of ClientData with client_idx as key to build StandaloneDataDict

federatedscope.core.data.utils.convert_data_mode(data, config)[source]

Convert StandaloneDataDict to ClientData in distributed mode.

Parameters
  • dataStandaloneDataDict

  • config – configuration of FL course, see federatedscope.core.configs

Returns

StandaloneDataDict in standalone mode, or ClientData in distributed mode.

federatedscope.core.data.utils.download_url(url: str, folder='folder')[source]

Downloads the content of an url to a folder. Modified from https://github.com/pyg-team/pytorch_geometric/tree/master/torch_geometric

Parameters
  • url (string) – The url of target file.

  • folder (string) – The target folder.

Returns

File path of downloaded files.

Return type

string

federatedscope.core.data.utils.filter_dict(func, kwarg)[source]

Filters out the common keys of kwarg that are not in kwarg.

Parameters
  • func – function to be filtered

  • kwarg – dict to filter

Returns

Filtered dict of arguments of the function.

federatedscope.core.data.utils.get_func_args(func)[source]

Get the set of arguments that the function expects.

Parameters

func – function to be analysis

Returns

Arguments that the function expects

federatedscope.core.data.utils.load_dataset(config, client_cfgs=None)[source]

Loads the dataset for the given config from branches

Parameters

config – configurations for FL, see federatedscope.core.configs

Note

See https://federatedscope.io/docs/datazoo/ for all available data.

federatedscope.core.data.utils.load_external_data(config=None)[source]

Based on the configuration file, this function imports external datasets and applies train/valid/test.

Parameters

configCN from federatedscope/core/configs/config.py

Returns

tuple of ML split dataset, and CN from federatedscope/core/configs/config.py, which might be modified in the function.

Return type

(data, modified_config)

federatedscope.core.data.utils.merge_data(all_data, merged_max_data_id=None, specified_dataset_name=None)[source]

Merge data from client 1 to merged_max_data_id contained in given all_data.

Parameters
  • all_dataStandaloneDataDict

  • merged_max_data_id – max merged data index

  • specified_dataset_name – split name to be merged

Returns

Merged data.

federatedscope.core.data.utils.save_local_data(dir_path, train_data=None, train_targets=None, test_data=None, test_targets=None, val_data=None, val_targets=None)[source]

Save data to disk. Source: https://github.com/omarfoq/FedEM/blob/main/data/femnist/generate_data.py

Parameters
  • train_data – x of train data

  • train_targets – y of train data

  • test_data – x of test data

  • test_targets – y of test data

  • val_data – x of validation data

  • val_targets – y of validation data

Note

save (`train_data`, `train_targets`) in {dir_path}/train.pt, (`val_data`, `val_targets`) in {dir_path}/val.pt and (`test_data`, `test_targets`) in {dir_path}/test.pt

federatedscope.core.splitters

class federatedscope.core.splitters.BaseSplitter(client_num)[source]

This is an abstract base class for all splitter, which is not implemented with __call__().

client_num

Divide the dataset into client_num pieces.

class federatedscope.core.splitters.generic.IIDSplitter(client_num)[source]

This splitter splits dataset following the independent and identically distribution.

Parameters

client_num – the dataset will be split into client_num pieces

class federatedscope.core.splitters.generic.LDASplitter(client_num, alpha=0.5)[source]

This splitter split dataset with LDA.

Parameters
  • client_num – the dataset will be split into client_num pieces

  • alpha (float) – Partition hyperparameter in LDA, smaller alpha generates more extreme heterogeneous scenario see np.random.dirichlet

class federatedscope.core.splitters.graph.Analyzer(raw_data: Data, split_data: List[Data])[source]

Analyzer for raw graph and split subgraphs.

Parameters
  • raw_data (PyG.data) – raw graph.

  • split_data (list) – the list for subgraphs split by splitter.

average_clustering()[source]
Returns

the average clustering coefficient for the raw G and split G

fl_adj()[source]
Returns

the adj for missing edge ADJ.

fl_data()[source]
Returns

the split edge index.

hamming()[source]
Returns

the average hamming distance of feature for the raw G, split G and missing edge G

hamming_distance_graph(data)[source]
Returns

calculate the hamming distance of graph data

homophily()[source]
Returns

the homophily for the raw G and split G

homophily_value(edge_index, y)[source]
Returns

calculate homophily_value

missing_data()[source]
Returns

the graph data built by missing edge index.

num_missing_edge()[source]
Returns

the number of missing edge and the rate of missing edge.

portion_ms_node()[source]
Returns

the proportion of nodes who miss egde.

class federatedscope.core.splitters.graph.LouvainSplitter(client_num, delta=20)[source]

Split Data into small data via louvain algorithm.

Parameters
  • client_num (int) – Split data into client_num of pieces.

  • delta (int) – The gap between the number of nodes on each client.

class federatedscope.core.splitters.graph.RandChunkSplitter(client_num)[source]

Split graph-level dataset via random chunk strategy.

Parameters

dataset (List or PyG.dataset) – The graph-level datasets.

class federatedscope.core.splitters.graph.RandomSplitter(client_num, sampling_rate=None, overlapping_rate=0, drop_edge=0)[source]

Split Data into small data via random sampling.

Parameters
  • client_num (int) – Split data into client_num of pieces.

  • sampling_rate (str) – Samples of the unique nodes for each client, eg. '0.2,0.2,0.2'

  • overlapping_rate (float) – Additional samples of overlapping data, eg. '0.4'

  • drop_edge (float) – Drop edges (drop_edge / client_num) for each client within overlapping part.

class federatedscope.core.splitters.graph.RelTypeSplitter(client_num, alpha=0.5, realloc_mask=False)[source]

Split Data into small data via dirichlet distribution to generate non-i.i.d data split.

Parameters
  • client_num (int) – Split data into client_num of pieces.

  • alpha (float) – Partition hyperparameter in LDA, smaller alpha generates more extreme heterogeneous scenario see np.random.dirichlet

class federatedscope.core.splitters.graph.ScaffoldLdaSplitter(client_num, alpha)[source]

First adopt scaffold splitting and then assign the samples to clients according to Latent Dirichlet Allocation.

Parameters
  • dataset (List or PyG.dataset) – The molecular datasets.

  • alpha (float) – Partition hyperparameter in LDA, smaller alpha generates more extreme heterogeneous scenario see np.random.dirichlet

Returns

data_list of split dataset via scaffold split.

Return type

List(List(PyG.data))

class federatedscope.core.splitters.graph.ScaffoldSplitter(client_num)[source]

Split molecular via scaffold. This splitter will sort all moleculars, and split them into several parts.

Parameters

client_num (int) – Split data into client_num of pieces.

federatedscope.core.configs

class federatedscope.core.configs.CN(init_dict=None, key_list=None, new_allowed=False)[source]

An extended configuration system based on [yacs]( https://github.com/rbgirshick/yacs). The two-level tree structure consists of several internal dict-like containers to allow simple key-value access and management.

assert_cfg(check_cfg=True)[source]

check the validness of the configuration instance

Parameters

check_cfg – whether enable checks

check_required_args()[source]

Check required arguments.

clean_unused_sub_cfgs()[source]

Clean the un-used secondary-level CfgNode, whose .use attribute is True

clear_aux_info()[source]

Clears all the auxiliary information of the CN object.

de_arguments()[source]

some config values are managed via Argument class, this function is used to make these values clean without the Argument class, such that the potential type-specific methods work correctly, e.g., len(cfg.federate.method) for a string config

freeze(inform=True, save=True, check_cfg=True)[source]
  1. make the cfg attributes immutable;

  2. if save==True, save the frozen cfg_check_funcs into self.outdir/config.yaml for better reproducibility;

  3. if self.wandb.use==True, update the frozen config

merge_from_file(cfg_filename, check_cfg=True)[source]

load configs from a yaml file, another cfg instance or a list stores the keys and values.

Parameters
  • cfg_filename – file name of yaml file

  • check_cfg – whether enable assert_cfg()

merge_from_list(cfg_list, check_cfg=True)[source]

load configs from a list stores the keys and values. modified merge_from_list in yacs.config.py to allow adding new keys if is_new_allowed() returns True :param cfg_list: list of pairs of cfg name and value :param check_cfg: whether enable assert_cfg()

merge_from_other_cfg(cfg_other, check_cfg=True)[source]

load configs from another cfg instance

Parameters
  • cfg_other – other cfg to be merged

  • check_cfg – whether enable assert_cfg()

print_help(arg_name='')[source]

print help info for a specific given arg_name or for all arguments if not given arg_name

Parameters

arg_name – name of specific args

ready_for_run(check_cfg=True)[source]

Check and cleans up the internal state of cfg and save cfg.

Parameters

check_cfg – whether enable assert_cfg()

register_cfg_check_fun(cfg_check_fun)[source]

Register a function that checks the configuration node.

Parameters

cfg_check_fun – function for validation the correctness of cfg.

federatedscope.core.configs.init_global_cfg(cfg)[source]

This function sets the default config value.

  1. Note that for an experiment, only part of the arguments will be used The remaining unused arguments won’t affect anything. So feel free to register any argument in graphgym.contrib.config

  2. We support more than one levels of configs, e.g., cfg.dataset.name

federatedscope.core.monitors

class federatedscope.core.monitors.EarlyStopper(patience=5, delta=0, improve_indicator_mode='best', the_larger_the_better=True)[source]

Track the history of metric (e.g., validation loss), check whether should stop (training) process if the metric doesn’t improve after a given patience.

Parameters
  • patience (int) – (Default: 5) How long to wait after last time the monitored metric improved. Note that the actual_checking_round = patience * cfg.eval.freq

  • delta (float) – (Default: 0) Minimum change in the monitored metric to indicate an improvement.

  • improve_indicator_mode (str) – Early stop when no improve to last patience round, in ['mean', 'best']

__track_and_check_best(history_result)

Tracks the best result and checks whether the patience is exceeded.

Parameters

history_result – results of all evaluation round

Returns

whether stop

Return type

Bool

__track_and_check_dummy(new_result)

Dummy stopper, always return false

Parameters

new_result

Returns

False

track_and_check(new_result)[source]

Checks the new result and if it improves it returns True.

Parameters

new_result – new evaluation result

Returns

whether stop

Return type

Bool

class federatedscope.core.monitors.MetricCalculator(eval_metric: Union[Set[str], List[str], str])[source]

Initializes the metric functions for the monitor. Use eval(ctx) to get evaluation results.

Parameters

eval_metric – set of metric names

_check_and_parse(ctx)[source]

Check the format of the prediction and labels

Parameters

ctx – context of trainer, see core.trainers.context

Returns

The ground truth labels y_pred: The prediction categories for classification task y_prob: The output of the model

Return type

y_true

get_metric_funcs(eval_metric)[source]

Build metrics for evaluation. :param self: write your description :param eval_metric: write your description

Returns: A metric calculator dict, such as {'loss': (eval_loss, False), 'acc': (eval_acc, True), ...}

Note

The key-value pairs of built-in metric and related funcs and the_larger_the_better sign is shown below:

Metric name

Source

The larger the better

loss

monitors.metric_calculator.eval_loss

False

avg_loss

monitors.metric_calculator.eval_avg_loss

False

total

monitors.metric_calculator.eval_total

False

correct

monitors.metric_calculator.eval_correct

True

acc

monitors.metric_calculator.eval_acc

True

ap

monitors.metric_calculator.eval_ap

True

f1

monitors.metric_calculator.eval_f1_score

True

roc_auc

monitors.metric_calculator.eval_roc_auc

True

rmse

monitors.metric_calculator.eval_rmse

False

mse

monitors.metric_calculator.eval_mse

False

loss_regular

monitors.metric_calculator.eval_regular

False

imp_ratio

monitors.metric_calculator.eval_imp_ratio

True

std

None

False

hits@{n}

monitors.metric_calculator.eval_hits

True

class federatedscope.core.monitors.Monitor(cfg, monitored_object=None)[source]

Provide the monitoring functionalities such as formatting the evaluation results into diverse metrics. Besides the prediction related performance, the monitor also can track efficiency related metrics for a worker

Parameters
  • cfg – a cfg node object

  • monitored_object – object to be monitored

log_res_best

best ever seen results

outdir

output directory

use_wandb

whether use wandb

wandb_online_track

whether use wandb to track online

monitored_object

object to be monitored

metric_calculator

metric calculator, / see core.monitors.metric_calculator

round_wise_update_key

key to decide which result of evaluation round is better

add_items_to_best_result(best_results, new_results, results_type)[source]

Add a new key: value item (results-type: new_results) to best_result

calc_model_metric(last_model, local_updated_models, rnd)[source]
Parameters
  • last_model (dict) – the state of last round.

  • local_updated_models (list) – each element is (data_size, model).

Returns

model_metric_dict

Return type

dict

compress_raw_res_file()[source]

Compress the raw res file to be written to disk.

convert_size(size_bytes)[source]

Convert bytes to human-readable size.

eval(ctx)[source]

Evaluates the given context with metric_calculator.

Parameters

ctx – context of trainer, see core.trainers.context

Returns

Evaluation results.

finish_fed_runner(fl_mode=None)[source]

Finish the Fed runner.

finish_fl()[source]

When FL finished, write system metrics to file.

format_eval_res(results, rnd, role=- 1, forms=None, return_raw=False)[source]

Format the evaluation results from trainer.ctx.eval_results

Parameters
  • results (dict) – a dict to store the evaluation results {metric:

  • value}

  • rnd (int|string) – FL round

  • role (int|string) – the output role

  • forms (list) – format type

  • return_raw (bool) – return either raw results, or other results

Returns

round_formatted_results, a formatted results with different forms and roles

Return type

dict

Note

Example of return value:

` {                                                                             'Role': 'Server #',                                                           'Round': 200,                                                                 'Results_weighted_avg': {                                                         'test_avg_loss': 0.58, 'test_acc': 0.67, 'test_correct':                      3356, 'test_loss': 2892, 'test_total': 5000                                   },                                                                        'Results_avg': {                                                                  'test_avg_loss': 0.57, 'test_acc': 0.67, 'test_correct':                      3356, 'test_loss': 2892, 'test_total': 5000                                   },                                                                        'Results_fairness': {                                                          'test_total': 33.99, 'test_correct': 27.185,                                  'test_avg_loss_std': 0.433551,                                                'test_avg_loss_bottom_decile': 0.356503,                                      'test_avg_loss_top_decile': 1.212492,                                         'test_avg_loss_min': 0.198317, 'test_avg_loss_max': 3.603567,                 'test_avg_loss_bottom10%': 0.276681, 'test_avg_loss_top10%':                  1.686649,                                                                     'test_avg_loss_cos1': 0.8679, 'test_avg_loss_entropy': 5.1641,                'test_loss_std': 13.686828, 'test_loss_bottom_decile': 11.8220,               'test_loss_top_decile': 39.727236, 'test_loss_min': 7.337724,                 'test_loss_max': 100.899873, 'test_loss_bottom10%': 9.618685,                 'test_loss_top10%': 54.96769, 'test_loss_cos1': 0.880356,                     'test_loss_entropy': 5.175803, 'test_acc_std': 0.123823,                      'test_acc_bottom_decile': 0.676471, 'test_acc_top_decile':                    0.916667,                                                                     'test_acc_min': 0.071429, 'test_acc_max': 0.972973,                           'test_acc_bottom10%': 0.527482, 'test_acc_top10%': 0.94486,                   'test_acc_cos1': 0.988134, 'test_acc_entropy': 5.283755                          },                                                                        } `

global_converged()[source]

Calculate wall time and round when global convergence has been reached.

local_converged()[source]

Calculate wall time and round when local convergence has been reached.

merge_system_metrics_simulation_mode(file_io=True, from_global_monitors=False)[source]

Average the system metrics recorded in system_metrics.json by all workers

save_formatted_results(formatted_res, save_file_name='eval_results.log')[source]

Save formatted results to a file.

track_avg_flops(flops, sample_num=1)[source]

update the average flops for forwarding each data sample, for most models and tasks, the averaging is not needed as the input shape is fixed

track_download_bytes(bytes)[source]

Track the number of bytes downloaded.

track_model_size(models)[source]

calculate the total model size given the models hold by the worker/trainer

Args

models: torch.nn.Module or list of torch.nn.Module

track_upload_bytes(bytes)[source]

Track the number of bytes uploaded.

update_best_result(best_results, new_results, results_type)[source]

Update best evaluation results. by default, the update is based on validation loss with ``round_wise_update_key=”val_loss” ``

federatedscope.core.aggregators

class federatedscope.core.aggregators.Aggregator[source]

Abstract class of Aggregator.

abstract aggregate(agg_info)[source]

Aggregation function.

Parameters

agg_info – information to be aggregated.

class federatedscope.core.aggregators.AsynClientsAvgAggregator(model=None, device='cpu', config=None)[source]

The aggregator used in asynchronous training, which discounts the staled model updates

aggregate(agg_info)[source]

To preform aggregation

Parameters

agg_info (dict) – the feedbacks from clients

Returns

the aggregated results

Return type

dict

discount_func(staleness)[source]

Served as an example, we discount the model update with staleness tau as: (1.0/((1.0+ au)**factor)), which has been used in previous studies such as FedAsync ( Asynchronous Federated Optimization) and FedBuff (Federated Learning with Buffered Asynchronous Aggregation).

class federatedscope.core.aggregators.BulyanAggregator(model=None, device='cpu', config=None)[source]

Implementation of Bulyan refers to The Hidden Vulnerability of Distributed Learning in Byzantium [Mhamdi et al., 2018] (http://proceedings.mlr.press/v80/mhamdi18a/mhamdi18a.pdf)

It combines the MultiKrum aggregator and the treamedmean aggregator

aggregate(agg_info)[source]

To preform aggregation with Median aggregation rule Arguments: agg_info (dict): the feedbacks from clients :returns: the aggregated results :rtype: dict

class federatedscope.core.aggregators.ClientsAvgAggregator(model=None, device='cpu', config=None)[source]

Implementation of vanilla FedAvg refer to ‘Communication-efficient learning of deep networks from decentralized data’ [McMahan et al., 2017] http://proceedings.mlr.press/v54/mcmahan17a.html

aggregate(agg_info)[source]

To preform aggregation

Parameters

agg_info (dict) – the feedbacks from clients

Returns

the aggregated results

Return type

dict

update(model_parameters)[source]
Parameters

model_parameters (dict) – PyTorch Module object’s state_dict.

class federatedscope.core.aggregators.FedOptAggregator(config, model, device='cpu')[source]

Implementation of FedOpt refer to Adaptive Federated Optimization [Reddi et al., 2021](https://openreview.net/forum?id=LkFG3lB13U5)

aggregate(agg_info)[source]

To preform FedOpt aggregation.

class federatedscope.core.aggregators.KrumAggregator(model=None, device='cpu', config=None)[source]

Implementation of Krum/multi-Krum refer to Machine learning with adversaries: Byzantine tolerant gradient descent [Blanchard P et al., 2017] (https://proceedings.neurips.cc/paper/2017/hash/ f4b9ec30ad9f68f89b29639786cb62ef-Abstract.html)

aggregate(agg_info)[source]

To preform aggregation with Krum aggregation rule

Arguments: agg_info (dict): the feedbacks from clients :returns: the aggregated results :rtype: dict

class federatedscope.core.aggregators.MedianAggregator(model=None, device='cpu', config=None)[source]

Implementation of median refers to Byzantine-robust distributed learning: Towards optimal statistical rates [Yin et al., 2018] (http://proceedings.mlr.press/v80/yin18a/yin18a.pdf)

It computes the coordinate-wise median of recieved updates from clients

The code is adapted from https://github.com/bladesteam/blades

aggregate(agg_info)[source]

To preform aggregation with Median aggregation rule Arguments: agg_info (dict): the feedbacks from clients :returns: the aggregated results :rtype: dict

class federatedscope.core.aggregators.NoCommunicationAggregator(model=None, device='cpu', config=None)[source]

Clients do not communicate. Each client work locally

aggregate(agg_info)[source]

Aggregation function.

Parameters

agg_info – information to be aggregated.

update(model_parameters)[source]
Parameters

model_parameters (dict) – PyTorch Module object’s state_dict.

class federatedscope.core.aggregators.NormboundingAggregator(model=None, device='cpu', config=None)[source]

The server clips each update to reduce the negative impact of malicious updates.

aggregate(agg_info)[source]

To preform aggregation with normbounding aggregation rule Arguments: agg_info (dict): the feedbacks from clients :returns: the aggregated results :rtype: dict

class federatedscope.core.aggregators.OnlineClientsAvgAggregator(model=None, device='cpu', src_device='cpu', config=None)[source]

Implementation of online aggregation of FedAvg.

aggregate(agg_info)[source]

Returns the aggregated value

inc(content)[source]

Increment the model weight by the given content.

reset()[source]

Reset the state of the model to its initial state

class federatedscope.core.aggregators.ServerClientsInterpolateAggregator(model=None, device='cpu', config=None, beta=1.0)[source]

conduct aggregation by interpolating global model from server and local models from clients

aggregate(agg_info)[source]

Returns the aggregated value

class federatedscope.core.aggregators.TrimmedmeanAggregator(model=None, device='cpu', config=None)[source]

Implementation of median refer to Byzantine-robust distributed learning: Towards optimal statistical rates [Yin et al., 2018] (http://proceedings.mlr.press/v80/yin18a/yin18a.pdf)

The code is adapted from https://github.com/bladesteam/blades

aggregate(agg_info)[source]

To preform aggregation with trimmedmean aggregation rule Arguments: agg_info (dict): the feedbacks from clients :returns: the aggregated results :rtype: dict

federatedscope.core.auxiliaries

federatedscope.core.auxiliaries.aggregator_builder.get_aggregator()[source]

This function builds an aggregator, which is a protocol for aggregate all clients’ model(s).

Parameters
  • method – key to determine which aggregator to use

  • model – model to be aggregated

  • device – where to aggregate models (cpu or gpu)

  • onlineTrue or False to use online aggregator.

  • config – configurations for FL, see federatedscope.core.configs

Returns

An instance of aggregator (see core.aggregator for details)

Note

The key-value pairs of method and aggregators:

Method

Aggregator

tensorflow

cross_backends.FedAvgAggregator

local

core.aggregators.NoCommunicationAggregator

global

core.aggregators.NoCommunicationAggregator

fedavg

core.aggregators.OnlineClientsAvgAggregator or core.aggregators.AsynClientsAvgAggregator or ClientsAvgAggregator

pfedme

core.aggregators.ServerClientsInterpolateAggregator

ditto

core.aggregators.OnlineClientsAvgAggregator or core.aggregators.AsynClientsAvgAggregator or ClientsAvgAggregator

fedsageplus

core.aggregators.OnlineClientsAvgAggregator or core.aggregators.AsynClientsAvgAggregator or ClientsAvgAggregator

gcflplus

core.aggregators.OnlineClientsAvgAggregator or core.aggregators.AsynClientsAvgAggregator or ClientsAvgAggregator

fedopt

core.aggregators.FedOptAggregator

federatedscope.core.auxiliaries.criterion_builder.get_criterion()[source]

This function builds an instance of loss functions from: “https://pytorch.org/docs/stable/nn.html#loss-functions”, where the criterion_type is chosen from.

Parameters
  • criterion_type – loss function type

  • device – move to device (cpu or gpu)

Returns

An instance of loss functions.

federatedscope.core.auxiliaries.data_builder.get_data()[source]

Instantiate the data and update the configuration accordingly if necessary.

Parameters
  • config – a cfg node object

  • client_cfgs – dict of client-specific cfg node object

Returns

The dataset object and the updated configuration.

Note

The available data.type is shown below:

Data type

Domain

FEMNIST

CV

Celeba

CV

${DNAME}@torchvision

CV

Shakespeare

NLP

SubReddit

NLP

Twitter (Sentiment140)

NLP

${DNAME}@torchtext

NLP

${DNAME}@huggingface_datasets

NLP

Cora

Graph (node-level)

CiteSeer

Graph (node-level)

PubMed

Graph (node-level)

DBLP_conf

Graph (node-level)

DBLP_org

Graph (node-level)

csbm

Graph (node-level)

Epinions

Graph (link-level)

Ciao

Graph (link-level)

FB15k

Graph (link-level)

FB15k-237

Graph (link-level)

WN18

Graph (link-level)

MUTAG

Graph (graph-level)

BZR

Graph (graph-level)

COX2

Graph (graph-level)

DHFR

Graph (graph-level)

PTC_MR

Graph (graph-level)

AIDS

Graph (graph-level)

NCI1

Graph (graph-level)

ENZYMES

Graph (graph-level)

DD

Graph (graph-level)

PROTEINS

Graph (graph-level)

COLLAB

Graph (graph-level)

IMDB-BINARY

Graph (graph-level)

IMDB-MULTI

Graph (graph-level)

REDDIT-BINARY

Graph (graph-level)

HIV

Graph (graph-level)

ESOL

Graph (graph-level)

FREESOLV

Graph (graph-level)

LIPO

Graph (graph-level)

PCBA

Graph (graph-level)

MUV

Graph (graph-level)

BACE

Graph (graph-level)

BBBP

Graph (graph-level)

TOX21

Graph (graph-level)

TOXCAST

Graph (graph-level)

SIDER

Graph (graph-level)

CLINTOX

Graph (graph-level)

graph_multi_domain_mol

Graph (graph-level)

graph_multi_domain_small

Graph (graph-level)

graph_multi_domain_biochem

Graph (graph-level)

cikmcup

Graph (graph-level)

toy

Tabular

synthetic

Tabular

quadratic

Tabular

${DNAME}openml

Tabular

vertical_fl_data

Tabular(vertical)

VFLMovieLens1M

Recommendation

VFLMovieLens10M

Recommendation

HFLMovieLens1M

Recommendation

HFLMovieLens10M

Recommendation

VFLNetflix

Recommendation

HFLNetflix

Recommendation

federatedscope.core.auxiliaries.dataloader_builder.get_dataloader()[source]

Instantiate a DataLoader via config.

Parameters
  • dataset – dataset from which to load the data.

  • config – configs containing batch_size, shuffle, etc.

  • split – current split (default: train), if split is test, cfg.dataloader.shuffle will be False. And in PyG, test split will use NeighborSampler by default.

Returns

Instance of specific DataLoader configured by config.

Note

The key-value pairs of dataloader.type and DataLoader:

dataloader.type

Source

raw

No DataLoader

base

torch.utils.data.DataLoader

pyg

torch_geometric.loader.DataLoader

graphsaint-rw

torch_geometric.loader.GraphSAINTRandomWalkSampler

neighbor

torch_geometric.loader.NeighborSampler

mf

federatedscope.mf.dataloader.MFDataLoader

federatedscope.core.auxiliaries.metric_builder.get_metric()[source]

This function returns a dict, where the key is metric name, and value is the function of how to calculate the metric and a bool to indicate the metric is larger the better.

Parameters

types – list of metric names

Returns

A metric calculator dict, such as {'loss': (eval_loss, False), 'acc': (eval_acc, True), ...}

Note

The key-value pairs of built-in metric and related funcs and the_larger_the_better sign is shown below:

Metric name

Source

The larger the better

loss

monitors.metric_calculator.eval_loss

False

avg_loss

monitors.metric_calculator.eval_avg_loss

False

total

monitors.metric_calculator.eval_total

False

correct

monitors.metric_calculator.eval_correct

True

acc

monitors.metric_calculator.eval_acc

True

ap

monitors.metric_calculator.eval_ap

True

f1

monitors.metric_calculator.eval_f1_score

True

roc_auc

monitors.metric_calculator.eval_roc_auc

True

rmse

monitors.metric_calculator.eval_rmse

False

mse

monitors.metric_calculator.eval_mse

False

loss_regular

monitors.metric_calculator.eval_regular

False

imp_ratio

monitors.metric_calculator.eval_imp_ratio

True

std

None

False

hits@{n}

monitors.metric_calculator.eval_hits

True

federatedscope.core.auxiliaries.model_builder.get_model()[source]

This function builds an instance of model to be trained.

Parameters
  • model_configcfg.model, a submodule of cfg

  • local_data – the model to be instantiated is responsible for the given data

  • backend – chosen from torch and tensorflow

Returns

the instantiated model.

Return type

model (torch.Module)

Note

The key-value pairs of built-in model and source are shown below:

Model type

Source

lr

core.lr.LogisticRegression or cross_backends.LogisticRegression

mlp

core.mlp.MLP

quadratic

tabular.model.QuadraticModel

convnet2, convnet5, vgg11

cv.model.get_cnn()

lstm

nlp.model.get_rnn()

{}@transformers

nlp.model.get_transformer()

gcn, sage, gpr, gat, gin, mpnn

gfl.model.get_gnn()

vmfnet, hmfnet

mf.model.model_builder.get_mfnet()

federatedscope.core.auxiliaries.optimizer_builder.get_optimizer()[source]

This function returns an instantiated optimizer to optimize the model.

Parameters
Returns

An instantiated optimizer

federatedscope.core.auxiliaries.regularizer_builder.get_regularizer()[source]

This function builds an instance of regularizer to regularize training.

Parameters

reg_type – type of scheduler, such as see https://pytorch.org/docs/stable/optim.html for details

Returns

An instantiated regularizer.

federatedscope.core.auxiliaries.runner_builder.get_runner()[source]

Instantiate a runner based on a configuration file

Parameters
  • server_class – server class

  • client_class – client class

  • config – configurations for FL, see federatedscope.core.configs

  • client_configs – client-specific configurations

Returns

An instantiated FedRunner to run the FL course.

Note

The key-value pairs of built-in runner and source are shown below:

Mode

Source

standalone

core.fed_runner.StandaloneRunner

distributed

core.fed_runner.DistributedRunner

standalone(process_num>1)

core.auxiliaries.parallel_runner. StandaloneMultiGPURunner

federatedscope.core.auxiliaries.sampler_builder.get_sampler()[source]

This function builds a sampler for sampling clients who should join the aggregation per communication round.

Parameters
  • sample_strategy – Sampling strategy of sampler

  • client_num – total number of client joining the FL course

  • client_info – client information

  • bins – size of bins for group sampler

Returns

An instantiated Sampler to sample during aggregation.

Note

The key-value pairs of built-in sampler and source are shown below:

Sampling strategy

Source

uniform

core.sampler.UniformSampler

group

core.sampler.GroupSampler

federatedscope.core.auxiliaries.scheduler_builder.get_scheduler()[source]

This function builds an instance of scheduler.

Parameters
  • optimizer – optimizer to be scheduled

  • type – type of scheduler

  • **kwargs – kwargs dict

Returns

An instantiated scheduler.

Note

Please follow contrib.scheduler.example to implement your own scheduler.

federatedscope.core.auxiliaries.splitter_builder.get_splitter()[source]

This function is to build splitter to generate simulated federation datasets from non-FL dataset.

Parameters

config – configurations for FL, see federatedscope.core.configs

Returns

An instance of splitter (see core.splitters for details)

Note

The key-value pairs of cfg.data.splitter and domain:

Splitter type

Domain

lda

Generic

iid

Generic

louvain

Graph (node-level)

random

Graph (node-level)

rel_type

Graph (link-level)

scaffold

Molecular

scaffold_lda

Molecular

rand_chunk

Graph (graph-level)

federatedscope.core.auxiliaries.trainer_builder.get_trainer()[source]

This function builds an instance of trainer.

Parameters
  • model – model used in FL course

  • data – data used in FL course

  • device – where to train model (cpu or gpu)

  • config – configurations for FL, see federatedscope.core.configs

  • only_for_evalTrue or False, if True, train routine will be removed in this trainer

  • is_attackerTrue or False to determine whether this client is an attacker

  • monitor – an instance of federatedscope.core.monitors.Monitor to observe the evaluation and system metrics

Returns

An instance of trainer.

Note

The key-value pairs of cfg.trainer.type and trainers:

Trainer Type

Source

general

core.trainers.GeneralTorchTrainer and core.trainers.GeneralTFTrainer

cvtrainer

cv.trainer.trainer.CVTrainer

nlptrainer

nlp.trainer.trainer.NLPTrainer

graphminibatch_trainer

gfl.trainer.graphtrainer.GraphMiniBatchTrainer

linkfullbatch_trainer

gfl.trainer.linktrainer.LinkFullBatchTrainer

linkminibatch_trainer

gfl.trainer.linktrainer.LinkMiniBatchTrainer

nodefullbatch_trainer

gfl.trainer.nodetrainer.NodeFullBatchTrainer

nodeminibatch_trainer

gfl.trainer.nodetrainer.NodeMiniBatchTrainer

flitplustrainer

gfl.flitplus.trainer.FLITPlusTrainer

flittrainer

gfl.flitplus.trainer.FLITTrainer

fedvattrainer

gfl.flitplus.trainer.FedVATTrainer

fedfocaltrainer

gfl.flitplus.trainer.FedFocalTrainer

mftrainer

federatedscope.mf.trainer.MFTrainer

mytorchtrainer

contrib.trainer.torch_example.MyTorchTrainer

Wrapper functions are shown below:

Wrapper Functions

Source

nbafl

core.trainers.wrap_nbafl_trainer

sgdmf

mf.trainer.wrap_MFTrainer

pfedme

core.trainers.wrap_pFedMeTrainer

ditto

core.trainers.wrap_DittoTrainer

fedem

core.trainers.FedEMTrainer

fedprox

core.trainers.wrap_fedprox_trainer

attack

attack.trainer.wrap_benignTrainer and attack.auxiliary.attack_trainer_builder.wrap_attacker_trainer

federatedscope.core.auxiliaries.transform_builder.get_transform()[source]

This function is to build transforms applying to dataset.

Parameters
  • configCN from federatedscope/core/configs/config.py

  • package – one of package from ['torchvision', 'torch_geometric', 'torchtext', 'torchaudio']

Returns

Dict of transform functions.

federatedscope.core.auxiliaries.worker_builder.get_client_cls()[source]

This function return a class of client.

Parameters

cfg – configurations for FL, see federatedscope.core.configs

Returns

A client class decided by cfg.

Note

The key-value pairs of client type and source:

Client type

Source

local

core.workers.Client

fedavg

core.workers.Client

pfedme

core.workers.Client

ditto

core.workers.Client

fedex

autotune.fedex.FedExClient

vfl

vertical_fl.worker.vFLClient

fedsageplus

gfl.fedsageplus.worker.FedSagePlusClient

gcflplus

gfl.gcflplus.worker.GCFLPlusClient

gradascent

attack.worker_as_attacker.active_client

federatedscope.core.auxiliaries.worker_builder.get_server_cls()[source]

This function return a class of server.

Parameters

cfg – configurations for FL, see federatedscope.core.configs

Returns

A server class decided by cfg.

Note

The key-value pairs of server type and source:

Server type

Source

local

core.workers.Server

fedavg

core.workers.Server

pfedme

core.workers.Server

ditto

core.workers.Server

fedex

autotune.fedex.FedExServer

vfl

vertical_fl.worker.vFLServer

fedsageplus

gfl.fedsageplus.worker.FedSagePlusServer

gcflplus

gfl.gcflplus.worker.GCFLPlusServer

attack

attack.worker_as_attacker.server_attacker.PassiveServer and attack.worker_as_attacker.server_attacker.PassivePIAServer

backdoor

attack.worker_as_attacker.server_attacker.BackdoorServer