Core Module References¶

federatedscope.core.fed_runner¶

class federatedscope.core.fed_runner.BaseRunner(data, server_class=<class 'federatedscope.core.workers.server.Server'>, client_class=<class 'federatedscope.core.workers.client.Client'>, config=None, client_configs=None)[source]¶

This class is a base class to construct an FL course, which includes _set_up() and run().

Parameters

data – The data used in the FL courses, which are formatted as {'ID':data} for standalone mode. More details can be found in federatedscope.core.auxiliaries.data_builder .
server_class – The server class is used for instantiating a ( customized) server.
client_class – The client class is used for instantiating a ( customized) client.
config – The configurations of the FL course.
client_configs – The clients’ configurations.

data¶: The data used in the FL courses, which are formatted as {'ID':data} for standalone mode. More details can be found in federatedscope.core.auxiliaries.data_builder .

server¶: The instantiated server.

client¶: The instantiate client(s).

cfg¶: The configurations of the FL course.

client_cfgs¶: The clients’ configurations.

mode¶: The run mode for FL, distributed or standalone

gpu_manager¶: manager of GPU resource

resource_info¶: information of resource

abstract _get_client_args(client_id, resource_info)[source]¶

Get the args for instantiating the server.

Parameters

client_id – ID of client
resource_info – information of resource

Returns

data which client holds; kwargs dict to instantiate the client.

Return type

(client_data, kw)

abstract _get_server_args(resource_info, client_resource_info)[source]¶

Get the args for instantiating the server.

Parameters

resource_info – information of resource
client_resource_info – information of client’s resource

Returns

None or data which server holds; model to be aggregated; kwargs dict to instantiate the server.

Return type

(server_data, model, kw)

abstract _set_up()[source]¶: Set up and instantiate the client/server.

_setup_client(client_id=- 1, client_model=None, resource_info=None)[source]¶

Set up and instantiate the client.

Parameters

client_id – ID of client
client_model – model of client
resource_info – information of resource

Returns

Instantiate client.

_setup_server(resource_info=None, client_resource_info=None)[source]¶

Set up and instantiate the server.

Parameters

resource_info – information of resource
client_resource_info – information of client’s resource

Returns

Instantiate server.

check()[source]¶: Check the completeness of Server and Client.

abstract run()[source]¶

Launch the FL course

Returns: best results during the FL course
Return type: dict

class federatedscope.core.fed_runner.DistributedRunner(data, server_class=<class 'federatedscope.core.workers.server.Server'>, client_class=<class 'federatedscope.core.workers.client.Client'>, config=None, client_configs=None)[source]¶

_get_client_args(client_id, resource_info)[source]¶

Get the args for instantiating the server.

Parameters

client_id – ID of client
resource_info – information of resource

Returns

data which client holds; kwargs dict to instantiate the client.

Return type

(client_data, kw)

_get_server_args(resource_info, client_resource_info)[source]¶

Get the args for instantiating the server.

Parameters

resource_info – information of resource
client_resource_info – information of client’s resource

Returns

None or data which server holds; model to be aggregated; kwargs dict to instantiate the server.

Return type

(server_data, model, kw)

_set_up()[source]¶: To set up server or client for distributed mode.

run()[source]¶

Launch the FL course

Returns: best results during the FL course
Return type: dict

class federatedscope.core.fed_runner.FedRunner(data, server_class=<class 'federatedscope.core.workers.server.Server'>, client_class=<class 'federatedscope.core.workers.client.Client'>, config=None, client_configs=None)[source]¶

This class is used to construct an FL course, which includes _set_up and run.

Parameters

data – The data used in the FL courses, which are formatted as {'ID':data} for standalone mode. More details can be found in federatedscope.core.auxiliaries.data_builder .
server_class – The server class is used for instantiating a ( customized) server.
client_class – The client class is used for instantiating a ( customized) client.
config – The configurations of the FL course.
client_configs – The clients’ configurations.

Warning

FedRunner will be removed in the future, consider using StandaloneRunner or DistributedRunner instead!

_handle_msg(msg, rcv=- 1)[source]¶: To simulate the message handling process (used only for the standalone mode)

_setup_client(client_id=- 1, client_model=None, resource_info=None)[source]¶: Set up the client

_setup_for_distributed()[source]¶: To set up server or client for distributed mode.

_setup_for_standalone()[source]¶: To set up server and client for standalone mode.

_setup_server(resource_info=None, client_resource_info=None)[source]¶: Set up the server

check()[source]¶: Check the completeness of Server and Client.

run()[source]¶: To run an FL course, which is called after server/client has been set up. For the standalone mode, a shared message queue will be set up to simulate receiving message.

class federatedscope.core.fed_runner.StandaloneRunner(data, server_class=<class 'federatedscope.core.workers.server.Server'>, client_class=<class 'federatedscope.core.workers.client.Client'>, config=None, client_configs=None)[source]¶

_get_client_args(client_id=- 1, resource_info=None)[source]¶

Get the args for instantiating the server.

Parameters

client_id – ID of client
resource_info – information of resource

Returns

data which client holds; kwargs dict to instantiate the client.

Return type

(client_data, kw)

_get_server_args(resource_info=None, client_resource_info=None)[source]¶

Get the args for instantiating the server.

Parameters

resource_info – information of resource
client_resource_info – information of client’s resource

Returns

None or data which server holds; model to be aggregated; kwargs dict to instantiate the server.

Return type

(server_data, model, kw)

_handle_msg(msg, rcv=- 1)[source]¶: To simulate the message handling process (used only for the standalone mode)

_run_simulation()[source]¶: Run for standalone simulation (W/O online aggr)

_run_simulation_online()[source]¶: Run for online aggregation. Any broadcast operation would be executed client-by-clien to avoid the existence of #clients messages at the same time. Currently, only consider centralized topology

_set_up()[source]¶: To set up server and client for standalone mode.

run()[source]¶

Launch the FL course

Returns: best results during the FL course
Return type: dict

federatedscope.core.workers¶

class federatedscope.core.workers.BaseClient(ID, state, config, model, strategy)[source]¶

_register_default_handlers()[source]¶

Register default handler dic to handle message, which includes sender, receiver, state, and content. More detail can be found in federatedscope.core.message.

Note

the default handlers to handle messages and related callback function are shown below:

Message type	Callback function
`assign_client_id`	`callback_funcs_for_assign_id()`
`ask_for_join_in_info`	`callback_funcs_for_join_in_info()`
`address`	`callback_funcs_for_address()`
`model_para`	`callback_funcs_for_model_para()`
`ss_model_para`	`callback_funcs_for_model_para()`
`evaluate`	`callback_funcs_for_evaluate()`
`finish`	`callback_funcs_for_finish()`
`converged`	`callback_funcs_for_converged()`

abstract callback_funcs_for_address(message)[source]¶

The handling function for receiving other clients’ IP addresses, which is used for constructing a complex topology

Parameters: message – The received message

abstract callback_funcs_for_assign_id(message)[source]¶

The handling function for receiving the client_ID assigned by the server (during the joining process), which is used in the distributed mode.

Parameters: message – The received message

abstract callback_funcs_for_converged(message)[source]¶

The handling function for receiving the signal that the FL course converged

Parameters: message – The received message

abstract callback_funcs_for_evaluate(message)[source]¶

The handling function for receiving the request of evaluating

Parameters: message – The received message

abstract callback_funcs_for_finish(message)[source]¶

The handling function for receiving the signal of finishing the FL course.

Parameters: message – The received message

abstract callback_funcs_for_join_in_info(message)[source]¶

The handling function for receiving the request of join in information (such as batch_size, num_of_samples) during the joining process.

Parameters: message – The received message

abstract callback_funcs_for_model_para(message)[source]¶

The handling function for receiving model parameters, which triggers the local training process. This handling function is widely used in various FL courses.

Parameters: message – The received message

register_handlers(msg_type, callback_func, send_msg=[None])[source]¶

To bind a message type with a handling function.

Parameters

msg_type (str) – The defined message type
callback_func – The handling functions to handle the received message

abstract run()[source]¶: To listen to the message and handle them accordingly (used for distributed mode)

class federatedscope.core.workers.BaseServer(ID, state, config, model, strategy)[source]¶

_register_default_handlers()[source]¶

Register default handler dic to handle message, which includes sender, receiver, state, and content. More detail can be found in federatedscope.core.message.

Note

the default handlers to handle messages and related callback function are shown below:

Message type	Callback function
`join_in`	`callback_funcs_for_join_in()`
`join_in_info`	`callback_funcs_for_join_in()`
`model_para`	`callback_funcs_model_para()`
`metrics`	`callback_funcs_for_metrics`

abstract callback_funcs_for_join_in(message)[source]¶

The handling function for receiving the join in information. The server might request for some information (such as num_of_samples) if necessary, assign IDs for the servers. If all the clients have joined in, the training process will be triggered.

Parameters: message – The received message

abstract callback_funcs_for_metrics(message)[source]¶

The handling function for receiving the evaluation results, which triggers check_and_move_on (perform aggregation when enough feedback has been received).

Parameters: message – The received message

abstract callback_funcs_model_para(message)[source]¶

The handling function for receiving model parameters, which triggers check_and_move_on (perform aggregation when enough feedback has been received). This handling function is widely used in various FL courses.

Parameters: message – The received message.

register_handlers(msg_type, callback_func, send_msg=[None])[source]¶

To bind a message type with a handling function.

Parameters

msg_type (str) – The defined message type
callback_func – The handling functions to handle the received message

abstract run()[source]¶: To start the FL course, listen and handle messages (for distributed mode).

class federatedscope.core.workers.Client(ID=- 1, server_id=None, state=- 1, config=None, data=None, model=None, device='cpu', strategy=None, is_unseen_client=False, *args, **kwargs)[source]¶

The Client class, which describes the behaviors of client in an FL course. The behaviors are described by the handling functions (named as callback_funcs_for_xxx)

Parameters

ID – The unique ID of the client, which is assigned by the server
course (when joining the FL) –
server_id – (Default) 0
state – The training round
config – The configuration
data – The data owned by the client
model – The model maintained locally
device – The device to run local training and evaluation

ID¶: ID of worker

state¶: the training round index

model¶: the model maintained locally

cfg¶: the configuration of FL course, see federatedscope.core.configs

mode¶: the run mode for FL, distributed or standalone

monitor¶: monite FL course and record metrics, see federatedscope.core.monitors.monitor.Monitor

trainer¶: instantiated trainer, see federatedscope.core.trainers

best_results¶: best results ever seen

history_results¶: all evaluation results

early_stopper¶: determine when to early stop, see federatedscope.core.monitors.early_stopper.EarlyStopper

ss_manager¶: secret sharing manager

msg_buffer¶: dict buffer for storing message

comm_manager¶: manager for communication, see federatedscope.core.communication

callback_funcs_for_address(message: Message)[source]¶

The handling function for receiving other clients’ IP addresses, which is used for constructing a complex topology

Parameters: message – The received message

callback_funcs_for_assign_id(message: Message)[source]¶

The handling function for receiving the client_ID assigned by the server (during the joining process), which is used in the distributed mode.

Parameters: message – The received message

callback_funcs_for_converged(message: Message)[source]¶

The handling function for receiving the signal that the FL course converged

Parameters: message – The received message

callback_funcs_for_evaluate(message: Message)[source]¶

The handling function for receiving the request of evaluating

Parameters: message – The received message

callback_funcs_for_finish(message: Message)[source]¶

The handling function for receiving the signal of finishing the FL course.

Parameters: message – The received message

callback_funcs_for_join_in_info(message: Message)[source]¶

The handling function for receiving the request of join in information (such as batch_size, num_of_samples) during the joining process.

Parameters: message – The received message

callback_funcs_for_model_para(message: Message)[source]¶

The handling function for receiving model parameters, which triggers the local training process. This handling function is widely used in various FL courses.

Parameters: message – The received message

join_in()[source]¶: To send join_in message to the server for joining in the FL course.

run()[source]¶: To listen to the message and handle them accordingly (used for distributed mode)

run_standalone()[source]¶: Run in standalone mode

class federatedscope.core.workers.Server(ID=- 1, state=0, config=None, data=None, model=None, client_num=5, total_round_num=10, device='cpu', strategy=None, unseen_clients_id=None, **kwargs)[source]¶

The Server class, which describes the behaviors of server in an FL course. The behaviors are described by the handled functions (named as callback_funcs_for_xxx).

Parameters

ID – The unique ID of the server, which is set to 0 by default
state – The training round
config – the configuration
data – The data owned by the server (for global evaluation)
model – The model used for aggregation
client_num – The (expected) client num to start the FL course
total_round_num – The total number of the training round
device – The device to run local training and evaluation

ID¶: ID of worker

state¶: the training round index

model¶: the model maintained locally

cfg¶: the configuration of FL course, see federatedscope.core.configs

mode¶: the run mode for FL, distributed or standalone

monitor¶: monite FL course and record metrics, see federatedscope.core.monitors.monitor.Monitor

trainer¶: instantiated trainer, see federatedscope.core.trainers

best_results¶: best results ever seen

history_results¶: all evaluation results

early_stopper¶: determine when to early stop, see federatedscope.core.monitors.early_stopper.EarlyStopper

aggregators¶: a protocol for aggregate all clients’ model(s), see federatedscope.core.aggregators

sample_client_num¶: number of client aggregated in each round

msg_buffer¶: dict buffer for storing message

staled_msg_buffer¶: list buffer for storing staled message

comm_manager¶: manager for communication, see federatedscope.core.communication

_merge_and_format_eval_results()[source]¶: The behaviors of server when receiving enough evaluating results

_perform_federated_aggregation()[source]¶: Perform federated aggregation and update the global model

_start_new_training_round(aggregated_num=0)[source]¶: The behaviors for starting a new training round

broadcast_client_address()[source]¶: To broadcast the communication addresses of clients (used for additive secret sharing)

broadcast_model_para(msg_type='model_para', sample_client_num=- 1, filter_unseen_clients=True)[source]¶

To broadcast the message to all clients or sampled clients

Parameters

msg_type – ‘model_para’ or other user defined msg_type
sample_client_num – the number of sampled clients in the broadcast behavior. And sample_client_num = -1 denotes to broadcast to all the clients.
filter_unseen_clients – whether filter out the unseen clients that do not contribute to FL process by training on their local data and uploading their local model update. The splitting is useful to check participation generalization gap in [ICLR’22, What Do We Mean by Generalization in Federated Learning?] You may want to set it to be False when in evaluation stage

callback_funcs_for_join_in(message: Message)[source]¶

The handling function for receiving the join in information. The server might request for some information (such as num_of_samples) if necessary, assign IDs for the servers. If all the clients have joined in, the training process will be triggered.

Parameters: message – The received message

callback_funcs_for_metrics(message: Message)[source]¶

The handling function for receiving the evaluation results, which triggers check_and_move_on (perform aggregation when enough feedback has been received).

Parameters: message – The received message

callback_funcs_model_para(message: Message)[source]¶

The handling function for receiving model parameters, which triggers check_and_move_on (perform aggregation when enough feedback has been received). This handling function is widely used in various FL courses.

Parameters: message – The received message.

check_and_move_on(check_eval_result=False, min_received_num=None)[source]¶

To check the message_buffer. When enough messages are receiving, some events (such as perform aggregation, evaluation, and move to the next training round) would be triggered.

Parameters

check_eval_result (bool) – If True, check the message buffer for evaluation; and check the message buffer for training otherwise.
min_received_num – number of minimal received message, used for async mode

check_and_save()[source]¶: To save the results and save model after each evaluation, and check whether to early stop.

check_buffer(cur_round, min_received_num, check_eval_result=False)[source]¶

To check the message buffer

Parameters

cur_round (int) – The current round number
min_received_num (int) – The minimal number of the receiving messages
check_eval_result (bool) – To check training results for evaluation results

Returns: bool: Whether enough messages have been received or not

check_client_join_in()[source]¶: To check whether all the clients have joined in the FL course.

eval()[source]¶: To conduct evaluation. When cfg.federate.make_global_eval=True, a global evaluation is conducted by the server.

merge_eval_results_from_all_clients()[source]¶

Merge evaluation results from all clients, update best, log the merged results and save them into eval_results.log

Returns: the formatted merged results

run()[source]¶: To start the FL course, listen and handle messages (for distributed mode).

save_best_results()[source]¶: To Save the best evaluation results.

save_client_eval_results()[source]¶: save the evaluation results of each client when the fl course early stopped or terminated

terminate(msg_type='finish')[source]¶: To terminate the FL course

trigger_for_feat_engr(trigger_train_func, kwargs_for_trigger_train_func={})[source]¶: Interface for feature engineering, the default operation is none

trigger_for_start()[source]¶: To start the FL course when the expected number of clients have joined

trigger_for_time_up(check_timestamp=None)[source]¶: The handler for time up: modify the currency timestamp and check the trigger condition

class federatedscope.core.workers.Worker(ID=- 1, state=0, config=None, model=None, strategy=None)[source]¶

The base worker class, the parent of BaseClient and BaseServer

Parameters

ID – ID of worker
state – the training round index
config – the configuration of FL course
model – the model maintained locally

ID¶: ID of worker

state¶: the training round index

model¶: the model maintained locally

cfg¶: the configuration of FL course

mode¶: the run mode for FL, distributed or standalone

monitor¶: monite FL course and record metrics

federatedscope.core.trainers¶

class federatedscope.core.trainers.BaseTrainer(model, data, device, **kwargs)[source]¶

print_trainer_meta_info()[source]¶: Returns: String contains meta information of Trainer.

class federatedscope.core.trainers.Context(model, cfg, data=None, device=None)[source]¶

Record and pass variables among different hook functions.

Parameters

model – training model
cfg – config
data (dict) – a dict contains train/val/test dataset or dataloader
device – running device
init_dict (dict) – a dict used to initialize the instance of Context
init_attr (bool) – if set up the static variables

Note

The variables within an instance of class Context can be set/get as an attribute.

` ctx.${NAME_VARIABLE} = ${VALUE_VARIABLE} ` where ${NAME_VARIABLE} and ${VALUE_VARIABLE} is the name and value of the variable.

To achieve automatically lifecycle management, you can wrap the variable with CtxVar and a lifecycle parameter as follows

` ctx.${NAME_VARIABLE} = CtxVar(${VALUE_VARIABLE}, ${LIFECYCLE}) ` The parameter ${LIFECYCLE} can be chosen from LIFECYCLE.BATCH, LIFECYCLE.EPOCH and LIFECYCLE.ROUTINE. Then the variable ctx.${NAME_VARIABLE} will be deleted at the end of the corresponding stage

LIFECYCLE.BATCH: the variables will be deleted after running a batch

LIFECYCLE.EPOCH: the variables will be deleted after running a epoch

LIFECYCLE.ROUTINE: the variables will be deleted after running a routine

More details please refer to our [tutorial](https://federatedscope.io/docs/trainer/).

We classify and show the default attributes below:

Data-related attributes

ctx.data: the raw data (not split) the trainer holds
ctx.num_samples: the number of samples used in training
ctx.train_data, ctx.val_data, ctx.test_data: the split data the trainer holds
ctx.train_loader, ctx.val_loader, ctx.test_loader: the DataLoader of each split data
ctx.num_train_data, ctx.num_val_data, ctx.num_test_data: the number of samples of the split data Model-related attributes
ctx.model: the model used
ctx.models: the multi models if use
ctx.mirrored_models: the mirrored models
ctx.trainable_para_names: the trainable parameter names of the model

Optimizer-related attributes

ctx.optimizer: see torch.optim
ctx.scheduler: decays the learning rate of each parameter group
ctx.criterion: loss/criterion function
ctx.regularizer: regular terms
ctx.grad_clip: gradient clipping

Mode-related attributes

ctx.cur_mode: mode of trainer, which is one of ['train', 'val', 'test']
ctx.mode_stack: stack of mode, only used for switching mode
ctx.cur_split: split of data, which is one of ['train', 'val', 'test'] (Note: use train data in test mode is allowed)
ctx.split_stack: stack of split, only used for switching data split

Metric-related attributes

ctx.loss_batch_total: Loss of current batch
ctx.loss_regular_total: Loss of regular term
ctx.y_true: true label of batch data
ctx.y_prob: output of the model with batch data as input
ctx.ys_true: true label of data
ctx.ys_prob: output of the model
ctx.eval_metrics: evaluation metrics calculated by ctx.monitor
ctx.monitor: used for monitor trainer’s behavior and statistics

Other (statistics) attributes (@property, query from cfg if not set)

ctx.cfg: configuration of FL course
ctx.device: current device, such as cpu and gpu0.
ctx.num_train_batch_last_epoch, ctx.num_total_train_batch: the number of batch
ctx.num_train_epoch, ctx.num_val_epoch, ctx.num_test_epoch: the number of epoch in each data split
ctx.num_train_batch, ctx.num_val_batch, ctx.num_test_batch: the number of batch in each data split

class federatedscope.core.trainers.FedEMTrainer(model_nums, models_interact_mode='sequential', model=None, data=None, device=None, config=None, base_trainer: Optional[Type[GeneralTorchTrainer]] = None)[source]¶

The FedEM implementation, “Federated Multi-Task Learning under a Mixture of Distributions (NeurIPS 2021)” based on the Algorithm 1 in their paper and official codes: https://github.com/omarfoq/FedEM

_hook_on_batch_end_gather_loss(ctx)[source]¶

Note

The modified attributes and according operations are shown below:

Attribute	Operation
`ctx.all_losses_model_batch`	Gather loss

_hook_on_batch_forward_weighted_loss(ctx)[source]¶

Note

The modified attributes and according operations are shown below:

Attribute	Operation
`ctx.loss_batch`	Multiply by `weights_internal_models`

_hook_on_batch_start_track_batch_idx(ctx)[source]¶

Note

The modified attributes and according operations are shown below:

Attribute	Operation
`ctx.optimizer_for_global_model`	False

_hook_on_fit_end_ensemble_eval(ctx)[source]¶

Ensemble evaluation

Note

The modified attributes and according operations are shown below:

Attribute	Operation
`ctx.ys_prob_ensemble`	Ensemble ys_prob
`ctx.ys_true`	Concatenate results
`ctx.ys_prob`	Concatenate results
`ctx.eval_metrics`	Get evaluated results from `ctx.monitor`

_hook_on_fit_end_flop_count(ctx)[source]¶

Note

The modified attributes and according operations are shown below:

Attribute	Operation
`ctx.monitor`	Count total_flops

_hook_on_fit_start_flop_count(ctx)[source]¶

Note

The modified attributes and according operations are shown below:

Attribute	Operation
`ctx.monitor`	Count total_flops

_hook_on_fit_start_mixture_weights_update(ctx)[source]¶

Note

The modified attributes and according operations are shown below:

Attribute	Operation
`ctx.mode`	Evaluate

register_multiple_model_hooks()[source]¶: customized multiple_model_hooks, which is called in the __init__ of GeneralMultiModelTrainer

class federatedscope.core.trainers.GeneralMultiModelTrainer(model_nums, models_interact_mode='sequential', model=None, data=None, device=None, config=None, base_trainer: Optional[Type[GeneralTorchTrainer]] = None)[source]¶

_run_routine(mode, hooks_set, dataset_name=None)[source]¶

Run the hooks_set and maintain the mode for multiple internal models

Parameters: mode – running mode of client, chosen from train/val/test

Note

Considering evaluation could be in `hooks_set[ "on_epoch_end"]`, there could be two data loaders in self.ctx, we must tell the running hooks which data_loader to call and which num_samples to count

get_model_para()[source]¶

return multiple model parameters

Returns

init_multiple_models()[source]¶: init multiple models and optimizers: the default implementation is copy init manner; ========================= Extension ============================= users can override this function according to their own requirements

register_multiple_model_hooks()[source]¶

By default, all internal models adopt the same hook_set.

Extension: Users can override this function to register customized hooks for different internal models.

Note

for sequential mode, users can append interact_hook on begin/end triggers such as ” -> (on_fit_end, _interact_to_other_models) -> “
for parallel mode, users can append interact_hook on any trigger they want such as ” -> (on_xxx_point, _interact_to_other_models) -> “
we must tell the running hooks which data_loader to call and which num_samples to count

update(model_parameters, strict=False)[source]¶

Parameters

model_parameters (list[dict]) – Multiple pyTorch Module object’s
state_dict. –

class federatedscope.core.trainers.GeneralTFTrainer(model, data, device, config, only_for_eval=False, monitor=None)[source]¶

_hook_on_batch_end(ctx)[source]¶

Note

The modified attributes and according operations are shown below:

Attribute	Operation
`ctx.num_samples`	Add `ctx.batch_size`
`ctx.loss_batch_total`	Add batch loss
`ctx.loss_regular_total`	Add batch regular loss
`ctx.ys_true`	Append `ctx.y_true`
`ctx.ys_prob`	Append `ctx.ys_prob`

_hook_on_batch_forward(ctx)[source]¶

Note

The modified attributes and according operations are shown below:

Attribute	Operation
`ctx.optimizer`	Initialize optimizer
`ctx.batch_size`	Calculate batch size
`ctx.loss_batch`	Calculate batch loss
`ctx.model`	Forward propagation
`ctx.y_true`	Get y_true from batch
`ctx.y_prob`	Forward propagation to get y_prob

_hook_on_batch_start_init(ctx)[source]¶

Note

The modified attributes and according operations are shown below:

Attribute	Operation
`ctx.data_batch`	Initialize batch data

_hook_on_epoch_start(ctx)[source]¶

Note

The modified attributes and according operations are shown below:

Attribute	Operation
`ctx.{cur_split}_loader`	Initialize DataLoader

_hook_on_fit_end(ctx)[source]¶

Evaluate metrics.

Note

The modified attributes and according operations are shown below:

Attribute	Operation
`ctx.ys_true`	Convert to numpy.array
`ctx.ys_prob`	Convert to numpy.array
`ctx.monitor`	Evaluate the results
`ctx.eval_metrics`	Get evaluated results from `ctx.monitor`

_hook_on_fit_start_init(ctx)[source]¶

Note

The modified attributes and according operations are shown below:

Attribute	Operation
`ctx.model`	Move to ctx.device
`ctx.loss_batch_total`	Initialize to 0
`ctx.loss_regular_total`	Initialize to 0
`ctx.num_samples`	Initialize to 0
`ctx.ys_true`	Initialize to `[]`
`ctx.ys_prob`	Initialize to `[]`

parse_data(data)[source]¶: Populate “{}_data”, “{}_loader” and “num_{}_data” for different modes

update(model_parameters, strict=False)[source]¶

Called by the FL client to update the model parameters

Parameters

model_parameters (dict) – {model_name: model_val}
strict (bool) – ensure the k-v paris are strictly same

class federatedscope.core.trainers.GeneralTorchTrainer(model, data, device, config, only_for_eval=False, monitor=None)[source]¶

_hook_on_batch_backward(ctx)[source]¶

Note

The modified attributes and according operations are shown below:

Attribute	Operation
`ctx.optimizer`	Update by gradient
`ctx.loss_task`	Backward propagation
`ctx.scheduler`	Update by gradient

_hook_on_batch_end(ctx)[source]¶

Note

The modified attributes and according operations are shown below:

Attribute	Operation
`ctx.num_samples`	Add `ctx.batch_size`
`ctx.loss_batch_total`	Add batch loss
`ctx.loss_regular_total`	Add batch regular loss
`ctx.ys_true`	Append `ctx.y_true`
`ctx.ys_prob`	Append `ctx.ys_prob`

_hook_on_batch_forward(ctx)[source]¶

Note

The modified attributes and according operations are shown below:

Attribute	Operation
`ctx.y_true`	Move to ctx.device
`ctx.y_prob`	Forward propagation get y_prob
`ctx.loss_batch`	Calculate the loss
`ctx.batch_size`	Get the batch_size

_hook_on_batch_forward_flop_count(ctx)[source]¶

The monitoring hook to calculate the flops during the fl course

Note

For customized cases that the forward process is not only based on ctx.model, please override this function (inheritance case) or replace this hook (plug-in case)

The modified attributes and according operations are shown below:

Attribute	Operation
`ctx.monitor`	Track average flops

_hook_on_batch_forward_regularizer(ctx)[source]¶

Note

The modified attributes and according operations are shown below:

Attribute	Operation
`ctx.loss_regular`	Calculate the regular loss
`ctx.loss_task`	Sum the `ctx.loss_regular` and `ctx.loss`

_hook_on_batch_start_init(ctx)[source]¶

Note

The modified attributes and according operations are shown below:

Attribute	Operation
`ctx.data_batch`	Initialize batch data

_hook_on_epoch_start(ctx)[source]¶

Note

The modified attributes and according operations are shown below:

Attribute	Operation
`ctx.{ctx.cur_split}_loader`	Initialize DataLoader

_hook_on_fit_end(ctx)[source]¶

Evaluate metrics.

Note

The modified attributes and according operations are shown below:

Attribute	Operation
`ctx.ys_true`	Convert to `numpy.array`
`ctx.ys_prob`	Convert to `numpy.array`
`ctx.monitor`	Evaluate the results
`ctx.eval_metrics`	Get evaluated results from `ctx.monitor`

_hook_on_fit_start_calculate_model_size(ctx)[source]¶

Note

The modified attributes and according operations are shown below:

Attribute	Operation
`ctx.monitor`	Track model size

_hook_on_fit_start_init(ctx)[source]¶

Note

The modified attributes and according operations are shown below:

Attribute	Operation
`ctx.model`	Move to `ctx.device`
`ctx.optimizer`	Initialize by `ctx.cfg`
`ctx.scheduler`	Initialize by `ctx.cfg`
`ctx.loss_batch_total`	Initialize to 0
`ctx.loss_regular_total`	Initialize to 0
`ctx.num_samples`	Initialize to 0
`ctx.ys_true`	Initialize to `[]`
`ctx.ys_prob`	Initialize to `[]`

discharge_model()[source]¶: Discharge the model from GPU device

get_model_para()[source]¶

Returns: model_parameters (dict): {model_name: model_val}

parse_data(data)[source]¶: Populate “${split}_data”, “${split}_loader” and “num_${ split}_data” for different data splits

setup_data(ctx)[source]¶: Initialization data by cfg.

update(model_parameters, strict=False)[source]¶

Called by the FL client to update the model parameters

Parameters: model_parameters (dict) – PyTorch Module object’s state_dict.

class federatedscope.core.trainers.Trainer(model, data, device, config, only_for_eval=False, monitor=None)[source]¶

Register, organize and run the train/test/val procedures

_param_filter(state_dict, filter_keywords=None)[source]¶

model parameter filter when transmit between local and gloabl, which is useful in personalization. e.g., setting cfg.personalization.local_param= [‘bn’, ‘norms’] indicates the implementation of “FedBN: Federated Learning on Non-IID Features via Local Batch Normalization, ICML2021”, which can be found in https://openreview.net/forum?id=6YEQUn0QICG

Parameters: state_dict (dict) – PyTorch Module object’s state_dict.
Returns: remove the keys that match any of the given keywords.
Return type: state_dict (dict)

_setup_data_related_var_in_ctx(ctx)[source]¶: Populate ${split}_data, ${split}_loader and num_${split}_data for different data splits, and setup init var in ctx.

get_model_para()[source]¶

Returns: model_parameters (dict): {model_name: model_val}

parse_data(data)[source]¶: Populate ${split}_data, ${split}_loader and num_${split}_data for different data splits

print_trainer_meta_info()[source]¶: print some meta info for code-users, e.g., model type; the para names will be filtered out, etc.,

setup_data(ctx)[source]¶: Initialization data by cfg.

update(model_parameters, strict=False)[source]¶

Called by the FL client to update the model parameters

Parameters

model_parameters (dict) – {model_name: model_val}
strict (bool) – ensure the k-v paris are strictly same

federatedscope.core.trainers.wrap_DittoTrainer(base_trainer: Type[GeneralTorchTrainer]) → Type[GeneralTorchTrainer][source]¶

Build a DittoTrainer with a plug-in manner, by registering new functions into specific BaseTrainer

The Ditto implementation, “Ditto: Fair and Robust Federated Learning Through Personalization. (ICML2021)” based on the Algorithm 2 in their paper and official codes: https://github.com/litian96/ditto

federatedscope.core.trainers.wrap_fedprox_trainer(base_trainer: Type[GeneralTorchTrainer]) → Type[GeneralTorchTrainer][source]¶: Implementation of fedprox refer to Federated Optimization in Heterogeneous Networks [Tian Li, et al., 2020]

(https://proceedings.mlsys.org/paper/2020/ file/38af86134b65d0f10fe33d30dd76442e-Paper.pdf)

federatedscope.core.trainers.wrap_nbafl_server(server)[source]¶: Register noise injector for the server

federatedscope.core.trainers.wrap_nbafl_trainer(base_trainer: Type[GeneralTorchTrainer]) → Type[GeneralTorchTrainer][source]¶

Implementation of NbAFL refer to Federated Learning with Differential Privacy: Algorithms and Performance Analysis [et al., 2020]

(https://ieeexplore.ieee.org/abstract/document/9069945/)

Arguments:
mu: the factor of the regularizer epsilon: the distinguishable bound w_clip: the threshold to clip weights

federatedscope.core.trainers.wrap_pFedMeTrainer(base_trainer: Type[GeneralTorchTrainer]) → Type[GeneralTorchTrainer][source]¶

Build a pFedMeTrainer with a plug-in manner, by registering new functions into specific BaseTrainer

The pFedMe implementation, “Personalized Federated Learning with Moreau Envelopes (NeurIPS 2020)” is based on the Algorithm 1 in their paper and official codes: https://github.com/CharlieDinh/pFedMe

federatedscope.core.data¶

class federatedscope.core.data.base_data.ClientData(client_cfg, train=None, val=None, test=None, **kwargs)[source]¶

ClientData converts split data to DataLoader.

Parameters

loader – Dataloader class or data dict which have been built
client_cfg – client-specific CfgNode
data – raw dataset, which will stay raw
train – train dataset, which will be converted to Dataloader
val – valid dataset, which will be converted to Dataloader
test – test dataset, which will be converted to Dataloader

Note

Key {split}_data in ClientData is the raw dataset. Key {split} in ClientData is the dataloader.

setup(new_client_cfg=None)[source]¶

Set up DataLoader in ClientData with new configurations.

Parameters: new_client_cfg – new client-specific CfgNode
Returns: Status for indicating whether the client_cfg is updated
Return type: Bool

class federatedscope.core.data.base_data.StandaloneDataDict(datadict, global_cfg)[source]¶

StandaloneDataDict maintain several ClientData, only used in Standalone mode to be passed to Runner, which will conduct several preprocess based on global_cfg, see preprocess() for details.

Parameters

datadict – Dict with client_id as key, ClientData as value.
global_cfg – global CfgNode

attack(datadict)[source]¶: Apply attack to StandaloneDataDict.

preprocess(datadict)[source]¶

Preprocess for:

Global evaluation (merge test data).
Global mode (train with centralized setting, merge all data).
Apply data attack algorithms.

Parameters: datadict – dict with client_id as key, ClientData as value.

resetup(global_cfg, client_cfgs=None)[source]¶

Reset-up new configs for ClientData, when the configs change which might be used in HPO.

Parameters

global_cfg – enable new config for ClientData
client_cfgs – enable new client-specific config for ClientData

class federatedscope.core.data.base_translator.BaseDataTranslator(global_cfg, client_cfgs=None)[source]¶

Translator is a tool to convert a centralized dataset to StandaloneDataDict, which is the input data of runner.

Notes

The Translator is consist of several stages:

Dataset -> ML split (split_train_val_test()) -> FL split (split_to_client()) -> StandaloneDataDict

split(dataset)[source]¶

Perform ML split and FL split.

Returns: dict of ClientData with client_idx as key to build StandaloneDataDict

split_to_client(train, val, test)[source]¶

Split dataset to clients and build ClientData.

Returns: dict of ClientData with client_idx as key.
Return type: dict

split_train_val_test(dataset, cfg=None)[source]¶

Split dataset to train, val, test if not provided.

Returns: List of split dataset, like [train, val, test]
Return type: List

class federatedscope.core.data.dummy_translator.DummyDataTranslator(global_cfg, client_cfgs=None)[source]¶

DummyDataTranslator convert datadict to StandaloneDataDict. Compared to core.data.base_translator.BaseDataTranslator, it do not perform FL split.

split(dataset)[source]¶

Perform ML split

Returns: dict of ClientData with client_idx as key to build StandaloneDataDict

federatedscope.core.data.utils.convert_data_mode(data, config)[source]¶

Convert StandaloneDataDict to ClientData in distributed mode.

Parameters

data – StandaloneDataDict
config – configuration of FL course, see federatedscope.core.configs

Returns

StandaloneDataDict in standalone mode, or ClientData in distributed mode.

federatedscope.core.data.utils.download_url(url: str, folder='folder')[source]¶

Downloads the content of an url to a folder. Modified from https://github.com/pyg-team/pytorch_geometric/tree/master/torch_geometric

Parameters

url (string) – The url of target file.
folder (string) – The target folder.

Returns

File path of downloaded files.

Return type

string

federatedscope.core.data.utils.filter_dict(func, kwarg)[source]¶

Filters out the common keys of kwarg that are not in kwarg.

Parameters

func – function to be filtered
kwarg – dict to filter

Returns

Filtered dict of arguments of the function.

federatedscope.core.data.utils.get_func_args(func)[source]¶

Get the set of arguments that the function expects.

Parameters: func – function to be analysis
Returns: Arguments that the function expects

federatedscope.core.data.utils.load_dataset(config, client_cfgs=None)[source]¶

Loads the dataset for the given config from branches

Parameters: config – configurations for FL, see federatedscope.core.configs

Note

See https://federatedscope.io/docs/datazoo/ for all available data.

federatedscope.core.data.utils.load_external_data(config=None)[source]¶

Based on the configuration file, this function imports external datasets and applies train/valid/test.

Parameters: config – CN from federatedscope/core/configs/config.py
Returns: tuple of ML split dataset, and CN from federatedscope/core/configs/config.py, which might be modified in the function.
Return type: (data, modified_config)

federatedscope.core.data.utils.merge_data(all_data, merged_max_data_id=None, specified_dataset_name=None)[source]¶

Merge data from client 1 to merged_max_data_id contained in given all_data.

Parameters

all_data – StandaloneDataDict
merged_max_data_id – max merged data index
specified_dataset_name – split name to be merged

Returns

Merged data.

federatedscope.core.data.utils.save_local_data(dir_path, train_data=None, train_targets=None, test_data=None, test_targets=None, val_data=None, val_targets=None)[source]¶

Save data to disk. Source: https://github.com/omarfoq/FedEM/blob/main/data/femnist/generate_data.py

Parameters

train_data – x of train data
train_targets – y of train data
test_data – x of test data
test_targets – y of test data
val_data – x of validation data
val_targets – y of validation data

Note

save (`train_data`, `train_targets`) in {dir_path}/train.pt, (`val_data`, `val_targets`) in {dir_path}/val.pt and (`test_data`, `test_targets`) in {dir_path}/test.pt

federatedscope.core.splitters¶

class federatedscope.core.splitters.BaseSplitter(client_num)[source]¶

This is an abstract base class for all splitter, which is not implemented with __call__().

client_num¶: Divide the dataset into client_num pieces.

class federatedscope.core.splitters.generic.IIDSplitter(client_num)[source]¶

This splitter splits dataset following the independent and identically distribution.

Parameters: client_num – the dataset will be split into client_num pieces

class federatedscope.core.splitters.generic.LDASplitter(client_num, alpha=0.5)[source]¶

This splitter split dataset with LDA.

Parameters

client_num – the dataset will be split into client_num pieces
alpha (float) – Partition hyperparameter in LDA, smaller alpha generates more extreme heterogeneous scenario see np.random.dirichlet

class federatedscope.core.splitters.graph.Analyzer(raw_data: Data, split_data: List[Data])[source]¶

Analyzer for raw graph and split subgraphs.

Parameters

raw_data (PyG.data) – raw graph.
split_data (list) – the list for subgraphs split by splitter.

average_clustering()[source]¶

Returns: the average clustering coefficient for the raw G and split G

fl_adj()[source]¶

Returns: the adj for missing edge ADJ.

fl_data()[source]¶

Returns: the split edge index.

hamming()[source]¶

Returns: the average hamming distance of feature for the raw G, split G and missing edge G

hamming_distance_graph(data)[source]¶

Returns: calculate the hamming distance of graph data

homophily()[source]¶

Returns: the homophily for the raw G and split G

homophily_value(edge_index, y)[source]¶

Returns: calculate homophily_value

missing_data()[source]¶

Returns: the graph data built by missing edge index.

num_missing_edge()[source]¶

Returns: the number of missing edge and the rate of missing edge.

portion_ms_node()[source]¶

Returns: the proportion of nodes who miss egde.

class federatedscope.core.splitters.graph.LouvainSplitter(client_num, delta=20)[source]¶

Split Data into small data via louvain algorithm.

Parameters

client_num (int) – Split data into client_num of pieces.
delta (int) – The gap between the number of nodes on each client.

class federatedscope.core.splitters.graph.RandChunkSplitter(client_num)[source]¶

Split graph-level dataset via random chunk strategy.

Parameters: dataset (List or PyG.dataset) – The graph-level datasets.

class federatedscope.core.splitters.graph.RandomSplitter(client_num, sampling_rate=None, overlapping_rate=0, drop_edge=0)[source]¶

Split Data into small data via random sampling.

Parameters

client_num (int) – Split data into client_num of pieces.
sampling_rate (str) – Samples of the unique nodes for each client, eg. '0.2,0.2,0.2'
overlapping_rate (float) – Additional samples of overlapping data, eg. '0.4'
drop_edge (float) – Drop edges (drop_edge / client_num) for each client within overlapping part.

class federatedscope.core.splitters.graph.RelTypeSplitter(client_num, alpha=0.5, realloc_mask=False)[source]¶

Split Data into small data via dirichlet distribution to generate non-i.i.d data split.

Parameters

client_num (int) – Split data into client_num of pieces.
alpha (float) – Partition hyperparameter in LDA, smaller alpha generates more extreme heterogeneous scenario see np.random.dirichlet

class federatedscope.core.splitters.graph.ScaffoldLdaSplitter(client_num, alpha)[source]¶

First adopt scaffold splitting and then assign the samples to clients according to Latent Dirichlet Allocation.

Parameters

dataset (List or PyG.dataset) – The molecular datasets.
alpha (float) – Partition hyperparameter in LDA, smaller alpha generates more extreme heterogeneous scenario see np.random.dirichlet

Returns

data_list of split dataset via scaffold split.

Return type

List(List(PyG.data))

class federatedscope.core.splitters.graph.ScaffoldSplitter(client_num)[source]¶

Split molecular via scaffold. This splitter will sort all moleculars, and split them into several parts.

Parameters: client_num (int) – Split data into client_num of pieces.

federatedscope.core.configs¶

class federatedscope.core.configs.CN(init_dict=None, key_list=None, new_allowed=False)[source]¶

An extended configuration system based on [yacs]( https://github.com/rbgirshick/yacs). The two-level tree structure consists of several internal dict-like containers to allow simple key-value access and management.

assert_cfg(check_cfg=True)[source]¶

check the validness of the configuration instance

Parameters: check_cfg – whether enable checks

check_required_args()[source]¶: Check required arguments.

clean_unused_sub_cfgs()[source]¶: Clean the un-used secondary-level CfgNode, whose .use attribute is True

clear_aux_info()[source]¶: Clears all the auxiliary information of the CN object.

de_arguments()[source]¶: some config values are managed via Argument class, this function is used to make these values clean without the Argument class, such that the potential type-specific methods work correctly, e.g., len(cfg.federate.method) for a string config

freeze(inform=True, save=True, check_cfg=True)[source]¶

make the cfg attributes immutable;
if save==True, save the frozen cfg_check_funcs into self.outdir/config.yaml for better reproducibility;
if self.wandb.use==True, update the frozen config

merge_from_file(cfg_filename, check_cfg=True)[source]¶

load configs from a yaml file, another cfg instance or a list stores the keys and values.

Parameters

cfg_filename – file name of yaml file
check_cfg – whether enable assert_cfg()

merge_from_list(cfg_list, check_cfg=True)[source]¶: load configs from a list stores the keys and values. modified merge_from_list in yacs.config.py to allow adding new keys if is_new_allowed() returns True :param cfg_list: list of pairs of cfg name and value :param check_cfg: whether enable assert_cfg()

merge_from_other_cfg(cfg_other, check_cfg=True)[source]¶

load configs from another cfg instance

Parameters

cfg_other – other cfg to be merged
check_cfg – whether enable assert_cfg()

print_help(arg_name='')[source]¶

print help info for a specific given arg_name or for all arguments if not given arg_name

Parameters: arg_name – name of specific args

ready_for_run(check_cfg=True)[source]¶

Check and cleans up the internal state of cfg and save cfg.

Parameters: check_cfg – whether enable assert_cfg()

register_cfg_check_fun(cfg_check_fun)[source]¶

Register a function that checks the configuration node.

Parameters: cfg_check_fun – function for validation the correctness of cfg.

federatedscope.core.configs.init_global_cfg(cfg)[source]¶

This function sets the default config value.

Note that for an experiment, only part of the arguments will be used The remaining unused arguments won’t affect anything. So feel free to register any argument in graphgym.contrib.config
We support more than one levels of configs, e.g., cfg.dataset.name

federatedscope.core.monitors¶

class federatedscope.core.monitors.EarlyStopper(patience=5, delta=0, improve_indicator_mode='best', the_larger_the_better=True)[source]¶

Track the history of metric (e.g., validation loss), check whether should stop (training) process if the metric doesn’t improve after a given patience.

Parameters

patience (int) – (Default: 5) How long to wait after last time the monitored metric improved. Note that the actual_checking_round = patience * cfg.eval.freq
delta (float) – (Default: 0) Minimum change in the monitored metric to indicate an improvement.
improve_indicator_mode (str) – Early stop when no improve to last patience round, in ['mean', 'best']

__track_and_check_best(history_result)¶

Tracks the best result and checks whether the patience is exceeded.

Parameters: history_result – results of all evaluation round
Returns: whether stop
Return type: Bool

__track_and_check_dummy(new_result)¶

Dummy stopper, always return false

Parameters: new_result –
Returns: False

track_and_check(new_result)[source]¶

Checks the new result and if it improves it returns True.

Parameters: new_result – new evaluation result
Returns: whether stop
Return type: Bool

class federatedscope.core.monitors.MetricCalculator(eval_metric: Union[Set[str], List[str], str])[source]¶

Initializes the metric functions for the monitor. Use eval(ctx) to get evaluation results.

Parameters: eval_metric – set of metric names

_check_and_parse(ctx)[source]¶

Check the format of the prediction and labels

Parameters: ctx – context of trainer, see core.trainers.context
Returns: The ground truth labels y_pred: The prediction categories for classification task y_prob: The output of the model
Return type: y_true

get_metric_funcs(eval_metric)[source]¶

Build metrics for evaluation. :param self: write your description :param eval_metric: write your description

Returns: A metric calculator dict, such as {'loss': (eval_loss, False), 'acc': (eval_acc, True), ...}

Note

The key-value pairs of built-in metric and related funcs and the_larger_the_better sign is shown below:

Metric name	Source	The larger the better
`loss`	`monitors.metric_calculator.eval_loss`	False
`avg_loss`	`monitors.metric_calculator.eval_avg_loss`	False
`total`	`monitors.metric_calculator.eval_total`	False
`correct`	`monitors.metric_calculator.eval_correct`	True
`acc`	`monitors.metric_calculator.eval_acc`	True
`ap`	`monitors.metric_calculator.eval_ap`	True
`f1`	`monitors.metric_calculator.eval_f1_score`	True
`roc_auc`	`monitors.metric_calculator.eval_roc_auc`	True
`rmse`	`monitors.metric_calculator.eval_rmse`	False
`mse`	`monitors.metric_calculator.eval_mse`	False
`loss_regular`	`monitors.metric_calculator.eval_regular`	False
`imp_ratio`	`monitors.metric_calculator.eval_imp_ratio`	True
`std`	`None`	False
`hits@{n}`	`monitors.metric_calculator.eval_hits`	True

class federatedscope.core.monitors.Monitor(cfg, monitored_object=None)[source]¶

Provide the monitoring functionalities such as formatting the evaluation results into diverse metrics. Besides the prediction related performance, the monitor also can track efficiency related metrics for a worker

Parameters

cfg – a cfg node object
monitored_object – object to be monitored

log_res_best¶: best ever seen results

outdir¶: output directory

use_wandb¶: whether use wandb

wandb_online_track¶: whether use wandb to track online

monitored_object¶: object to be monitored

metric_calculator¶: metric calculator, / see core.monitors.metric_calculator

round_wise_update_key¶: key to decide which result of evaluation round is better

add_items_to_best_result(best_results, new_results, results_type)[source]¶: Add a new key: value item (results-type: new_results) to best_result

calc_model_metric(last_model, local_updated_models, rnd)[source]¶

Parameters

last_model (dict) – the state of last round.
local_updated_models (list) – each element is (data_size, model).

Returns

model_metric_dict

Return type

dict

compress_raw_res_file()[source]¶: Compress the raw res file to be written to disk.

convert_size(size_bytes)[source]¶: Convert bytes to human-readable size.

eval(ctx)[source]¶

Evaluates the given context with metric_calculator.

Parameters: ctx – context of trainer, see core.trainers.context
Returns: Evaluation results.

finish_fed_runner(fl_mode=None)[source]¶: Finish the Fed runner.

finish_fl()[source]¶: When FL finished, write system metrics to file.

format_eval_res(results, rnd, role=- 1, forms=None, return_raw=False)[source]¶

Format the evaluation results from trainer.ctx.eval_results

Parameters

results (dict) – a dict to store the evaluation results {metric:
value} –
rnd (int|string) – FL round
role (int|string) – the output role
forms (list) – format type
return_raw (bool) – return either raw results, or other results

Returns

round_formatted_results, a formatted results with different forms and roles

Return type

dict

Note

Example of return value:: ` { 'Role': 'Server #', 'Round': 200, 'Results_weighted_avg': { 'test_avg_loss': 0.58, 'test_acc': 0.67, 'test_correct': 3356, 'test_loss': 2892, 'test_total': 5000 }, 'Results_avg': { 'test_avg_loss': 0.57, 'test_acc': 0.67, 'test_correct': 3356, 'test_loss': 2892, 'test_total': 5000 }, 'Results_fairness': { 'test_total': 33.99, 'test_correct': 27.185, 'test_avg_loss_std': 0.433551, 'test_avg_loss_bottom_decile': 0.356503, 'test_avg_loss_top_decile': 1.212492, 'test_avg_loss_min': 0.198317, 'test_avg_loss_max': 3.603567, 'test_avg_loss_bottom10%': 0.276681, 'test_avg_loss_top10%': 1.686649, 'test_avg_loss_cos1': 0.8679, 'test_avg_loss_entropy': 5.1641, 'test_loss_std': 13.686828, 'test_loss_bottom_decile': 11.8220, 'test_loss_top_decile': 39.727236, 'test_loss_min': 7.337724, 'test_loss_max': 100.899873, 'test_loss_bottom10%': 9.618685, 'test_loss_top10%': 54.96769, 'test_loss_cos1': 0.880356, 'test_loss_entropy': 5.175803, 'test_acc_std': 0.123823, 'test_acc_bottom_decile': 0.676471, 'test_acc_top_decile': 0.916667, 'test_acc_min': 0.071429, 'test_acc_max': 0.972973, 'test_acc_bottom10%': 0.527482, 'test_acc_top10%': 0.94486, 'test_acc_cos1': 0.988134, 'test_acc_entropy': 5.283755 }, } `

global_converged()[source]¶: Calculate wall time and round when global convergence has been reached.

local_converged()[source]¶: Calculate wall time and round when local convergence has been reached.

merge_system_metrics_simulation_mode(file_io=True, from_global_monitors=False)[source]¶: Average the system metrics recorded in system_metrics.json by all workers

save_formatted_results(formatted_res, save_file_name='eval_results.log')[source]¶: Save formatted results to a file.

track_avg_flops(flops, sample_num=1)[source]¶: update the average flops for forwarding each data sample, for most models and tasks, the averaging is not needed as the input shape is fixed

track_download_bytes(bytes)[source]¶: Track the number of bytes downloaded.

track_model_size(models)[source]¶

calculate the total model size given the models hold by the worker/trainer

Args: models: torch.nn.Module or list of torch.nn.Module

track_upload_bytes(bytes)[source]¶: Track the number of bytes uploaded.

update_best_result(best_results, new_results, results_type)[source]¶: Update best evaluation results. by default, the update is based on validation loss with ``round_wise_update_key=”val_loss” ``

federatedscope.core.aggregators¶

class federatedscope.core.aggregators.Aggregator[source]¶

Abstract class of Aggregator.

abstract aggregate(agg_info)[source]¶

Aggregation function.

Parameters: agg_info – information to be aggregated.

class federatedscope.core.aggregators.AsynClientsAvgAggregator(model=None, device='cpu', config=None)[source]¶

The aggregator used in asynchronous training, which discounts the staled model updates

aggregate(agg_info)[source]¶

To preform aggregation

Parameters: agg_info (dict) – the feedbacks from clients
Returns: the aggregated results
Return type: dict

discount_func(staleness)[source]¶: Served as an example, we discount the model update with staleness tau as: (1.0/((1.0+ au)**factor)), which has been used in previous studies such as FedAsync ( Asynchronous Federated Optimization) and FedBuff (Federated Learning with Buffered Asynchronous Aggregation).

class federatedscope.core.aggregators.BulyanAggregator(model=None, device='cpu', config=None)[source]¶

Implementation of Bulyan refers to The Hidden Vulnerability of Distributed Learning in Byzantium [Mhamdi et al., 2018] (http://proceedings.mlr.press/v80/mhamdi18a/mhamdi18a.pdf)

It combines the MultiKrum aggregator and the treamedmean aggregator

aggregate(agg_info)[source]¶: To preform aggregation with Median aggregation rule Arguments: agg_info (dict): the feedbacks from clients :returns: the aggregated results :rtype: dict

class federatedscope.core.aggregators.ClientsAvgAggregator(model=None, device='cpu', config=None)[source]¶

Implementation of vanilla FedAvg refer to ‘Communication-efficient learning of deep networks from decentralized data’ [McMahan et al., 2017] http://proceedings.mlr.press/v54/mcmahan17a.html

aggregate(agg_info)[source]¶

To preform aggregation

Parameters: agg_info (dict) – the feedbacks from clients
Returns: the aggregated results
Return type: dict

update(model_parameters)[source]¶

Parameters: model_parameters (dict) – PyTorch Module object’s state_dict.

class federatedscope.core.aggregators.FedOptAggregator(config, model, device='cpu')[source]¶

Implementation of FedOpt refer to Adaptive Federated Optimization [Reddi et al., 2021](https://openreview.net/forum?id=LkFG3lB13U5)

aggregate(agg_info)[source]¶: To preform FedOpt aggregation.

class federatedscope.core.aggregators.KrumAggregator(model=None, device='cpu', config=None)[source]¶

Implementation of Krum/multi-Krum refer to Machine learning with adversaries: Byzantine tolerant gradient descent [Blanchard P et al., 2017] (https://proceedings.neurips.cc/paper/2017/hash/ f4b9ec30ad9f68f89b29639786cb62ef-Abstract.html)

aggregate(agg_info)[source]¶

To preform aggregation with Krum aggregation rule

Arguments: agg_info (dict): the feedbacks from clients :returns: the aggregated results :rtype: dict

class federatedscope.core.aggregators.MedianAggregator(model=None, device='cpu', config=None)[source]¶

Implementation of median refers to Byzantine-robust distributed learning: Towards optimal statistical rates [Yin et al., 2018] (http://proceedings.mlr.press/v80/yin18a/yin18a.pdf)

It computes the coordinate-wise median of recieved updates from clients

The code is adapted from https://github.com/bladesteam/blades

aggregate(agg_info)[source]¶: To preform aggregation with Median aggregation rule Arguments: agg_info (dict): the feedbacks from clients :returns: the aggregated results :rtype: dict

class federatedscope.core.aggregators.NoCommunicationAggregator(model=None, device='cpu', config=None)[source]¶

Clients do not communicate. Each client work locally

aggregate(agg_info)[source]¶

Aggregation function.

Parameters: agg_info – information to be aggregated.

update(model_parameters)[source]¶

Parameters: model_parameters (dict) – PyTorch Module object’s state_dict.

class federatedscope.core.aggregators.NormboundingAggregator(model=None, device='cpu', config=None)[source]¶

The server clips each update to reduce the negative impact of malicious updates.

aggregate(agg_info)[source]¶: To preform aggregation with normbounding aggregation rule Arguments: agg_info (dict): the feedbacks from clients :returns: the aggregated results :rtype: dict

class federatedscope.core.aggregators.OnlineClientsAvgAggregator(model=None, device='cpu', src_device='cpu', config=None)[source]¶

Implementation of online aggregation of FedAvg.

aggregate(agg_info)[source]¶: Returns the aggregated value

inc(content)[source]¶: Increment the model weight by the given content.

reset()[source]¶: Reset the state of the model to its initial state

class federatedscope.core.aggregators.ServerClientsInterpolateAggregator(model=None, device='cpu', config=None, beta=1.0)[source]¶

conduct aggregation by interpolating global model from server and local models from clients

aggregate(agg_info)[source]¶: Returns the aggregated value

class federatedscope.core.aggregators.TrimmedmeanAggregator(model=None, device='cpu', config=None)[source]¶

Implementation of median refer to Byzantine-robust distributed learning: Towards optimal statistical rates [Yin et al., 2018] (http://proceedings.mlr.press/v80/yin18a/yin18a.pdf)

The code is adapted from https://github.com/bladesteam/blades

aggregate(agg_info)[source]¶: To preform aggregation with trimmedmean aggregation rule Arguments: agg_info (dict): the feedbacks from clients :returns: the aggregated results :rtype: dict

federatedscope.core.auxiliaries¶

federatedscope.core.auxiliaries.aggregator_builder.get_aggregator()[source]¶

This function builds an aggregator, which is a protocol for aggregate all clients’ model(s).

Parameters

method – key to determine which aggregator to use
model – model to be aggregated
device – where to aggregate models (cpu or gpu)
online – True or False to use online aggregator.
config – configurations for FL, see federatedscope.core.configs

Returns

An instance of aggregator (see core.aggregator for details)

Note

The key-value pairs of method and aggregators:

Method	Aggregator
`tensorflow`	`cross_backends.FedAvgAggregator`
`local`	`core.aggregators.NoCommunicationAggregator`
`global`	`core.aggregators.NoCommunicationAggregator`
`fedavg`	`core.aggregators.OnlineClientsAvgAggregator` or `core.aggregators.AsynClientsAvgAggregator` or `ClientsAvgAggregator`
`pfedme`	`core.aggregators.ServerClientsInterpolateAggregator`
`ditto`	`core.aggregators.OnlineClientsAvgAggregator` or `core.aggregators.AsynClientsAvgAggregator` or `ClientsAvgAggregator`
`fedsageplus`	`core.aggregators.OnlineClientsAvgAggregator` or `core.aggregators.AsynClientsAvgAggregator` or `ClientsAvgAggregator`
`gcflplus`	`core.aggregators.OnlineClientsAvgAggregator` or `core.aggregators.AsynClientsAvgAggregator` or `ClientsAvgAggregator`
`fedopt`	`core.aggregators.FedOptAggregator`

federatedscope.core.auxiliaries.criterion_builder.get_criterion()[source]¶

This function builds an instance of loss functions from: “https://pytorch.org/docs/stable/nn.html#loss-functions”, where the criterion_type is chosen from.

Parameters

criterion_type – loss function type
device – move to device (cpu or gpu)

Returns

An instance of loss functions.

federatedscope.core.auxiliaries.data_builder.get_data()[source]¶

Instantiate the data and update the configuration accordingly if necessary.

Parameters

config – a cfg node object
client_cfgs – dict of client-specific cfg node object

Returns

The dataset object and the updated configuration.

Note

The available data.type is shown below:

Data type	Domain
FEMNIST	CV
Celeba	CV
`${DNAME}@torchvision`	CV
Shakespeare	NLP
SubReddit	NLP
Twitter (Sentiment140)	NLP
`${DNAME}@torchtext`	NLP
`${DNAME}@huggingface_datasets`	NLP
Cora	Graph (node-level)
CiteSeer	Graph (node-level)
PubMed	Graph (node-level)
DBLP_conf	Graph (node-level)
DBLP_org	Graph (node-level)
csbm	Graph (node-level)
Epinions	Graph (link-level)
Ciao	Graph (link-level)
FB15k	Graph (link-level)
FB15k-237	Graph (link-level)
WN18	Graph (link-level)
MUTAG	Graph (graph-level)
BZR	Graph (graph-level)
COX2	Graph (graph-level)
DHFR	Graph (graph-level)
PTC_MR	Graph (graph-level)
AIDS	Graph (graph-level)
NCI1	Graph (graph-level)
ENZYMES	Graph (graph-level)
DD	Graph (graph-level)
PROTEINS	Graph (graph-level)
COLLAB	Graph (graph-level)
IMDB-BINARY	Graph (graph-level)
IMDB-MULTI	Graph (graph-level)
REDDIT-BINARY	Graph (graph-level)
HIV	Graph (graph-level)
ESOL	Graph (graph-level)
FREESOLV	Graph (graph-level)
LIPO	Graph (graph-level)
PCBA	Graph (graph-level)
MUV	Graph (graph-level)
BACE	Graph (graph-level)
BBBP	Graph (graph-level)
TOX21	Graph (graph-level)
TOXCAST	Graph (graph-level)
SIDER	Graph (graph-level)
CLINTOX	Graph (graph-level)
graph_multi_domain_mol	Graph (graph-level)
graph_multi_domain_small	Graph (graph-level)
graph_multi_domain_biochem	Graph (graph-level)
cikmcup	Graph (graph-level)
toy	Tabular
synthetic	Tabular
quadratic	Tabular
`${DNAME}openml`	Tabular
vertical_fl_data	Tabular(vertical)
VFLMovieLens1M	Recommendation
VFLMovieLens10M	Recommendation
HFLMovieLens1M	Recommendation
HFLMovieLens10M	Recommendation
VFLNetflix	Recommendation
HFLNetflix	Recommendation

federatedscope.core.auxiliaries.dataloader_builder.get_dataloader()[source]¶

Instantiate a DataLoader via config.

Parameters

dataset – dataset from which to load the data.
config – configs containing batch_size, shuffle, etc.
split – current split (default: train), if split is test, cfg.dataloader.shuffle will be False. And in PyG, test split will use NeighborSampler by default.

Returns

Instance of specific DataLoader configured by config.

Note

The key-value pairs of dataloader.type and DataLoader:

`dataloader.type`	Source
`raw`	No DataLoader
`base`	`torch.utils.data.DataLoader`
`pyg`	`torch_geometric.loader.DataLoader`
`graphsaint-rw`	`torch_geometric.loader.GraphSAINTRandomWalkSampler`
`neighbor`	`torch_geometric.loader.NeighborSampler`
`mf`	`federatedscope.mf.dataloader.MFDataLoader`

federatedscope.core.auxiliaries.metric_builder.get_metric()[source]¶

This function returns a dict, where the key is metric name, and value is the function of how to calculate the metric and a bool to indicate the metric is larger the better.

Parameters: types – list of metric names
Returns: A metric calculator dict, such as {'loss': (eval_loss, False), 'acc': (eval_acc, True), ...}

Note

The key-value pairs of built-in metric and related funcs and the_larger_the_better sign is shown below:

Metric name	Source	The larger the better
`loss`	`monitors.metric_calculator.eval_loss`	False
`avg_loss`	`monitors.metric_calculator.eval_avg_loss`	False
`total`	`monitors.metric_calculator.eval_total`	False
`correct`	`monitors.metric_calculator.eval_correct`	True
`acc`	`monitors.metric_calculator.eval_acc`	True
`ap`	`monitors.metric_calculator.eval_ap`	True
`f1`	`monitors.metric_calculator.eval_f1_score`	True
`roc_auc`	`monitors.metric_calculator.eval_roc_auc`	True
`rmse`	`monitors.metric_calculator.eval_rmse`	False
`mse`	`monitors.metric_calculator.eval_mse`	False
`loss_regular`	`monitors.metric_calculator.eval_regular`	False
`imp_ratio`	`monitors.metric_calculator.eval_imp_ratio`	True
`std`	`None`	False
`hits@{n}`	`monitors.metric_calculator.eval_hits`	True

federatedscope.core.auxiliaries.model_builder.get_model()[source]¶

This function builds an instance of model to be trained.

Parameters

model_config – cfg.model, a submodule of cfg
local_data – the model to be instantiated is responsible for the given data
backend – chosen from torch and tensorflow

Returns

the instantiated model.

Return type

model (torch.Module)

Note

The key-value pairs of built-in model and source are shown below:

Model type	Source
`lr`	`core.lr.LogisticRegression` or `cross_backends.LogisticRegression`
`mlp`	`core.mlp.MLP`
`quadratic`	`tabular.model.QuadraticModel`
`convnet2, convnet5, vgg11`	`cv.model.get_cnn()`
`lstm`	`nlp.model.get_rnn()`
`{}@transformers`	`nlp.model.get_transformer()`
`gcn, sage, gpr, gat, gin, mpnn`	`gfl.model.get_gnn()`
`vmfnet, hmfnet`	`mf.model.model_builder.get_mfnet()`

federatedscope.core.auxiliaries.optimizer_builder.get_optimizer()[source]¶

This function returns an instantiated optimizer to optimize the model.

Parameters

model – model to be optimized
type – type of optimizer, see https://pytorch.org/docs/stable/optim.html
lr – learning rate
**kwargs – kwargs dict

Returns

An instantiated optimizer

federatedscope.core.auxiliaries.regularizer_builder.get_regularizer()[source]¶

This function builds an instance of regularizer to regularize training.

Parameters: reg_type – type of scheduler, such as see https://pytorch.org/docs/stable/optim.html for details
Returns: An instantiated regularizer.

federatedscope.core.auxiliaries.runner_builder.get_runner()[source]¶

Instantiate a runner based on a configuration file

Parameters

server_class – server class
client_class – client class
config – configurations for FL, see federatedscope.core.configs
client_configs – client-specific configurations

Returns

An instantiated FedRunner to run the FL course.

Note

The key-value pairs of built-in runner and source are shown below:

Mode	Source
`standalone`	`core.fed_runner.StandaloneRunner`
`distributed`	`core.fed_runner.DistributedRunner`
`standalone(process_num>1)`	`core.auxiliaries.parallel_runner.` `StandaloneMultiGPURunner`

federatedscope.core.auxiliaries.sampler_builder.get_sampler()[source]¶

This function builds a sampler for sampling clients who should join the aggregation per communication round.

Parameters

sample_strategy – Sampling strategy of sampler
client_num – total number of client joining the FL course
client_info – client information
bins – size of bins for group sampler

Returns

An instantiated Sampler to sample during aggregation.

Note

The key-value pairs of built-in sampler and source are shown below:

Sampling strategy	Source
`uniform`	`core.sampler.UniformSampler`
`group`	`core.sampler.GroupSampler`

federatedscope.core.auxiliaries.scheduler_builder.get_scheduler()[source]¶

This function builds an instance of scheduler.

Parameters

optimizer – optimizer to be scheduled
type – type of scheduler
**kwargs – kwargs dict

Returns

An instantiated scheduler.

Note

Please follow contrib.scheduler.example to implement your own scheduler.

federatedscope.core.auxiliaries.splitter_builder.get_splitter()[source]¶

This function is to build splitter to generate simulated federation datasets from non-FL dataset.

Parameters: config – configurations for FL, see federatedscope.core.configs
Returns: An instance of splitter (see core.splitters for details)

Note

The key-value pairs of cfg.data.splitter and domain:

Splitter type	Domain
lda	Generic
iid	Generic
louvain	Graph (node-level)
random	Graph (node-level)
rel_type	Graph (link-level)
scaffold	Molecular
scaffold_lda	Molecular
rand_chunk	Graph (graph-level)

federatedscope.core.auxiliaries.trainer_builder.get_trainer()[source]¶

This function builds an instance of trainer.

Parameters

model – model used in FL course
data – data used in FL course
device – where to train model (cpu or gpu)
config – configurations for FL, see federatedscope.core.configs
only_for_eval – True or False, if True, train routine will be removed in this trainer
is_attacker – True or False to determine whether this client is an attacker
monitor – an instance of federatedscope.core.monitors.Monitor to observe the evaluation and system metrics

Returns

An instance of trainer.

Note

The key-value pairs of cfg.trainer.type and trainers:

Trainer Type	Source
`general`	`core.trainers.GeneralTorchTrainer` and `core.trainers.GeneralTFTrainer`
`cvtrainer`	`cv.trainer.trainer.CVTrainer`
`nlptrainer`	`nlp.trainer.trainer.NLPTrainer`
`graphminibatch_trainer`	`gfl.trainer.graphtrainer.GraphMiniBatchTrainer`
`linkfullbatch_trainer`	`gfl.trainer.linktrainer.LinkFullBatchTrainer`
`linkminibatch_trainer`	`gfl.trainer.linktrainer.LinkMiniBatchTrainer`
`nodefullbatch_trainer`	`gfl.trainer.nodetrainer.NodeFullBatchTrainer`
`nodeminibatch_trainer`	`gfl.trainer.nodetrainer.NodeMiniBatchTrainer`
`flitplustrainer`	`gfl.flitplus.trainer.FLITPlusTrainer`
`flittrainer`	`gfl.flitplus.trainer.FLITTrainer`
`fedvattrainer`	`gfl.flitplus.trainer.FedVATTrainer`
`fedfocaltrainer`	`gfl.flitplus.trainer.FedFocalTrainer`
`mftrainer`	`federatedscope.mf.trainer.MFTrainer`
`mytorchtrainer`	`contrib.trainer.torch_example.MyTorchTrainer`

Wrapper functions are shown below:

Wrapper Functions	Source
`nbafl`	`core.trainers.wrap_nbafl_trainer`
`sgdmf`	`mf.trainer.wrap_MFTrainer`
`pfedme`	`core.trainers.wrap_pFedMeTrainer`
`ditto`	`core.trainers.wrap_DittoTrainer`
`fedem`	`core.trainers.FedEMTrainer`
`fedprox`	`core.trainers.wrap_fedprox_trainer`
`attack`	`attack.trainer.wrap_benignTrainer` and `attack.auxiliary.attack_trainer_builder.wrap_attacker_trainer`

federatedscope.core.auxiliaries.transform_builder.get_transform()[source]¶

This function is to build transforms applying to dataset.

Parameters

config – CN from federatedscope/core/configs/config.py
package – one of package from ['torchvision', 'torch_geometric', 'torchtext', 'torchaudio']

Returns

Dict of transform functions.

federatedscope.core.auxiliaries.worker_builder.get_client_cls()[source]¶

This function return a class of client.

Parameters: cfg – configurations for FL, see federatedscope.core.configs
Returns: A client class decided by cfg.

Note

The key-value pairs of client type and source:

Client type	Source
`local`	`core.workers.Client`
`fedavg`	`core.workers.Client`
`pfedme`	`core.workers.Client`
`ditto`	`core.workers.Client`
`fedex`	`autotune.fedex.FedExClient`
`vfl`	`vertical_fl.worker.vFLClient`
`fedsageplus`	`gfl.fedsageplus.worker.FedSagePlusClient`
`gcflplus`	`gfl.gcflplus.worker.GCFLPlusClient`
`gradascent`	`attack.worker_as_attacker.active_client`

federatedscope.core.auxiliaries.worker_builder.get_server_cls()[source]¶

This function return a class of server.

Parameters: cfg – configurations for FL, see federatedscope.core.configs
Returns: A server class decided by cfg.

Note

The key-value pairs of server type and source:

Server type	Source
`local`	`core.workers.Server`
`fedavg`	`core.workers.Server`
`pfedme`	`core.workers.Server`
`ditto`	`core.workers.Server`
`fedex`	`autotune.fedex.FedExServer`
`vfl`	`vertical_fl.worker.vFLServer`
`fedsageplus`	`gfl.fedsageplus.worker.FedSagePlusServer`
`gcflplus`	`gfl.gcflplus.worker.GCFLPlusServer`
`attack`	`attack.worker_as_attacker.server_attacker.PassiveServer` and `attack.worker_as_attacker.server_attacker.PassivePIAServer`
`backdoor`	`attack.worker_as_attacker.server_attacker.BackdoorServer`