FedChain A Collaborative Framework for Building Artificial Intelligence Models Using Blockchain and Federated Learning FedChain A Collaborative Framework for Building Artificial Intelligence Models us[.]
Trang 1FedChain: A Collaborative Framework for Building Artificial Intelligence Models using Blockchain and
Federated Learning
Tran Duc Luong∗†, Vuong Minh Tien∗†, Hoang Tuan Anh∗†, Ngan Van Luyen∗†, Nguyen Chi Vy∗†,
Phan The Duy∗†, Van-Hau Pham∗†
∗Information Security Laboratory, University of Information Technology, Ho Chi Minh city, Vietnam
†Vietnam National University, Ho Chi Minh city, Vietnam
Ho Chi Minh City, Vietnam {19521815, 19522346, 18520446, 18521074, 18521681}@gm.uit.edu.vn, {duypt, haupv}@uit.edu.vn
Abstract—Machine learning (ML) has been drawn to attention
from both academia and industry thanks to outstanding advances
and its potential in many fields Nevertheless, data collection for
training models is a difficult task since there are many concerns
on privacy and data breach reported recently Data owners or
holders are usually hesitant to share their private data Also,
the benefits from analyzing user data are not distributed to
users In addition, due to the lack of incentive mechanism for
sharing data, ML builders cannot leverage the massive data
from many sources Thus, this paper introduces a collaborative
approach for building artificial intelligence (AI) models, named
FedChain to encourage many data owners to cooperate in the
training phase without sharing their raw data It helps data
holders ensure privacy preservation for the collaborative training
right on their premises, while reducing the computation load
in case of centralized training More specifically, we utilize
federated learning (FL)and Hyperledger Sawtooth Blockchain to
set up a prototype framework that enables many parties to join,
contribute and receive rewards transparently from their training
task results Finally, we conduct experiments of our FedChain
on cyber threat intelligence context, where AI model is trained
within many organizations on each their private datastore, and
then it is used for detecting malicious actions in the network
Experimental results with the CICIDS-2017 dataset prove that
the FL-based strategy can help create effective privacy-preserving
ML models while taking advantage of diverse data sources from
the community
Index Terms—Federated Learning, Privacy Preservation,
Blockchain, Generative Adversarial Networks
I INTRODUCTION Currently, more and more countries have been promoting
the strategy of building Smart City to catch up with the
growth of The Fourth Industrial Revolution where AI takes
a leading role In reality, AI models leverage the advances in
ML to build more efficient models, which requires a large
amount of data for training The centralized approach to
building ML models today is to gather all the training data
in a particular server, usually in the cloud, and then train the
model on the collected data volume However, this approach is
slowly becoming unfeasible in practice for privacy and security
reasons when collecting data The risk of sensitive data loss
during storage, transmission, and sharing of data has raised
concerns about the privacy of data owners At the same time, data providers are also unwilling to share data because their data contributions would not earn them any awards in the traditional ML training method Therefore, for the sake of comprehensive smart ecosystems, a new training mechanism must be discovered and implemented which could resolve the issues of data security and the incentives in training a mutual model
In this context, Federated Learning (FL) emerges as a decentralized learning technique that ensures both the high performance of ML models and data privacy Contrary to centralized learning, this method allows the global model to
be trained right on the local parties and transfers the model parameters to the central server, which is responsible for the aggregation of the received model updates to construct an improved global model Finally, the participants download the global update from the aggregator and compute it on their own dataset for the next local model The training process occurs iteratively until the global training is optimized By utilizing the computing strength of distributed clients, FL approach can enhance ML model quality and reduce user privacy leakage Nevertheless, FL scheme in practice has to face non-independent and identically distributed (Non-IID) data that means unbalanced data distribution in size, labels, etc among collaborative workers Many researchers have indicated that the degradation in accuracy and performance of FL appears almost inevitable due to Non-IID data With the power of gen-erating synthetic samples, Generative Adversarial Networks (GAN) would be used as a data augmentation technique to mitigate this imbalance issues in FL The operating principle
of GAN can be visualized as a zero-sum game between two opposing neural networks, the Generator (G) and the Discriminator (D) G will be trained to output new adversarial samples given some noise source, whereas D is responsible for classifying data samples as either real (from the original dataset) or fake (generated from G) The game would go on continuously until the D model can no longer distinguish real
or fake samples, meaning the G model is generating plausible data Then in FL scheme, each client would be equipped with
Trang 2a GAN architecture which is used for data augmentation in
the case of Non-IID data
In building ML models, most works assume that all devices
in the system will engage in FL tasks unconditionally when
requested, and will be totally honest It is, however, impossible
because there will be incurred expenses and many dishonest
participants during the model-building process Therefore,
to create effective models, it is important that the system
must ensure the honesty of the participants and have some
incentive mechanism to offer the contributed resources with
the appropriate rewards With Blockchain, all contributions
will be stored transparently, which means that each node in
the Blockchain can check the validity of the contribution
From there, a framework may be built to help manage, control
honesty and encourage data owners to be involved in building
models through a reward mechanism in accordance with
con-tributions Besides, the application of the InterPlanetary File
System (IPFS) can help the model to achieve complete
disin-tegration With distributed data storage, IPFS helps the system
to strengthen fault tolerance and eliminate the drawbacks of
centralized data storage systems This is also conducive to the
system to significantly optimize costs in terms of time and
data transmission
Based on the above-mentioned analysis, in this paper, we
propose a new privacy-preserving framework named FedChain
using FL and GAN in the process of training AI models
efficiently, even in Non-IID data Also, Blockchain and IPFS
will be integrated to provide transparency, honesty along with
an incentive mechanism for collaborative learning Finally, to
evaluate the feasibility and effectiveness of FedChain
frame-work, Intrusion Detection System (IDS) is selected as the ML
model due to the urgency of cybersecurity factors in the smart
city development
The rest of this paper is organized as follows In Section II,
we introduce the related works on FL, GAN, and Blockchain
In Section III, we present an overview and describe the
detailed operating procedure of FedChain system The
envi-ronmental setup and performance evaluation results are shown
in Section IV Finally, we conclude the paper and propose
future works in Section V
II RELATED WORKS
A Federated Learning and GAN
Federated Learning is a distributed collaborative AI
mech-anism that appealed to the attention of many researchers in
various fields With the capability of training ML model in a
distributed manner, FL addresses critical issues about privacy
and security of data left by the centralized approach Some
previous science papers [1], [2], [10], [12], [13] have studied
the applications of FL in the context of IIoT Specifically,
Dinh C.Nguyen et al [1] carried out a comprehensive FL
survey that discussed the role of FL in a wide range of IoT
services such as IoT data sharing, data offloading, and caching,
attack detection, mobile crowd-sensing, and IoT privacy and
security The authors also proved the flexibility of FL in
some areas such as smart healthcare, smart transportation,
Unmanned Aerial Vehicles (UAVs), etc On the side of net-work security, Shaashwat Agrawa et al [2] introduced an FL-based framework for Intrusion Detection System (IDS) with the aim of enhancing anomaly detection accuracy as well as user privacy The paper then presented other challenges and potential solutions for FL implementations in IDS
Recently, many researchers have been devoted to imple-menting Generative Adversarial Networks (GAN) as a data generation scheme in FL scenarios However, instead of draw-ing on any systematic approach to the beneficial aspect of GAN, a variety of writers utilized this generative data method
to conduct causative attacks [3], [4] in the context of FL Jiale Zhang et al [3] proposed a poisoning attack strategy that unfriendly participants aim to deteriorate the performance
of the global model by fabricating malicious adversarial data with a GAN architecture
B Federated Learning with Blockchain in IIoT
On the issue of building Blockchain-Based FL for IoT Devices framework, there have been many research efforts focused on this issue A few previous research papers [5], [6], have studied the building of a secure FL framework based
on empowerment for Blockchain technology The prominent investigation of Rui Wang et al [5] is the first to integrate the Blockchain framework and MEC technology into the FL scenario to ensure the privacy, quality, and communication overhead for the IoV system The authors also proposed an al-gorithm to prevent malicious updates to protect FL and design
an incentive mechanism based on trained model weights In previous research on strategies to enhance training and privacy
in FL, Swaraj Kumar et al [7] used InterPlanetary File System (IPFS) as a data storage to build a fully decentralized system They also created a value-driven incentive mechanism using Ethereum smart contracts
Besides that, Yufeng Zhan et al [8] presented a clas-sification of existing incentive mechanisms for associative learning and then evaluated and compared them The authors pointed out incentive mechanisms that have been ignored or are currently not linked to the algorithms in the current models The authors suggested that building an incentive mechanism also needs to construct a more secure system so that users can feel safe to participate in model training Nonetheless, they just raised ideas on building incentive mechanisms in FL without implementation
III SYSTEM DESIGN: FEDCHAIN
A Overview of Architecture The overall architecture of FedChain is illustrated in the Fig 1 The system consists of 3 network layers that leverage Blockchain to link clients and servers in FL At the lowest layer, participants download the global model from the server and train the model locally by their collected data Concerning Non-IID data, GANs will be used to generate adversarial data
as a supplier for the client dataset The next layer includes Blockchain management and the IPFS database Following identity verification, the client can register nodes and task
Trang 3Fig 1 The architecture of FedChain system
training using Blockchain The system allows combining
mul-tiple tasks to provide a variety of task training All data at this
tier will be stored distributedly via IPFS to make our system
fully decentralized The highest layer is the aggregation server,
whose responsibility is to aggregate the models received in the
lower layer after executing smart contracts
All operations in the system are recorded to the Blockchain
as transactions for aggregation and evaluation Blockchain
helps to ensure transparency, security, immutability, and
au-ditability in the payment of rewards and distribution of profits
after completing the FL tasks
Our system is designed to ensure a secure collaborative
framework for training ML models by combining FL with
blockchain and IPFS The system can ensure privacy,
trans-parency, low communication overhead, as well as encourage
data owners to participate in the FL process to build mutual
ML models
B Training ML Model with Federated Learning
The workflow of training ML model with FL approach is
described in four steps as described in Algorithm 1 Firstly, the
aggregation server initializes the global model with parameters
w0 The number of rounds, epochs, clients, etc of the model
is defined These parameters are sent to training devices from
many organizations, which take part in the training process
Next, these devices train the local model by self-collected
dataset and send new parameters wkr back to the aggregation
server This is conducive to preserving the data privacy of
local organizations instead of sending raw data Thirdly, the
server aggregates the received models based on the FedAVG
algorithm [9] The global model parameters wr are calculated
depending on the data contribution rate of each individual
used for training Finally, the aggregation server distributes
the global model to the participants so that they can continue
to update the model The process goes back to step 2 and perform until the specified number of rounds is reached, or the model is converged
Algorithm 1 Federated Learning-based ML model training process
Input:
- Aggregation Server AS;
- Clients C = {(Ck); k = 1, 2, , n}, that want to join the training process;
- The number of exchange rounds R between server and participant
Output: The efficient machine learning model
1: 1 Initialization:
2: - At Aggregation Server: A Machine Learning Model M
is generated parameter w0 with initial settings
3: - At Local Server: Clients C receive model M with parameter w0 and related parameters
4: Procedures:
5: r ← 1 6: snod ← specified number of datasets 7: while r ≤ R do
8: 2 Training local machine learning model 9: k ← 1
10: while k ≤ n do 11: cd ← Client Ck datasets 12: if cd < snod then 13: while cd < snod do 14: GAN generates and provides datasets
16: end if 17: Client Ck train model Mk using private datasets 18: Send back parameter wr to AS
19: k ← k + 1 20: end while 21: 3 Aggregation Server builds new model 22: - AS receive all parameter wrk from all clients 23: - AS uses FedAVG algorithm and generate new model with parameter wr
24: 4 Aggregation Server sent new model to partici-pant
25: k ← 1 26: while k ≤ n do 27: Client Ck update parameter wr to model Mk 28: k ← k + 1
29: end while 30: r ← r + 1 31: end while 32: return Machine Learning Model with parameter wR
C Incentive Mechanism for Collaborative Training using Blockchain
The architecture we provide is autonomous, serving the connection between those who have data and those who want
to create ML models from that data
Trang 4Blockchain is characterized by transparency; once data is
written to the blockchain, it cannot be deleted or changed
by any individual or organization Based on that feature, we
decided to use Blockchain to record user behavior and use it
as evidence for the rewarding process In this way, the payout
is fair and transparent
The main user behaviors handled within the
blockchain-based system are described as follows:
• System membership registration: During the
develop-ment of the functions, we use asymmetric encryption to
authenticate the user’s identity Therefore, for the member
registration task, in addition to the necessary information,
the user needs to generate a key pair The public key will
be sent with the request
• Register to open a new training task: When an
individual or organization wants to open a new training
assignment, they need to submit identification and
infor-mation about the task they want to open When a task is
successfully registered, everyone in the system can see it
Those with the right dataset will register and contribute
to this task
• Register to contribute to a specific task: Users will
choose a task published on the system that matches
the dataset they own to participate in the contribution
process for that task To perform this function, users
need to submit information about hardware resources
and characteristics of the dataset they own Registration
information is stored for tracking and a global model is
sent back to the user for training
• Upload local model: After receiving the training task,
users go through the training process on their dataset
Then, the results of the training process are sent to the
system with the task name To reduce the load on the
Blockchain system, we use IPFS, a distributed database,
to store uploaded files In this way, it is possible to reduce
communication costs and overcome the disadvantages of
traditional centralized data storage When the model is
uploaded, it will be evaluated and the results are stored
in Blockchain These contributions are considered to
compensate users accordingly
• Collect model: This function is performed by the task
publisher They will ask the system to retrieve the models
related to the task that they have previously registered
The system, after receiving the request, will verify the
identity and return the models corresponding to the task
All the above activities occurring in the system are recorded
in the Blockchain, ensuring transparency in the operation
pro-cess, which increases incentives for participants to contribute
D Data Augmentation for Clients using GAN
The GAN architecture in FedChain framework is equipped
in each client that is responsible for automatically supplying
new data records in the case of Non-IID data After the training
phase as described in Algorithm 2, the well-trained generator
G can generate adversarial examples from a given noise vector
Algorithm 2 GAN training process The generator G and discriminator D are trained in a parallel manner
Input:
- Original dataset x including benign samples and mali-cious samples;
- The noise vector z for generator;
Output:
The optimization of generator G and discriminator D; Procedure:
1: for number of training iterations do:
2: D training:
3: - Sample minibatch of m noise samples {z1, z2, , zm} from noise distribution pz(z)
4: - Sample minibatch of m real samples {x1, x2, , xm} from data generating distribution pdata(x)
5: - Update D loss by ascending its stochastic gradient:
1 m
m X
k=1 [log D (xk) + log (1 − D(G(zk)))]
6: G training:
7: - Sample minibatch of m noise samples {z1, z2, , zm} from noise distribution pz(z)
8: - Update G loss by descending its stochastic gradient:
1 m
m X
k=1 [log (1 − D(G(zk)))]
9: end for 10: return GAN Model
IV EXPERIMENT
In fact, the FedChain framework can support the construc-tion of AI models in various fields Within the scope of the paper, this framework is utilized to build ML models in the context of network intrusion detectors, which is conducive to the aspect of cybersecurity This part presents the experimental results of the trained ML-IDS model through FedChain to prove the feasibility of our approach
A Environmental Settings Environment for FL and GAN Tensorflow is used for FL implementation, while GAN is carried out on Keras Tensorflow The hardware configuration used for training FL and GAN is CPU: Intel® Xeon® E5-2660 v4 (16 cores – 1.0 Ghz), RAM: 16 GB, OS: Ubuntu 16.04 Environment for Blockchain system
We use Hyperledger Sawtooth platform with PBFT con-sensus algorithm deployed on 5 nodes to build Blockchain system with Docker The machine configuration used for this deployment is CPU Intel® Core™ i5-9300HQ (4 cores - 8 threads - 3.5 Ghz), RAM 16 GB, OS Ubuntu 16.04
B Dataset and Preprocessing Dataset
This study would utilize a recent dataset named
CICIDS-2017 provided by the Canadian Institute of Cybersecurity
Trang 5for the evaluation of FedChain scheme It contains over 2.8
million network flows spanned in 8 csv files including benign
samples and 14 types of up-to-date attacks However, we only
spend the data from 3 files (Tuesday, Wednesday,
Thursday-Afternoon) with a total of more than 1.3 million network
records that describe typical cyberattacks such as DoS Hulk,
DoS Golden Eye, DoS slowloris, DoS Slowhttptest,
Heart-bleed, FTP-Patator, SSH-Patator, Infiltration More
specifi-cally, from the collected records in each file, we divide them
into 2 sub-datasets with the ratio of 80:20 that the larger
part is the training set, and the other is used for testing
phases Finally, training and testing sets from 3 files would
be combined into CICIDS2017 Train and CICIDS2017 Test
files, respectively
Data Preprocessing
According to a study published in the article [11],
Kurni-abudi et al showed that the top 16 features in the CICIDS2017
dataset are believed to be easily extracted and observed, as
well as having a great influence on detecting a basic network
attack These selected features are Destination Port, Flow
Duration, Packet Length Std, Total Length of Bwd Packet,
Subflow Bwd Bytes, Packet Length Variance, Bwd Packet
Length Mean, Bwd Segment Size Avg, Bwd Packet Length
Max, Total Length of Fwd Packets, Packet Length Mean,
Max Packet Length, Subflow Fwd Bytes, Average Packet Size,
Init Win bytes backward, Init Win bytes forward
After feature selection, we remove all redundant features,
delete non-numeric fields (NaN) and infinity values (Inf) Next,
the values of label column will be converted to binary format
where labels 0 represents benign samples and labels 1 are
assigned to the others In our experiments, we have rescaled
16 features to the interval [0,1] via Min-Max normalization
which has the following formula:
xrescaled= x − xmin
xmax− xmin where x is the feature value before the normalization and
xrescaled is that after the normalization Besides, xmax and xmin
represent the maximum value and the minimum value of this
feature in the dataset, respectively
C Performance evaluation
GAN performance in generating adversarial data
This experiment will show the ability of GAN in
automat-ically generating new network traffics that closely resemble
the input ones We construct GAN architecture with the
following hyperparameters: epochs = 5000, batch size = 256,
using Adam optimizer with learning rate = 0.0002 Also, the
generator G and discriminator D are designed with 5 and
6 layers in turn Both of them utilize LeakyReLU (Leaky
Rectified Linear Unit) and sigmoid as activation functions
Following the GAN training procedure, we proceed to
utilize generator G to generate new synthetic samples from
CICIDS2017 Train dataset A glance at the Fig.2 reveals the
resemblance between original CICIDS2017 Train and
gener-ated network flows in four above features
Fig 2 Similarity between original and generated data in 4 features (From left
to right): FlowDuration, TotalLengthBwdPacket, PacketLengthMean, Pack-etLengthStd
Comparison of FL model performance on Non-IID data without and with GAN
The ML-IDS model is implemented in FL scenario based
on LSTM model with the following layers: LSTM layer with
64 internal units, Dense layer with 16 internal units The model has an input of size (16,1) and output is the result after passing the Dense layer with the sigmoid activation function
We conduct FL training in 3,6,9 and 12 rounds (R) with 3 clients (K)
TABLE I
T HE RESULT OF THE EXPERIMENT IN THE N ON -IID D ATA CASE Round Precision Recall F1-score Accuracy
3 0.9035 0.6342 0.7452 0.9564 Without 6 0.9482 0.591 0.7281 0.9557
12 0.9413 0.5696 0.7097 0.9532
3 0.9457 0.9021 0.9233 0.9331
12 0.959 0.9007 0.9289 0.9581
Table I compares the performance of the model before and after using GAN in terms of the four above metrics
In the beginning, the input data ratio among 3 clients is 5:2:3, whereas the size of the input dataset among them after using GAN becomes balanced As a result, the training model with GAN has witnessed a surge in recall and F1-score by approximately 39% (R=9) and 22% (R=12) respectively The other metrics still stabilize at a high level, about 95% Blockchain performance in FL context
We evaluate the performance of the Blockchain system through CPU resource consumption and request processing time in some specific contexts
To determine the CPU resource consumption of the system when operating, we conducted a test with 10 users continu-ously submitting a request to register a task and measured the consumption at each interval of 0.1 seconds The results are presented in the Fig.3
Trang 60 0.5 1 1.5
10
15
20
25
30
Time (second)
Fig 3 CPU consumption results
To measure the system’s processing time performance, we
measured the processing time in turn in the context of 10
users, 20 users, 50 users continuously sending task registration
requests to the system Measurements were repeated three
times to ensure accuracy The average processing time results
are presented in Table II
TABLE II
R ESULTS OF MEASURING THE SYSTEM ’ S PROCESSING TIME ( S )
Round 1 Round 2 Round 3
10 User 0.06023118 0.059109 0.059569
20 User 0.103943 0.101376 0.112528
50 User 0.336669 0.384335 0.297349
Next, we conduct performance testing in the context of
processing the upload model task The obtained results show
that the system takes an average of 25.19693 seconds to
process a request To handle this request, the system needs
to process it in 2 steps The first step is to upload the model
to IPFS; the second step is to write the data to the Blockchain
We have calculated and found that most of the time is spent
on uploading the model to IPFS (25.19693 seconds/request
for the above context) while the Blockchain system works
very well (0.018419 seconds/request) However, uploading the
model to IPFS completely depends on the connection speed
of the system, from which it can be concluded that our system
can still work well From the above results, we can see that the
performance of Blockchain is great regarding both processing
time and CPU resource consumption
V CONCLUSION ANDFUTUREWORK
In the evolution of the artificial intelligence industry, data
sharing plays an essential role in building intelligent
ecosys-tems and applications In this paper, we have proposed a
collaborative framework named FedChain to help ensure data
privacy and security, along with an incentive and transparent
mechanism In addition, FedChain helps to optimize resources,
investment and operating costs by combining practical and effective technologies such as FL, GAN, Blockchain, IPFS with high applicability and flexibility The experimental results
of our analysis on cyber threat intelligence context with the CICIDS-2017 dataset have demonstrated that the proposed FedChain can enable data sharing securely and efficiently that takes advantage of diverse data sources from the community
In the future, we plan to integrate mobile edge computing (MEC) into the FedChain framework to reduce the system’s communication and data transmission costs
ACKNOWLEDGEMENT Phan The Duy was funded by Vingroup Joint Stock Company and supported by the Domestic Master/ PhD Scholarship Programme of Vingroup Innovation Foundation (VINIF), Vingroup Big Data Institute (VINBIGDATA), code VINIF.2020.TS.138
REFERENCES [1] Dinh C Nguyen, Ming Ding, Pubudu N Pathirana, Aruna Seneviratne, Jun Li, and H Vincent Poor, ”Federated Learning for Internet of Things:
A Comprehensive Survey,” IEEE Communications Surveys & Tutorials, 2021.
[2] Shaashwat Agrawal, Sagnik Sarkar, Ons Aouedi, Gokul Yenduri, Kandaraj Piamrat, Sweta Bhattacharya, Praveen Kumar Reddy Mad-dikunta and Thippa Reddy Gadekallu, ”Federated Learning for Intru-sion Detection System: Concepts, Challenges and Future Directions”, arXiv:2106.09527, 2021.
[3] Jiale Zhang, Junjun Chen, Di Wu, Bing Chen and Shui Yu, ”Poisoning Attack in Federated Learning using Generative Adversarial Nets,”, 2019 18th IEEE International Conference On Trust, Security And Privacy In Computing And Communications, 2019.
[4] Vale Tolpegin, Stacey Truex, Mehmet Emre Gursoy and Ling Liu, ”Data Poisoning Attacks Against Federated Learning Systems”, Computer Security – ESORICS 2020, 2020.
[5] Rui Wang, Heju Li, Erwu Liu, ”Blockchain-Based Federated Learning
in Mobile Edge Networks with Application in Internet of Vehicles,” arXiv:2103.01116, 2021.
[6] Yang Zhao, Jun Zhao, Linshan Jiang, Rui Tan, Dusit Niyato, Zengxiang
Li, Lingjuan Lyu, Yingbo Liu, ”Privacy-Preserving Blockchain-Based Federated Learning for IoT Devices”, IEEE Internet of Things Journal, vol 8, no 3, pp 1817-1829, 2021.
[7] Swaraj Kumar, Sandipan Dutta, Shaurya Chatturvedi, Dr MPS Bhatia,
”Strategies for Enhancing Training and Privacy in Blockchain Enabled Federated Learning,” IEEE Sixth International Conference on Multime-dia Big Data (BigMM), New Delhi, InMultime-dia, 2020.
[8] Yufeng Zhan, Jie Zhang, Zicong Hong, Leijie Wu, Peng Li, Song Guo,
”A Survey of Incentive Mechanism Design for Federated Learning,” IEEE Transactions on Emerging Topics in Computing, 2021.
[9] H.Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, Blaise Aguera y Arcas, ”Communication-Efficient Learning of Deep Networks from Decentralized Data,” arXiv:1602.05629, 2017 [10] Nguyen Chi Vy, Nguyen Huu Quyen, Phan The Duy and Van-Hau Pham,
”Federated learning-based intrusion detection in the context of IIoT networks: Poisoning Attack and defense,” 15th International Conference
on Network and System Security (NSS 2021), Tianjin, China, 2021 [11] Kurniabudi, Deris Stiawan, Darmawijoyo, Mohd Yazid Bin Idris, Alwi M.Bamhdi, and Rahmat Budiarto, ”CICIDS-2017 Dataset Feature Anal-ysis With Information Gain for Anomaly Detection,” in IEEE Access, vol 8, pp 132911-132921, 2020.
[12] Parimala M and Swarna Priya R M and Quoc-Viet Pham and Kapal Dev and Praveen Kumar Reddy Maddikunta and Thippa Reddy Gadekallu and Thien Huynh-The, ”Fusion of Federated Learning and Industrial Internet of Things: A Survey”, arXiv:2101.00798, 2021.
[13] Phan The Duy, Huynh Nhat Hao, Huynh Minh Chu and Van-Hau Pham,
”A Secure and Privacy Preserving Federated Learning Approach for IoT Intrusion Detection System,” 15th International Conference on Network and System Security (NSS 2021), Tianjin, China, 2021.