A Modified FCN Based Method for Left Ventricle Endocardium and Epicardium Segmentation with New Block Modules A modified FCN based method for Left Ventricle endocardium and epicardium segmentation wit[.]
Trang 1A modified FCN-based method for Left Ventricle endocardium and epicardium segmentation with
new block modules
Do-Hai-Ninh Nham School of Applied Mathematics and Informatics
Hanoi University of Science and Technology
ninh.ndh182714@sis.hust.edu.vn
Minh-Nhat Trinh School of Electrical and Electronic Engineering Hanoi University of Science and Technology minhnhattrinh312@gmail.com
Tien-Thanh Tran
Smart Health Center
Vingroup Big Data Institute
trantienthanh081298@gmail.com
Van-Truong Pham School of Electrical and Electronic Engineering Hanoi University of Science and Technology truong.phamvan@hust.edu.vn
Thi-Thao Tran School of Electrical and Electronic Engineering
Hanoi University of Science and Technology
thao.tranthi@hust.edu.vn
Abstract—Cardiac segmentation of medical magnetic
reso-nance images has been crucial nowadays owing to its necessity
for cardiac problems diagnosis In the increasing demand of
advanced procedures for cardiac disease diagnosis and inspired
by the structure of receptive field block; in this paper, we
propose a new block module then further assembling into a
deep fully convolutional neural network to deal with automated
left ventricle segmentation With only one learning stage, our
proposed model is trained end-to-end, pixels-to-pixels and
val-idated on two popular cardiac MRI benchmarks, ACDC and
SunnyBrook datasets Several experiments have proved that our
new model architecture has a better performance than previous
segmentation methods with enhanced feature discriminability
and robustness, despite having much less training parameters
Index Terms—kernel size; dilation rate; fully convolutional
neural network; cardiac MRI segmentation
I INTRODUCTION
To interpret the cardiovascular function, cardiac magnetic
resonance imaging (MRI) has been extensively utilized
Seg-menting the left ventricle (LV) from these MRI images is
an imperative step for cardiac problem diagnosis; as many
information like systolic and diastolic issues could be sooner
observed [1] Thus there is a desperate necessity for upgrade in
LV segmentation methods nowadays As clinical approaches
could be tedious and sometimes might lead to human error
risks, developed automated approaches such as deep
learning-based approaches (DL) [2], [3], active contour models
ap-proaches (ACMs) [4], [5] have proved their efficiencies
As regards ACM methods, the level set-based active contour
model is shown to be effective on enabling the curve to update
Identify applicable funding agency here If none, delete this.
its topology throughout the segmentation process Despite revealing benefits, there are certain deficiencies that unless contour initialization has to be constructed, the final result might have less acceptable objects boundary information There is also remarkable success indicated in LV segmentation with reference to deep learning-based methods, as Tran [6] has employed a deep fully convolutional neural network (FCN) into cardiac MRI segmentation In his paper, traditional fully connected layers are replaced with convolutional layers to seize a coarse inference map, before deconvoluting it with fractional stride convolution to yield temporal coherence and discriminative features then obtain prominent results Notably methods using U-Net [7]; due to skip connections operation, which deep, coarse, semantic feature maps from the encoder
to the corresponding layers in the decoder could be allowed
to propagate, the output segmentation map is proved to be more proper Trinh et al [3] have combined the vanilla U-Net with modules of Swish activation [8] and Squeeze-and-Excite block [9] for better tunning parameters and highlighting the best salient boundary features Tran et al [10] have formulated the issue as a two-stage problem and build a hybrid algorithm, where the first phase hypothesizes raw segmentation features of all masks by a vanilla U-Net In the second phase these features are handled as initial level set functions for the multiphase active contour model (MP-ACM) before providing persistently outstanding cardiac MRI contours prediction performance However, all these previous deep network methods have resulted in high computational costs, low inference speed and borderline data insufficiency
as a consequence of strides and dilated convolution defection Motivated by these limitations, in this work, we develop
Trang 2and validate a new approach for automatic contours detection
of cardiac MR images Our main contributions could be
summarized as follows:
• We propose new block modules to optimize the receptive
field on feature maps, thus aiming to enhance deep
features, improving feature detection and correct
occa-sional misalignment in MRI segmentation performance,
in condition of sensitive contour initialization in cardiac
MRI segmentation performance
• Using the backbone of FCN, we replace FCN’s
convo-lutional block modules by our proposed modules before
training the model end-to-end
• We show that our new approach achieves state-of-the-art
results on the ACDC and SunnyBrook datasets at a real
time processing speed with only one training stage, and
demonstrate the effectiveness of our notably light-weight
model to other models
II RELATEDWORK
Fully Convolutional Neural Network (FCN) Previously,
FCN has been exploited in many proposals and applications
With fully convolutional inference, a convolutional network
has been exemplified for raw multi-class segmentation of
particular tissues [11] Also within a convolutional network,
multi-scale and sliding window technique could be constructed
for classification, localization and even detection [12]
Evolv-ing from [12], FCN has been adopted as a segmentation
method [13], which includes FCN-8s, FCN-16s and FCN-32s
FCN-8s and FCN-16s are validated to have more
outstand-ing segmentation performances, as the FCN-32s outcome is
rougher due to the loss of spatial location and local
informa-tion Furthermore, FCN in [6] highlighted the first time ever to
be employed in pixel-wise labelling or per-pixel classification,
benchmarked on cardiac MRI datasets
Receptive Field In neuroscience, receptive field is a region
in the sensory periphery within which stimuli can influence the
electrical activity of sensory cells [15] Similarly, in a deep
learning context, the Receptive Field (RF) is basically defined
as a measure of association of an output feature of any layer to
the input region [16] Inception architecture [17] has overcome
the hurdles of huge variation locaction, being overfitted and
expensive computatation cost by putting multiple RFs sizes
(1 × 1, 3 × 3, 5 × 5) into use Thus the network structure
would essentially get a bit wider rather than deeper Inception
variants [18], [19] have factorized convolutions of filter size
n × n into a combination of 1 × n and n × 1
convolu-tions; therefore attaining promising performances on object
detection and classification In the Deformable Convolutional
Network (DCN) [20], several RF sizes of deformable filters
are correlated with object scales and shapes, indicating that
the deformation is efficiently learned from input image or
feature maps Compared with larger but fixed dilation value
of atrous convolution [21], deformable convolution utilizes
different dilation values applied to each point in the grid
during convolution In addition, DCN is also reported to be
an extremely light-weight Spatial Transformer Network [22];
hence DCN has won as the runner-up in the COCO Detection
as well as Segmentation Challenge
III METHODOLOGY
A The Proposed Block Module
In general, assuming that the input shape is nh× nw and the convolutional kernel shape is kh× kw, the output shape will be (nh− kh+ 1) × (nw− kw+ 1) Accordingly, the output shape of the convolutional layer is affected by the shape of the input and the shape of the convolution kernel Convolutional kernels with odd height and width values (such as 1, 3, 5, or 7) are frequently utilized; which the spatial dimensionality could
be preserved while padding with the same number of rows
on top and bottom, and the same number of columns on left and right This usage of odd kernels and padding to precisely preserve dimensionality reveals certain efficiency; that for any two-dimensional tensor X, when the kernel size is odd and the padding rows and columns number of all sides are all similar, obtaining an output with equal height and width as the input, we know that the output y[i, j] is determined by cross-correlation of the input and convolutional kernel with the window centered on x[i, j] [23]
The question is, how to decide the most optimized kernels for convolutional operation? Salient parts of the input image could have substantially wide variation in shape and size Ow-ing to this, adoptOw-ing the precise kernel size for convolutional operation has become a tough issue If a smaller kernel is adopted for locally distributed information, a larger one is preferred for more global features Very deep network makes
it hard for gradient to be passed during updating through the whole network; also, naively stacking large convolutional operations is computationally expensive Consequently, con-volutional filters with several shapes on the same level need
to be taken into account
As above, asymmetric n×n convolution could be performed
as the integration of 1×n and n×1 convolutions For instance,
a 3 × 3 convolution is proved to be equivalent to a 1 × 3 then
3 × 1 convolution when their computational cost of is proved
to be 33% cheaper Thus, instead of employing 2 consecutive
3 × 3, 5 × 5, 7 × 7 and 9 × 9 operations, we consider a merger
of 1 × 3 and 3 × 1, 1 × 5 and 5 × 1, 1 × 7 and 7 × 1, 1 × 9 and
9 × 1 respectively; before applying an asymmetric convolution operation
One primary obstacle with the above modules is that even
a modest number of 5 × 5 and 7 × 7 convolutions could
be prohibitively expensive on top of a convolutional layer with a large number of filters A 1 × 1 convolutional layer is judiciously exploited to overcome this dispute, as it could offer
a channel-wise pooling, also called feature map pooling or a projection layer Such simple technique with low dimensional embeddings could be utilized in dimensionality reduction whilst retaining salient features; also, generating a one-to-one projection of stack of feature maps to pool features across channels after conventional pooling layers
To obtain a promising performance on MRI segmentation, convolutional procedures have to analyze both globally and
Trang 3locally distributed features If familiar discrete convolutions
are applied throughout the network architecture, it will be
nec-essary to assign large kernels in order to achieve global view,
which is responsible for parameters surging Therefore, we
adopt dilated convolutions [24] at the asymmetric convolution
layers as it supports RF exponential expansion without loss of
resolution, while the parameters number increases linearly, to
control the features’ eccentricities
Between two consecutive layers, an activation function
and a Mean-Variance normalization layer (MVN) [6] are
used for humbling the pixel distribution shifting right after
a convolutional operation Compare with Batch Normalization
[14], which reduces internal covariate shift and accelerates the
gradient flow through the network like MVN; MVN is still
effective though being much simpler as it primarily centers
and standardizes a single batch at a time About the activation,
instead of using ReLU activation; Swish [8] is selected as the
only activation function throughout the training
Overall, our proposed block modules could be briefly
de-scribed as in Fig 1 We create 4 block modules In block
Fig 1 Our proposed block module (n = 2i + 1; i ∈ [1, 4])
module i, (i ∈ [1, 4]), there are 4 convolutional blocks, whose
kernel sizes are (1, 1), (1, n), (n, 1), (n, n) respectively that
n = 2i + 1 All the dilation rates are set up to be equal to
(1, 1) except for the last one, which is (n − 2, n − 2)
B Model Architecture
Assembling the new module to the FCN backbone [6], we
construct an advanced one-stage segmentation model Even
though having such a lightweight module, our modified FCN
delivers relatively comparable results to some of the
start-of-the-art approaches and moreover retaining a very fast speed
Our new 16-layer model architecture is displayed as Fig 2
Fig 2 Our modified FCN based on the new receptive field block module in Fig 1
C Loss function Sigmoid activation is employed to form the loss and the binary image from the output layer, which contains c = 2 labels standing for the total number of classes for each pixel
to be classified into Denoting there are totally N pixels for prediction as well as groundtruth, P and L be the predicted set and the groundtruth set respectively which |P | = |L| = N
Pic and Lic are the element of P and L in order, with condition that i ∈ {1, 2, , N } and c ∈ {0, 1}; Pic ∈ [0, 1]; Lic∈ {0, 1} displaying ground truth label and predicted label probability correspondingly
Dice Loss is our loss function usage in the training process,
it is calculated as:
Dice Loss =
PN i=1(Pic+ Lic) − 2PicLic
PN i=1(Pic+ Lic) + ϵ (1)
D Evaluation Metrics
In medical image analysis, while the Dice Similarity Co-efficient (DSC) is a statistical tool for measuring similarity between segmentation maps, the Intersection over Union index (IoU) is also a statistical tool but to gauge the similarity and diversity of sample pixel sets They are determined by:
DSC(P, L) =
PN i=12PicLic+ ϵ
PN i=1(Pic+ Lic) + ϵ (2) IoU (P, L) =
PN i=1PicLic+ ϵ
PN i=1(Pic+ Lic− PicLic) + ϵ (3)
Trang 4The smooth coefficient ϵ is provided for preventing zero
division, in experiment we suppose ϵ to be 1e − 15
In a diagnostic test, while sensitivity (SEN) is determined
as the rate of people who are diagnosed as positive out
of those who actually have the condition, specificity (SPE)
is determined as the rate of people who are diagnosed as
negative Mathematically, sensitivity=number of true
posi-tives (TP)/(number of true posiposi-tives (TP) + number of false
negatives (FN)) and specificity=number of true negatives
(TN)/(number of true negatives (TN) + number of false
positives (FP)):
SP E = T N
Accuracy (ACC) is also used as a statistical measure of how
well a binary classification test correctly identifies or excludes
a condition:
T P + T N + F P + F N (6)
IV EXPERIMENTALRESULTS
A Benchmarks
There are 45 cine-MRI images in the Sunnybrook dataset
[25] (2009 Cardiac MR Left Ventricle Segmentation Challenge
data), comprising a mixed of patients and pathologies: healthy,
hypertrophy, heart failure with infarction and heart failure
without infarction While a part of this dataset was first used
in the automated myocardium segmentation challenge from
short-axis MRI, held by a MICCAI workshop in 2009; the
entire dataset is now available in the CAP database with public
domain license There are three different subsets of 15 cases
each: training, validation, and online with groundtruth
respec-tively The training set is utilized for training model for LV
endocardium and epicardium segmentation, when validation
and online sets would be used for evaluation
Having been released by the ”Automatic Cardiac Diagnosis
Challenge (ACDC)” workshop held in conjunction with the
20th International Conference on Medical Image Computing
and Computer Assisted Intervention (MICCAI), the public
ACDC dataset [26] comprises 100 patient 4D cine-CMR scans
each consisting of segmentation masks for the left ventricle
(LV), the myocardium (Myo) and the right ventricle (RV) at
the end-systolic (ES) and end-diastolic (ED) phases of each
patient Then the training database containing the manual
segmentation masks is separated into the training set and
test set with ratio 8:2 to assess the proposed model, before
comparing the performance between several automatic
meth-ods on the segmentation of the left ventricular endocardium
and epicardium for both end diastolic and end systolic phase
instances
B Training
We have performed our customized FCN to segment cardiac
left ventricles on MRIs It is necessary to note that in each
segmentation task of each dataset, we break down the task into
2 sub-tasks respectively for endo- and epi-cardium contour prediction Our models are trained end-to-end, with cost minimization on various epochs (base on different cases) is operated by applying NADAM optimizer [27] with an original learning rate of 0.001 Learning rate is multiplied by 0.5 every 10 epochs, before reaching 0.00001 and being constantly kept through the remainder training period All images are all rigidly pre-processed by cropping center and normalization, before being augmented by randomly flipping horizontally and vertically, and rotated within π radians, before being trained with NVIDIA Tesla P100 16GB GPU
C Evaluation
TABLE I
ENDOCARDIUM OF S UNNY B ROOK D ATASET
Trinh [3] 32.4M 0.7850 0.6996 0.8991 0.9920 0.9746 FCN[1][6] 10.9M 0.9156 0.8537 0.9336 0.9934 0.9866
-U-Net [7] 31.0M 0.6123 0.5438 0.7294 0.9914 0.9731
-Ours 9.1M 0.9196 0.8580 0.9314 0.9942 0.9870
[1] w/o finetune and Xavier unit.
TABLE II
OF S UNNY B ROOK D ATASET
Trinh [3] 32.4M 0.8620 0.7940 0.9277 0.9843 0.9726 FCN [1] [6] 10.9M 0.9466 0.9009 0.9542 0.9886 0.9814
-U-Net [7] 31.0M 0.6896 0.6347 0.7150 0.9921 0.9568
-Ours 9.1M 0.9505 0.9100 0.9507 0.9910 0.9826
[1] w/o finetune and Xavier unit.
We compare our algorithm with several algorithms to eval-uate the efficiency of our new approach and write down the mean value of each metric index into 4 tables Ablation test results recorded in the Table I and Table II show that our proposed method is confirmed to be the best on endocardium segmentation of the SunnyBrook dataset according to testing measurement on DSC (0.9196), IoU (0.8580), SPE (0.9942) and ACC (0.9870), also on epicardium segmentation of the SunnyBrook dataset with DSC (0.9505), IoU (0.9100), SPE (0.9910) and ACC (0.9826) This indicates that our proposed method is more convinced of the predicted regions and pro-vides more accurate boundary maps, even with exceptionally light-weighted backbone (9.1 million trainable parameters), comparing with Tran et al [10] (which is a two-stage ar-chitecture) and other popular architectures
Trang 5TABLE III
FCN [1] [6] 10.9M 0.8576 0.8047 0.9272 0.9810 0.9798
-Trinh [3] 32.4M 0.9253 0.8964 0.9661 0.9966 0.9940
SegNet [28] 29.5M 0.8192 0.7470 0.9300 0.9829 0.9783
U-Net [7] 31.0M 0.8810 0.8225 0.9504 0.9909 0.9920
Ours 9.1M 0.8798 0.8360 0.9512 0.9852 0.9823
[1] w/o finetune and Xavier unit. [2]with finetune and Xavier unit.
TABLE IV
OF ACDC D ATASET
FCN [1] [6] 10.9M 0.9167 0.8803 0.9631 0.9947 0.9918
-SegNet [28] 29.5M 0.8896 0.8321 0.9526 0.9942 0.9907
U-Net [7] 31.0M 0.9189 0.8712 0.9618 0.9945 0.9918
Ours 9.1M 0.9273 0.8978 0.9656 0.9969 0.9943
[1] w/o finetune and Xavier unit. [2]with finetune and Xavier unit.
To present a fairer verification of our proposed approach,
corresponding outcomes of ours and five different approaches
have been displayed in Table III and Table IV on the
bench-mark of the ACDC dataset From Table IV, our proposed
method is confirmed to outperform other compared methods
on epicardium segmentation of the ACDC dataset according
to testing measurement on DSC (0.9273), IoU (0.8978), SPE
(0.9969) and ACC (0.9943), which is very comparable to some
of the state-of-the-art approaches Nevertheless, with regards
to the endocardium segmentation in Table III, Trinh et al [3]
has gained the best results with Dice score of 0.9253, while
our result stays modest with Dice score of 0.8798
D Representative Results
In this section, we provide quantitative results produced by
our proposed model for difficult input cases As indicated in
Fig 3-5, our simple model can achieve better results than
some of existing methods; which verifies the strength and
light-weighting of our proposed method have obtained quite a
comparable performance on inputs with high complexity
From the results on figures and tables, we demonstrate other
fundamental observations that:
• U-Net-based networks seem to have poorer performances
than FCN-based networks do on the SunnyBrook dataset,
as vanilla U-Net, in some particular cases it could not
provide an acceptable prediction as Fig 3 and Fig 4
have illustrated
• In contrast, FCN-based methods seem to have more
mediocre results than U-Net-based methods However,
our FCN-based performance on the epicardium of ACDC
Fig 3 Endocardium contour prediction on the SunnyBrook dataset between different models in some complex cases
Fig 4 Epicardium contour prediction on the SunnyBrook dataset between different models in some complex cases
Fig 5 Epicardium contour prediction on the ACDC dataset between different models in some complex cases
dataset has verified our approach’s convincement on pro-viding comparable results for 2 different datasets (which very little number of algorithms could do and no light-weighted models could do), as shown in Fig 5 as our light-weighted network still outperforms vanilla U-Net on
a side of the ACDC dataset segmentation
V CONCLUSION
In this paper, we have introduced a new light-weighted deep-learning framework for cardiac MRI contours prediction Our experimental outcomes through the training process
Trang 6ev-idently express that our new model has achieved prominent
performances on the two popular datasets SunnyBrook and
ACDC Even though, those are just small datasets in the deep
learning era Therefore, in the future, we will dedicate more
effort on complex MRI segmentation datasets to minimize
the segmentation errors of the deep-learning approaches and
maximize the acceleration, efficiency and reliability of the
deep-learning network
ACKNOWLEDGEMENT
This research is funded by the Hanoi University of Science
and Technology (HUST) under project number T2021-PC-005
Minh-Nhat Trinh was funded by Vingroup JSC and supported
by the Master, PhD Scholarship Programme of Vingroup
Innovation Foundation (VINIF), Institute of Big Data, code
VINIF.2021.ThS.33
REFERENCES [1] Miller, Christopher and Pearce, Keith and Jordan, Peter and Argyle,
Rachel and Clark, David and Stout, Martin and Ray, Simon and Schmitt,
Matthias, “Comparison of real-time three-dimensional echocardiography
with cardiovascular magnetic resonance for left ventricular volumetric
assessment in unselected patients” European heart journal cardiovascular
Imaging, vol 13, pp 187–195, doi 10.1093/ejechocard/jer248,
Novem-ber 2011.
[2] M R Avendi and A Kheradvar and H Jafarkhani, A Combined
Deep-Learning and Deformable-Model Approach to Fully Automatic
Segmentation of the Left Ventricle in Cardiac MRI, eprint 1512.07951,
archivePrefix, primaryClass cs.CV, 2015.
[3] M.N Trinh and N.T Nguyen and T.T Tran and V.T Pham, A Deep
Learning-based Approach with Image-driven Active Contour Loss for
Medical Image Segmentation, The 2nd International Conference on Data
Science and Applications (ICDSA 2021).
[4] Lynch, Michael and Ghita, Ovidiu and Whelan, Paul, Left-ventricle
myocardium segmentation using a coupled level-set with a priori
knowl-edge, Computerized Medical Imaging and Graphics, 30 (4) pp 255-262.
ISSN 0895-6111, doi 10.1016/j.compmedimag.2006.03.009, July 2006.
[5] Pham, Van-Truong and Tran, Thi-Thao, Active Contour Model and
Nonlinear Shape Priors with Application to Left Ventricle Segmentation
in Cardiac MR Images, Optik - International Journal for Light and
Electron Optics, vol 127, doi 10.1016/j.ijleo.2015.10.162, November
2015.
[6] Phi Vu Tran, A Fully Convolutional Neural Network for Cardiac
Seg-mentation in Short-Axis MRI, eprint 1604.00494, archivePrefix arXiv,
primaryClass cs.CV, 2017.
[7] Ronneberger, Olaf and Fischer, Philipp and Brox, Thomas, U-Net:
Convolutional Networks for Biomedical Image Segmentation, LNCS,
vol 9351, pp 234–241, isbn 978-3-319-24573-7, doi
10.1007/978-3-319-24574-4 28, October 2015.
[8] Ramachandran, Prajit and Zoph, Barret and Le, Quoc, Swish: a
Self-Gated Activation Function, October 2017.
[9] Jie Hu and Li Shen and Samuel Albanie and Gang Sun and Enhua
Wu, Squeeze-and-Excitation Networks, eprint 1709.01507, archivePrefix
arXiv, primaryClass cs.CV, 2019.
[10] Tran, Tien-Thanh and Tran, Thi-Thao and Ninh, Quoc-Cuong and
Bui, Minh-Duc and Pham, Van-Truong, Segmentation of Left Ventricle
in Short-Axis MR Images Based on Fully Convolutional Network
and Active Contour Model, pp 49–59, isbn 978-3-030-62323-4, doi
10.1007/978-3-030-62324-1 5, October 2020.
[11] Ning, Feng and Delhomme, Damien and Lecun, Yann and Piano, Fabio
and Bottou, L´eon and Barbano, Paolo, Toward automatic phenotyping of
developing embryos from videos, Image Processing, IEEE Transactions
on, vol 14, pp 1360–1371, doi 10.1109/TIP.2005.852470, October 2005.
[12] Sermanet, Pierre and Eigen, David and Zhang, Xiang and Mathieu,
Michael and Fergus, Rob and Lecun, Yann, OverFeat: Integrated
Recognition, Localization and Detection using Convolutional Networks,
International Conference on Learning Representations (ICLR) (Banff),
December 2013.
[13] Long, Jonathan and Shelhamer, Evan and Darrell, Trevor, Fully con-volutional networks for semantic segmentation, pp 3431–3440, doi 10.1109/CVPR.2015.7298965, June 2015.
[14] Sergey Ioffe and Christian Szegedy, Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift, eprint 1502.03167, archivePrefix arXiv, primaryClass cs.LG, 2015.
[15] Levitt and Jonathan B., Receptive field, accessed: 01.10.2021, https://www.britannica.com/science/receptive-field.
[16] Nikolas Adaloglou, Understanding the receptive field
of deep convolutional networks, accessed: 02.07.2020, https://theaisummer.com/receptive-field/.
[17] Christian Szegedy and Wei Liu and Yangqing Jia and Pierre Sermanet and Scott Reed and Dragomir Anguelov and Dumitru Erhan and Vincent Vanhoucke and Andrew Rabinovich, Going Deeper with Convolutions, eprint 1409.4842, archivePrefix arXiv, primaryClass cs.CV, 2014 [18] Christian Szegedy and Vincent Vanhoucke and Sergey Ioffe and Jonathon Shlens and Zbigniew Wojna, Rethinking the Inception Archi-tecture for Computer Vision, eprint 1512.00567, archivePrefix arXiv, primaryClass cs.CV, 2015.
[19] Christian Szegedy and Sergey Ioffe and Vincent Vanhoucke and Alex Alemi, Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning, eprint 1602.07261, archivePrefix arXiv, pri-maryClass cs.CV, 2016.
[20] Jifeng Dai and Haozhi Qi and Yuwen Xiong and Yi Li and Guodong Zhang and Han Hu and Yichen Wei, Deformable Convolutional Net-works, eprint 1703.06211, archivePrefix arXiv, primaryClass cs.CV, 2017.
[21] Liang-Chieh Chen and George Papandreou and Iasonas Kokkinos and Kevin Murphy and Alan L Yuille, DeepLab: Semantic Image Segmen-tation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs, eprint 1606.00915, archivePrefix arXiv, primaryClass cs.CV, 2017.
[22] Sik-Ho Tsang, Review: DCN / DCNv1 — Deformable Convolutional Networks, 2nd Runner Up in 2017 COCO Detection (Object Detection), accessed: 02.02.2019, https://towardsdatascience.com/review-dcn- deformable-convolutional-networks-2nd-runner-up-in-2017-coco-detection-object-14e488efce44.
[23] Aston Zhang and Zack C Lipton and Mu Li and Alex J Smola, Dive into Deep Learning, https://d2l.ai/chapter convolutional-neural-networks/padding-and-strides.html.
[24] Liang-Chieh Chen and George Papandreou and Florian Schroff and Hartwig Adam, Rethinking Atrous Convolution for Semantic Image Segmentation, eprint 1706.05587, archivePrefix arXiv, primaryClass cs.CV, 2017.
[25] Krasnobaev, Arseny and Sozykin, Andrey, An Overview of Techniques for Cardiac Left Ventricle Segmentation on Short-Axis MRI, ITM Web
of Conferences, vol 8, pp 01003, doi 10.1051/itmconf/20160801003, January 2016.
[26] Bernard, Olivier and Lalande, Alain and Zotti, Clement and Cerve-nansky, Frederick and Yang, Xin and Heng, Pheng-Ann and Cetin, Irem and Lekadir, Karim and Camara, Oscar and Gonz´alez Ballester, Miguel ´ Angel and Sanroma, Gerard and Napel, Sandy and Petersen, Steffen and Tziritas, Georgios and Ilias, Grinias and Khened, Mahendra and Kollerathu, Varghese and Krishnamurthi, Ganapathy and Rohe, Marc-Michel and Jodoin, Pierre-Marc, Deep Learning Techniques for Automatic MRI Cardiac Multi-Structures Segmentation and Diagnosis:
Is the Problem Solved?, IEEE Transactions on Medical Imaging, vol.
PP, doi 10.1109/TMI.2018.2837502, May 2018.
[27] Timothy Dozat, Incorporating Nesterov Momentum into Adam, 2016 [28] Vijay Badrinarayanan and Alex Kendall and Roberto Cipolla, SegNet:
A Deep Convolutional Encoder-Decoder Architecture for Image Seg-mentation, eprint 1511.00561, archivePrefix arXiv, primaryClass cs.CV, 2016.
[29] Queir´os, Sandro and Barbosa, Daniel and Heyde, Brecht and Morais, Pedro and Vilac¸a, Jo˜ao and Friboulet, Denis and Bernard, Olivier and D’hooge, Jan, Fast Automatic Myocardial Segmentation in 4D cine CMR datasets, Medical Image Analysis, vol 18, doi 10.1016/j.media.2014.06.001, October 2014.
[30] Hu, Huaifei and Liu, Haihua and Gao, Zhiyong and Huang, Lu, Hybrid segmentation of left ventricle in cardiac MRI using Gaussian-mixture model and region restricted dynamic programming, Magnetic resonance imaging, vol 31, doi 10.1016/j.mri.2012.10.004, December 2012.