Lung Nodule Classification: Computed Tomography (CT) Scans
ABSTRACT
This paper proposes a novel framework for the classification
of lung nodules using computed tomography (CT) scans. The proposed
framework is based on the integration between (i) the geometric
shape features, in terms of construction error of modeled
spherical harmonics (SHs); (ii) the appearance feature in terms of
Gibs energy modeled using the 7thô€€€ order MGRF; and (iii) the
size feature using the k-nearst neighbor (k-NN) classifier. The
final classification is obtained by using the deep autoencoder neural
networks. The geometric feature is extracted by calculating
the construction error between the original nodule mesh and the
SHs-based constructed ones. To calculate this error curve at each
point, the surface mesh for each nodule is modeled using different
SHs, from 1 to 70, and calculate the difference as the error.
Secondly, the appearance feature is modeled using the novel 7th-
order Markov-Gibbs random field (MGRF) model in addition to
the size feature using k-NN classifier. Finally, a deep autoencoder
(AE) classifier is applied to distinguish between the malignant and
benign nodules. To evaluate the proposed framework, we used
the publicly available data from the Lung Image Database Consortium
(LIDC). We used a total of 116 nodules that were collected
from 60 patients. By achieving a classification accuracy of
96.00%, the proposed system demonstrates promise to be a valuable
tool for the detection of lung cancer.
Index Terms-CT, 7th-order MGRF, Spherical Harmonics,
Lung nodules
1. INTRODUCTION
Lung cancer is considered the leading cause of cancer death
among both genders in the United States with about 1 out of 4
cancer deaths resulting from lung cancer [1]. Although there are
several imaging modalities used for the diagnosis of lung cancer,
e.g., magnetic resonance imaging (MRI), chest radiograph
(X-ray), and many other modalities, computed tomography (CT)
imaging is the most common and appropriate modality for examining
the lung tissues due to its high resolution and clear contrast
compared to other techniques [2]. Recently, the number of lung
cancer cases have increased exponentially, and its early detection
can increase the chance of survival [3]. Furthermore, an automated
assistive tool for the radiologists is of great importance to
help in the analysis of the large amount of data available from
CT scans. Thus, the computer aided diagnosis systems (CADx)
is of great interest and high importance. Recently, a plethora of
methods for automated diagnosis of pulmonary nodules in CT
scans have been introduced. Various researchers have used image
processing and data mining techniques to diagnose the pulmonary
nodules. Namely, Macedo et al. [4] have proposed the use of
different classifiers, such as the support vector machine (SVM),
and rule-based system, to distinguish between malignant and
benign lung nodules. They used texture, shape, and appearance
features that were extracted from the histogram of oriented gradient
(HOG) from the region of interest (ROI). Kumar et al. [5]
used deep features extracted from multi-layer autoencoders for
the classification of lung nodules. Although they have proved the
effectiveness of extracting high-level features from the input data
in their experiments, they disregarded the morphological information,
e.g., perimeter, skewness, and circularity of the nodule,
which couldn’t be extracted by the conventional deep models.
Jia et al. [6] have proposed a rule-based classification system
based on growth rate changes and registration technique. Lee et
al. [7] have proposed a lung nodule classification system using
a random forest classifier aided by clustering. First they merged
all the data, then they divided it into two clusters, then divided
each cluster into two groups, nodule and non-nodule, based on the
training set labels. Finally a random forest classifier was trained
for each cluster to distinguish between benign and malignant
nodules. Farahani et al. [8] have proposed an ensemble-based
system to classify each pulmonary nodule by integrating multiple
classifiers like SVM, K-nearest-neighbors, and neural networks.
The classifiers have learned over five morphological features and
the output of these classifiers is combined using majority voting.
Huang et al. [9] have proposed a system to differentiate malignant
from benign pulmonary nodules based on fractal texture features
from Fractional Brownian Motion (FBM) model using (SVM).
Elsayed et al. [10] have proposed a system that uses different
classifiers, e.g., Linear, Quadratic, Parzen, Neural Networks, and
their different combinations such as mean, median, maximum,
minimum, and voting, to enhance the performance of the classification
of malignant and benign pulmonary nodules. Kim et
al. [11] have proposed a system which uses a deep neural network
to extract abstract information inherent in raw hand-crafted imaging
features. Then, the learned representation is used with the
raw imaging features to train the classifier. Narayanan et al. [12]
also used deep neural network to classify the pulmonary nodules
after training on morphological features. Bayanati et al. [13] tried
to identify which features on CT images could differentiate between
malignant and benign nodules. They used both texture and
shape analysis features and found an enhancement in accuracy
but without significant change in the false positive.
The existing methods for the classification of lung nodules have