1. Trang chủ
  2. » Luận Văn - Báo Cáo

Exploring the application of artificial intelligence techniques in data analysis cancer detection a systematic analysis

92 0 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Exploring the application of artificial intelligence techniques in data analysis for cancer detection: a systematic analysis
Tác giả Thanh Nhat Hoang
Người hướng dẫn Ph.D. Dinh-Toi Chu, MSc. Hue Vu Thi
Trường học Vietnam National University, Hanoi International School
Chuyên ngành Business Data Analytics
Thể loại Graduation project
Năm xuất bản 2024
Thành phố Hanoi
Định dạng
Số trang 92
Dung lượng 2,09 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Cấu trúc

  • CHAPTER 1. LITERATURE REVIEW (11)
    • 1.1. AI (AI) technology and applications in health (11)
    • 1.2 Cancer, cancer data analysis, and the limitations (12)
    • 1.3 AI technology aids in the detection and treatment of cancer (14)
      • 1.3.1. AI Algorithms and Models in Medical Data and Cancer Simulation (14)
      • 1.3.2. Used in diagnosis to detect early signs of cancer (15)
      • 1.3.3. Training Data (16)
    • 1.4. Systematic Review Method for Medical and Cancer Data Analysis (17)
      • 1.4.1. Systematic review analysis in medicine (17)
      • 1.4.2. Systematic review analysis in the field of cancer (17)
      • 1.4.3. Reasons for conducting a systematic review study (17)
      • 1.4.4. Process of Systematic Review (18)
  • CHAPTER 2. METHOD (19)
    • 2.1. Research Content (19)
    • 2.2. Research Method (20)
      • 2.2.1. Research Design (20)
      • 2.2.2. Study Subjects (20)
      • 2.2.3. Selection and exclusion standards (21)
      • 2.2.4. Data sources and techniques for searching (21)
      • 2.2.5 Research screening and selection process (22)
    • 2.3. Evaluate Matrix (22)
  • CHAPTER 3. RESULT (24)
    • 3.1. AI in Detection of Cancer: A Systematic Review (24)
      • 3.1.1 Search Results (24)
      • 3.1.2 Characteristics of research subjects (25)
    • 3.2. Effectiveness of AI in Early Cancer Detection (43)
      • 3.2.1. AI Application in Cancer Detection Using Existing Models (43)
      • 3.2.2. AI Detection in Cancer: Analysis of DL and NN Approaches (50)
      • 3.2.3. AI Detection in Cancer: Analysis of Hybrid Approaches (55)
    • 3.3 AI model Advantages and Disadvantages (60)
      • 3.3.1. Advantages and disadvantages of Existing AI Models (60)
      • 3.3.2. Advantages and disadvantages of DL and NN model (72)
      • 3.2.3. Advantages and disadvantages of the Hybrid approach (77)
    • 3.4 Additional research potential on the use of AI in oncology (82)
  • CHAPTER 4. CONCLUSION AND RECOMMENDATIONS (85)
    • 4.1. Conclusion (85)
    • 4.2. Recommendation (86)

Nội dung

Khám phá ứng dụng kỹ thuật trí tuệ nhân tạo trong phân tích dữ liệu phát hiện ung thư và phân tích hệ thống Exploring the application of artificial intelligence techniques in data analysis cancer detection a systematic analysis

LITERATURE REVIEW

AI (AI) technology and applications in health

Artificial Intelligence (AI) refers to advanced techniques that utilize computers and technology to simulate human behavior, thinking, and intelligence, aiming to enhance human capabilities across various aspects of life The concept was first introduced by Alan Turing in 1950 through the "Turing Test," which explored the relationship between intelligent computer behavior and human cognition Turing envisioned AI as a more complex version of the human brain The term "AI" was officially coined by John McCarthy in 1956 during a conference dedicated to the subject Since then, AI has evolved significantly, branching into areas like machine learning and deep learning, driven by global interest and research Today, AI is widely implemented in various applications, including personal virtual assistants, automated systems, and software, as well as in critical fields such as disease diagnosis and treatment.

Artificial Intelligence (AI) is revolutionizing the medical field by enhancing disease diagnosis, healthcare, and treatment through technologies like artificial neural networks and hybrid intelligence systems It can be categorized into two main parts: virtual components such as healthcare software and machine learning algorithms, and physical components like robots and smart medical devices AI's applications are vast, impacting areas such as disease diagnosis, treatment support, administrative tasks, and patient record management As data sources in healthcare become increasingly abundant, the role of AI is expected to grow, highlighting its significance in augmenting human capabilities However, this rapid integration of AI also raises ethical and social concerns, particularly regarding information security and patient safety, making effective management of AI crucial to uphold research integrity and protect patient privacy.

The application of artificial intelligence in cancer diagnosis is a major focus for researchers and medical professionals, offering significant potential to enhance diagnostic accuracy By enabling early detection of cancer, AI technology plays a crucial role in improving patient survival rates.

The rapid advancement of artificial intelligence (AI) has spurred significant research into its applications in cancer diagnosis and treatment, resulting in improved efficiency and accuracy AI technologies, particularly those utilizing deep learning and machine learning, are now employed for early detection of various cancers, including breast and cervical cancer These innovations optimize diagnostic processes and reduce the time required for disease identification compared to traditional methods AI applications not only facilitate early diagnosis but also assist in assessing disease conditions, enabling healthcare professionals to deliver timely and appropriate treatments However, the effective implementation of AI in cancer diagnosis necessitates access to extensive data for thorough analysis and evaluation, alongside careful consideration of ethical implications Overall, AI holds great promise for enhancing cancer diagnosis and is poised for further advancements in the future.

Cancer, cancer data analysis, and the limitations

Traditionally, data on cancer was generated based on common imaging techniques (radiological scans, ultrasound, computerized scans) [11], histological tests and stains

[12], analysis of bodily fluids (blood, urine, liquid biopsy) which tests for tumor signals

Cytology techniques are employed to identify characteristics that suggest tumor presence, supported by text-based medical records Jiang et al identified five key categories of oncologic data: imaging data, phenotypic analysis, molecular interactions, and textual information.

Recent advancements in cancer databases have incorporated "omics" data, which offers crucial molecular insights for cancer research and detection, including genomics, epigenomics, proteomics, transcriptomics, and metabolomics These developments, coupled with improvements in testing methodologies, data science, and storage capabilities, have led to the generation of vast amounts of "big data" that cannot be feasibly analyzed manually Additionally, human interpreters often face challenges with accuracy and attention to detail when engaged in long hours of repetitive tasks.

Artificial intelligence (AI) is increasingly utilized to analyze vast oncologic datasets, enhancing drug development and dosage management By employing advanced learning models, AI can be trained on annotated datasets through complex neural networks and deep learning algorithms, streamlining tasks like oncologic imaging and interpretation, while also enabling significant functions such as cancer risk prediction The adoption of AI in medicine gained momentum around 2017 with the FDA's approval of CardioAI for radiological cardiac image processing Since then, AI tools have seen rapid advancements, with ongoing development focused on improving sensitivity and specificity.

The integration of AI in cancer diagnosis and treatment is a contentious issue, with some studies suggesting that AI performance matches that of human clinicians, while others argue for AI's supportive role in clinical settings Until a clear consensus is established, the use of AI in healthcare remains ambiguous Additionally, the introduction of AI raises concerns about liability in cases of misdiagnosis and may foster distrust among certain patient groups Furthermore, data security issues pose significant risks, as there have been no major data breaches to date, but the potential for future incidents remains a concern.

While data analysis and AI tools hold promise for the future of medical diagnosis, they face significant challenges, including the need for complete and accurate data to avoid biased results, which is critical in clinical settings Unlike human physicians, who can filter and recheck information, machine learning models lack this ability, making them vulnerable to errors from unorganized or incomplete data Additionally, the heterogeneity of data, such as variations in tumor images, complicates analysis further To build trust among professionals and the public, it is essential to rigorously test, verify, and secure these models and algorithms, as any shortcomings could lead to a loss of confidence that may take years to restore.

AI technology aids in the detection and treatment of cancer

1.3.1 AI Algorithms and Models in Medical Data and Cancer Simulation

AI algorithms are revolutionizing medical imaging by enabling accurate analysis of medical data and images Key types of AI technologies utilized in this field include Machine Learning and Deep Learning Neural Networks.

Machine learning algorithms play a crucial role in the medical field by identifying and diagnosing various conditions such as tumors, lesions, and abnormalities These algorithms effectively analyze medical images, including MRI and CT scans, to assess disease progression, treatment effectiveness, and patient prognosis Notably, a study demonstrated the successful application of machine learning in detecting breast cancer from mammograms and diagnosing lung cancer from chest X-ray images Additionally, the study showcased the ability of machine learning to segment brain tumors from MRI images, highlighting its potential in enhancing diagnostic accuracy in healthcare.

Deep learning algorithms play a crucial role in diagnosing various medical conditions, such as tumors and anatomical irregularities, while also assessing disease progression and treatment responses They have been effectively applied to histopathological images for cell type classification and identifying cancer cells Notably, a study demonstrated that CT scans could predict high PD-L1 expression in non-small cell lung cancer, indicating the presence of cancer cells This innovative approach has shown remarkable effectiveness in multiple cancer diagnosis tasks, including classification, reconstruction, detection, segmentation, alignment, and generation of medical images.

Convolutional Neural Networks (CNNs) play a vital role in medical image analysis by automatically identifying key features in complex images, facilitating the accurate detection of abnormalities, tumor diagnosis, and organ segmentation Research demonstrates that CNNs excel in various medical imaging tasks, including image classification, segmentation, and object identification, often outperforming human experts Their effectiveness has been particularly noted in analyzing medical images related to brain, lung, and breast cancers, achieving high accuracy in assessing severity and comprehensiveness Furthermore, CNNs are predominantly used in AI-driven medical image analysis for diagnosing a range of diseases, including cancer, Alzheimer's, and cardiovascular conditions.

1.3.2 Used in diagnosis to detect early signs of cancer

Artificial Intelligence (AI) is revolutionizing cancer diagnosis and treatment, showcasing significant potential in predicting liver cancer recurrence through methods like CS-SVM This article explores the application of AI techniques in cancer detection, particularly for individuals at high risk who may not meet current screening criteria AI algorithms analyze medical records and images, enhancing predictive accuracy Notably, a January 2020 study revealed an AI system based on Google DeepMind that surpassed experts in breast cancer detection Additionally, deep neural networks are being utilized to identify and classify lung nodules, distinguishing between benign and malignant cases This application of deep learning, which mimics human decision-making through multi-layer neural networks, is paving the way for automatic classification models in lung cancer detection Overall, the integration of AI in cancer research holds great promise for future advancements and standardization.

[30] The application of AI in cancer diagnosis is a potential field and is worthy of research and development

Training data plays a vital role in the development of AI models for medical data analysis, medical imaging, and cancer cell simulation To create accurate and reliable AI models, it is essential to ensure diversity and representation within data sets Utilizing big data analytics tools, particularly those that integrate cloud-based mobile computing, can enhance the development of interconnected healthcare systems and yield valuable insights into patient data High-quality medical datasets are crucial for training machine learning algorithms to predict health outcomes effectively, and maintaining data quality is key to achieving accurate results Additionally, addressing privacy concerns and ethical considerations is imperative when developing AI and machine learning models from medical datasets The use of synthetic patient data can further simulate machine learning-enabled health systems, ultimately improving the effectiveness of healthcare delivery.

Data-driven simulations can also be used to train AI-based segmentation models for medical imaging, ensuring accurate automation of data processing [36].

Systematic Review Method for Medical and Cancer Data Analysis

1.4.1 Systematic review analysis in medicine

A systematic review in medicine is a comprehensive study that compiles all relevant empirical data meeting specific eligibility criteria to tackle a defined research question The process begins with a meticulous literature search, followed by a critical assessment and synthesis of the findings from the selected studies These reviews are considered high-quality evidence sources, particularly valuable for shaping clinical guidelines and healthcare decisions, as they aim to minimize bias through structured methodologies.

1.4.2 Systematic review analysis in the field of cancer

AI is revolutionizing cancer research and personalized treatment by leveraging high-dimensional datasets, advanced computing power, and innovative deep learning methods Its applications encompass cancer detection and classification, tumor molecular characterization, drug development and repurposing, and patient outcome prediction Furthermore, AI enhances early cancer diagnosis by swiftly identifying high-risk factors through pathology, imaging, lab tests, and biomarkers, facilitating timely detection and screening of tumors Additionally, AI can predict treatment prognosis, disease recurrence likelihood, and survival duration based on various criteria.

1.4.3 Reasons for conducting a systematic review study

In today's fast-paced scientific landscape, the sheer volume of health studies published daily makes it challenging for readers to sift through the information To effectively evaluate and summarize this research, systematic reviews of individual studies have become essential However, decision-makers are often faced with a plethora of these reviews, which can vary significantly in quality and focus This has led to the emergence of systematic reviews of reviews, allowing for a comprehensive comparison of findings across multiple studies By consolidating evidence from various reviews, this methodology equips clinical decision-makers with the critical insights needed to make informed choices in healthcare.

A systematic review is essential for addressing any qualitative research issue, as it allows for a comprehensive investigation and accurate evaluation of existing research findings According to Petticrew and Roberts (2005), a systematic review involves meticulously locating, reassessing, and synthesizing all relevant works on a specific topic This process not only enhances the reliability of the research but also provides readers with a thorough overview of the subject matter.

- A systematic review can be carried out in seven phases, according to Petticrew and Roberts (2005)[41] These steps are:

1 Clearly state the hypothesis or research question

2 Determine the kinds of research required to carry out the investigation

3 Carry out an exhaustive literature search required to find relevant studies

4 Screen studies to see whether they fit the selection criteria; if not, do additional analysis

5 Reexamine papers critically before incorporating them into the systematic review

6 Compile research and evaluate its consistency

METHOD

Research Content

This study was conducted to analyze the application of AI in cancer data analysis with the goal of early detection

A systematic review was conducted to evaluate the role of artificial intelligence (AI) in early cancer detection, focusing on existing research, evidence, and identifying gaps in the current knowledge The analysis involved extensive data collection from various databases, followed by a rigorous filtering process Selected articles were examined to assess the effectiveness of different AI models and methods used in biomedicine, particularly in cancer diagnosis and treatment This review highlighted significant research gaps in the application of AI for data analysis in early cancer detection, as illustrated in the study design diagram (Figure 2.1).

Research Method

This study adheres to the PRISMA (Preferred Reporting Items for Systematic Review and Meta-analysis) guidelines, which are the most widely recognized standards for conducting systematic reviews By following PRISMA, researchers can maintain a high level of objectivity and accuracy throughout the systematic review process, encompassing the formulation of research questions, data collection, evaluation, synthesis, and analysis.

This study focuses on publications regarding the application of artificial intelligence (AI) in cancer identification, sourced from two primary databases: PubMed and ScienceDirect The research methodology employed involved a systematic process for searching, collecting, and selecting relevant articles.

The following criteria were used to determine whether a study was selected for inclusion in the analysis (PICOTS) [43]:

 Scientific articles on two databases Pubmed and Sciencedirect

 Full text of the article can be read

 Articles on the topic of AI application in cancer data analysis

- I – Intervention: With or without intervention; No restrictions on study designs, however, literature reviews, systematic reviews, and meta-analyses were excluded

- C – Comparison: Whether or not there was a comparison group in the study

 Application of AI in data analysis to predict cancer screening

 Application of AI in data analysis to predict cancer prevention

 Application of AI in data analysis to predict cancer treatment

- T – Time: All articles published to date

- Analytical studies do not refer to the application of AI data analysis in cancer detections

- Analytical studies do not refer to detection methods but to other methods

- Review studies, systematic reviews, meta-analyses, etc

2.2.4 Data sources and techniques for searching

- Keywords in the title and abstract of relevant articles

- Two databases were searched: PubMed and Science Direct

- Keywords used for searching: (((Cancer) AND (Application)) AND (Artificial Intelligence)) AND (Data Analysis) AND (DETECTION))

- Zotero software was used to store data during searching The data after searching will be cited using EndNote and managed using and Microsoft Excel

2.2.5 Research screening and selection process

- The process of filtering and selecting studies for inclusion in the review was as follows:

Step 1: Merge search data, remove duplicate articles, and exclude articles without abstracts

Step 2: Filter based on article titles and abstracts to exclude reports with research topics unrelated to the application of AI data analysis in cancer detections

Step 3: Retrieve full-text articles

Step 4: Filter based on full text Review full-text articles with inclusion and exclusion criteria

Step 5: Contact authors as needed, to clarify study objectives and possibly request additional information

Step 6: Make final decisions on inclusion and proceed with data collection

The selected citations will be rigorously assessed by two unbiased reviewers, College Student Thai - Van Bui Hoang and MSc Hue Vu Thi In case of any disagreements during the research selection process, these will be resolved through discussion or by involving a third reviewer, Ph.D Dinh Toi Chu The final report will provide a detailed summary of the search outcomes, accompanied by a PRISMA diagram (Figure 3.1) [42].

Evaluate Matrix

In cancer diagnosis and classification, key assessment measures such as precision, specificity, sensitivity, recall, Area Under Curve (AUC), and F-score are frequently utilized These metrics play a crucial role in evaluating the effectiveness of diagnostic methods and ensuring accurate cancer detection and classification.

Accuracy is a crucial standard for evaluating models generated through data mining techniques This metric is calculated by dividing the total number of samples predicted by the model by the number of correctly labeled samples.

Sensitivity, also known as recall, measures a model's effectiveness in accurately identifying positive cases, such as cancer patients In cancer diagnosis, high sensitivity is essential to maximize the detection of true positive cases, ensuring that as many individuals with the disease are identified as possible.

 Specificity: Evaluates the model's ability to correctly identify negative cases (patients without cancer) High specificity helps to reduce the number of false positives

 Precision: Indicates The exactness of positive predicts by calculating the proportion of true positives among all positive predictions High precision is important to minimize the number of false positives

The F-Score is a vital metric that merges precision and recall, offering a comprehensive evaluation of a model's performance This metric is especially beneficial for analyzing imbalanced datasets, where one class, such as cancer cases, is significantly less frequent than the other.

 AUC (Area Under the Curve): Represents the area under the ROC (Receiver

The Operating Characteristic (ROC) curve illustrates the relationship between the true positive rate (sensitivity) and the false positive rate (1-specificity) across different threshold settings The Area Under the Curve (AUC) offers a single scalar value that evaluates the model's overall capacity to differentiate between positive and negative cases effectively.

RESULT

AI in Detection of Cancer: A Systematic Review

The document filtering and evaluation process began by searching PubMed and ScienceDirect with specific keywords, yielding 2,160 articles After removing duplicates and articles lacking abstracts, 2,066 articles remained A second screening excluded 1,479 articles that were unrelated to the research topic, non-English, or did not meet established criteria, leaving 587 articles for full-text review Applying the PICOT criteria during this review resulted in the exclusion of 529 articles that did not address the application of AI in cancer prediction data analysis Ultimately, 58 articles that fulfilled all criteria were selected for further analysis.

Figure 3.1 Search results and data screening based on PRISMA diagram

Table 3.1 presents a comprehensive overview of the 58 selected studies, detailing key features such as the study title, publication year, country of origin, type of illness investigated, data types used, main objectives, experimental designs, and sample sizes.

Table 3.1 Characteristics of articles selected for the study

Name Year Nation Type of disease Type of data used Main Objective Main experimental design/ Sample size

[47] 2017 Netherlands Breast Cancer Imagine Data

Compares the ability of machine learning algorithms vs clinical pathologists

- A training data set of whole-slide images from 2 centers in the

Netherlands with (n = 110) and without (n = 160) nodal metastases verified

- Algorithm performance was evaluated in an independent test set of 129 whole-slide images (49 with and 80 without metastases)

[48] 2023 Iran Cancer Genetic, clinical data

Provide a comprehensive overview of the various ML algorithms and techniques employed for cancer detection

Name Year Nation Type of disease Type of data used Main Objective Main experimental design/ Sample size

[49] 2023 USA Pancreatic cancer (PC) Blood tests

Achieve high predictive accuracy for PC through a minimally invasive approach,

Blood was obtained from seven treatment‐nạve PC patients and with

14 controls who had no diagnosis or suspicion of cancer

[50] 2020 Korea Cervical Cancer Imagine Data

Develop a cervical cancer prediction model (CCPM) that improves early detection accuracy using outlier detection and over-sampling methods

[51] 2022 Germany Colorectal cancer (CRC) Imagine Data

AI in detecting microsatellite instability (MSI) and mismatch repair deficiency (dMMR) in colorectal cancer

8343 patients across different countries and ethnicities

[52] 2012 Georgia Prostate cancer Imagine Data

Evaluate the use of hyperspectral imaging (HSI) combined with advanced image processing and classification methods for the detection of prostate cancer

Name Year Nation Type of disease Type of data used Main Objective Main experimental design/ Sample size

[53] 2021 Australia Breast cancer Imagine Data

Evaluate deep learning (DL)‐ based AI (AI) techniques for detecting the presence of breast cancer

28,694 digital mammographic images from 7498 women with screen‐detected breast cancer

[54] 2022 Saudi Arabia Skin cancer Imagine Data

A novel CIMDC-DI algorithm was developed for melanoma identification and classification using dermoscopic images

Carolina Breast cancer Imagine Data

We aim to create, annotate, and publicly share a comprehensive dataset of digital breast tomosynthesis (DBT) images to support the advancement and assessment of AI algorithms in breast cancer screening Additionally, we will establish a foundational deep learning model for effective breast cancer detection.

22 032 reconstructed DBT volumes from 5060 patients

Name Year Nation Type of disease Type of data used Main Objective Main experimental design/ Sample size

[56] 2021 Brazil Prostate cancer Imagine Data

This study evaluates the diagnostic performance of an AI system in whole slide images (WSIs) of transrectal ultrasound (TRUS) prostate biopsies It aims to determine how this technology influences the accuracy of board-certified pathologists interpreting these WSIs and to assess its effect on the diagnostic accuracy and efficiency of seasoned diagnostic pathologists.

[57] 2018 Colombia Breast cancer Imagine Data

Automated detection of invasive breast cancer on whole-slide images (WSI)

Near 500 cases and then independently tested on 195 studies from The Cancer Genome Atlas

Cancers of breast, endometrium, cervix, and ovary

Concurrent detection of early-stage women-specific cancers total number of samples across all groups was 1369

Name Year Nation Type of disease Type of data used Main Objective Main experimental design/ Sample size

Thyroid cancer Skin cancer Stomach cancer Breast cancer Lung cancer

Imagine Data Cancer detection using RNA-

Seq data and image features

472 skin, 415 stomach, 1,083 breast, and 511 lung cancer samples

Breast, lung, colon, and ovarian cancer

Cancer detection using high dimensional gene expression and proteomic spectra datasets

Laryngeal squamous cell carcinoma (LSCC)

Limitations of current methods for diagnosing laryngeal squamous cell carcinoma (LSCC) and the potential of using deep learning (DL) to improve detection rates

[62] 2022 India Lung cancer Imagine Data

Smart Lung Tumor Detector and Stage Classifier Using Deep Learning Techniques using data from 99 patients

Name Year Nation Type of disease Type of data used Main Objective Main experimental design/ Sample size

[63] 2019 China Lung Cancer Imagine Data

Detection and classification of pulmonary nodules in lung cancer diagnosis using clinical CT images

Identify reliable plasma lipid biomarkers for non-invasive diagnosis for malignant brain gliomas (MBGs) through metabolic detection

(n0) enrolled from multiple medical centers

CNN-based classifier which classifies the tumor and nontumor regions in retinoblastoma includes 23 patients (17 healthy and 27

Cancers of the brain and the central nervous system

Utilizing Attenuated Total Reflectance (ATR)-Fourier Transform Infrared (FTIR) spectroscopy in combination with machine learning techniques for early detection of brain cancer

Name Year Nation Type of disease Type of data used Main Objective Main experimental design/ Sample size

Breast, lung, colorectal and prostate cancers

Identify advances, opportunities and challenges of ML in cancer diagnostics

2606 tumors from 24 common cancer types

[68] 2021 Sweden Breast cancer Imagine Data

Detect missed breast cancer cases from mammography images (2d)

14768 women were screened with 2D and 3D mammography

[69] 2022 China Colorectal cancer Imagine Data

Identification of tumor sprouting in colorectal treatment and diagnosis colorectal cancer patients (n00)

[51] 2022 Germany Colorectal cancer Imagine Data

Detection of microsatellite instability in colorectal patients

8343 patients from 9 cohorts across different countries

[70] 2023 Australia Breast cancer Imagine Data Comparing AI and radiologist

100K+ images, patients that died 24 months after were excluded

[71] 2022 Netherlands Prostate cancer Imagine Data

AI applications into improving prostate cancer MRI

[72] 2023 China Multi-cancer Clinical data

Performance of a protein assay in detecting different cancers, especially in LMIC

Name Year Nation Type of disease Type of data used Main Objective Main experimental design/ Sample size

[73] 2023 Germany Colon cancer Imagine Data Classification of microsatellite instability based on 629 patients

Emirates Oral cancer Imagine Data AI use in diagnosing oral cancer total of 7245 patients and 69,425 images were included

[75] 2022 USA Breast cancer Imagine Data

Investigate the utility of a novel AI program applied to initial staging chest CT in breast cancer patients in risk assessment of mortality and survival

[76] 2021 Netherlands Breast cancer Imagine Data AI methods for use in breast cancer diagnostics 

[77] 2023 China Nasopharyngeal carcinoma Imagine Data

Find if AI could be competent pre-inspector for MRI radiologists

[78] 2020 China Gastric cancer Imagine Data

Develop and validate a real- time deep convolutional neural networks (DCNNs) system for detecting early gastric cancer

Name Year Nation Type of disease Type of data used Main Objective Main experimental design/ Sample size

[79] 2020 South Korea Breast cancer Imagine Data

Develop an AI algorithm for diagnosis of breast cancer in mammography

[80] 2023 India Skin cancer Imagine Data

Updated overview of the latest research in using Machine Learning and Advanced Technologies for Skin Cancer Detection

[81] 2023 India Breast cancer Imagine Data

Create and verify a state-of- the-art approach to the early identification of breast cancer using the Internet of Medical Things and the blockchain

- Compare differences in the ADR when randomized with the use of an AI system (DEEP2)

- Specifically measure the false alert rate as a surrogate for both accuracy and user experience of the AI (DEEP2) algorithm

Name Year Nation Type of disease Type of data used Main Objective Main experimental design/ Sample size

[83] 2021 USA Prostate cancer Imagine Data

Evaluate the effectiveness of The Paige Prostate AI-based digital diagnostic in supporting and improving accuracy in prostate cancer diagnosis

Total of 1876 prostate core biopsies from 118 consecutive patients

[84] 2023 Taiwan Breast cancer Imagine Data

Apply color normalization and nucleus extraction methods to overcome differences between staining technologies, supporting the application of AI in identifying specific tumor

2000 H&E-stained images of 1000 pixels × 1000 pixels in size from 20 cases

[85] 2023 China Colorectal cancer Imagine Data

Results of a whole-slide- level prediction model with dMMR/pMMR detection method

8500 images from patients with CRC, sourced from various institutions

[86] 2024 India Skin cancer Imagine Data

Present an effective skin cancer detection and classification (SCDC) system called MFEUsLNet

33,126 dermoscopic images of skin lesions from 2,056 patients across 3 continents

Name Year Nation Type of disease Type of data used Main Objective Main experimental design/ Sample size

[87] 2024 Spain Breast cancer Imagine Data

Present a convolutional neural network model to detect and localize cancerous regions in histopathological images of breast tissue

Total containing 275 whole slide images

[88] 2023 India Cervical cancer Imagine Data

- Classify cell images into six distinct classes

- explore the use of three popular object detection models

- Investigate the impact of data augmentation on the classification performance of the object detection models

- Conduct a comparative analysis of the three object detection models, evaluating their strengths and weaknesses in classifying the cell images dataset consists of 400 images and a total of 11,534 labeled cells

Name Year Nation Type of disease Type of data used Main Objective Main experimental design/ Sample size

[89] 2023 Colombia Brain tumor Imagine Data

Propose an effective brain tumor detection and classification method using deep learning total of 3000 brain MRI scans/images

Zealand Skin cancer Imagine Data

Test if deep learning models can accurately identify and localize regions of nonmelanoma skin cancer

Total of 1,836 FMV images were collected from 220 patients

[91] 2021 Bangladesh Skin cancer Imagine Data

Develop an effective automated system for detecting and classifying melanoma skin lesions into malignant or benign categories using image processing and machine learning techniques

[46] 2023 India Bone cancer Imagine Data

Apply convolutional neural networks and supervised deep learning methods for tumor samples across the selected 50 patients

Name Year Nation Type of disease Type of data used Main Objective Main experimental design/ Sample size the detection of osteosarcoma bone cancer

[92] 2023 India Breast cancer Imagine Data

Investigate the performance of the averaged-perceptron machine learning classifier on the Wisconsin original breast cancer dataset contains 699 records

[93] 2024 India Lung cancer Imagine Data

Propose a novel Deep Learning (DL) framework for the detection and diagnosis of lung cancer from Computed Tomography (CT) image analysis contains 888 chest CT scans

[94] 2023 India Skin cancer Imagine Data

Evaluate a deep neural network (DNN) model with fine-tuned training and improved learning performance on dermoscopic images for skin cancer detection dataset comprises 58,032 dermoscopic images

Name Year Nation Type of disease Type of data used Main Objective Main experimental design/ Sample size

[95] 2023 UK Breast cancer Imagine Data

Evaluate a novel computer- aided diagnosis system called BCDNet for breast cancer detection using ultrasound images

166 normal ultrasound images and 647 cancer samples

[96] 2021 Brazil Prostate Cancer Imagine Data

Evaluate the ability of deep learning algorithms to detect and grade prostate cancer (PCa) in radical prostatectomy specimens

[97] 2021 India Breast cancer Imagine Data

This article reviews and compares the effectiveness of Artificial Neural Networks (ANN), focusing on Multi-Layer Perceptron Neural Networks (MLP) and Convolutional Neural Networks (CNN), in the early detection of breast malignancies The analysis is based on a dataset of 683 records, evaluating the accuracy of each neural network type in diagnosing breast cancer.

Name Year Nation Type of disease Type of data used Main Objective Main experimental design/ Sample size

Present and compare several machine learning (ML) algorithms and biosensor technologies for early detection of breast cancer

[99] 2023 India Skin Cancer Imagine Data

Propose an artificial skin cancer screening process using image processing and machine learning techniques dataset includes 1800 images of malignant lesions and 1497 images of benign melanoma lesions

Overview of techniques, challenges, and solutions to assist healthcare professionals and technologists in understanding new screening and diagnostic tools final sample of 38 articles

[100] 2017 China Breast Cancer Imagine Data

Identifying nuclear versus non-nuclear regions in high- resolution histopathological images

Table 3.1.2 provides a comprehensive overview of 58 research articles examining the role of artificial intelligence (AI) in cancer detection and analysis, with studies conducted from 2012 to 2024 across various countries, including the Netherlands, Iran, the USA, Korea, Germany, and Saudi Arabia These studies address multiple cancer types, such as breast, pancreatic, cervical, colorectal, prostate, skin, and lung cancer, utilizing diverse data sources like imaging, genetic, clinical, and blood tests The main focus of the research is to improve the accuracy and efficiency of cancer detection by comparing AI models with clinical pathologists or creating innovative AI techniques for early detection and classification The experimental designs and sample sizes vary significantly, ranging from large datasets with thousands of images or patient samples to targeted experimental setups involving whole-slide images, blood tests, or genomic data Notably, 22 articles, representing 39% of the total, are set to be published in 2023, highlighting a notable surge in research activity in this field.

[85] [88] [89] [90] [46] [92] [94] [95] [99] [45] Followed by 13 and 9 articles equivalent to 23%, 16% in 2022 and 2021 [50] [51]40] [53] [54] [55] [56] [58] [59] [61] [51] [64]

[67] [68] [69] [51] [71] [74] [75] [76] [78] [79] [83] [91] [96] [97] In the remaining years, there were publications, but the rate was very small, only from 2% to 7%

Figure 3.2 Year of publication of the article

Year public the research paper

Recent findings reveal that a significant 84% of AI models utilize image data types, while genetic data types account for 7% of models The remaining data types, such as blood tests, clinical data, and mixed data, comprise only 2% to 5% of the total Additionally, most published articles on this topic have emerged within the last 3 to 4 years.

Figure 3.3 Types of data are used

Recent research indicates that breast cancer dominates the landscape of published medical papers, accounting for 32% of all disease-related studies Following breast cancer, skin cancer is addressed in 12% of research publications, while prostate cancer appears in 9% Other types of cancer, such as brain and cervical cancer, are mentioned but represent a smaller fraction of the literature, with prevalence rates ranging from 2% to 5%.

Imagine DataBlood testsClinical DataMixDataGenetic Data

Figure 3.4 Proportion type of Cancer

The search results showed that published studies focused mainly on India with

13 studies (22.8%), and China with 9 research articles (15.8%), ranked third is Germany with 4 articles published (7%) The remaining studies were conducted sporadically in other countries such as Iran [48] [45], Netherlands [48] [45], USA [49] [75, 83], Korea

Figure 3.5 Article punlished by countries

Breast CancerSkin cancerMore than 1 CancerProstate cancerLung cancerBrain CancerMulti-cancerCervical Cancer

Various studies exhibit diverse experimental designs and sample sizes across numerous topics and methodologies For instance, one study utilized a training dataset comprising 129 whole-slide images—49 depicting metastases and 80 without—from two locations in the Netherlands, involving 110 patients with nodal metastases and 160 cases without.

The training dataset was evaluated using an independent test set, which included blood samples from fourteen cancer-free controls and seven patients with prostate cancer who had not yet undergone treatment Additionally, studies involving 8,343 patients from diverse national and ethnic backgrounds, along with research analyzing 858 cases with 36 characteristics, highlight the demographic variability among participants Furthermore, the dataset comprises 28,694 digital mammography images, contributing to the comprehensive analysis of the research.

A comprehensive study involving 7,498 women with screen-detected breast cancer and 80,791 endoscopic images from 1,482 patients has been conducted, showcasing a diverse dataset derived from randomized extension trials This extensive collection encompasses various imaging data types, highlighting a robust approach to cancer research that leverages multiple trial designs and large sample sizes.

Effectiveness of AI in Early Cancer Detection

3.2.1 AI Application in Cancer Detection Using Existing Models

In my research, I will utilize AI models that excel in analyzing diverse data sources—such as images, genetic information, and clinical data—to identify malignancies My goal is to provide a unique perspective on the capabilities, applications, and effectiveness of these advanced models in cancer detection, as outlined in Table 3.2.

Table 3.2 AI Detection in Cancer based onAnalysis of Existing Models

Ref Approach Targets for application

Accuracy, Prediction based on Digital Image

Breast Cancer Colorecta l cancer Bone cancer

Enhancing the performance of the detection and classification process based on Image, CSV, JSON Data

Laryngeal squamous cell carcinoma (LSCC) Cervical cancer

Reduce detection time and improving accuracy in prostate cancer diagnosis

Ref Approach Targets for application

Enhancing the performance of the detection and classification process based on Image Data

Brain Tumor Bone Cancer Colon cancer Skin Cancer

Enhancing the performance of the detection, identify and classification process, training and improved learning performance on based on Image Data

Brain Tumor Breast cancer Skin Cancer

Designed for tasks such as image classification, detection and object locolization

Breast Cancer Bone Cancer Skin Cancer

Ref Approach Targets for application

Propose for cancer detection using high dimensional gene expression and proteomic spectra datasets

Breast, lung colon ovarian cancer

- Specifically measure the false alert rate as a surrogate for both accuracy and user experience

- Compare differences in the ADR when randomized

Improves early detection accuracy using outlier detection and over- sampling methods

Ref Approach Targets for application

Improve early detection, classification based on Digital Image

The models are used for object detection and classification of cervical cancer cells

The model is used for classifying breast ultrasound images as either

Ref Approach Targets for application

SCORE PRECISION SPECIFICITY breast cancer or normal

Used for automated nucleus detection in histopathologica l images of breast cancer for analyzing the shape and appearance of breast cancer nuclei

Automatically identifying, object detection and classification based on Image data

Explaination:  : no information, AUC: Area Under the Curve , ACC: Accuracy Rate , SEN:Sensitivity

The use of AI for early cancer detection and classification has been thoroughly investigated through various models and methodologies, as highlighted in the summarized findings in Table 3.7 Notably, the ResNet model has been cited in multiple studies, demonstrating its relevance in this field.

Recent advancements in digital imaging have demonstrated remarkable effectiveness in the early detection and predictive analysis of breast and colorectal cancers With an area under the curve (AUC) ranging from 0.946 to 0.976, accuracy rates have reached between 90% and 99.2% Sensitivity values vary from 80% to 98.8%, while recall rates consistently exceed 85% Additionally, F-scores range from 89.59% to 95.39%, precision levels are reported between 85% and 97.67%, and specificity has been observed from 81.87% to 99.6%.

YOLOv5 has been effectively employed to improve detection and classification in laryngeal squamous cell carcinoma (LSCC) and cervical cancer, achieving an accuracy of around 62% and a specificity ranging from 60% to 85.8% across image, CSV, and JSON data formats In contrast, the Paige Prostate model, which targets prostate cancer, demonstrated remarkable performance with an accuracy exceeding 97%, a recall rate of 98%, and specificity above 93%.

The VGG16-19 model has shown impressive results in diagnosing various cancers, including skin, bone, colon, and brain tumors, achieving over 90% accuracy, with AUC values between 86.69% and 89.44%, sensitivity ranging from 87.21% to 88.94%, and F-scores between 85.5% and 88.33% Similarly, EfficientNet has excelled in detecting and classifying brain tumors, breast cancer, and skin cancer, boasting an accuracy of 96.15%, AUC values from 80% to 97.62%, sensitivity at 80%, and an F-score of 100% DenseNet CNN has also demonstrated strong performance in image classification for breast, bone, and skin cancers, with an AUC of around 90%, accuracy of 87.15%, precision of 85.5%, and specificity of 84.38% Additionally, the Single-Hidden Layer Feedforward Neural Network (aSLFN) has reported AUC values between 72% and 98.5% and accuracy ranging from 54.62% to 92.84% when applied to breast, lung, colon, and ovarian cancers using genetic data.

DEEP2 has been developed to assess false alert rates, serving as indicators for both accuracy and user experience in adenoma detection, while also emphasizing the variety of cancer types These results highlight the considerable promise of artificial intelligence in enhancing cancer detection, increasing accuracy, and refining predictive analytics.

The cervical cancer prediction model (CCPM) has significantly enhanced early detection accuracy, achieving over 98% in both accuracy and specificity through outlier detection and over-sampling methods In contrast, the NasNet model demonstrated about 85% accuracy in detecting breast and skin cancers, with recall and F-scores around 75% and specificity at 74% Detectron2 exhibited moderate performance in classifying cervical cancer cells, with recall rates ranging from 46% to 51.3%, highlighting challenges in certain cancer types The BCDNet model excelled in classifying breast ultrasound images, attaining an impressive accuracy of 93.97%, a 96.42% F-score, and 99.07% specificity Lastly, the Stacked Sparse Autoencoder (SSAE) effectively detected breast cancer nuclei in histopathological images, achieving precision rates of 82.85%, F-scores of 84.49%, and specificity of 78.83%.

The Faster R-CNN model [69, 88] demonstrated strong performance in cervical and colorectal cancer detection, with 96% accuracy, 94% sensitivity, and 83% specificity These

Recent advancements in AI models have greatly improved cancer detection and classification, showcasing the effectiveness of machine learning and deep learning in enhancing early detection accuracy and treatment outcomes The variety of models and their performance metrics highlight the necessity for customized strategies to tackle specific cancer types, reflecting the continuous progress in medical research.

All the results given above have been selected from all the relevant articles and the values are taken in the range from the lowest to the highest result threshold

3.2.2 AI Detection in Cancer: Analysis of DL and NN Approaches

Researchers are increasingly combining machine learning (ML) techniques with deep learning (DL) and neural network (NN) approaches to enhance cancer detection using diverse data types, particularly image data By integrating these methodologies, they aim to expedite the cancer detection process through a comprehensive categorization of newly acquired images, leveraging insights from previously analyzed photos.

Table 3.3 AI Detection in Cancer: DL and NN Approaches Analysis

Ref Approach Targets for application Type of Cancer

AUC ACC SEN RECALL F-SCORE PRECISION SPECIFICITY

Improve the detection and classification, segmentation process Reduce need for traditional molecular testing

Lung cancer, Breast cancer, prostate cancer, Skin cancer

Colorectal cancer (CRC) Retinoblastoma, Gastric cancer, Oral Cancer,

Ref Approach Targets for application Type of Cancer

AUC ACC SEN RECALL F-SCORE PRECISION SPECIFICITY

ANN algorithms used for the classification, detection task of categorizing cancer based on images data

Skin Cancer, Breast cancer Image  97.4%-

Focus on tumor detection and classification, segmentation and prognostic tasks highly efficient prognosis prediction

Thyroid cancer, Skin cancer, Stomach cancer, Breast cancer, Lung cancer, colorectal prostate cancers

Ref Approach Targets for application Type of Cancer

AUC ACC SEN RECALL F-SCORE PRECISION SPECIFICITY

- Differentiate between brain cancer and control in patient cohorts

- Can model sequential patterns for accurate classification

- Ability to capture temporal/sequential patterns

Breast Cancer, Brain Cancer, Skin Cancer

Explaination:  : no information, AUC: Area Under the Curve, ACC: Accuracy Rate, SEN: Sensitivity

The use of neural networks (NN) and deep learning (DL) techniques in cancer detection is prevalent across various cancer types, with Convolutional Neural Networks (CNNs) leading in performance for detection, classification, and segmentation, thus minimizing reliance on traditional molecular testing CNNs exhibit high accuracy and robustness in cancers such as lung, breast, prostate, skin, and colorectal Additionally, Artificial Neural Networks (ANNs) and Deep Neural Networks (DNNs) excel in classification and prognosis prediction, while Recurrent Neural Networks (RNNs) effectively model sequential patterns crucial for distinguishing cancer types among patients However, a key limitation is the reliance on limited datasets, highlighting the need for further validation to improve model generalizability Expanding clinical parameters could enhance accuracy, providing a reliable tool for early cancer diagnosis and better patient outcomes Common evaluation metrics like AUC, sensitivity, and specificity are utilized to assess model performance, ensuring the applicability of these advanced techniques in clinical environments.

The results displayed above are all those that were chosen from among all pertinent articles, and the values vary from the lowest to the highest result threshold

3.2.3 AI Detection in Cancer: Analysis of Hybrid Approaches

Selecting the best methods for machine learning can be challenging due to its diverse approaches, prompting researchers to seek strategies that ensure high-quality outcomes Consequently, scientists often incorporate additional techniques related to deep learning (DL) and neural networks (NN) when assessing the effectiveness of machine learning methods This article categorizes machine learning processes into two main groups—NN and DL—while also considering hybrid approaches currently employed for cancer detection Notably, research predominantly utilizes image data sources to enhance the speed and accuracy of cancer detection.

Table 3.4 AI Detection in Cancer: Hybrid Approaches Analysis

Ref Approach Targets for application

Mainly to predict the relationship between model and factors considered in cancer and prediction identity

Cancer, Pancreatic cancer (PC), Breast Cancer, Skin Cancer

Hand-crafted features in histopathology image analysis improve cancer diagnosis by enabling efficient biomarker screening through accurate segmentation and classification

Brain Cancer, Breast cancer, Skin Cancer, Pancreatic cancer (PC)

Ref Approach Targets for application

This algorithm used to improve segmentation and classification, detection cancer process

Skin Cancer, Breast cancer, Pancreatic cancer (PC), Cervical Cancer, Bone Cancer, Prostate Cancer (PCa)

Improve classification tasks, segmentation based on image, genetic and clinical data but mostly image

Thyroid cancer, Skin cancer, Stomach cancer, Breast cancer, Lung cancer, colorectal prostate cancers

The application of hybrid machine learning approaches in cancer detection reveals their widespread use across various cancer types, particularly with Support Vector Machines (SVMs) that enhance diagnosis through efficient biomarker screening and accurate data segmentation Random Forest techniques significantly improve classification processes, showcasing high accuracy and sensitivity in multiple cancer types Regression techniques effectively predict relationships between factors and cancer, especially with genetic data, while K-Nearest Neighbors (KNN), though less common, excel in classification tasks with high precision The effectiveness of these hybrid methods varies by data type and cancer, emphasizing the need for integrating diverse data sources to boost detection accuracy Addressing challenges like data quality and computational demands is essential for optimizing these advanced methods in clinical applications, with evaluation criteria such as AUC, sensitivity, and specificity ensuring their robustness in cancer detection.

This section explores the benefits and drawbacks of various hybrid machine learning techniques for cancer diagnosis By combining multiple machine learning methods, such as Regression, Support Vector Machines (SVMs), Random Forest, and K-Nearest Neighbors (KNN), hybrid approaches aim to enhance the performance of cancer diagnostic models by leveraging their strengths while addressing their weaknesses The effectiveness of each technique in detecting different types of cancer is evaluated, highlighting the specific advantages and disadvantages associated with each method.

The values range from the lowest to the highest result threshold, and the results shown above are all those that were selected from all relevant articles.

AI model Advantages and Disadvantages

3.3.1 Advantages and disadvantages of Existing AI Models

This section analyzes the advantages and disadvantages of existing AI models for cancer detection, as detailed in Table 3.5 It focuses on various neural network architectures and deep learning frameworks tailored to specific cancer types The evaluation of these AI models considers their effectiveness, accuracy, and computational requirements, providing a comprehensive understanding of their utility and limitations in medical settings.

Table 3.5 Advantages and disadvantages of Existing AI Models

Cancer Advantages of the model approach Disadvantages of the model approach

ResNet demonstrates exceptional performance in computational pathology by leveraging pre-training on ImageNet, which enhances its generalization capabilities It effectively extracts intricate features and maintains robust performance across varied cohorts Additionally, ResNet is scalable and practical for integration into clinical workflows, achieving high classification accuracy in its applications.

The potential for misclassification, including false-negative and false-positive cases, highlights the sensitivity of image quality in diagnostic processes Performance may vary across different ethnic groups due to genetic diversity, necessitating extensive preprocessing of histology slides Additionally, threshold adjustments are required for specific cohorts to ensure generalizability The computational intensity and complex architecture of these systems demand significant hyper-parameter optimization for optimal results.

Cancer Advantages of the model approach Disadvantages of the model approach

88] YOLOv5 laryngeal squamous cell carcinoma (LSCC), Cervical cancer

- Real-Time Detection: YOLO is capable of processing images and video frames very quickly, making it suitable for real-time applications

- High Detection Accuracy: The YOLO model, particularly the ensemble of YOLOv5s and YOLOv5m with TTA, achieved promising detection performance metrics, including precision, recall, and mean Average Precision (mAP)

YOLO is a single-stage object detector that streamlines the detection process by directly predicting bounding boxes and class probabilities from entire images in a single evaluation This approach enhances computational efficiency and simplifies the detection pipeline when compared to traditional two-stage detectors.

- Versatility: YOLO can handle various imaging modalities, as demonstrated in the study with both white light (WL) and narrow-band imaging (NBI) videolaryngoscopies This versatility makes it

The model is computationally intensive and requires powerful hardware for training and inference, making it less accessible for environments with limited computational resources

The performance can be impacted by the quality of the input data Issues like image distortion or low- quality annotations can significantly affect the model's accuracy

Cancer Advantages of the model approach Disadvantages of the model approach applicable to different types of endoscopic imaging

- Comprehensive Feature Extraction: YOLO's convolutional neural network architecture effectively extracts hierarchical features from images, which improves its ability to recognize complex patterns associated with cancerous lesions

The model exhibited impressive performance metrics, achieving a sensitivity of 99% and a remarkable negative predictive value (NPV) of 100% at the part-specimen level Furthermore, at the patient level, it reached optimal sensitivity and NPV, both recorded at 100%.

- Reduction in Diagnostic Time: Paige Prostate potentially reduces diagnostic time by 65.5% for pathologists by pre-screening slides and reducing the number of slides requiring full histologic review

- Enhanced Diagnostic Accuracy: The model identified additional cases that were initially

- False Positives: The model produces some false positives, requiring pathologists to conduct additional reviews and histological analyses, which may increase workload in certain cases

- Limited Specificity: The specificity at the patient level is lower (0.78), meaning some benign cases are flagged as suspicious, leading to unnecessary further testing

- Dependency on Slide Quality: The accuracy of

Paige Prostate can be affected by the quality of the histopathology slides, with out-of-focus scans potentially leading to false negatives

Cancer Advantages of the model approach Disadvantages of the model approach missed by pathologists, leading to more accurate diagnosis

- Generalizability: Demonstrated effective performance across institutions, validating its utility beyond the development environment

- Quality Assurance: The system could be used for quality assurance, flagging only slides that need further review, thus maintaining high diagnostic standards

- Initial Learning Curve: Pathologists may need time to familiarize themselves with the model’s outputs and integrate it into their workflow, potentially causing initial inefficiencies

Brain Tumor, Bone Cancer, Colon cancer, Skin Cancer

- High Performance: VGG16 is known for its high performance in image classification tasks due to its deep architecture, which enables it to learn complex features

- Transfer Learning: The pre-trained VGG16 model can be easily fine-tuned for various specific tasks, making it versatile and widely applicable across different domains

- Simplicity: The architecture of VGG16 is relatively simple, consisting of only convolutional

- Computationally Intensive: Large amounts of processing power are needed for VGG16 training and inference due to its depth and the large number of parameters

- Memory Consumption: The model consumes a large amount of memory, which can be a limitation when working with high-resolution images or deploying the model on devices with limited memory

- Training Time: The training time for VGG16 is relatively long compared to more modern

The model approach for cancer research offers several advantages, including its simplicity and ease of implementation, especially when utilizing fully connected layers This approach allows for better understanding compared to more complex models Additionally, it has the potential to achieve comparable or superior performance while requiring fewer parameters and lower computational costs.

Brain Tumor, Breast cancer, Skin Cancer

- High Accuracy: EfficientNet models achieve high accuracy in various classification tasks For example, EfficientNetV2-M achieved the highest training accuracy of 94.10% and the highest validation accuracy on several datasets

- Computational Efficiency: Designed to be computationally efficient while maintaining high accuracy This allows for faster training and inference times compared to some other deep learning models

- Performance on Multiclass and Binary Classification: EfficientNet models performed well in both multiclass and binary classifications, achieving high accuracy and F1 scores across different datasets

- Scalability: EfficientNet’s architecture allows it to scale up effectively by balancing network depth, width, and resolution

- High Resource Demand: Requires significant computational resources

- Long Training Time: Larger models need more time to train

- Bias Potential: Can be biased if training data is imbalanced

Dependency on Pre-trained Models: May need fine- tuning for specific tasks

Cancer Advantages of the model approach Disadvantages of the model approach

- State-of-the-Art Results: Achieves state-of-the- art performance on various benchmark datasets, making it a reliable choice for image classification tasks

Breast, lung, colon, ovarian cancer

- Improved Classification Accuracy: The aSLFN model has been shown to perform better in terms of classification accuracy compared to traditional ELM and other models under similar conditions

- Stability: The aSLFN maintains a dominant position in classification performance across most cases due to its improved stability

- Enhanced Computation Speed: aSLFN demonstrates a much shorter computation time compared to its parent algorithm IC-SLFN due to the correlation-filtering module

- Feature Selection: The built-in filtering module effectively removes unnecessary features,

- Dependence on Dataset: The classification performance of the aSLFN strongly depends on the specific dataset, indicating variability in effectiveness

- Computation Speed with Large Datasets: The computation speed may decrease significantly with very large datasets due to the requirement to calculate all rank correlations between attributes and class labels

- Single Layer Limitation: Feature learning using just a single layer network may not be as effective for natural signals such as images and videos, even with many hidden nodes

Cancer Advantages of the model approach Disadvantages of the model approach enhancing the algorithm’s performance and making it suitable for high-dimensional datasets

- Efficient Initialization: By using the Goodman-

Kruskal Gamma rank correlation for the initialization of hidden nodes, the model leverages embedded knowledge in data, providing a more structured approach than random initialization

- Nạve Assumption: The model assumes the effects of each attribute are independent, which in practical applications might not always hold true, potentially limiting its predictive power

- Increased Detection Rates: DEEP2 significantly improves adenoma detection rates (ADR) and adenomas per colonoscopy (APC), helping in the early identification of potential colorectal cancer

- Real-Time Detection: The model operates in real-time, providing immediate feedback to endoscopists during the procedure, which enhances the likelihood of detecting lesions that might be missed otherwise

- False Positives: DEEP2 reduces false positives compared to other AI systems, it still generates some false alerts, which can lead to unnecessary biopsies or extended procedure times

The accuracy of the DEEP2 model is significantly influenced by the quality of endoscopic imaging, as poor image quality stemming from inadequate bowel preparation or technical difficulties can lead to reduced model performance.

- Integration Challenges: DEEP2 system into existing clinical workflows and endoscopy units

Cancer Advantages of the model approach Disadvantages of the model approach

- Reduced Miss Rate: DEEP2 helps reduce the miss rate of adenomas, especially flat and subtle lesions that are often overlooked

AI enhances the consistency of endoscopic procedures by standardizing the detection process, which minimizes performance variability among different endoscopists and ensures a higher quality of care However, implementing this technology necessitates considerable investment in hardware and software, as well as comprehensive training for medical staff.

Cervical cancer prediction model (CCPM)

- Early Prediction: CCPM provides early prediction of cervical cancer using risk factors as inputs

- Outlier Detection: Utilizes outlier detection methods such as DBSCAN and iForest to remove outliers from the dataset

- Data Balancing: Employs SMOTE and

SMOTETomek for data balancing, which helps in handling imbalanced datasets

- High Performance: Demonstrated better performance compared to other methods,

- Generalization Limitation: The CRIC dataset may not fully represent the diversity and variability of real-world Pap smear images, which limits the generalization ability of the trained models

- Distortion and Noise: Pap smear images can have varying levels of distortion and noise, impacting the performance of the algorithms

- Multi-class Classification Challenge: Models struggle with accurate multi-class classification compared to binary classification

Cancer Advantages of the model approach Disadvantages of the model approach achieving high accuracy in predicting cervical cancer

- Mobile Application: A mobile application developed for CCPM can collect data and provide instant results, aiding in early-stage diagnosis

- Random Forest Classifier: Utilizes Random

Forest (RF) as a classifier, which is effective in reducing overfitting and producing low variance

- Computationally Intensive: State-of-the-art object detection models used in CCPM are computationally intensive and may require powerful hardware for training and inference

- Memory and Speed: The combination of outlier detection and data balancing methods with RF can be slower and require more memory compared to other algorithms

- Flexibility and Scalability: The NasNet model is noted for its high flexibility and scalability in terms of computational resources and parameters

- Optimized Performance: It has been trained on the ImageNet database and is optimized for performance, making it effective for feature extraction

- Automatic Learning of Features: The NasNet model can learn and extract meaningful features automatically from raw data, eliminating the need for manual feature engineering

- High Computational Demand: The model requires significant computational resources for training and inference

- Dependency on Large Datasets: The effectiveness of NasNet relies on being trained on large datasets, which might not always be available in specific applications

Cancer Advantages of the model approach Disadvantages of the model approach

- Handling High Variability: It is effective in handling high variability and general tasks through fine-grained object classes

- High Performance: Improved sensitivity, precision, and overall accuracy, especially with Gaussian chaotic maps

- Accurate Localization: Effective at identifying cancerous regions in ultrasound images

- Efficient Training: Uses Extreme Learning

Machines (ELMs) for fast and effective training, especially on small datasets

- State-of-the-Art Results: Outperforms existing methods in sensitivity, precision, F1-score, and accuracy

- Resource-Intensive: Requires extensive computational resources and large labeled datasets for training deep CNNs

- Computationally Intensive: Fine-tuning CNNs and running BACM optimization need powerful hardware

- Variable Performance: ELM classifier performance can fluctuate due to random initialization of weights and biases, requiring careful parameter optimization

- High-Level Feature Learning: SSAE transforms pixel intensities into structured

- Training Time: The SSAE model requires longer training times compared to some other models, such as EM and BRT, which do not require a training

Cancer Advantages of the model approach Disadvantages of the model approach

(SSAE) representations, enabling high-level feature extraction from unlabeled image patches

- Efficiency: The sliding window approach allows rapid traversal across large images, reducing computational burden by filtering out non- candidate patches early

- Improved Detection Performance: Achieves higher F-measure (84.49%) and better average area under the Precision-Recall curve (AveP) (78.83%) compared to other methods

- Unsupervised Learning: Suitable for extensive unlabeled histological image data, learning features without labeled data

- Computational Efficiency: More efficient in run-time execution compared to other models like

EM, BRT, and CD phase The deeper the architecture (more layers), the longer the training time required

Increasing the number of layers in a model can result in potential overfitting, which is evident from the inferior performance of the three-layer SSAE model when compared to the two-layer SSAE model.

- High Accuracy and Efficiency: The Faster R-

CNN model achieves high precision in detecting features, making it suitable for detailed image analysis tasks

- The CRIC dataset may not fully represent the diversity and variability of real-world Pap smear images, limiting the generalization ability of the trained models

Cancer Advantages of the model approach Disadvantages of the model approach

- End-to-End Detection: It realizes end-to-end detection capabilities, efficiently extracting high- precision feature information from image datasets

- Time Efficiency: The model significantly reduces diagnosis time For instance, the average time for a single image diagnosis is approximately 0.03 seconds, compared to 13 seconds for manual diagnosis by pathologists

- Reduction of Subjective Errors: It helps in reducing subjective errors in diagnosis due to its objectivity and repeatability, ensuring consistent performance

Through continuous feedback, the model can improve its learning and optimization performance, which aids in more accurate preliminary screening of images

- Reduced Workload: It assists pathologists by reducing their workload, allowing them to focus on more complex samples

Pap smear images can have varying levels of distortion and noise, which can impact the performance of the algorithms

The models struggled with accurate multi-class classification compared to binary classification

- Computational Requirements: State-of-the-art object detection models are computationally intensive and may require powerful hardware for training and inference

- False Positives and Detection Accuracy:

High sensitivity creates more false positives, and the detection sometimes overlaps with benign tissue

This article summarizes various AI models used for cancer detection, highlighting their advantages and limitations ResNet excels in generalization and classification accuracy for colorectal and breast cancers but is computationally intensive YOLOv5 offers real-time detection for laryngeal and cervical cancers, though it demands high-quality data and resources Paige Prostate, while sensitive, struggles with specificity and false positives in prostate cancer diagnosis VGG16-19 is effective for classifying skin, colon, brain, and bone cancers but requires extensive resources and training time EfficientNet achieves high accuracy but faces challenges with bias and computation demands DenseNet enhances classification for breast, bone, and skin cancers but is dataset-dependent DEEP2 improves adenoma detection rates but relies on image quality The CCPM model predicts cervical cancer early but has issues with multi-class classification and computational intensity NasNet is versatile in feature learning but needs substantial resources BCDNnet boosts breast cancer detection accuracy but is resource-heavy SSAE enhances feature learning and detection but has longer training times and overfitting risks Lastly, Faster R-CNN shows high precision in cervical and colorectal cancer detection, requiring powerful hardware and sometimes resulting in false positives Each model offers significant advancements yet faces challenges related to computational requirements and data quality.

3.3.2 Advantages and disadvantages of DL and NN model

The integration of deep learning (DL) and neural networks (NN) in medical applications, especially for cancer detection and diagnosis, has gained considerable attention recently These innovative technologies have transformed the field by significantly improving the accuracy and efficiency of detection processes, thereby offering essential support to healthcare professionals A summary of the advantages and disadvantages of various DL and NN approaches is presented in Table 6, showcasing their applications across different cancer types This section provides a thorough overview of these methodologies, evaluating their strengths and limitations to enhance the understanding of their impact on cancer detection.

Table 3.6 Advantages and Disadvantages of DL and NN approaches Ref Approach Type of Cancer Advantages of the model approach Disadvantages of the model approach

Lung cancer, Breast cancer, prostate cancer, Skin cancer, Colorectal cancer (CRC), Retinoblastoma Gastric cancer, Oral Cancer, Nasopharyngeal carcinoma

- High Accuracy in Image Analysis: CNNs are particularly effective in analyzing medical images such as mammograms and histopathology slides

-Feature Extraction: CNNs can automatically extract relevant features from raw data, which is beneficial for complex datasets where manual feature extraction is challenging

The article highlights the integration of Convolutional Neural Networks (CNNs) with meta-optics and plasmonic sensors, significantly improving the sensitivity and specificity of cancer detection By merging cutting-edge optical techniques with deep learning, this innovative approach enhances diagnostic capabilities in oncology.

- Early Detection: CNNs can analyze large- scale genomic and clinical data to assist in early diagnosis, prognosis, and treatment selection, potentially leading to better patient outcomes

- Non-invasive Monitoring: CNNs can be used in conjunction with plasmonic sensors for the detection of circulating tumor cells (CTCs)

Convolutional Neural Networks (CNNs) require substantial volumes of accurately labeled data for effective training However, acquiring such data poses significant challenges due to privacy concerns, restricted access to high-quality medical images, and the necessity for expert annotation.

- Interpretability: CNNs are often "black boxes," making it hard to understand their decision-making process, which can hinder trust among clinicians

- Class Imbalance: Cancer datasets usually have fewer positive cases, leading to biased models that perform better on negative cases

- Misclassification Issues: CNNs sometimes misclassify tumors due to technical artifacts, small tumor portions, or unusual morphological patterns

- Data Quality Dependence: CNN performance heavily depends on the quality of histopathological slides; poor preservation or artifacts can lead to errors

- Complexity of Training: Training CNNs requires significant annotated data and computational resources, making the process time-consuming

Ref Approach Type of Cancer Advantages of the model approach Disadvantages of the model approach and specific cancer biomarkers in body fluids, providing a non-invasive method for cancer monitoring

Label-free detection utilizing infrared imaging alongside convolutional neural networks (CNNs) enables the identification of cancer and microsatellite instability (MSI) or microsatellite stability (MSS) status without the need for staining or preprocessing This innovative approach preserves tissue samples for additional molecular analyses, enhancing the efficiency and effectiveness of cancer diagnostics.

- Handling Complex Data: Effective for complex, non-linear data relationships in tasks like disease diagnosis

- Automated Feature Extraction: Simplifies preprocessing by automatically learning features from raw data

- High Accuracy: Achieves high accuracy and sensitivity, often outperforming traditional methods in medical diagnosis

- Adaptability: Can improve performance over time by learning from new data

- Large Data Requirement: Needs large amounts of labeled data for training

- Computational Intensity: Demands substantial processing power and memory

- Overfitting Risk: Prone to overfitting, requiring careful tuning to ensure generalization

- Interpretation Complexity: Difficult to interpret and understand decision-making, posing challenges in critical applications like healthcare

Thyroid cancer, Skin cancer, Stomach cancer, Breast cancer,

- High Performance in Classification and Detection Tasks: +) DNNs have shown superior performance in various classification tasks, including cancer detection from

- High Computational Cost and Resource Intensity:

Ref Approach Type of Cancer Advantages of the model approach Disadvantages of the model approach

Lung cancer, colorectal prostate cancers histological images, segmentation tasks, and prognosis prediction using multi-omics data

+) Specific models such as EfficientNet-B4 and EfficientNetV2-M demonstrated high accuracy, precision, recall, F1 score, and AUC score in the classification of skin lesions

Deep Neural Networks (DNNs) excel at processing complex, high-dimensional data, including OMICs data like gene expression, protein, and metabolite information Their advanced capabilities enable precise classification and accurate prognosis predictions in various applications.

- Potential for Real-Time Predictions: Some

DNN architectures, like the U-Net, can provide real-time predictions for tasks such as breast mass segmentation in ultrasound images

Additional research potential on the use of AI in oncology

From the evidence shown, I have summarized some research gaps of AI application in Cancer Diagnosis data analysis in Figure 3.6

A comparative study of current cancer diagnostic models is essential for advancing research Future investigations should focus on evaluating various models using identical datasets and disease types to provide a comprehensive assessment of their diagnostic performance For instance, Helen ML Frazer and colleagues compared ResNet-V2, EfficientNetB6, and NasNet models solely on AUC and ACC metrics for breast cancer, while Sushopti Gawade et al assessed VGG16-19, ResNet101, and DenseNet models for bone cancer using Accuracy, F1, Precision, Recall, and AUC These studies highlight inconsistencies in evaluation criteria, varying datasets, and different cancer types, which hinder a holistic understanding of AI applications in cancer detection A standardized approach is necessary to enable researchers to select the optimal models for cancer diagnosis effectively.

Current cancer diagnostic models predominantly utilize imaging data, which accounts for 84% of cancer cases However, exploring diverse data sources could enhance early cancer detection and broaden predictive capabilities for various cancer types Conducting cross-sectional testing across different cancer types with various models may reveal innovative applications and identify superior models that outperform existing AI approaches in evaluation, prediction, and prognosis.

A notable research gap exists in the exploration of cross-testing various models, such as ResNet, for diseases like breast and colorectal cancer, instead of relying solely on conventional models like CNN for skin and brain cancers By applying the ResNet model to skin and brain cancers, we can diversify our approach and uncover new applications and diagnostic capabilities, potentially enhancing the performance of these models in areas where they have not previously been utilized.

In conclusion, we propose an innovative solution for the future by developing a new AI model for early cancer diagnosis, leveraging advanced algorithms and neural network architectures such as ANN, DNN, RF, and SVM This model will enhance the capabilities of existing systems and effectively address the limitations identified in current AI models discussed in this research.

CONCLUSION AND RECOMMENDATIONS

Ngày đăng: 26/02/2025, 22:29

Nguồn tham khảo

Tài liệu tham khảo Loại Chi tiết
1. Amisha, et al., Overview of AI in medicine. J Family Med Prim Care, 2019. 8(7): p. 2328-2331 Sách, tạp chí
Tiêu đề: Overview of AI in medicine
Tác giả: Amisha, et al
Nhà XB: J Family Med Prim Care
Năm: 2019
2. Liu, P.R., et al., Application of AI in Medicine: An Overview. Curr Med Sci, 2021. 41(6): p. 1105-1115 Sách, tạp chí
Tiêu đề: Application of AI in Medicine: An Overview
3. Mintz, Y. and R. Brodie, Introduction to AI in medicine. Minim Invasive Ther Allied Technol, 2019. 28(2): p. 73-81 Sách, tạp chí
Tiêu đề: Introduction to AI in medicine
Tác giả: Y. Mintz, R. Brodie
Nhà XB: Minim Invasive Ther Allied Technol
Năm: 2019
4. Bonacchi, R., M. Filippi, and M.A. Rocca, Role of AI in MS clinical practice. Neuroimage Clin, 2022. 35: p. 103065 Sách, tạp chí
Tiêu đề: Role of AI in MS clinical practice
5. Lei, C., et al., Advances in materials-based therapeutic strategies against osteoporosis. Biomaterials, 2023. 296: p. 122066 Sách, tạp chí
Tiêu đề: Advances in materials-based therapeutic strategies against osteoporosis
6. Al Kuwaiti, A., et al., A Review of the Role of AI in Healthcare. J Pers Med, 2023. 13(6) Sách, tạp chí
Tiêu đề: A Review of the Role of AI in Healthcare
Tác giả: Al Kuwaiti, A., et al
Nhà XB: J Pers Med
Năm: 2023
7. Davenport, T. and R. Kalakota, The potential for AI in healthcare. Future Healthc J, 2019. 6(2): p. 94-98 Sách, tạp chí
Tiêu đề: The potential for AI in healthcare
8. Zhang, B., H. Shi, and H. Wang, Machine Learning and AI in Cancer Prognosis, Prediction, and Treatment Selection: A Critical Approach. J Multidiscip Healthc, 2023. 16: p. 1779-1791 Sách, tạp chí
Tiêu đề: Machine Learning and AI in Cancer Prognosis, Prediction, and Treatment Selection: A Critical Approach
Tác giả: B. Zhang, H. Shi, H. Wang
Nhà XB: J Multidiscip Healthc
Năm: 2023
9. Huang, S., et al., AI in cancer diagnosis and prognosis: Opportunities and challenges. Cancer Letters, 2020. 471: p. 61-71 Sách, tạp chí
Tiêu đề: AI in cancer diagnosis and prognosis: Opportunities and challenges
10. S. Alshuhri, M., et al., AI in cancer diagnosis: Opportunities and challenges. Pathology - Research and Practice, 2024. 253: p. 154996 Sách, tạp chí
Tiêu đề: AI in cancer diagnosis: Opportunities and challenges
Tác giả: S. Alshuhri, M
Nhà XB: Pathology - Research and Practice
Năm: 2024
11. Fass, L., Imaging and cancer: A review. Molecular Oncology, 2008. 2(2): p. 115-152 Sách, tạp chí
Tiêu đề: Imaging and cancer: A review
12. Hanby, A.M., The pathology of breast cancer and the role of the histopathology laboratory. Clinical Oncology, 2005. 17(4): p. 234-239 Sách, tạp chí
Tiêu đề: The pathology of breast cancer and the role of the histopathology laboratory
Tác giả: Hanby, A.M
Nhà XB: Clinical Oncology
Năm: 2005
13. Schrag, D., et al., Blood-based tests for multicancer early detection (PATHFINDER): a prospective cohort study. The Lancet, 2023. 402(10409): p. 1251-1260 Sách, tạp chí
Tiêu đề: Blood-based tests for multicancer early detection (PATHFINDER): a prospective cohort study
Tác giả: Schrag, D., et al
Nhà XB: The Lancet
Năm: 2023
14. Al-Abbadi, M.A., Basics of cytology. Avicenna J Med, 2011. 01(01): p. 18-28 Sách, tạp chí
Tiêu đề: Basics of cytology
15. Jiang, P., et al., Big data in basic and translational cancer research. Nature Reviews Cancer, 2022. 22(11): p. 625-639 Sách, tạp chí
Tiêu đề: Big data in basic and translational cancer research
Tác giả: Jiang, P., et al
Nhà XB: Nature Reviews Cancer
Năm: 2022
16. Luchini, C., A. Pea, and A. Scarpa, AI in oncology: current applications and future perspectives. British Journal of Cancer, 2022. 126(1): p. 4-9 Sách, tạp chí
Tiêu đề: AI in oncology: current applications and future perspectives
17. Liang, G., et al., The emerging roles of AI in cancer drug development and precision therapy. Biomedicine & Pharmacotherapy, 2020. 128: p. 110255 Sách, tạp chí
Tiêu đề: The emerging roles of AI in cancer drug development and precision therapy
Tác giả: Liang, G., et al
Nhà XB: Biomedicine & Pharmacotherapy
Năm: 2020
18. Bi, W.L., et al., AI in cancer imaging: Clinical challenges and applications. CA Cancer J Clin, 2019. 69(2): p. 127-157 Sách, tạp chí
Tiêu đề: AI in cancer imaging: Clinical challenges and applications
19. United States Food and Drug Administration. 510(k) Premarket Notification - Arterys Cardio DL. 2017; Available from Sách, tạp chí
Tiêu đề: 510(k) Premarket Notification - Arterys Cardio DL
Tác giả: United States Food and Drug Administration
Năm: 2017
21. Guendouzi, B.S., et al., A systematic review of federated learning: Challenges, aggregation methods, and development tools. Journal of Network and Computer Applications, 2023. 220: p. 103714 Sách, tạp chí
Tiêu đề: A systematic review of federated learning: Challenges, aggregation methods, and development tools
Tác giả: Guendouzi, B.S., et al
Nhà XB: Journal of Network and Computer Applications
Năm: 2023

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN