Artificial intelligence algorithm for the histopathological diagnosis of skin cancer

Methods: A deep learning program was built using three neural network architectures: MobileNet, Inception and convolutional networks. A database was constructed using 2732 images of melanomas, basal and squamous cell carcinomas, and normal skin. The validation set consisted of 284 images from all 4 categories, allowing for the calculation of sensitivity and specificity. All images were provided by the Path Presenter website.


INTRODUCTION
Artificial intelligence (AI) systems consist of computer programs that replicate human reasoning and its characteristic features such as learning, creativity and responsiveness. AI systems differ from human reasoning in that they are fast, autonomous and have no physiological needs. They are therefore able to outperform humans on a variety of tasks, including those in the medical domain [1][2][3] .
Radiology and pathology are two areas that could benefit from the use of AI. Both specialties involve a need to recognize visual patterns in order to perform an accurate pathological diagnosis. Computers are extremely efficient at this type of task, as evidenced by findings from non-medical studies of facial and object recognition 4,5 .
Currently, AI technologies such as IBM's Watson system are used to evaluate oncology patients and assist in the choice of chemotherapy drugs for specific cancer types and clinical conditions 6 .
Other technological systems have also been used to evaluate microscopic skin lesions and differentiate between benign and malignant disease; in these tasks, AI systems have outperformed human clinicians 7 .
The evaluation of skin lesions is a highly important procedure, as cutaneous neoplasms are the most common cancers in the world.The most prevalent histological subtypes are basal cell carcinoma (BCC), squamous cell carcinoma (SCC), and malignant melanoma (MM). However, the diagnosis of these conditions can only be confirmed by microscopic histological evaluation 8,9 . An AI algorithm based on deep learning was developed to diagnose the most important cutaneous pathologies, such as BCC, SCC, and MM, and distinguish these from histologically normal skin.

Software development
An deep-learning AI system was developed with three analysis options: one did not use a neural network library, while the other two used the Inception V3 and MobileNet models. The software structure consisted of sets of overlapping algorithms divided into several processing tiers, with each round involving several linear and nonlinear transformations. A computerized communication interface was established to perform the histopathological evaluation of cutaneous images at increasing levels of magnification (40-100x).
The image processing algorithm was implemented in Python 3, a systematic, block-based programming language. It is a dynamic method that allows for the creation of a learning network to catalogue the pathologies. The programming languages PHP 7 and JavaScript were then used to create a n graphic user interface, where options and commands could be selected using tools such as buttons and scroll bars.
The software was named KAi (Kuiava Artificial Intelligence). Image resolution was 224x224 pixels for MobileNet, 299x299 for Inception and 128x128 for the convolutional network. Images were analyzed using R, G and B channels.

Database construction and selection of histopathological slides
The algorithms were trained using the opensource library TensorFlow. The pre-trained models MobileNet 1.0 and Inception 3 were used, along with an exclusive database of histopathological images. The database contained pre-analyzed images obtained from Google Inc.
Histopathological images of cutaneous tissue slides at multiple magnifications, classified as normal, BCC, SCC and melanoma, were obtained from the Path Presenter website. A total of 2,732 images were used to create the training database. All images were reevaluated by two pathologists to confirm the original classification. The images were then analyzed by an AI system to create the database.
The preliminary results yielded a reliability of 97.5% for MobileNet; 97.8% for Inception and 97% for the convolutional network.

Software validation
After the initial evaluation, an efficiency test was performed in three phases: Stage one: Fifty-one histopathology slides with or without lesions were selected, from which 754 images were obtained and rotated at 90, 180 and 270 degrees to expand the database. This resulted in a final set of 3,016 images.
Stage two: Two image groups were created, one with 284 images to test and validate the system, and another with 2,732 images to develop the program.
Stage three: The Jpeg images were processed and analyzed by the KAi software to provide a diagnosis.

Statistical analysis
The data generated by the software was exported to a Microsoft Excel spreadsheet. Sensitivity and specificity were calculated using the following formula: sensitivity = true positives/total positives; specificity = true negatives/total negatives. Hit-rates were compared using ANOVA, and results were considered significant at a 5% level. SSPS 10.0 was used for the analyses.

Ethical considerations
This research project was submitted to the Ethics Committee of the University of Passo Fundo, and approved under number 25973719.9.0000.5342, on February 14th, 2020.

RESULTS
A computer program was developed to analyze histopathological images of skin lesions and detect the presence of cancer. The layout and functionality of the program are shown in Figure 1 A-B. The program database included 12 melanoma slides, 16 SCC, 13 BCC and 10 normal skin slides. The slides were used to generate 868 melanoma images, 652 SCC, 680 BCC and 532 normal skin images. The validation set consisted of 80 melanoma images, 72 each for BCC and SCC, and 60 images of normal skin -see Table 1 and Figure 2.
Sensitivity and specificity were calculated for each of the models. The sensitivity of the MobileNet model was 92% (95%CI, 83-100%), and its specificity, 97% (95%CI, 90-100%    The maximum sensitivity and specificity for the differentiation of malignant lesions were 91% and 95.4%, respectively. With regards to the detection of malignant conditions, maximum sensitivity was 98.3% and specificity was 99.6%. There were no statistical differences in sensitivity and specificity between the MobileNet, Inception and Convolutional network models (p=0.769).

DISCUSSION
The global incidence of cancer is expected to increase in the following years, mainly as a result of population aging 10 . Skin cancers are the most common malignancy in the world. Skin lesions are a major cause of morbidity and, in some cases, mortality. A definitive diagnosis of these cases requires a histological analysis of the lesions 7,9 . Fortunately, computer programs might be able to assist with histological diagnosis [11][12][13] .
Computer programs based on AI have the capacity to make significant contributions to the health sector. So far, these techniques have produced encouraging results, with software programs demonstrating high sensitivity and specificity, and emerging as a promising complement to the histopathological evaluation of cutaneous lesions 14 .
Medical assistance is a crucial part of patient care. The need for a careful evaluation of the clinical and individual characteristics of each patient makes it unlikely that the entire population of health care workers would be replaced by technology 5 .
Recent discussions on the future of medicine have identified areas such as radiology and pathology where technology may be an important tool to assist professionals in providing a fast and accurate diagnosis, reducing patient waiting lists and improving medical procedures 1,15 .
This is a promising alternative for Brazil, a country with an uneven distribution of medical specialists 16 , and where the adoption of technological tools could contribute significantly to the quality of health care 17 . Additionally, technological development may stimulate the economy and improve the qualification of health care professionals 17 .
Limitations of the program developed in the present study include difficulties in diagnosing lesions in the early stages of cancer or with intense inflammation. Structural difficulties may emerge due to variations in the way images are digitized in each institution, although this can be minimized by increasing the variability of images in the program database 14 .
Strengths of the program include its intuitive interface and the use of a database containing histologic images of cutaneous neoplasms to test and validate the program. The slides were obtained from the Path Presenter library, which receives data from several institutions and pathologists. The resulting variability in these images contributes to a reliable diagnosis of skin lesions.
Computer programs based on AI have the capacity to improve patient care. The software developed in the present study demonstrated high sensitivity and specificity, with values of 93 and 98.8%, respectively. The application of technology in health care might contribute to the speed and efficacy of medical evaluations.