Improving the Curvelet Saliency and Deep Convolutional Neural Networks for Diabetic Retinopathy Classification in Fundus Images

Retinal vessel images give a wide range of the abnormal pixels of patients. Therefore, classifying the diseases depending on fundus images is a popular approach. This paper proposes a new method to classify diabetic retinopathy in retinal blood vessel images based on curvelet saliency for segmentation. Our approach includes three periods: pre-processing of the quality of input images, calculating the saliency map based on curvelet coefficients, and classifying VGG16. To evaluate the results of the proposed method STARE and HRF datasets are used for testing with the Jaccard Index. The accuracy of the proposed method is about 98.42% and 97.96% with STARE and HRF datasets respectively. Keywords-saliency; VGG16; classification; diabetic retinopathy; retinal blood vessel


Dang Thanh Tin Information Systems Engineering Laboratory Faculty of Electrical and Electronics Engineering Ho Chi Minh City University of Technology (HCMUT)
Vietnam National University-Ho Chi Minh City Ho Chi Minh City, Vietnam dttin@hcmut.edu.vn Abstract-Retinal vessel images give a wide range of the abnormal pixels of patients. Therefore, classifying the diseases depending on fundus images is a popular approach. This paper proposes a new method to classify diabetic retinopathy in retinal blood vessel images based on curvelet saliency for segmentation. Our approach includes three periods: pre-processing of the quality of input images, calculating the saliency map based on curvelet coefficients, and classifying VGG16. To evaluate the results of the proposed method STARE and HRF datasets are used for testing with the Jaccard Index. The accuracy of the proposed method is about 98.42% and 97.96% with STARE and HRF datasets respectively.
Keywords-saliency; VGG16; classification; diabetic retinopathy; retinal blood vessel INTRODUCTION Diabetes is a condition that occurs when the pancreas does not produce enough insulin or when the body loses its ability to metabolize insulin. Manifestations of diabetic retinopathy include microaneurysms, intraretinal hemorrhage, hard exudates, macular edema, macular ischemia, neovascularization, vitreous hemorrhage, and traction retinal detachment. The fundus complication of diabetes is called diabetic retinopathy, which damages the small blood vessels in the retina. The retina is the light-sensitive area of the eyeball, where nerve cells receive images and send them to the brain for processing. The macula is most important for the most delicate images. Diabetic retinopathy is the leading cause of vision loss or blindness in developed countries. Therefore, the identification of diabetic retinopathy through retinal images is very important. Recent advances in computer science, focused on machine learning to detect data patterns, can provide solutions for diabetic retinopathy disease detection through retinal imaging. The detection of bright lesions in the retinal blood vessels is hard work, but can help doctors in diagnosis and treatment. Retinopathy is widely researched [1][2][3][4][5] and many different methods such as image saliency analysis [6,7], machine learning techniques [8][9][10], wavelet and support vector machine [11], segmentation [12,13], morphology [27], etc., are used.
Authors in [10] proposed a method to identify diabetes from retinal images using the DiaNet model with a multi-stage Convolutional Neural Network (CNN). Authors in [14] proposed a method for the detection of fundus image lesions by using a CNN to search with the shuffled frog leaping algorithm. Authors in [15] used the discrete image clustering technique to separate the image foreground and background of the input image in order to classify the lesion results. Authors in [16] proposed a method to segment premature infant retinal images to achieve an extracted retinal image with a map of blood vessels. Authors in [17] suggested a contour detection method. They computed the image gradient value by applying Type-2 fuzzy rules to detect edges. Authors in [18] used a CNN model to design a learning method for grading fundus images. However, the system was validated on a small dataset. Authors in [19] proposed a method for diabetic retinopathy detection based on U-Net and ResNet- 18 This paper proposes a method for diabetic retinopathy classification based on the improvement of the second general wavelet transform with the VGG-16 CNN. To evaluate the results of the proposed method and compare them with the results of other known methods, two (STARE and HRF) open datasets were used for testing and the evaluation criterion of Jaccard Index (JI) was utilized. The accuracy of the proposed method is about 98.42% and 97.96% for STARE and HRF datasets respectively.
The main contributions of this study are: • The proposal of the deep VGG16 to classify diabetic retinopathy in fundus images.
• Increased accuracy of classification by curvelet saliency combined with VGG16.

II. CLASSIFYING DIABETIC RETINOPATHY WITH CURVELET SALIENCY AND VGG16
The quality of retinal blood vessel images is affected by a wide range of reasons such as: noise, motion which creates blur images, and overlapping. The diabetic classification based on feature extraction is a popular state-of-the-art approach. This section presents the proposed method for diabetic retinopathy classification with curvelet saliency and VGG16 which is shown in Figure 1. The proposed method can be divided into 3 stages which depend on the characteristics of each step: pre-processing, curvelet saliency, and classification. The input data are the retinal blood vessel images of the system. Green is the color channel chosen to be processed. The CLAHE redistributes the lightness values of the objects. The top-hat transform for mask making creates the morphology for the next stages. The 3 above steps are the preprocessing of the proposed method. The aim of preprocessing is to enhance the quality of the blood vessels in fundus images. Secondly, the curvelet coefficients are calculated based on the curvelet transform with DB4 in the decomposition steps. This value is applied in each level of the top-hat transform to form the threshold for saliency levels. The output is the curvelet saliency map for the final periods. Finally, the VGG16 model classifies the images as diabetic or not.

A. Improving the Features of Retinal Blood Vessel Images
The aim of this step is to improve the features of fundus images. The thickness of blood vessels varies from 1 to 5 pixels. Enhancing quality starts from the similar color and the approximation of the surrounding pixels. The color of fundus images includes three color channels, i.e. Red-Green-Blue. Among them, Red is the saturated channel, Green is the lighting, and Blue is the contrast of vessels with the background. In our method, the Green color is chosen because of the necessary lighting in blood detection. Then, CLAHE is performed with the following steps: • Not overlapping contextual areas and preserving 8×8 pixel blocks.
• Calculating the average number of pixels for preparing the division as in (1): where ܰ ௩ is the average number of pixels, ܰ ௬ represents the gray levels of the divided areas, NrX and NrY are the number of pixels for the X-and Y-dimension respectively.
• From the average number of pixels, their product with the clipped pixels can be created with the i-th gray level. These values are the histogram of the clipped pixels. Then, the conditions for histogram level setup are compared to the average to keep the better value.
• Applying the condition and probability density as in (2): where ߙ = 0.04 is the scaling parameter of Rayleigh distribution, y(i) is the Rayleigh forward transform, and y min is the lower bound of the pixel value.
• Updating the artifacts with the average histogram and probability density.
• The final step in pre-processing is the top-hat transform. This transform creates the mask making and the subtraction process. The proposed method is divided into 2 levels (high and low) based on the number of distances between the gray levels of these pixels and the average histogram in the CLAHE step. The output of this step is the high-level and low-level curvelet coefficients.

B. Curvelet Coefficients for Choosing the Saliency Map
The input of this step is the high and low level of the previous step. In each level, this approach uses the stages as in Figure 2. The decomposition is done with DB4 for division and subband creation. The subband w j with a block size b j. in each level of fundus images f is noted as ∆ . An appropriate scale of sidelength ~2 -s and Q w is a collection of smooth windows localized around dyadic squares of fundus images as in (3): Then, the curvelet coefficients are calculated as follows: • Renormalizing in each unit scale.
• Analyzing via the discrete Ridgelet transform in each unit scale.
• Update the subbands of blocks based on the double value if their number is odd. However, if the block's number is even, the updates do not work.
• In reversion, the curvelet domain with coefficients K depends on the number of scales in the list of subbands. The saliency level R(x) at pixel x is presented with level K.
The curvelet saliency for final saliency prediction will be done with the algorithm described below: Input: R(x) and K Output: the saliency prediction Function cal_curvletSaliency(R(x), K) The superpixel centers S = the average pixel i in grid while iteration t from 1 to v do: The association between pixel p and superpixel i is: ℚ ୮୧ ୲ ൌ e ିฮ࣠ ౦ ିୗ ౪షభ ฮ మ superpixel centers = ∑ association in K end while return distance(pixel x with superpixel centers) End function The final output of this period is the saliency map of the areas of blood vessels in fundus images. The curvelet coefficient K is the condition for the association of superpixel centers in a map.

C. VGG16 in the Curvelet Saliency Map for Diabetes Classification
The stage after the feature extraction of image segmentation from curvelet saliency is the classification by VGG16 with the architecture shown in Figure 3. Applying VGG16 for feature extraction in the curvelet saliency map.
In this architecture, each level of the curvelet saliency will be submitted to the following procedures: • Block 1: 2 convolution + 1 pooling At first, each input image is downsized with a 3×3 channel. The channel of the second block is 64×64 and the size is doubled for the below blocks. Therefore, the channels of block 3, 4 and 5 are 128×128, 256×256, and 512×512. The padding is 2 pixels for 3×3 convolution layers. This size is the smallest size for the notion of left/right, up/down, and center. The stride of the convolution layer is also 2 pixels with the above padding. Then, the pooling layer (max pooling) has a window size of 2×2 pixels and the stride has also 2 pixels. In block 6, the first and second fully connected layers have 1×1×4096 channels. However, the final fully connected layer is 1×1×1000 channels. The softmax (ranging from 0 to 1) gives the final classification as diabetes or non-diabetes in each level of the curvelet saliency. The differences between VGG16 in this paper with the traditional VGG16 are the stride and the multimodel applied in each level of the curvelet saliency. The synthesis of the multi-model (multi-results of softmax) is the max value.

A. Datasets Used
The proposed method is actualized in the STARE [21] and HRF (High-Resolution Fundus) [22] datasets. These datasets contain images of normal or diseased retinal blood vessels. These 2 datasets are public and free to be used in research and academic tasks. During the experiments 70% of the STARE dataset was used for training and 30% for testing. The HRF dataset was used for testing (100%). The STARE dataset [21] consists of 402 images (91 diabetes and 311 non-diabetes). Their size is 605×700 pixels with 24 bits per pixel (standard RGB). In this dataset, a wide range of diseases are diagnosed based on retinal blood vessels. In this paper, the focused disease is diabetic retinopathy. The HRF dataset [22] includes 45 images (15 diabetes images of glaucomatous patients and 30 non-diabetes).
The language used to develop the proposed method is Python 3.9 and the configuration of the system is: 2.7GHz Quad-Core Intel i7 processor and 16GB 2133MHz LPDDR3 RAM.

B. Evaluation Metric and Experimental Results
The main idea of the proposed method is based on the segmentation with curvelet saliency. Therefore, the evaluation applies to the JI value. If A is the image segmentation and B is the ground truth image, the JI(A, B) is calculated by (4). The higher the JI value is, the better. To define the curvelet coefficients that adapt with the salient map more than others, Table I shows the comparison of saliency segmentation by superpixels and other coefficients. It can be seen that the proposed curvelet coefficients give segmentation results near the ground truth of the retinal blood vessels in the HRF dataset. Table II exhibits the comparison results between the proposed method and matched filtering and fuzzy C-means clustering with integrated level set [23] and fully convolutional deep learning [24]. We can see that the average JI of the proposed method is better than the others. This experiment is practiced in HRF with 45 images for evaluation. The ground truth of each fundus image is given clearly in the dataset. Therefore, the JI values are easy to compare. Figure 4 presents some segmentation results of curvelet saliency in the HRF dataset and compares the proposed method with [23] and [24].

Method
Average JI [23] 92.03 [24] 96.13 Proposed 98.25 From Tables I-II and Figure 4, the curvelet saliency adapts with segmentation in diabetic diseases. The next evaluation is the input of curvelet saliency for other deep learning models for diabetis classification. Figure 5 shows the comparison chart of the results. In Figure 5, we can see that VGG16 performs better than the other methods known, in terms of classification accuracy for classifying diabetic retinopathy in curvelet coefficients. The comparison was carried out between the proposed method with CNN, Fully Convolutional Network (FCN), and U-Net with VGG16. In the two datasets, the accuracy of VGG16 is higher for classifying diabetic retinopathy. Table III shows the classification results of VGG16 in detecting diabetes and non-diabetes images using Deep CNN VGG-16 and GoogLeNet [25], ResNet50 and VGG-16 pretrained networks [26], and VGG16 in curvelet saliency.  The curvelet coefficients are the main reason for the better results because they enhance the quality and divide the saliency levels. The curves adapt with the curvelet transform. These values make a condition for choosing saliency maps. Therefore, the segmentation of fundus images is better. On the other hand, the multi-level curvelet coefficient for saliency consists of continuing the input of VGG16 with changing the stride and synthesis of multi-model VGG16. As a result, the classifying results are improved. The authors in the other works applied deep learning for classification based on the number of layers or by improving the parameters of the dataset. However, the input parameters are not easy to enhance. If we only focus on information on the surface, it is not enough. The proposed method offers multi-level processing with curvelet coefficients for salient map. The levels of fundus image quality are shown clearly. The deep VGG16 in each saliency level calculates a value for disease classification.
IV. CONCLUSION AND FUTURE WORK Diabetes can reduce the red blood cell rate, increase the ability of platelets to agglomerate, and increase blood viscosity. As a result, the capillaries become clogged, causing retinal ischemia. Diagnosis in medicine is a vital task. Any disease prediction or classification system must adapt to the medical images. The retinal vessel images give a wide range of abnormal pixels. Therefore, classifying the diseases from fundus images is a popular research topic. This paper proposes a new method for classifying the diabetic retinopathy in retinal blood vessel images based on curvelet saliency for segmentation. Our approach includes 3 steps: pre-processing of the input images, calculating the saliency map based on curvelet coefficients, and utilizing VGG16 for classification. The choice of the Green color channel combined with enhancement and division level saliency by curve condition of the curvelet transform gave the best segmentation results. The stride and multi-model of the VGG16 was proposed for each saliency level to enhance the results. In future work, configuration and updating the number of layers or blocks in the deep learning models will be considered.