A Machine Learning based Approach for Segmenting Retinal Nerve Images using Artificial Neural Networks

-Artificial Intelligence (AI) based Machine Learning (ML) is gaining more attention from researchers. In ophthalmology, ML has been applied to fundus photographs, achieving robust classification performance in the detection of diseases such as diabetic retinopathy, retinopathy of prematurity, etc. The detection and extraction of blood vessels in the retina is an essential part of various diagnosing problems associated with eyes, such as diabetic retinopathy. This paper proposes a novel machine learning approach to segment the retinal blood vessels from eye fundus images using a combination of color features, texture features, and Back Propagation Neural Networks (BPNN). The proposed method comprises of two steps, namely the color texture feature extraction and training the BPNN to get the segmented retinal nerves. Magenta color and correlation-texture features are given as input to the BPNN. The system was trained and tested in retinal fundus images taken from two distinct databases. The average sensitivity, specificity, and accuracy obtained for the segmentation of retinal blood vessels were 0.470%, 0.914%, and 0.903% respectively. Results obtained reveal that the proposed methodology is excellent in automated segmentation retinal nerves. The proposed segmentation methodology was able to obtain comparable accuracy with other methods.

INTRODUCTION Utilizing computer-assisted diagnosis of retinal fundus images is becoming an alternative to the manual inspection known as direct ophthalmoscopy. Moreover, computer-assisted diagnosis of retinal fundus images is proven to be as reliable as direct ophthalmoscopy, and requires less time to process and analyze. Various eye related pathologies that can result in blindness, such as macular degeneration and diabetic retinopathy, are routinely diagnosed by utilizing retinal fundus images [1]. One of the fundamental steps in diagnosing diabetic retinopathy is the extraction of retinal blood vessels from fundus images. Although several segmentation methods [2,3] have been proposed, this segmentation remains challenging due to variations in retinal vasculature network and image quality. Currently, the main challenges in retinal vessel segmentation are the noise (often due to uneven illumination) and thin vessels. Furthermore, the majority of the proposed segmentation methods focus on optimizing the preprocessing and vessel segmentation parameters separately for each dataset. Hence, these approaches can often achieve high accuracy for the optimized dataset, whereas their application to other datasets has reduced accuracy. Although vessel segmentation methods usually contain preprocessing steps aimed at enhancing the appearance of vessels, some approaches skip the preprocessing steps and start with the segmentation step. Nowadays many segmentation methods rely on machine learning [4] concepts combined with traditional segmentation techniques for enhancing the segmentation accuracy, by providing data statistical analysis to support segmentation algorithms. These machine-learning concepts can be broadly categorized into unsupervised and supervised approaches, based on the use of labeled training data. In a supervised approach, each pixel in the image is labeled and assigned to a class by a human operator, i.e. vessel and non-vessel. A series of feature vectors is generated from the data being processed (pixel-wise features in image segmentation problems) and a classifier is trained by using the labels assigned to the data. Unsupervised approaches use predefined feature vectors without any class labels, where similar samples are gathered in distinct classes. This clustering is based on some assumptions about the structure of the input data, i.e. two classes of input data where the feature vectors of each class are similar to each other (vessel and not vessel). Based on the problem, this similarity metric can be complex or defined by a simple metric such as pixel intensities. This paper discusses briefly the retinal vessel segmentation methods to provide some insight into different methods and is by no means an exhaustive review of these methods. For a detailed discussion on different vessel segmentation methods please refer to [5]. Authors in [6] proposed a supervised retina vessel segmentation where a k-Nearest Neighbor (k-NN) classifier was utilized for identifying vessel and non-vessel pixels by using a feature vector based on a multi-scale Gaussian filter. Authors in [7] proposed a similar approach that utilized a feature vector constructed by using a ridge detector. Based on a feature vector constructed by using multi-scale Gabor wavelet filters, authors in [8] proposed the use of a Bayesian classifier for segmenting vessel and non-vessel pixels. In [9], a neural network (NN) based classifier was proposed, by utilizing calculated features that use momentinvariant features. Retinal vessel segmentation by using a classifier based on boosted decision trees was proposed in [10]. Meanwhile authors in [11] proposed a classifier by utilizing a support vector machine coupled with features derived that used a rotation-invariant linear operator.
The main advantage of using an unsupervised segmentation approach instead of a supervised one is its independence from labeled training data. This can be considered as an important aspect in medical imaging and related applications that often contain large data. Popular unsupervised retinal vessel segmentation methods can be categorized as vessel tracking, matched filtering, and morphology-based methods. Starting from a set of initial points, defined either manually or automatically, vessel-tracking methods try to segment the vessels by tracking their centerline. This tracking can be performed by utilizing different vessel estimation profiles, such as Gaussian [12], generic parametric [13], Bayesian probabilistic [14] and multi-scale profiles [15]. One of the earliest examples of vessel tracking based segmentation methods was proposed in [16], based on the Maximum A Posteriori (MAP) technique. Initial seeding positions corresponding to centerline and vessel edges were determined by using statistical analysis of intensity and vessel continuity properties. Afterwards, Gessel boundaries were estimated by using a Gaussian curve fitting function applied to the vessel cross-section intensity profile. Authors in [17] proposed a similar approach by combining MAP technique with a multiscale line detection algorithm, and their method was able to handle vessel tree branching and crossover points with good performance. Based on the notion that vessel profile could be modeled by using a kernel (structuring element), filtering concepts try to model and segment the vessels by convolving the retinal image with a 2D rotating template. A rotating template is used to approximate the vessel profile in as many orientations as possible (known as the filter response), with the response being the highest in places where the vessels fit the kernel. Techniques based on filtering utilize different kernels for modeling and enhancing retinal vessels, such as matched filters [18], Gaussian filters [19], wavelet filters [20,21], Gabor filters [8] and COSFIRE filters [22,23]. Methods utilizing morphological operations can be used to enhance retinal images for use either with other segmentation methods, or for segmenting blood vessels from the background [24].
Machine-learning algorithms are often utilized as supportive tools to automate and/or enhance most segmentation methods, by providing statistical analysis on a set of data generated by other segmentation methods. Therefore, any existing unsupervised segmentation algorithm could be enhanced by employing and integrating machine-learning concepts. Complex segmentation tasks and problems are usually solved by using a whole pipeline of several segmentation algorithms belonging to various image processing concepts. In this study an automated vessel extraction method in retinal fundus images is proposed, based on a hybrid technique comprising of genetic algorithm enhanced spatial fuzzy c-means algorithm with integrated level set method evaluated on real-world clinical data. Furthermore, a combination of filters is utilized to enhance the segmentation, as each filter responded in a distinct way to different pixels in the image. Considering the different image characteristics between datasets, the segmentation approach can be made more robust by combining filters. As the aim of this study was to propose an optimal segmentation method for use on various datasets, this method was not optimized for any specific dataset.

II. PROPOSED METHODOLOGY
This section describes the overall proposed process in the segmentation of nerves or the blood vessels in retinal images. Retinal images were used as testing datasets. Figure 1 shows the overall architecture of the proposed methodology for segmenting the retinal nerves. Overall architecture of the proposed methodology.

A. Dataset
Retinal images were taken from publicly accessible digital retinal images from DRIVE [4], and STARE [25] databases for the process of nerve segmentation. These datasets are used for developing and testing the performance of various retinal segmentation methods. Datasets drawn manually are called as the ground truth.

B. Grey Scale Conversion and Pre-processing
Pre-processing is the process of removing the noise and artifacts present in images. A contrast limited adaptive histogram equalization is performed in order to equalize the entire image. A mean filter is utilized to reduce the noise and the artifacts present in the input retinal images. Afterward, the retinal images are converted to grey for further processing. Since the grey scale converted image has various shapeless image boundaries, the pixels outside the image or the nerve boundaries were considered to remove the missing nerves present in the boundary. The original, equalized, grey scale and its pre-processed retinal images are depicted in Figure 2.

C. Texture Feature Extraction
Texture provides some important features about the structural arrangement of various surfaces. Texture features are used for classifying the possible nerve regions that have been identified with previous processing [26,27]. Gray Level Co-Occurrence Matrix (GLCM) features were calculated for the regions present inside the retinal image. In general, GLCM creates a grey-co matrix by calculating the frequency of a pixel with grey-level (greyscale intensity) value i occuring horizontally adjacent to a pixel with the value j. Each element www.etasr.com Saeed: A Machine Learning based Approach for Segmenting Retinal Nerve Images using Artificial … (i,j) in GLCM specifies the number of times that the pixel with value i occurs horizontally adjacent to a pixel with value j. The correlation features obtained are shown in Table I. In this method, color and texture features are considered as the input to train the BPNN.

D. Correlation
Correlation is a measure of how a particular pixel correlates to its neighboring the pixels. The Correlation Cr of an image can be calculated by the following equation.

E. Color Feature Extraction
RGB color format is a widely used format for processing digital images. Its main drawback is that it is not perceptually uniform for all images. Color formation process.
The Hue, Saturation, Value (HSV) representation of RGB color space is compatible with the human perception of color.
In this method, in order to obtain the color features, histograms of a square window centering around each pixel on an equidistant grid in each plane of the image were calculated using both LAB and HSV color spaces. A 5×5 window size, used for extracting the mean histograms, was obtained for two image spaces. The process of color formation is shown in Figure 3.
F. Constructing Feature Vectors: Color Texture using Neighborhood Statistics Gabor filters are often used for extracting texture features in order to segment images. But Gabor filters have a major setback as they induce a lot of redundancy generating enormous amounts of feature channels. This method proposed a new color and texture feature extraction using the higher order image statistics, defining the texture regularity of the total image with in its neighborhood structures more effectively. An unsupervised learning process was used for recovering image statistics as in [9]. The whole image is considered as a random field X with a set of lattice points S, where {S s }∈S is the total set of pixels present in the entire image. For extracting this feature, an unsupervised adaptive filter is also used. This improves the probability of the pixel intensities by decreasing their joint entropy hy (X|Y=y), of the conditional probability for each and every neighborhood pair of pixels, (X = x,Y = y). This can be done by changing the entire value of each pixel x present in the center. In each iteration for the entire image region Ζ m , the following equation is computed: An image i m+1 is constructed using the finite forward differences method on the gradient descent, with intensities: ‫ݔ‬ ାଵ ൌ ‫ݔ‬ ‫݉ݔ‬ െ ‫ݔ߲/݄߲ߣ‬ (3) with λ being the time step. Pixel updating process is stopped after few iterations when ||݅ ାଵ െ ݅ || 2 < δ, a small threshold. The process of magenta color formation is executed on the entire image for extracting the color texture features [9]. The color formation process is shown in Figure 3.

G. Constructing Feature Vectors
The mean weighted histogram process performed the feature vector construction. Since color and texture in a colortextured image play complementary roles in image segmentation, this combination enhanced the final segmentation result more accurately. If there are C channels present in N feature histograms, the weighted mean histogram which is computed as channel wise, ‫ܪ‬ ഥ can be computed as: where w i is the weight which is allocated to each histogram.  Figure 4. The NN has an input layer, a hidden type layer H in (j) and an output layer. The initial seed point is assumed as X i and the final seed point is assumed as X 2 . The pixel values as color texture features are given as input to the input layer.
The framework of the CTBPNN is as follows. The hidden layer input H ij (j)was defined as: where x i is the input feature, ω ij is the weight between the neurons from the input and the hidden layer, and a j represents the threshold. The estimates of CTBPNN are: where, λ is a non-negative regularization parameter, x is the blood vessel width and blood vessel tortuosity of retinal images, Y is the average accuracy and βs is the regression coefficient. The neuron number for the hidden layer is: where n, h, and m are the total number of neurons belonging to the input, the hidden, and the output layer respectively, and α is a threshold between 0 and 20. In this work, n was assumed as 80, m was set to 3, and h ranges from 20 to 32. Correlation features were separately calculated for each training and testing phase. The results obtained from various other segmentation methods were compared with the proposed. Correlation features are separately calculated for each training and testing phases. The results obtained from other segmentation methods are compared with the proposed. IV. RESULTS AND DISCUSSION In this research, different retinal images taken two distinct databases were used. Metrics such as sensitivity, specificity, and accuracy were considered for the accuracy estimation process. Sensitivity represents the probability that the segmentation method will correctly identify vessel pixels. Specificity is the probability the segmentation method will correctly identify non-vessel pixels. Accuracy represents the overall performance of a segmentation method. They can be computed using the following equations: Accuracy ൌ ்ା்ே ்ାிேା்ேାி (11) where True Positive (TP) denotes vessel pixels correctly segmented as vessel pixels, and True Negative (TN) denotes non-vessel pixels correctly segmented as non-vessel pixels. False positive (FP) denotes non-vessel pixels segmented as vessel pixels, while false negative (FN) denotes vessel pixels segmented as non-vessel pixels. Segmented results along with original images, pre-processed, border detected, color formation, ground truth and segmented results of the retinal nerve images are shown in Figure 5. The segmented images are very close to the ground truth images. In this approach, images specified by ophthalmologists and doctors are considered as ground truth for the calculation of segmentation accuracy.  segmenting retinal nerves. In general, lower sensitivity and specificity improves the performance of the segmentation algorithm [1]. Since the sensitivity and specificity of the proposed method are lesser than the previous methods, the slighter lower accuracy is not affecting its actual accuracy. The accuracy is slightly lesser than the previous methods since the magenta color formation is done. Moreover, the proposed method gives lesser sensitivity and specificity for both the datasets. The comparison between different retinal nerve segmentation methods is shown in Table III. The proposed method needed 24s in average to segment the retinal nerve, making it comparable to most known methods from the literature.

V. CONCLUSION AND FUTURE WORK
This paper proposed a novel method for segmenting nerves of retinal images using the combination of color and texture features with BPNN. Various mages taken from two databases were used for the process of training and testing the NN. This approach improved the segmentation accuracy of nerves in retinal images, by segmenting the exact nerve region present in it. Color and texture features were computed and the obtained seed points were given as input for training and testing the proposed method. Comparing the proposed with various retinal nerves segmentation methods in the literature, the proposed performed well for processing the fundus image in its vessel segmentation with sensitivity, specificity and accuracy of 0.470%, 0.914%, and 0.903% respectively for the DRIVE dataset and 0.447%, 0.919%, and 0.911% respectively for the STARE dataset. This method could segment the nerve region better than other methods in less time. Future enhancements can be the proposal of a novel method for detecting the glaucoma or other abnormalities present in retinal images.