Identification of beef and pork using gray level co-occurrence matrix and probabilistic neural network

Objective: Identify images of beef and pork using texture feature extraction Gray Level Co-Occurrence Matrix and Probabilistic Neural Network classification algorithm. Design/method/approach: Apply texture feature extraction to Gray Level Co-Occurrence Matrix and Probabilistic Neural Network Classifier to perform classification. Results: From the test results with k-fold cross-validation and confusion matrix, it shows that feature extraction of Gray Level Co-Occurrence Matrix and Probabilistic Neural Network Classifier get an average accuracy of 87%, precision 83%, and recall 90%. Authenticity/state of the art: In this study, several scenarios were tested, namely the effect of using resize, brightness, and rotate values. Using a resize value of 256 x 256 pixels from the test results got the best accuracy of 87%. The brightness test of 20% affects the accuracy rate of 86% on increasing brightness and 90% on reducing brightness. In contrast, the test on the rotated image does not affect the accuracy results. The average accuracy obtained is 87%. The data in this study were obtained by collecting primary data on images of beef and pork in several markets in Denpasar.


Introduction
Beef is one of the commodities that are quite high in consumption in Indonesia and pork is a commodity that is low enough to be consumed by the people of Indonesia. [1]. However, in some areas, the level of pig production is relatively high, one of which is in the Province of Bali, with the high production of pigs in Bali affecting the consumption of pork in Bali. [2]. At first glance, beef and pork look the same in terms of color and texture. With the use of Image Processing technology, it can be used to introduce beef and pork through the image of the two types of meat. Identification of beef and pork images can be identified based on color and texture. Research related to meat identification has been carried out by [3], where the meat image is identified by feature extraction method Gray Level Co-Occurrence Matrix (GLCM). The classification of meat images is carried out based on features and pixel spacing. The GLCM feature extraction process requires high computing, 18 Computing and Information Processing Letters ISSN 2722-4139 Vol. 1., No. 1, November 2021, pp. 17-24 so the image is first converted into a grayscale or gray degree image [4]. From the study results with a total of 1800 data and four classes of meat, the best accuracy was 87.5%, with an image taking distance of 20 cm and a neighboring pixel distance of d=2. GLCM has also been proven to analyze image textures in other studies [5] that identify beef and pork-based on texture. The test results obtained an accuracy of 88.75% without a background and an accuracy of 73.75%. Research related to the classification of types of meat has been carried out in several studies with various methods such as SVM [3] with an accuracy of 87.5% and KNN [5]with the obtained accuracy of 88.75%. Another classification method that can be applied is the Probabilistic Neural Network (PNN). According to [6], PNN is reliable in classifying images, especially color images with texture characteristics. PNN was chosen because it does not require a large dataset in the learning stage and fast data processing, so it does not require a repeated training process (iteration) to improve the smoothing parameters used to identify classes from data. [7]. However, a Probabilistic Neural Network has a problem determining smoothing parameters usually used by trial and error or userdefined to get the best accuracy. [8]. Research related to meat classification has been carried out with a combination of GLCM and PNN [9] for beef freshness classification, the accuracy obtained is 75% using 80 datasets. Other research was also conducted by [10] to classify fish species using 141 datasets, with an accuracy obtained of 89.65%.
Based on the problems that have been described regarding the identification and classification of meat images, the application of GLCM feature extraction for the identification process of meat images by utilizing a combination of pixel spacing and six texture features, namely dissimilarity, correlation, homogeneity, contrast, ASM, and energy to obtain texture characteristics in the image. meat [3]. PNN was applied as a classification method for beef and pork.

Method
The research method used is a quantitative method. The stages in this research are preprocessing, feature extraction of texture features with Gray Level Co-Occurrence Matrix, and classification using Probabilistic Neural Network. The stages of research to be carried out in this study can be seen in Figure 1.

Data Collection
The source of the image of beef and pork is obtained by buying beef and pork from several markets in Denpasar, namely the Badung traditional market, Pepito Renon, and Tiara Dewata Supermarket. Then the primary data collection to get the input image, beef and pork were photographed using a Canon DSLR (EOS) camera with a zooming camera setting of 4 (four) times magnification and an image taking distance of 20 cm to get the overall surface texture of the meat without any background. And with lighting conditions in the room during the day. At the purchase of each meat, the meat is then cut into several pieces before the image-taking process is carried out and the meat used does not go through a washing process, this aims to get the original color of the two meats. The process of taking images of meat with one piece of meat is shooting images with four degrees of camera rotation, namely from 00, 450,900, and 1350, the image is also taken on the opposite side of the meat in the same position. Image taking is done for two days on the same meat to get freshness and color on the texture of the meat, then buy new meat to get the next image of the meat. The image data is then labeled manually, namely label 0 for the image of a cow and label 1 for the image of a pig. Image data obtained as many as 400 data consisting of 200 images of beef and 200 images of pork. The dataset will be divided into 80% training data, and 20% test data. The results of the image dataset of meat that have been labeled with the process can be seen in Table 1.

Doc
Review Label

Image Preprocessing
Preprocessing is the initial stage for object recognition in the image before entering the feature extraction stage using a gray level co-occurrence matrix. The raw image data that was first collected has a variety of sizes so it is necessary to carry out a preprocessing stage to homogenize the image data before entering the feature extraction process. Pre-processing is carried out in several stages, namely resizing the image to 256 x 256 pixels, converting HSV images for color segmentation of meat images and converting Grayscale images or grayscale forms (Grayscale). [11]. The results of the preprocessing process can be seen in Figure 2.

GLCM Feature Feature Extraction
Extraction of texture features is done to get the characteristic value of texture features in grayscale images. In this study, the GLCM matrix is made with four corner directions, namely the angle of 00 450 ,900, and 1350 and the distance of neighboring pixels used is 1. Furthermore, the six features of the Haralick texture include Dissimilarity, Correlation, Homogeneity, Contrast, ASM and Energy will be calculated based on the four corners.
The equation of Haralick can be seen in equation (1)  The calculation results from GLCM texture feature extraction can be seen in Table 2.

PNN (Probailistic Neural Network) Classifier
The PNN (Probailistic Neural Network) Classifier method is a method used for image classification purposes because it does not require large datasets in the learning stage and fast data processing, so there is no need for repeated training processes (iterations) to improve parameters (smoothing parameters) which will be used to identify the class of data [7] This method is a clastic probabilistic method that applies the Bayesian Theorem [13]. According to Haykin, 1994 said probabilistic neural network consists of three layers [14] : 1. Input Layer or the input data layer from the result of feature feature extraction. 2. Pattern Layeris a layer for receiving data from the input layer to be processed, namely by adding up the contributions for each input data which will produce a PNN network output vector. Equation (7) is an equation to calculate the value of the pattern layer 3. Summation Layeris an output node in the form of a binary that produces a classification decision, namely by taking the maximum probability and producing a value of 0 for the cow class and 1 for the pig class. Equation (8) is an equation for calculating the summation layer value.  Table 3 shows the results of the probabilistic neural network calculations where the output results are adjusted based on the class with the largest total gaussian value used as the final output of the identification of the image data.

K-Fold Cross Validation
K-Fold Cross Validationis one method of validating the accuracy of a system. In this method the dataset is divided into a number of n-fold partitions at random, then iterating over the data that has been divided into training data and test data alternately.

Confusion Matrix
Confusion Matrix is one of the methods used to measure the level of performance of a system in classifying data. Confusion Matrix will provide comparison results between the classification results carried out by the model and the actual classification results. The confusion matrix table can be seen in Figure 3.

Receiver Operating Characteristics (ROC) Curve
ROC curve is used to visualize and test the performance of a classification model [15]. The ROC curve is plotted on a graph with a true positive rate (sensitivity) value on the Y axis and a false positive rate (1-sensitivity) value on the Y axis. [16]. TPR (sensitivity) is used to measure the proportion of "true positives" that are correctly identified while PFR is used to measure the proportion of "true negatives" that are correctly identified. The accuracy of the ROC classification is done by calculating the area under the ROC curve or what is called the AUC (Area Under Curva) [17].

Results and Discussion
The test is carried out using 400 datasets with the evaluation method of k-fold cross-validation using a value of k = 5, which means that there are five tests on the dataset with random data. In each k iteration, there are training data and test data with the distribution of the dataset 80% of the test data and 20% of the training data. Then the next test uses a confusion matrix by calculating the accuracy, precision, and recall values in each fold and getting the average value. The last test uses the ROC curve by calculating the area of the carrying area curve or AUC. Table 4 is the test result of k-fold cross validation where there are five folds or five tests where there are 80 data in each fold with a 50:50 division of beef and pork data. Test results usingk-fold with a value of k = 5 proven to be able to classify meat data with an accuracy of 87%, wherein each fold there are tests on actual data and predictive data as in fold 1 there are 31 beef data that are predicted to be correct as beef and 9 beef data which was predicted as pork, as well as 4 pork which was predicted as beef, and 36 pork which was predicted correctly as pork, the results of the next fold test can be seen in table 4.  The confusion matrix is a test method to measure the accuracy of the GLCM and PNN models based on the values of True Positive (TP), False Positive (FP), True Negative (TN), and False Negative (FN). Based on the confusion matrix test results, the results of accuracy, precision, and recall can be seen in Table 5.

Test value resize, brightness, rotate
In this section, testing is carried out on the selection of resizing values to obtain the best accuracy. Resize referred in this study is to change the size of the image to a certain size by equating the length and width of the image. The resize values used in this test are 16 x 16 pixels, 32 x 32 pixels, 64 x 64 pixels, 128 x 128 pixels, 256 x 256 pixels, and 512 x 512 pixels. From the overall test of the resize values in table 6 and table 7, it was found that the best use of resizing is resized 256 x 256 pixels with an accuracy of 87% of the training model and 97% of test accuracy outside of training data. Tests were carried out on the effect of brightness on the training image, the brightness value used in this test was 20% for the addition and subtraction of brightness. From the overall brightness test, the accuracy is 86%, with the addition of 20% brightness and 20% brightness reduction, obtaining 90% accuracy. Tests were also carried out on the effect of rotate on the training image. From the whole test of rotated image obtained 87% accuracy.

Conclusion
In this study, using the GLCM (Gray Level Co-Occurrence Matrix) algorithm for texture feature extraction and PNN (Probabilistic Neural Network) classification for beef and pork image identification yields an accuracy of 87%, precision of 83% and recall of 90%. and the results using the ROC curve with a 5 fold cross validation approach in the classification model get an average AUC value of 0.87, where the AUC value is in the range of values between 0.80 -0.90 and is included in the good classification category. [17].In this study, testing was carried out on the effect of resize, brightness, and rotate values. The test results show that the use of resize and brightness values affects the accuracy value, while the rotate value test does not affect the accuracy value. The use of Hue Saturation Value (HSV) color segmentation can be used as a reference to improve model accuracy. The gray level co-occurrence matrix (GLCM) feature extraction value can be used as a reference in the classification using the dissimilarity, correlation, homogeneity, contrast, ASM, and energy features showing the difference in values between the two meat images. The use of the smoothing parameter value or standard deviation of 10 in the PNN (Probabilistic Neural Network) algorithm can be used as a reference to get the best accuracy.
In this study, it was found that there are limitations and shortcomings in identifying meat images, namely images with insufficient lighting or images with excessive lighting, there are still errors in recognizing meat textures so that identification errors occur. It is hoped that in future research, other color moments algorithms will be added to handle different lighting.