Artificial Neural Networks on Eggs Production Data Management

Background: Eggs have acquired a greater importance as an inexpensive and high-quality protein. The Brazilian egg industry has been characterized by a constant production expansion in the last decade, increasing the number of housed animals and facilitating the spread of many diseases. In order to reduce the sanitary and financial risks, decisions regarding the production and the health status of the flock must be made based on objective criteria. The use of Artificial Neural Networks (ANN) is a valuable tool to reduce the subjectivity of the analysis. In this context, the aim of this study was at validating the ANNs as viable tool to be employed in the prediction and management of commercial egg production flocks. Materials, Methods & Results: Data from 42 flocks of commercial layer hens from a poultry company were selected. The data refer to the period between 2010 and 2018 and it represents a total of 600,000 layers. Six parameters were selected as “output” data (number of dead birds per week, feed consumption, number of eggs, weekly weight, weekly egg production and flock uniformity) and a total of 13 parameters were selected as “input” data (flock age, flock identification, total hens in the flock, weekly weight, flock uniformity, lineage, weekly mortality, absolute number of dead birds, eggs/hen, weekly egg production, feed consumption, flock location, creation phase). ANNs were elaborated by software programs NeuroShell Predictor and NeuroShell Classifier. The programs identified input variables for the assembly of the networks seeking the prediction of the variables called outgoing that are subsequently validated. This validation goes through the comparison between the predictions and the real data present in the database that was the basis for the work. Validation of each ANN is expressed by the specific statistical parameters multiple determination (R) and Mean Squared Error (MSE). For instance, R above 0.70 expresses a good validation. ANN developed for the output variable “number of dead birds per week” presented R= 0.9533 and MSE= 256.88. For “feed consumption”, the results were R= 0.7382 and MSE= 274.56. For “number of eggs (eggs/hen)”, the results were R= 0.9901 and MSE= 172.26. For “weekly weight”, R= 0.9712 and MSE= 11154.41. For “weekly egg production”, R= 0.8015 and MSE= 72.60. For “flock uniformity”, R= -2.9955 and MSE= 431.82. Discussion: From the six ANN designed in this study, in five it was possible to validate the predictions by comparing predictions with the real data. In one output parameter (“flock uniformity”), it was not possible to have adequate validation due to insufficient data in our database. For “number of dead birds per week”, “feed consumption”, “weekly weight” and “uniformity”, the most important variable was “flock age” (27.5%, 52.5%, 55.2% and 37.9%, respectively). For “number of eggs (eggs/hen)”, “uniformity” (52.1%) was the most relevant variable for prediction. For “weekly egg production”, “flock age” and “number of eggs (eggs/hen)” were the most important zootechnical parameters, both with a relative contribution of 38.2%. The results showed that even with the use of a robust tool such as ANNs, it is necessary to have well-noted and clear information that expresses the reality of the flocks. In any case, the results presented allow us to state that ANNs are capable for the management of data generated in a commercial egg production facility. The process of evaluation of these data would be improved if ANNs were routinely used by the professionals linked to this activity.


INTRODUCTION
In recent years, eggs have acquired a greater importance on the national and international context. The Brazilian egg production has experienced an increasing trend since 2010, reaching more than 44 billion eggs produced in 2018 and the country is among the ten largest egg producers in the world. At the same period that production increased by more than 54%, the national consumption of eggs rose from 148 units/year in 2010 to 212 units in 2018 [3].
The significant increase in egg production reflects a greater number of housed birds and of data management. In order to reduce sanitary and financial risks, decisions regarding the production and the health status of the flock must be made based on objective criteria. Otherwise, decisions can become only hunches [18]. The Artificial Neural Networks (ANN), computing system inspired by the biological neural networks that constitute animal brains, is a valuable tool that can be used to reduce the subjectivity of the analysis [26,27]. The ANN is a methodology that takes into account nonlinearities in the relationship between the input and output information [22]. Artificial neural networks have the ability to learn the patterns of a data set during the training process, thereby providing consistent predictions or generalization capabilities over test sets [4,22]. Over the years, several studies have been performed to implement the ANNs as a viable tool to the management of productive and health data in the poultry production chain [5,13,14,16,17,19,21,23].
In this context, the aim of this study was validating the ANNs as viable tools to be employed in the prediction and management of commercial egg production flocks.

Flock productive data
For this study, data from 42 flocks of commercial layer hens from a poultry company, located in the Rio Grande do Sul State in the southern region of Brazil, were used. The data refer to flocks housed in the period between 2010 and 2018 and represents a total of 600,000 layers. The birds were housed using the all-in-all out replacement method and data were analysed from 01 weeks to 90 weeks.

Selected parameters and variables
The zootechnical parameters used for the model calculations were classified as "input" and the parameters to be predicted are the "output" data. Six parameters were selected as "output" data: number of dead birds per week, food consumption, number of eggs (eggs/hen), weekly weight, weekly egg production and flock uniformity.
A total of 14 parameters were selected as "input" data: flock age (weeks), flock identification (covers all the intrinsic characteristics of each flock, including management practices and ambience), total hens in the flock, weekly weight (grams), flock uniformity (determined based on the mean weight (± 10%) of the flock), lineage (four white lineages and four red lineages), weekly mortality (%), absolute number of dead birds, feed consumption (grams), number of eggs (eggs/hen), weekly egg production (%), flock location, creation phase (rearing or production).

Artificial Neural Network (ANN)
NeuroShell Predictor and NeuroShell Classifier 1 softwares were used for the ANNs building. In total, 59,000 cells of data were obtained for the creation of the ANNs. In a first step, 50% (or 1980 lines) was used for training the networks and the remaining data was used for the validation of the prediction models.
The accuracy of the models was calculated using the analysis of coefficient of multiple determination (R 2 ) and mean square error (MSE). The R 2 is an indicator of how well the model fits the data. Models evaluation also included graphs analysis, with the indication of the network prediction versus the actual value [19].

RESULTS
According to the database, six ANN models were designed in this study considering the following parameters as output variable: number of dead birds per week, feed consumption, number of eggs (eggs/hen), weekly weight, weekly egg production, uniformity. The coefficient of multiple determination (R 2 ) and the mean squared error (MSE) for each model are described in Table 1. From the six ANN models designed, it was not possible to validate the predictions by comparing predictions with the real data of the output parameter "flock uniformity". In this case, the inadequate results of the statistical parameters are related to insufficient data of the zootechnical parameter when selected as output variable. The Figure 1 shows an example of a graphic analysis with the indication of the network prediction versus the actual value to the output variable "number of eggs (eggs/hen)".
The description and the relative contribution (%) of the parameters selected as input variables for the ANN models are described in Tables 2, 3, 4, 5, 6 and 7. The input variables that were not presented in the Tables do not have a relative contribution to the corresponding output variable.

DISCUSSION
Adequate protein intake is critical for human health and organism development, and eggs are a very good source of inexpensive and high-quality protein [7]. With a high demand, egg production chain needs to reach superior productivity levels in an environmentally sustainable manner. In addition, definition of wellestablished technical concepts, within the rules of food safety, is necessary to ensure a good management of the zootechnical data [24]. Data analysis in poultry systems has mainly been performed using mathematical and statistical methods or visual graph analyses. However, the high complexity of biological analyses leads to the use of newer techniques, which allow for the development of more robust systems against unexpected conditions [15]. In this context, the development of alternative and objective tools for the productive management of farms is essential to guarantee the productivity and health of the flocks. Among these alternative tools, the ANNs proved to be a viable and robust tool that can be used to laying hens management [20]. Such methods have been applied in animal sciences, and their popularity is increasing due to their superiority in predictive ability for complex systems [8].
Previous studies from the research team of Centro de Diagnóstico e Pesquisa em Patologia Aviária (CDPA), belonging to Universidade Federal do Rio Grande do Sul (UFRGS) have shown that ANNs can be used for data management in breeder [19] and broilers [16] flocks, in hatchery facilities [21] and in broiler slaughterhouse [14]. They have also proven to be effective tools in a broad management of a complete  integration of broiler production [23]. However, this is the first study of CDPA group focusing on eggs production data management using artificial neural networks. Some previous studies cited in the literature performed ANNs as a prediction tool in this area. But, in general, the only parameters estimated were the total egg production or anomalies in egg production, selecting inferior number of parameters as input variables [2,8,15,22]. Regarding poultry health, ANNs have already been used to predict the antimicrobial resistance of Escherichia coli strains [17], the evaluation of lymphocyte depletion in the Bursa of Fabricius [13] and as an objective criteria for histological diagnosis of lymphocytic losses in the thymus by the use of image analysis [5]. Values of R 2 (Coefficient of Multiple Determination) near to "1" indicate a higher quality in the validation of the network, while those that are more distant present a lower quality. Previous studies have already shown that R 2 values above 0.70 in the ANN training processes indicate a good quality of networks for prediction [14,16,19,21,23]. It is important to note that all parameters could also be listed as output variables. The choice depends on the needs of the veterinarian and companies. During the process of modulating the ANNs, they must be checked for qualification. When data prediction is performed, it is very important to recognize the error. Once the error can be positive or negative and their sum may cancel them out, we square the errors to avoid the cancel effect. MSE values indicate the error in the prediction of a specific variable. Lower MSE values indicate a better prediction [22].
For "number of dead birds per week", the most important variable was "flock age" (27.5%). Other variables also showed significant relative contribution, such as "flock location" (16.4%), "weekly weight" (15.6%), "weekly mortality" (14.6%), "egg production per hen" (12.3%) and "uniformity" (10.9%). The importance of these variables in predicting the number of dead birds was expected, as they directly reflect on the results [9]. However, the importance of the variable "flock location" must be highlighted, since it shows that structural and management conditions are one of the main factors that influence mortality on a farm. Information such as room temperature and values for serological monitoring were not available for this experiment. Although these data are often not taken into account by poultry companies, they would possibly be relevant for the prediction of this particular parameter [19]. ANNs training with these and other relevant data could result in differences in the relative contribution of the input variables displayed in Table 2. A value of R 2 = 0.9533 in the validation process indicates a very good quality network for predicting the number of dead birds per week. However, this observation does not invalidate the hypothesis that a more robust data field would provide an even higher ANN quality.
For "feed consumption", the two most important variable were "flock age" (52.5%) and "total hens in the flock" (26.4%). In addition, there is certainly a close relationship between feed consumption and the age of birds [6]. In the ANN validation process, a value of R 2 = 0.7382 was obtained. Despite presenting the lowest R 2 , this network has a good prediction capacity. The available data used for the training and validation of RNAs do not distinguish white and red lineages, which clearly have different feed consumption [25]. So, this data field characteristic must be the cause of a less accurate model.
For "number of eggs (eggs/hen)", "uniformity" (52.1%) was the most relevant variable for prediction. Uniformity is measure of the amount of the body weight variation in the flock and it is important aspect of layer production, especially considering the rearing phase [1]. In the same model, "Flock age" is also important input variable and it represents 36.1% of relative contribution. This was expected, since the effect of age on egg production in hens has been previously described [11]. A value of R 2 = 0.9901 in the RNA validation process demonstrates that this network is very suitable for predicting "number of eggs (eggs/ hen)". The use of mathematical models to estimate egg production curves is of great importance for evaluating egg production over the laying cycle. These models may serve, for example, to estimate the financial loss caused by a decline in egg production, as evinced by a deviation from the expected curve [10,22].
The zootechnical parameter "flock age" (55.2%) was the most important to predict the "weekly weight" in a layer farm. The profile of live weigh is affected by the time or age in meat or egg chicken and different growth functions may evaluate this relation [12]. The number of "total hens in the flock" (31.2%) also has an important contribution to this output data. A value of R 2 = 0.9712 demonstrates the high quality of ANN for predicting "weekly weight". For this variable, it is important to highlight that white and red lineages were not distinguished in the database. Although different genetic groups of layer hens present variable growth rate under the same environmental and nutritional conditions [25], this aspect does not impact the accuracy of the ANN.
For "weekly egg production", "flock age" and "number of eggs (eggs/hen)" were the most important zootechnical parameters, both with a relative contribution of 38.2%. These results were expected, since these input variables imply directly in the weekly egg production. A value of R 2 = 0. 8015 demonstrates the fitness of ANN for prediction. There are other factors that can also influence the prediction of the "weekly egg production", including variations in the period of natural and artificial light or the season that begins the laying period, which are not regularly collected by the companies.
The zootechnical parameters "flock age" (37.9%) and "weekly weight" (35.0%) seemed to be the most important input variables for the prediction of the output "uniformity". However, a negative value (R 2 = -2.9952) for the coefficient of multiple determination does not allow the network validation. Neural networks passes necessarily by two phases, the first called the training phase, when each error in membership assignment is fed back and the connection weights are updated, and the second, called the testing or validation phase, when the prediction of ANN is compared with the real data [22]. Generally, 50% of database is selected for training the networks and the remaining data is used for the validation of the prediction models. In this case, there was insufficient data to validate the prediction of the output variable "uniformity", resulting in a model without adequate accuracy. Thus, a small database size poses one problem in ANN development, because of the inability to partition the database into fairly sized subsets for training and validation [4]. Besides the quantity, ANN depends on the quality of database, as it has been observed in any conventional statistic model [4,22]. In this study, 20% of data field was discarded due to annotation errors. Another ANN´s limitation is the inability to explain in a comprehensible form the process through which a given decision or answer was made by the model, which is considered a "black box" [4].
The neural networks, as observed with the models building in this study, can be fitted to any kind of data set and do not require model assumptions of the type required in nonlinear methodologies [26]. Even though with insufficient or incomplete data in some cases in the study, the ANN models, which are characterized by the high tolerance to data containing measurement errors [27], presented a good fit and prediction of the results.

CONCLUSIONS
The artificial neural networks were able to predict zootechnical and management data in commercial laying hens farms. Despite the insufficiency data for some variables, it was observed a correct response prediction of ANNs. Certainly, the improvement in the databases' registration would improve those models that did not present satisfactory results. We hope that the analysis conducted in this article can provide reference for the choice of ANN for data management analysis in the egg industry. Funding. The authors received no specific funding for this work.

Declaration of interest.
The authors report no conflicts of interest. The authors alone are responsible for the content and writing of paper.