Taxonomy of intelligence software reliability model

The probability of failure free software operation for a specified period of time in a specified environment is called Reliability, it is one of the attributes of software quality and study about it come back to 1384. Exposition and spreading of new software systems and profound effect of it to human life emphasize the importance of software reliability analysis, until it poses formal definition at 1975. First race of reliability analysis methods that we called classic methods has stochastic process approach and in this way, attempt to predict the software behavior in future. Due to the ambiguity in fruitfulness of these solutions the challenge about reliability analysis continued till now. Great tendency in applying intelligence systems at variety of applications can be seen at 90 decade, and software reliability attracts some research direction to itself. Until now variety of methods in reliability analysis on the base of intelligence systems approach exhibited. In this survey the taxonomy of these methods represented with brief description of each one. Also comparison between these methods can be seen at the end of survey.


Introduction
The increasing development of using software in sensitive and costly fields such as military systems' navigation, astronaut robots, medical subjects, many other various areas, and the growing complexities of productive applications clarify the necessity of presenting some approaches to evaluating the error-proof performance of applications along with the time and expenses spent in this area more than before. Reliability is the most important parameter of software quality in software engineering [7]. Its publicly accepted definition is as follows, "The probability of operating without failure during a specific period of time and in a specific environment." [5].
The history of evaluating the reliability of systems is traced back to 1384 [1]. Therefore, the subject was officially defined in software in 1975 [2]. This definition and the ones presented after it have not resulted in an accepted solution in this field so far [4]. The use of Intelligence methods has started since early 1990 in this field [3], and this new approach peaked in the 90s, although some limited numbers of new papers are still presented in this field. Despite the fact that it appears unreasonable to evaluate software reliability without considering the hardware infrastructure, the hardware infrastructure is assumed to be flawless in the majority of models presented to evaluate software reliability. However, some models have been presented without this presumption to deal with the problem in combination [6]. A general classification of Intelligence methods for this field is presented in the second part. The third part deals with the methods pertaining to the neural networks used in this field. The fourth part investigates genetic algorithms, while the methods of support vector machines are studied in the fifth part. In the sixth part, some criteria are introduced to compare different methods, and the comparison of these methods and conclusion are presented in the seventh and eighth part, respectively.

Classification of Intelligence methods to evaluate the reliability
Generally, the Intelligence models of evaluating the software reliability are called nonparametric models. This is due to the performance of classic models as they are called parametric models [8]. Parametric (classic) approaches which have been presented to estimate the unknown parameters in the distribution function of the model led to the selection of the name parametric. In these models, functions named average functions or hazard functions which have one or more unknown parameters are presented. Using estimation methods, the values of these parameters are estimated, and then the value of reliability pertaining to future times is estimated by using the resulting function. The Intelligence models presented in this field can be classified into three general categories according to intelligent techniques and performance: As mentioned earlier, this classification is done according to the approach which has been used; however, it is possible to do more detailed classifications in each group. This matter will be dealt with in the next parts.

Methods based on neural networks
It is obvious that many parameters such as the methodology used to develop the software applications, the software type, developing environment, software complexities, organization, the personnel producing the software application, and so forth influence the software reliability. Factors which are quite effective on the subject and the fact that these attributes focus on quality (so it is not possible to measure some of them precisely and numerically) have made the reliability face with a totally non-linear pattern. Dealing with such problems which are accompanied by many vague factors (in terms of quantity evaluation), the majority of experts would select neural networks as an appropriate approach because these networks are capable of estimating complicated functions quite well. Therefore, the neural networks have been paid more attention than two other techniques in this field, and many papers have been presented on this approach so far [9][10][11][12][13][14][15]. The neural networks were first presented in [14], [15] for this field. Given the approach used in different papers which apply the neural networks, it is possible to classify these papers into four main groups as follows: The first category includes error-error models. This appellation refers to the fact that we encounter models which gives the desired neural network the number of errors occurred during the previous tests and predicts the number of expected failures in the next interval. In other words, the inputs and outputs of the neural network is the number of errors [16], [17], [18]. Various papers reported different results as the type or structure of network changes and the number of neural network inputs varies. Using the multilayer feed forward network, recurrent network, and radial-based function network, a comparison is made in [14] according to the square errors in future prediction. Table 1 indicates the results of this comparison. As it is observed, the performance of recurrent network has been reported to be better than that of the other ones. The higher capabilities of recurrent networks in predicting the parameters pertaining to the reliability have also been reported in comparison with other networks in other papers. In fact, this higher capability results from the innate ability of such networks in predicting real subjects [19], [20], and [21]. The poor performance of radial-based function neural networks is among the interesting reported results. Given the strength and flexibility of these networks in estimating the functions, this weak output can be caused by two factors: 1-The number of training data has not been sufficient, 2-The number of neurons has been small in the hidden layer of radial-based function network. Conducting other tests, if we, however, can prove that none of the above-mentioned reasons has not caused the poor performance of radial-based function neural networks, then the lack of input parameters (another one except for the number of failures) to estimate the output will be the only reason for this matter. This is a very important problem which has not been taken into account so far. The second category includes time-time models. Like the first category, this one is named according to the type of input and output expected by the neural network. In this category, we encounter the models which give the times pertaining to the history of the application and those between failures to the neural network and predict the time to failure then. The papers proposed in this category are more than those of other categories. The reason can be sought in acquiring a rather satisfactory result in this category because no certain reason has been proposed so far [22][23][24][25][26]. A quite comprehensive investigation pertaining to the ability of multilayer feed forward network is done in [22] with the approach proposed in this group. However, no definite result has been presented for the structure of the optimized neural network in order to evaluate the reliability. In fact, the results of simulations conducted in [22] indicate that it is not possible to present such structure. In [22] and the other final results obtained by the authors of different papers in [22][23][24][25][26], the predictability mainly depends on the type of training data or the so-called input data. In other words and according to different papers [22][23][24][25][26], it is not possible to present a comprehensive neural network which has a constant ability while encountering different datasets. It appears that the most interesting and useful approach to using the neural networks in order to predict the reliability is the hybrid models. However, these models do not have anything special by nature, and whatever they present is adapted from a viewpoint of classic models. In some cases in these models, the number of failures occurred during the previous intervals and sometimes the intervals pertaining to the previous periods are given to the neural network as the input. Likewise, the next interval or the number of the following failures is predicted. It is obvious that they are not any different from the other two categories; therefore, what makes these models different from the previous two categories is the combination of some models which evaluate the reliability of classic software in a neural network and present a hybrid output. Selecting the activity function intelligently for mid-layer neurons in these models, the performance of the neural network changes so that it appears to present a hybrid of classic models. Like the classic models, the hybrid models attempt to find the unknown parameters of the assumed distribution function. Therefore, the difference is that the classic models attempt to find the unknown parameters by using the estimation methods (mostly maximum likelihood); however, the hybrid model of the neural network attempts to the appropriate values of weight coefficients which are the same as the unknown parameters of the distribution function by using the back propagation training technique or increasing the average error squares. Therefore, the activity functions of different neurons can be selected according to the proposed distribution function in the classic models or by combining the output of these neurons in the next layer of the neural network so that a weighted output of majority is obtained from the classic models. An instance of these models is presented in [27] in which a comprehensive investigation and comparison of the performance of the hybrid model has been conducted on the usual neural network and the classic models, and a higher performance has been observed for the hybrid models. Given the performance of the hybrid models of neural networks, we may be able to resolve the main problem of software reliability which is the selection of an appropriate model according to the environment and the target software application. In other words, selecting the activity function according to the distribution function proposed in the classic model and then the training network model and finally considering the weight coefficients calculated in the output layer, we attempt to select an equal classic model which has the maximum weight coefficient in the output layer.
There are a few models of the neural networks which do not fit into any groups according to the presented classification definitions, and they are not studied here due to the fact that they are limited and not very well-liked. The inputs and outputs of the neural networks are the intervals between failures and the number of failures predicted for the next interval or vice versa, respectively [18][19][20][21][22][23][24][25][26][27][28][29].

Methods based on genetic algorithms
The genetic algorithm is a method for searching in the problem space and finding the optimal value for the problem. Given this definition and the problem of estimating the unknown parameters of distribution functions in predicting software reliability in classic models, the way of using the genetic algorithms is clarified in this area. As it was stated earlier, the problem of estimating software reliability has turned into the problem of estimating the unknown parameters existing in the distribution functions; therefore, the problem can be turned into finding the optimal value for these parameters simply and by defining the parameters relating to a genetic algorithm. Although it is possible to present such a method, no actions have been taken so far, and no paper has discussed this matter. Perhaps the reason for is the lack of a valuable classic model or society which is publicly acclaimed (and therefore, it would be justifiable to spend time finding the optimal values of its parameters). Nevertheless, for the sake of classification integrity, the models of evaluating the software reliability which are based on genetic algorithms fall into two general categories: The first group named parameter-exploring models refer to the models explained in specified classification. The second group named exploratory models of model-parameter includes the models which attempt to find the distribution function and the relevant parameters simultaneously. In other words, the approaches have been presented to search the space of functions by considering the sample data in order to select the best function in predicting software reliability. Given the performance of these models, they have been titled as exploratory models of model-parameter. These models do not have any presumptions for distribution functions and attempt to find the function itself with the relevant parameters. According to the training data, we attempt to find the best distribution function (or the best function which can present an estimate for future, according to the training data). In other words, the problem of evaluating the software reliability has turned into an optimization problem to find the best function. Given the problem, it is obvious that the solution is genetic programing [30], a branch of genetic algorithms in which the individuals are the functions, and the operators which function on the individuals produce function, too. An instance of these models is presented in [31] so that the inefficiency of classic models is completely obvious in encountering some datasets according to the results. However, the Intelligence models indicate more flexibility, and they did not have disappointing results regarding any of datasets. Also, it has been clarified during the tests conducted in [31] that trigonometric and exponential functions are not efficient in this field. A very important result which has been referred to in [31] indicates the inefficiency of a certain model in encountering different datasets. This result is consistent with those of other models. In other words, the characteristics of input data have a great impact on the output of the proposed models, so it is not possible to select a special model as an efficient and comprehensive model to encounter every type of dataset. Statistically, the outputs of genetic algorithm are, however, better than those of other models. Therefore, the relative superiority of model-parameter exploratory genetic algorithm is clarified in comparison with the model of neural network and classic model. Perhaps the reason can be found in non-presumption approach of genetic algorithm models.

Methods based on support vector machine
The support vector machine [32] is used comprehensively to predict non-linear problems. The support vector machine has mainly presented for pattern recognition. However, its modified type named support vector regression (SVR) [33] has been presented. It is used to estimate the function or regression, in other words. The success of support vector machine in different fields has drawn experts' attention to software reliability. However, its usage has not been accepted for software reliability in comparison with other techniques. In fact, all the methods proposed in the field of software reliability have used the modified version of support vector machine named support vector regression. The main idea of this method is to attempt to find a function in order to estimate the number of failures or the interval between the next two failures. In other words, a classification can also be presented according to the type of input or output used like in neural network models. However, since the number of models presented with this approach is small, such classification has not been presented. According to the simulations conducted in the presented papers, it has been claimed that the results of predictions carried out by SVR are better than the results of genetic algorithms or those of neural networks [36], [34][35][36][37][38][39]. However, lack of reception of SVR in reliability in comparison with other methods takes an aura of mystery on this claim. For instance, a model which uses data presented in [35] is proposed in [34]. It uses the cumulative time between two failures from the previous periods as the input, and the cumulative time between two failures pertaining to the next step will be predicted. The interesting point in the results of [34] is the increased number of errors in the model based on support vector machine in comparison with neural network models as the number of previous input data increases. Given these results and also the rather satisfactory results of classic models based on Markov's model, it may be stated that predicting the next step regarding the cumulative time of failure does not depend on the all previous models. It is obvious that this matter is still a theory, and it has not been investigated or proven precisely so far.

Comparison of methods
The classic models state some presumptions on the environment and targeted application in the first place; therefore, they narrow down the application area of the model to simplify the presentation of the regulations over the model. In the first step in intelligent methods, we encounter the fact that these models have no presumption regarding the environment or the software application. This matter accounts for the main privileges of the intelligent models. Therefore, almost all the papers, in the majority of approaches, (except for the hybrid approaches of neural networks whose models are few) have reached the conclusion that the efficiency of the proposed model is highly sensitive to the input data. This issue refers to the inefficiency of Intelligence models while encountering all circumstances. In fact, it confirms the necessity of some presumptions which are stated in classic models. Therefore, two main flaws which all intelligent modes have are as follows: 1) Their disintegration or, in other words, their inability in presenting the satisfactory result in all environments and circumstances.

2)
Lack of presenting the presumptions or the necessary circumstances for the satisfactory performance of the proposed model. In Table 1, a general comparison of the explained methods is presented. As it was observed, three intellectualizing approaches which have been taken into account in software reliability field are neural networks, genetic algorithms, and support vector machines. These three approaches have been compared with each other in Table 2.

Conclusion
It is obvious that generating a flawless system is not possible; therefore, we cannot consider the objective of evaluating software reliability to be the production of flawless applications. Thus, the objective is to decrease the errors. So the answer to the question which asks, "What is the acceptable threshold of error in systems?" can resolve the challenge existing in the subject of evaluating the software reliability. The classic models of evaluating the reliability mostly attempt to find a probability distribution function based on the subject so that they can predict the future according to that function. Therefore, finding this distribution function and predicting the future have turned out to be a challenge. Intelligent approaches have not made special innovations in the main problem; however, they attempt to find the solution by accepting the problem the way it is (inputs, outputs, and assumptions). Also, the degree of this reception varies from a maximum value in the hybrid approaches of neural networks or parametric models of genetic algorithm to a minimum value in the exploratory genetic algorithms of the model. High dependence of all existing approaches (intelligent/classic) on the input data creates this theory that there are other efficient parameters which have not been taken into account for the definition of the main problem so far. For instance, the hypothesis which states no errors are added in the process of resolving the discovered error is totally different from the real world of software. This issue is simply overlooked in the majority of models. Given the comprehensive researches which have been conducted in the evaluation of reliability classically and the stillremained challenge, it appears that accepting the existing problems (the way they are) and presenting intelligent approaches for them do not influence the problem solving so much. Moreover, the results of intelligent approaches proposed confirm this assertion. Considering the power of intelligent methods in presenting an approach for the problems in multi-dimensional spaces, it is expected that a step be taken in order to achieve this goal by redefining the problem of evaluating the reliability from a different perspective. Finally, although accepting the input parameters of a problem which was introduced many years ago and using the modern techniques may sometimes be troubleshooting, it is not a new and terrific subject. The intelligent techniques are also considered to be troubleshooting to some extents in the field of software reliability. However, lack of precise and mathematical analysis of these techniques is a black point in this field in order to encourage the experts to use such techniques. Using the main ability of the genetic algorithm, it attempts to find the optimal value of unknown parameters in the functions with the functions proposed for classic models.

Simple Implementation 
The main characteristic of genetic algorithm which is the search for finding the optimal value is used.
 It cannot be considered as a single model. In fact, it is a method to find the unknown parameter of another model. Therefore, the method changes, and the value of unknown parameter and the output results will be different.

Genetic Algorithm, Exploratory of Model and Parameter
Using the genetic programing, it attempts to find the optimal function for evaluation and predicting the reliability.
 It solves the problem without any presumption.  Unlike other models, it assumes both the function and its parameters; therefore, it appears that it is more flexible while dealing with different datasets.


The number of presented models is almost few; therefore, it is hard to make an absolute statement about this matter.  Achieving the desired result requires almost a lot of training data in comparison with other methods.

Support Vector Machine
Using SVR version of support vector machine, it seeks to estimate the evaluation function of reliability in the future.
 SVM is of the very efficient methods of estimation non-linear functions.  Compared with the models of neural networks, it has a high power of generalization; therefore, it indicates rather satisfactory results for different datasets.
 Compared with other methods, fewer models have been presented by this model, and this problem makes the efficiency evaluation difficult.  Simple Implementation  The efficiency of neural network has been proven as an estimation method of functions [40].  It doesn't require a certain parameter adjustment.  It doesn't state a certain presumption to solve the problem.  Various available models according to this method simplify the evaluation and comparison.  An almost good ability in dealing with noise-making data.
 Dealing with the network in the form of black box, and the lack of precise and mathematic analysis regarding the efficiency of the proposed model.  They attempt to learn the model existing in training data, and they experience over fitting or lack of generalizability as usual.