410 likes | 435 Views
Outline. Basic Concepts Software reliability models Software reliability measure simulation A project application Conclusion and future work Q&A. What is Software Reliability. The Probability of failure-free software operations for a specified period of time in a specified environment .
E N D
Outline • Basic Concepts • Software reliability models • Software reliability measure simulation • A project application • Conclusion and future work • Q&A
What is Software Reliability • The Probability of failure-free software operations for a specified period of time in a specified environment . Some relative concepts: Failure, Fault, time Failure functions... Section 1-1
Failure: It is the departure of the external results of program operation from requirements.Fault: It is the defect in the program that, when executed under particular conditions, causes a failure. It is the cause of failure. “Failure” is something dynamic. The program has to be executing for a failure to occur. “Fault” is static. It’s a property of program not a property of its execution behavior. It is created when a programmer makes an error. Section 1-2
Time: Reliability quantities are defined with respect to time. We are concerned three kinds of time: execution time: is the CPU time that is actually spent by the computer in executing the software; calendar time: is the time people normally experience in terms of years, months, weeks,etc.; clock time: is the elapsed time from start to end of computer execution in running the software. Section 1-3
There are four general ways of characterizing failure occurrences in time: 1). Time of failure, 2). Time interval between failures, 3). Cumulative failures experienced up to a given time, 4). Failures experienced in a time interval. Section 1-4
Failure functions: when a time basis is determined, failures can be expressed in several ways • The cumulative function (mean value function): denotes the average cumulative failures associated with each point of time, • The failure intensity function: represents the rate of change of the cumulative failure function, • The failure rate function (rate of occurrence of failures): defined as the probability that a failure per unit time occurs in the interval [t,t+t], given that a failure has not occurred before t. We will use failure rate functions for our simulation Section 1-5
One particular aspect of SRE (Software Reliability engineering) that has received the most attention. There are many models have been proposed since 1970s. The basic idea: A software reliability model describe failures as a random process, which is characterized in either times of failures or the number of failures at fixed times. Software reliability modeling Section 2-1
Let N(t) be a random process representing the numberof failures experienced by time t. Then (t), the meanvalue function, is defined as (t)=E[N(t)], which represents the expected number of failures at time t. The failure intensity function of the (t) process is the instantaneous rate of change of the expected failures number with respect to time, it is defined by Section 2-2
The counting process{N(t), t0} is modeled by NHPP N(t) follows a Poisson distri- bution. The probability that N(t) is a given integer n is: m(t) is called mean value function, it describes the expected cumulative number of failures in [0,t) Nonhomogeneous Poisson Process (NHPP) Models Section 2-3
The general assumptions of NHPP models • N(0) = 0, • {N(t), t0} has independent increments, • P{[N(t+h)-N(t)]=1}=(t)+o(h), • P{[N(t+h)-N(t) 2}=o(h). o(h) denotes a quantity which tends to zero for small h. (t) is the instantaneous failure intensity is defined as: Section 2-4
The Goel-Okumoto (GO) model Assumptions • The cumulative number of faults detected at time t follows a Poisson distribution, • All faults are independent and have the same chance of being detected, • All detected faults are removed immediately and no new faults are introduced. The failure process is modeled by an NHPP model with mean value function (t) given by Section 2-5
a and b are parameters to be determined by using collected failure data. Note that for this model we have ()= a and (0)=0 a is the final number of faults that can be detected by the test process. b is a constant of proportionality, can be interpreted as the failure occurrence rate per fault. The intensity function (t) is the derivative of (t) is then The expected number of remaining faults at time t is Section 2-6
(t) The shape of the intensity function and the mean value of the GO model (t) t Section 2-7
At beginning of testing, some faults are “covered” by other faults. Removing a detected fault at beginning does not decrease the failure intensity very much. Software reliability testing usually involves a learning process. Skills and effectiveness improve gradually. (t)=a[1-(1+bt)e-bt]; b>0. S-shaped NHPP model (t) t Section 2-8
Markov Models • A stochastic process {X(t),t0} is said to be a Markov process if its future Development depends only on the present state of the process, that is P[X(t) x(t)|X(t1) x1,…,X(tn) xn]=P[X(t) x(t)|X(tn) xn], for all t1<t2…<t. Markov property which has the following simple explanation, Given the present state of the process, its future behavior is independent of the past history of the process. This is the most important feature Section 2-9
The process {N(t), t0} where N(t) is the number of events in a Markov process, such as the number of detected faults in a software context, is called a Markov counting process is the birth-death process for which a so-called birth increases the size of the process by one and a death decreases the size by one. N(t) A realization of a Markov counting process 7 6 5 4 3 2 t 1 0 t1 t2 t3 t4 t5 t6 t7
The Jelinski_Moranda (JM) model • Assumptions 1. The number of initial faults is an unknown but fixed constant; 2. A detected fault is removed immediately and no new fault is introduced; 3. Times between failures are independent, exponential distributed random quantities; 4. All remaining faults contribute the same amount of the software failure intensity. Section 2-11
The main property of the JM model is constant FI between the detection of two consecutive failures. This is quite reasonable if the software is unchanged and the testing is random and homogeneous. (i) The failure intensity versus the number of removed faults N i 0 N+1 1 N Section 2-12
Random Process (both error introduce and run selection process are random) The failures are independent each other (failure times are independent each other) with and without repair: two situations in testing phase Limitations: 1) model’s assumptions 2) future prediction, must noting the environment, using recent data. General model characteristics and limitations Section 2-15
Software reliability simulation • Present a particular attractive computational alternative for investigating software reliability; • it can model a wider range of reliability phenomena than mathematical analyses; • It can provide a “virtual” environment to predict or study software reliability for some software projects; • averts the need for overly restrictive assumption Section 3-1
Failure rate functions: • (t)=n0e-t, GO model,n0 is the initial failure rate, is the failure rate decay factor. Inputs parameters; • n(t)=0(1-n/n0), JM model, 0n0 is the estimated number of initial faults, 0 is initial failure rate; • (t)=ab2te-bt, S-shaped model, a: expected failure number b: failure detect rate . . In our simulation, including 7 failure rate functions Section 3-3
Simulation approaches • Black-box simulation: treat software as a whole, only its interactions with the outside world are modeled, not con- cerning the internal structure and component combination. • White-box simulation: assume the software comprising of m components, its architecture is specified by the inter- component transition probabilities, denoted by wij (Comp.i -->Comp.j) Section 3-4
The basic algorithm of black-box simulation While (time<max_time) Initialization(set max_time, dt,..) (t) : value of rate function at that time Produce random number 0<a<1 Num_failure: cumulative number of failures N (t)*dt<a Num_failur+1,time=time+dt, Section 3-5
The input of black-box simulation is a failures indicator for the software, such as: go 130.6 0.0048 “go” indicate using GO model, the latter two real numbers are the first and second parameters. We use CASRE to get the parameters of failure rate functions. • The output of the simulation are cumulative failures nu- mbers versus time, and failure intensity. Section 3-6
In white-box simulation, event (failure) producing algorithm issimilar to black-box. In additional, at each time_step we mustcalculate which component will execute. This calculation isbased on the input file (control-flow matrix). Another inputof white-box simulation failure behavior file 3 0.00 0.80 0.20 070 0.00 0.30 0.60 0.40 0.00 go 130.6 0.0048 jm 63.78 0.3288 ys 88.50 0.0098 3-component failure file 3-component control-flow matrix Section 3-7
3-component internal architecture 0.8 component1 component2 0.7 0.6 0.3 0.2 0.4 component3 Section 3-8
We have applied the simulation techniques into a project and got some interesting results. This is a three successive generations of switching system (TROPICO-R) software. The three products are: PRA, PRB, PRC. Each product has four functions (components) : 1)Telephone(TEL): local call processing, charge-metering, etc 2) Defense(DEF): on-line testing, traffic measurement, error… 3)Interface(INT): communication with local devices… 4)Management(MAN): communication with external devices.. We have simulated every product and each function using GO, JM, S-shaped models Application for a project Section 4-1
PRA_TEL simulation results Section 4-2
PRB_TEL simulation results Section 4-3
PRC_TEL simulation results Total number of failures is small, JM has better prediction Section 4-4
PRA and PRB whole system simulation results Section 4-5
PRC system simulation results Section 4-6
Some analyses of simulation • In the case of large number total failure (PRA, PRB), In general, the three models are better fitting and predictions; • For the successive two generations products (PRB,PRC) the S-shaped model has better fitting at early phase of testing; • In the case of small number of total failure (PRC), JM model has better prediction than S-shaped model and GO model. Section 4-7
PRA system simulation deviation: (simulation value)-(real data value) Section 4-8
PRB system simulation deviation Section 4-9
PRC system simulation deviation Section 4-10
From the Figures of deviation we can see • There are exist some large deviations at some time point, this indicates that modeling and simulation can catch the general trend or prediction of software reliability measures, and it is difficult to get accurate measures with exact time point, • Around 10th month during the software testing, there is a sensitive transition for fault exposure. Section 4-11
Conclusion • Combined analytical models with simulation techniques to give effective and practical method for software relia- bility measures • Implemented a rate-based simulator for software reliabi- lity measurements (it is not computation complexity; it enables models combination approach; it takes into account the internal structure or dependency of software components) • The application for project demonstrate it can be used for analyses, prediction and evaluation in software reliability literature. Section 5-1
Future work • To introduce more software environment factors into the simulation process for software reliability measures, • To develop more effective methods or algorithms to simulate or analyze the dependency between components in complex software systems, • To extend the methods and approaches investigated in our work for network (e.g., the Internet) software reliability analyses. Section 5-2
Question and Answer The End Thank you very much!