350 likes | 804 Views
Cherkassy State Technological University, Specialize Computer Systems Department , Cherkassy, Ukraine Method and Mean of Computer’s Memory Reliable Work Monitoring Scientific Adviser D.E., Prof. Ryabtsev V.G. PhD Student Utkina T.Yu. Introduction
E N D
Cherkassy State Technological University, SpecializeComputer SystemsDepartment, Cherkassy, Ukraine Method and Mean of Computer’s Memory Reliable Work Monitoring Scientific Adviser D.E., Prof. Ryabtsev V.G. PhD Student Utkina T.Yu.
Introduction • The most careful testing does not allow even to liquidate the basic source of loss of important information, changes in the critical files of the systems, stop of all system or its part, outages are programmatic failures. And it is when the cost of question depending on industry of the useis increased repeatedly. Such failures arise, if an alpha particle gets in a component or it is exposed to influence of space radiation, and a transistor changes the state unexpectedly. Much rarer, than programmatic failures, there are hardware failures such as a malfunction of bit memory, lines or devices, but their consequences can be not less catastrophic. For providing of reliable work of the system both types of errors and adequate reaction of them must be determined. • Especially sharply the question is raised for providing of reliable work of the central control systems, where all information of work of the basic blocks or separate devices of one or another object is collected, the administration of its basic components is conducted, and the emergency of the central computer hardware can lead to dangerous situations. Therefore in order to avoid catastrophic conditions it is necessary to make the estimation of the weak points and of the possible vulnerabilities of the concrete information systems of computer in advance.
One of computer important components is data storages, which consist of semiconductor memory microcircuits. As memory microcircuits become all more difficult and work quicker, their testing it is the heavy and expensive procedure. • Presently the different manufacturers of memory microcircuits pay more attention to this question. The manufacturers of memory microcircuits invest the considerable means in the systems of testing and continuously perfect this procedure to support high quality of products. • In turn that testifies to the actuality of development of methods and means of preventive diagnosing of data storages with taking into account the half-periods of degradation.
Purposes of work • Provide reliable work of the memory modules due to the offered of computer’s memory reliable work monitoring. • Makemany sets of tests for conducting of the prevention of a multi version diagnosing of modules of the main memory of computer. • Present recommendations of using of differentgroups of tests.
Tasks • It is necessary for achieving these purposes: • Investigate a question of computer’s memory’s modules reliable work providing. Enter necessary concepts of half-periods and period of degradation of probability of non-failure operation of memory. • Devise method and mean of regulation of conducting of the prevention of a multi version diagnosing of data storages with taking into account the half-periods of degradation.Give information about mode of operation and features of the program of computer’s memory reliable work monitoring. • Conduct the analysis of the existent test programs of computer’s memory. built on the basis of the best from them by the indexes of duration and efficiency many sets of tests for conducting of the prevention of a multi version diagnosing of the memory modules.
Figure 1. Intensity of failures of memory microcircuits on the different stages of exploitation The preventive diagnosing ofdata storages with taking into accountthe half-periods of degradation • Quality and reliability is the essential condition of risks decline due to the detection of low-quality memory’s microcircuits by the manufacturer that is allowed to obtain the increase of reliability, vitality, fault-tolerance, faultiness of the produced modules of main memory. • However three months of early term of exploitation of the memory modules are critical, as it is necessary to take into account the memory errors and malfunctions, which is related to probability of developments of the unfound disrepairs at testing a manufacturer, besides it is impossible to avoid the damages of the memory modules at transporting to the user. • In course of time probability of development of failures of the memory module goes down until the data storages will be aged that afterwards inevitably will lead to malfunction of the last (fig. 1).
In order to avoid the negative scenarios of development and for reliability providing of functioning of such systems it is needed to solve the problem of security of computer’s informative systems from the memory errors. • By estimating of data storages reliability it is necessary to define the volume of the prevention test diagnosing and consider methods which warn the origin of the state of the memory modules with the low indexes of quality. • Unfortunately, the comprehensive tests of the memory modules are impossible, as require the large expenses of time and means. • Thus, it is necessary to find a compromise between the positive estimation of reliability and expenses on the conducting of the test diagnosing.
(1) Probability of non-failure operationof the data storages • The probability of non-failure operation of the data storages during exploitation is calculated by the formula: where k=·x – ratio of repetition factor of development of failures,x – repetition factor of development of failures, i.e. the number may not adjacent positions of byte or words of memory, in which as a result of failure incorrect values can appear simultaneously, – intensity of development of failures; m=n·c – number of memory microcircuits is in a device,n – number of the memory modules,c – number of microcircuits is in the memory module; t – duration of exploitation of the memory modules. • If the user is set possible probability of the capable of working state of the data storages, equal minimum Rmin, then it is needed to define the intervals of time, on completion of which it is necessary to execute the prevention diagnosing of the memory modules.
Fifth version Five versions Figure 2. Changes of probabilities of reject of the memory module during organization of a multi version diagnosing First version A multi version diagnosing ofthe storage devices • All term of exploitation of the data storages during organization of a multi version diagnosing is broken up on a few periods and the different groups «versions» of tests in each of them are executed, thus with increase time of exploitation the number of the using tests is multiplied, so that total probability of detection of the uncrossed refusals augmented by them. • We will consider the graphs of changes of probabilities of reject of the memory module during organization of a multi version diagnosing, which are given on a fig. 2. • A multi version diagnosing is the diagnosing when the different groups «version» of tests on the different stages of exploitation of the data storages are used.
(2) (3) The calculation of period and half-periods of degradation • We will enter necessary concepts. • Definition 1.Period of degradation of probability of non-failure operation of memory is the interval of time, when probability of non-failure operation of the module diminishes to the minimum value set by the user.The period of degradation is calculated by the formula: • Under reaching reliability of working as the data storages ofvalue Ri mintime of degradation comes. • Definition 2.Half-period of degradation is the interval of time is equal to the half of period of degradation. A formula for the calculation of the first half-period of degradation is of the form:
(4) • We will enter a variable rfor denotation of number of half-periods of degradation of the memory module, and then a formula for the calculation ofihalf-period of degradation will be of the form: • If the main memory consists more than one module, for each of the set modules the period of degradation is determined. The period of degradation of the main memory will answer the minimum value of period of degradation of the modules set in the system. For the module with the minimum value of period of degradation reckon the half-periods of degradation. • Then the intervals of time, on completion of which it is necessary to execute the prevention diagnosing of the main memory, will be equal to the half-periods of degradation for the memory module with the least period of degradation.
Features of using the offered method • In order to avoid the malfunctions of memory on all stages of exploitationthe maintenance prevention of a multi version diagnosing is offered. • In the period of early term of exploitation of the memory modules during three months of work testing is conducted each time at including of feed the different sets of tests with high efficiency of exposure of malfunctions. • Time of useful exploitation without the prevention diagnosing in same queue lasts to the offensive Tpd1 half-period of degradation. The next stage of testing begins just at achievement sometimes of exploitation of value Tpd 2 half-period of degradation, then Tpd 3 and etc, to the offensive of time of degradation. • With the increment of number of half-periods of degradation it is recommended to execute more protracted and more effective tests. • The offensive of time of degradation is meaning the decreasing of probability of non-failure operation of i memory module to the value Ri min indicated by the user. It is whereupon recommended to replace the module, otherwise from the senescence of the module there can be critical malfunctions of all system.
Figure 3. Exploitation ofthe modules of the main memory of computer with taking into account the half-periods of degradation • We will consider the histogram of moments of time of conducting of preventive diagnosis for computer’s memory, consisting of two memory modules taking into minimum of possible probability of the capable of working state of the main memory Rmin =0,75 and repetition factor of development of failuresx=3, which is presented on a fig. 3. • The developed method of determination of time of conducting of the preventive diagnosis and replacement of the memory modules is realized in the program of computer’s memory reliable work monitoring Diagnostics_Memory 2.3.
Features of the program of computer’s memory reliable work monitoring • The basic task of the program Diagnostics Memory 2.3 is regulation of conducting of the prevention of a multi version diagnosing, providing of the given level of probability of non-failure operation of the main memory and reliability of computer’s work increasing on the whole. • The program calculates reliability of the memory’s work during exploitation and warns the development of failures and defects of the data storages, which can arise up, both as a result of unfound out disrepairs at testing by the manufacturer or damages of the modules of memory at transporting to the user and the senescence of the modules, by conducting of the maintenance prevention of a multi version diagnosing of the modules of the main memory of computer taking into account half-periods and period of degradation of the data storages.
Figure 4. Mode of “Input data” of the program Diagnostics_Memory 2.3 • The main window of the program in the mode “Input data” is presented on a fig. 4.
The monitoring program is developed as an open system; its functions are presented the separate modes which can be complemented. • The four modes are presented to the user: input data, time control, test selection, check SPD; and also help information of the program is given, aims its using, input and output data is given, instructions on a start each of mode of operation and their possibilities, accessible additional information about developers, name of project, version, supervisor of project.
Using of the mode “Input data” • The mode of “Input data” is mean the collecting of necessary information from the user about the set modules of the computer’s memory of: namely, information of the main memory: number of modules, repetition factor of development of failures, the allowable probability of non-failure operation, average operation period in days; information of every module: number of microcircuits in the modules, intensity of failures of memory microcircuits, date of putting into exploitation. • The system has the possibility of interactive prompt. Consequently, for example, parameter the repetition factor of development of failures can be chosen from the list, the value of which is varied from 1 to 9. Depending on the size of this parameter and on the used technologies of detection/correction of errors of memory the possibility of timely exposure and removal of the same time emerging errors of the main memory is determined. That’s why the user has to know from what repetition factor of development of failures of the main memory he is insured by one or another technology of detection/correction of errors of memory (on condition of support of this technology a computer hardware) it is enough only to bring to the list of values of this parameter the mouse pointer and to detain it on a few seconds. Whereupon the interactive prompt will be shown out with necessary information (fig. 4).
The user sets the allowable probability of non-failure operation of memory depending on the requirements, produced to the concrete computer’s informative systems, in order to avoid the origin of failures in operation, spoilages of data or their irreversible loss, caused the errors of memory, that in the end results in losses from the outages of the system. • After entering of all source data it is possible to expect time of the maintenance prevention of a multi version diagnosing of the memory, current probability of non-failure operation, half-periods and period of degradation of the main memory of computer, is here calculated. If this probability less than the allowable probability of non-failure operation of the main memory, about it is revealed to the user with suggestion of replacement of module, the probability of non-failure operation which is the least. • The algorithm of work of the program Diagnostics_Memory 2.3 in the mode “Input data”, illustrating the use of the method of regulation of conducting of the prevention of a multi version diagnosing, presented on a fig. 5.
Figure 5. The algorithm ofthe program Diagnostics_Memory 2.3in the mode “Input data”
Using of the mode “Check SPD” • SPD (Serial Presence Detect) is the device of determination of presence with sequential access that is made on the special microcircuit (as a rule, it is the electric reprogrammable memory) which contains information of the type of device and its basic descriptions. A capacity of such memory is 512 byte. • Without SPD microcircuit it can’t be any modern module of memory.The SPD microcircuit contains major temporal features and data of the chips and their manufacturer, and also exact settings of these chips which are needed BIOS for correct system configuration, used in the module. • For providing of correct work with the concrete type of memory at the start ofthe system the successive reading of bytes is executed from the SPD microcircuit for authentication of the memory module and the identification of settings of operation of the main memory. • The program in the mode “Check SPD” is allowed to execute the verification ofthe SPD modules of the computer’s memory, on the basis of which is conducted following analysis of current timings of the operation of memory modules.
Figure 6. Results of the program Diagnostics_Memory 2.3 in the mode “Check SPD” • The results of the program in the mode “Check SPD” are presented in the fig. 6.
If the current timings are exceed the values, which are recommended by the manufacturer and brought in SPD of every module, the user will be informed that the computer’s memory works on the increased frequency and the losses of information are possible. In this case to execute control of time of offensive of half-periods of degradation and period of degradation it is necessary to set the timings of operation of the modules of the main memory accordant SPD in advance. • If the current timings are answered the maximally possible timings of SPD, the user will be informed, that the computer’s memory works accordant SPD. Otherwise the user will be put in a fame that the operation of the computer’s memory is possible to accelerate, asthe current timings are not maximal accordant SPD.
Using of the mode “Time control” • The mode of operation of “Time control” is suggested to use for the explorations’ terms account and the moment of execution of preventive works of the memory. • In the mode “Time control” the program fixes the duration of work of computer and makes the comparison of the total operation period of computer in course of time to completion of prevention of early term of exploitation of the computer’s memory, and also by the half-periods of degradation and period of degradation. • The start of the mode is possible only after the acquisition by the user of information of the set modules of the computer’s memory in the mode “Input data”, which becomes active at the first entrance to the system by default. In this mode the control of work of devices of the main memory is carried out at the start of the operating system of Windows. • With its help the user can appoint in advance the calculated time in the mode “Input data” to completion of prevention of early term of exploitation of the computer’s memory, half-periods of degradation and period of degradation, upon completion which will be got by a report about the necessity of conducting of testing, in order to avoid the origin of disrepairs and failures in the operation of the main memory.
Figure 7. Mode of “Time control” of the programDiagnostics_Memory 2.3 • Program window in the mode “Time control” is presented on a fig. 7. • The input data in the mode “Time control” are: • information of the current duration of work of the program: date of the currentstart (1), beginning of work from the moment of the current start of work (2), operation period of devices of the main memory from the moment of the current start (3); • information of total duration of work of the program: date of the first start (4), beginning of work from the moment of the first start (5), operation period of devices of the main memory from the moment of the first start (6); • information of time for testing: time in/to of completion of prevention of early term of exploitation (7), time in/to of offensive of the 1st half-period of degradation of the main memory (8), time in/to of offensive of the 2nd half-period of degradation of the main memory (9), time in/to of offensive of the 3rd half-period of degradation of the main memory (10), time in/to of offensive of period of degradation of the main memory (11). • In order to avoid the failures of memory on all stages of exploitation the maintenance prevention of a multi version diagnosing is offered.
Figure 8. Efficiency of different versions ofthe sets of tests Using of the mode “Test selection” • In the period of early term of the memory modules exploitation during three months of work testing each time is conducted at including of feed the different sets of tests, with high efficiency of the detection of failures. • Time of useful exploitation lasts to the offensive of half-period of degradation. The next stage of testing begins just at achievement sometimes of exploitation of value of the 1st half-period of degradation, the 2nd half-period of degradation, then the 3rd half-period of degradation…etc to the offensive of degradation’s time. With the increment of number of half-periods of degradation it is recommended to execute more protracted and more effective tests. • Consequently, the operation period of testing during the prevention diagnosing at all stages of exploitation will be the form, marked on a fig. 8 by the shaded rectangles.
Figure 9. Mode of “Test selection” of the program Diagnostics_Memory 2.3 • The mode of “Test selection” allows to execute timely the prevention service and to save high indexes of reliability of the memory modules during all term of exploitation of the computer’s memory of by using of a few versions of diagnostic tests on the different stages of life cycle of the main memory. • Program window in the mode “Test selection” is presented on a fig. 9.
The onset of the degradation’s time is mean the decline of probability of non-failure operation of memory to the value indicated by the user, it is whereupon recommended to replace one of the modules with the least probability of non-failure operation, otherwise from the senescence of the module there can be critical malfunctions of all system. In this case subsequent exploitation it is not admitted and needed to execute replacement of the memory module. Such replacement of the modules can be carried out repeatedly; at this the necessary probability of non-failure operation of the main memory is achieved. • In the program for the growing number of half-periods of degradation it is suggested to use 3 aggregates of the sets of tests, which provide possibility of start of separate tests or groups of tests on the different stages of exploitation of the memory modules, which are formed on the basis of combination of the criteria of duration of testing and efficiency of exposure of disrepairs: • the file of the test programs with low efficiency of exposure of disrepairs and the least duration of testing is LowEfficacyTests.pro: • RightMark Memory Analyzer version 3.58; • RightMark RAMTester Utility version 1.0; • the file of the test programs with middle efficiency of exposure of disrepairs and middle duration of testing is NormalEfficacyTests.pro: • AleGr MEMTEST 2.0; • the file of the test programs with high efficiency of exposure of disrepairs and large enough duration of testing is HighEfficacyTests.pro: • Windows Memory Diagnostic. • The programs of testing, the list of which is varied depending on the selection file of the test programs also can be complemented by the user on his discretion.
Recommendations regardingthe selection of test programs • For using of the hardware diagnosing of devices of the main memory there is the specialized software by which is conducted the testing of devices of memory. • The software products, which execute testing of the memory modules, differentiate between itself not only by the limitation of capacity of memory which is tested but also by the variety of the sets of tests used for testing. Speed and quality of conducting of diagnosing of the main memory depends on it. • Among such software products are the most known and accessible for the user: • OCZ Memtest86 www.memtest.org; • GoldMemory 6.68 www.goldmemory.cz; • AleGr MEMTEST 2.0http://www.home.earthlink.net/~alegr/download/memtest.htm; • RightMark Memory Analyzer version 3.58 http://cpu.rightmark.org/products/rmma.shtml,http://www.ixbt.com/cpu/rmma-general-3.shtml(Russian),http://www.digit-life.com/articles2/rmma-general/rmma-general-3.html (English); • RightMark RAMTester Utility version 1.0http://cpu.rightmark.org/products/ramtester.shtml; • Microsoft Windows Memory Diagnostic (Beta)оса.microsoft.com; • DocMemory v.3.1beta www.docmemory.com.
AM1 2zz 22560818C-5A 0547D Figure 10. Test module of memoryAM1 512 MB DDR • In order to define what from the software products can be used at testing of the main memory it is necessary to estimate the criteria of selection of the test programs among which the most important are efficiency and duration of conducting of testing. • The criterion of efficiency is meant by probability of detection of error of memory. • The parameter of duration accords time which must be expended on testing of the main memory by necessary capacity with the set frequency. • For determination of the criteria of efficiency and duration of the test programs the testing of the memory module of AM1 512 MB DDR SDRAM (PC3200)was executed, the type of the module and chip are shown on a fig. 10.
As a platform the following test equipment was used: • Processor – AMD Sempron 2600+ (1600 MHz, nVIDIA nForce3 250,AMD Hammer); • Cooler – Titan TFD-8025 12Z DC 12V 0.11A; • Graphics Card – GeForce FX 5200 (64 Мб); • Hard Disk – Maxtor 91024D4 (10 Гб, 7200 RPM, Ultra-ATA/33); • Motherboard – Asus K8N, Bios American Megatrends Inc. v.1008; • EPU – CODEGEN 300XA ATX 2.03 (P4) (300W); • Operating System – Windows XP SP2. • The values over of probabilities of detection of the memory errors, got by an experiment for the lower described programs of diagnosing and the tested memory module by a capacity 512 Mbytes with frequency of work 400 MHz is resulted in tab. 1.
GoldMemory 6.68 is the program that doesn’t depend on the operating system and is intended for the careful testing of the main memory. It is able to test to4 Gigabyte of memory. For work creates a load diskette or CD-ROM. • AleGr Memtest 2.0 is the program for testing of DRAM computers, built on Intel 386 and higher. AleGr Memtest is developed for a start on processors witha cache, and takes into account the work of system bus, cache and the main memory of Pentium and Pentium Pro (Pentium II, III, 4). Maximal size of memory, which AleGr Memtest is able to check up 3 Gigabyte under DOS anda little less 2 Gigabyte under Windows. If capacity set in the system of memory4 Gigabyte or anymore, it is possible to start a few sessions of testers under Windows, to overcome all memory. • RightMark Memory Analyzer 3.58 is the program, which gives the exhaustive information of central processing unit, the set of system logic and the main memory, allows to know the followings parameters of the system: the average and spades real bandwidth of the main memory; the capacity and hierarchy of cache of levels L1/L2/L3; the average and spades values of latentness of cache and the main memory; other parameters which represent the performance of cache of levels L1/L2/L3; the parameters of I-ROB (Instructions ReOrder Buffer); different parameters of central processing units; the parameters of D-TLB and I-TLB; includes the additional module of RightMark Memory Stability Test, which gives the possibility of testing the set volume of the main memory of a few setsof tests by the user selection. The utility is intended for work based on the operating systems of Windows 95/98/98SE/ME/2000/XP/2003 Server.
RightMark RAMTester Utility version 1.0 is the utility that is intended for searching of errors in work of the modules of the memory in the operating systems of Windows family. Principle of RAMTester’ work is based on recording of certain data in memory with subsequent their reading and comparing tothe primary meaning. One of features of utility is a possibility of verification only of free volume of the main memory. The size of the main memory which is subject testing can be setby the user. The utility is intended for work based on operating systems of Windows 9x/ME/NT2000/XP, Windows XP/2003 x64. • Microsoft Windows Memory Diagnostic (Beta) is the brand-name utility of Microsoft Company that is intended for testing of the computer’s memory inthe presence of errors. It supports the most of configurations. At a start Windows Memory Diagnostic suggests to create a load diskette or CD-ROM image forthe subsequentautonomous work. • DocMemory v.3.1beta is the program, intended for testing of the computer’s memory. The installer creates a load diskette, at the load of computer with which testing begins automatically. The program works on the platforms of Intel or compatible processors from 486 to Pentium РС and AMD Athlon РС.The different types of memory from Fast Page Mode, EDO, SDRAM, DDR, DDR2, FBDIMM or RAMBUS are supported.
Conclusions • Advantages of the offered method and mean of computer memory’s reliable work providing, allows attaining the given level of probability of non-failure operation of the memory due to implementation ofa few versions of diagnostic tests on the different stages of exploitation of device taking into accountthe half-period of degradation and the proactive replacement of the memory modules.
Thanks for your attention! You can write to the authors by e-mails: • Vladimir G. Ryabtsev : • volodja18@ukr.net • Tat'yana Yu. Utkina : • utia_chdtu@yahoo.com