throbber
Copyright IBM Research - Technical Report RC21457 Log96&56 4/26199
`
`Software Testing Best Practices
`
`Ram Cbillarege
`Center for Software Engineering
`IBM Research
`
`Abstract:
`
`This report lists 28 best practices that contribute to improved software testing. n,ey are not
`necessarily related to software test tools. Some may have associated tools 0111 they are
`fundamentally practice. The collections represent practices that several experienced software
`organizations have gained from and and recognize as key.
`
`1. Introduction
`
`Every time we conclude a snidy or task force on the subject of software development
`process I have found one recO!ll!llendation that comes out loud and d ear. "We need to adopt the
`best practices in the industry." While it appears as an obvious conclusion, the most glaring lack of
`it's presence continues to astound the snidy team. So clear is its presence that it distinguishes the
`winners from the also-ran like no other factor.
`
`The se.arch for best practices is constant. Son1e are known and well recognized, others
`debated, and several hidden. Sometimes a practices that is obvious to the observer may be
`transparent to the practitioner who chants "that's just the ·way we do things." At other tin1es
`what's known in one co!ll!llunity is never heard of in another.
`
`The list in this article is focused on Software Testing. While every attempt is made to
`foe.us it to testing, we know, that testing does not stand alone. It is intiniately dependent on the
`development activity and therefore draws heavily on the development practices. But finally,
`testing is a separate process activity - the final arbiter of validity before the user assesses its
`merit.
`
`The collec.tion of practices have come frohm many sources - at this point indelibly
`blended wi th its long history. Some of them were identified merely through a recognition of what
`is in the literatures; others through focus groups where practitioners identified what they \>allied.
`The list has been sifted and shared \\~th increasing nunlber of practitioners to gain their insight.
`And finally they were culled down to a reasonable number.
`
`A long list is hard to conceptualize, less translate to implementation. To be actionable, we
`need to think in terms of steps - a few at a time, and avenues to tailor the choice to our own
`independent needs. I like to think of them as Basic, Foundational, and Incremental.
`
`I
`
`
`
`CONFIGIT 1023
`
`

`

`Copyright IBM Research - Technical Report RC 21457 Log 96856 4116199
`
`The Basics are exactly that. They are the training wheels you need to get started and
`when you take them off, it is evident that you !mow how to ride. But remember, that you take
`them off does not mean you forget how to ride. 1bis is an imponant difference which all too often
`is forgotten in software. "Yeah, we used to write functional specification but we don't do that
`anymore" means you forget to ride, not that you didn't need to do that step anymore.
`The
`Basic practices have been around for a long time. Their v-alue contribution is widely recognized
`and documented in our software engineering literature. Their applicability is broad, regardless of
`product or process.
`
`The Foundational practices are the rock in the soil that protects your effons against
`harshness of nanire, be it a redesign of your architecnire or enhancements to sustain unforeseen
`growth. They need to be put down thoughtfully and will make the difference in the long haul,
`whether you build a ranch or a sl..-yscraper. Their value add is significant and established by a few
`leaders in the industry. Unlike the Basics, they are probably not as well !mown and therefore need
`implementation help. While there may be no textbooks on them yet, there is plenty of
`documentation to dig up.
`
`The Incremental practices provide specific advantages in special condition~. While they
`may not provide broad gains across the board of testing, they are more specialized. These are the
`right angle drills - when you need it, there's nothing else that can get between narrow snids and
`drill a hole perfectly square. At the same time, if there was just one drill you were going to buy, it
`may not be yollf first choice. Not all practices are widely !mown or greatly docmnented. But they
`all possess the strength that are powerful when judiciously applied.
`
`The next sections describe each of the practices and are grouped under Basics,
`Foundational, and Incren1ental.
`
`2. The Basic Practices
`
`• Functional Specifications
`• Reviews and In~pection
`• Formal entry and exit criteria
`• Functional test - variations
`•Multi-platform testing
`• Internal Betas
`• Automated test execution
`• Beta programs
`• 'Nightly' Builds
`
`f unctional Specifications
`
`Functional specifications are a key pan of many development processes and came
`into vogue with the development of the waterfall process. While it is a development
`
`2
`
`
`
`2
`
`

`

`Copyright IBM Research - Technical Report RC 21457 Log 96856 4/26199
`
`process aspect, it is critically necessary for software functional test. A functional
`specification often describes the external view of an object or a procedure indicating the
`options by which a servi ce could be invoked. The testers use this to write down test cases
`from a black box testing perspective ..
`
`The advantage of having a functioual specification is that the test generation
`activity could happen in parallel with the development of the code. This is ideal fron1
`several dimen~ons. Firstly, it gain~ parallelism in execution, removing a serious
`serialization bottleneck in the development process. By the time the software code is
`ready, the test cases are also ready to be run against the code. Secondly, it forces a degree
`of clarity from the perspec.tive of a designer and an architect, so essential for the overall
`efficiencies of development. Thirdly, the functional specifications becon1e documentation
`that can be shared with the customers to gain an additional perspective on what is being
`developed.
`
`Re,iews and Inspection
`
`Software inspection, which was invented by Mike Fagan in the mid ?O's at IBM,
`has grown to be recognized as one of the most efficient methods of debugging code.
`Today, 20 years later, there are several books written on software inspec.tion, tools have
`been made available, and consulting organizations teach the practice of software
`inspection. It is argued that software inspection can easily provide a ten times gain in the
`process of debugging software .. Not much needs to be said about this, since it is a fairly
`well-know11 and understood practice.
`
`f onnal :Entry and :Exit Ciiteria
`
`The notion of a formal entry and exit criteria goes back to the evohllion of the
`waterfall development processes and a model called ETVX, again an IBM invention. The
`idea is that every process step, be it inspec.tion, fttnctioual test, or software design, has a
`precise entry and precise exit criteria. These are deiined by the development process and
`are watched by management to gate the movement from one stage to another. It is
`arguable as to how precise any one of the criteria can be, and with the decrease of
`emphasis developmerll, process entry and exit criteria went out of currency. However,
`this practice allows much more careful management of the software development process.
`
`f unctional Test - Variations
`
`Most functioual tests are written as black box tests working off a functional
`specification. The number of test cases that are generated usually are variations on the
`input space coupled ,vith visiting the output conditions. A variation refers to a spec.ific
`combination of input conditions to yield a specific output condition. Writing down
`functional tests involves v.'Iiting different variations to cover as much of the state space as
`one deems necessary for a program. The best practice involves understanding how to
`,mte variations and gain coverage which is adequate enough to thoroughly test the
`
`3
`
`
`
`3
`
`

`

`Copyright IBM Research - Technical Report RC 21457 Log 96856 4116199
`
`fttnction. Given that there is no measure of cove.rage for functional tests, the practice of
`writing variations does involve an element of art. Toe practice has been in use in many
`locations within IBM and we need to consolidate our knowledge to teach new flmction
`testers the art and practice.
`
`Multi-platform Testing
`
`Many products today are designed to mn on different platfomlS which creates the
`additional burden to both design and test the product. When code is ported fron1 one
`platfom1 to another, modifications are sometimes done for perfonnance pwposes. Toe net
`result is that testing on multiple platfOllllS has become a necessity for most products.
`Therefore techniques to do this better, both in development and testing, are essential. This
`best practice should address all aspects of multi-platform development and testing.
`
`Internal Betas
`
`The idea of a Beta is to release a product to a limited number of customers and get
`feedback to fix problen1S before a larger shipment. For larger companies, such as IBM,
`Microsoft and Oracle, many of their products are used internally, th11S fotnling a good beta
`audience. Techniques to best conduct such an internal Beta test are essential for us to
`obtain good coverage and efficiently use internal resources. This best practice has
`everything to do with Beta progran15 though on a smaller scale to best leverage it and
`reduce cost and expense of an external Beta.
`
`Automated Test Enrntion
`
`The goal of automated test execution is that we minimize the a11101111t of manual
`work involved in test execution and gain higher coverage with a larger number of test
`cases. The automated test execution has a significant impact on both the tools sets for test
`execution and also the way tests are designed. Integral to automated test envir0flll1ents is
`the test oracle that verifies current operation and logs failure with diagnosis information.
`This is a best practice fairly well 1.111derstood in some segn1ents of software testing and not
`in others. The best practice., therefore, needs to leverage what is known and then develop
`methods for areas where autoniation is not yet fully exploited.
`
`Beta Programs
`
`(see internal betas)
`
`'Nightly' Builds
`
`4
`
`
`
`4
`
`

`

`Copyright IBM Research - Technical Report RC 21457 Log 96&56 4126199
`
`The concept of a nightly build has been in vogue for a long time. While every build
`is not necessarily done every day, the concept capnires frequent builds from changes that
`are being promoted into the change control system. The advantage is firstly, that if a
`major regression occurs because of errors recently generated, they are capnired quickly.
`Secondly, regression tests can be mn in the background. Thirdly, the newer releases of
`software are available to developers and testers sooner.
`
`3. Foundational
`
`User Scenarios
`Usability Testing
`In-process ODC feedback loops
`Multi-release CDC/Butterfly profiles
`R.equiren1ents for test planning
`Automated test generation
`
`User Sc.ena1ios
`
`As we integrate multiple software products and create end user applications that
`invoke one or a multiplicity of products, the task of testing the end user feantres gets
`complicated. One of the viable methods of testing is to develop user scenarios that
`exercise the functionality of the applications. We broadly call these User Scenarios. The
`advantage of the user scenario is that it tests the product in the ways that most likely
`reflect customer usage, imitating what Software Reliability Engineering has for long
`advocated under the concept of Operational Profile. A fttrther advantage of using user
`scenarios is that one reduces the con1plexity of writing test cases by moving to testing
`scenarios than feantres of an application. However, the methodology of developing user
`scenarios and using enough ofthen1 to get adequate coverage at a fonctional level
`continues to be a difficult task. 1bis best practice should capntre methods of recording
`user scenarios and developing test cases based on them. In addition it could discuss
`potential diagnosis methods when specific faihtre scenarios occurs.
`
`Usability Testing
`
`For a large munber of products, it is believed that the usability becon1es the final
`arbiter of quality. 1bis is true for a large number of desktop applications that gained
`market share through providing a good user experience. Usability testing needs to not only
`assess how usable a product is but also provide feedback on methods to improve the user
`experience and thereby gain a positive quality image. The best practice for usability
`testing should also have knowledge about advances in the area of Human Computer
`Interface
`
`5
`
`
`
`5
`
`

`

`Copyright IBM Research - Technical Report RC21457 Log 96&56 4/26199
`
`In-Process ODC F eeclback Loops
`
`Orthogonal defect classification is a measurement method that uses the defect
`stream to provide precise measurability into the product and the process. Given the
`measurement, a variety of analysis techniques have been developed to assist management
`and decision making on a range of software engineering activities. One of the uses ofODC
`has been the ability to dose feedback loops in a software development process, which has
`traditionally been a difficult task. While ODC can be used for a variety of other software
`managen1ent methods, dosing of feedback loops has been found over the past few years to
`be a much needed process improvement and cost control mechanism.
`
`Multi-Release ODC/Butterfly
`
`A key feature of the ODC measuren1ent is the ability to look at nmltiple releases of
`a product and develop a profile of customer usage and its impact on warranty costs and
`overall development efficiencies. The technology of multi-release ODC/Butterfly analysis
`allows a product manager to m..ke strategic development decisions so as to optimize
`development costs, tin1e to market, and quality issues by recognizing customer trends,
`usage patterns, and product performance.
`
`"Requirements" for Tes t Planning
`
`One of the roles of software testing is to ensure that the product meets the
`requiren1ents of the c!ientele. Capturing the requirements therefore becomes an essential
`part not only to help develop but to create test plans that can be used to gauge if the
`developed product is likely to meet customer needs. Often times in smaller development
`organizations, the task of requirements managenient falls prey to conjec.nires of what
`ought to be developed as opposed to what is needed in the marke.t. Therefore,
`requirements management and its translation to produce test plans is an in1pot1ant step.
`This practice needs to be understood and executed v.~th a holistic view to be successful.
`
`Automated Tes t ~neration
`
`Ahnost 30% of the testing task can be the writing of test cases. To first order of
`approximation, this is a completely m.1nual exercise and a prime candidate for savings
`through automation. However, the technology for automation has not been advancing as
`rapidly as one would have hoped. While there are automated test generation tools they
`often produce too large a test set, de.feating the gains from automation. On the other,
`there do exist a few techniques and tools that have been recognized as good methods for
`autoniatically generating test cases. The practice needs to understand which of these
`methods are successful and in what environments they are viable .. There is a reasonable
`
`6
`
`
`
`6
`
`

`

`Copyright IBM Research - Technical Report RC21457 Log96&56 4/26199
`
`amount of learning in the use of these tools or methodologies but they do pay off past the
`initial ramp up.
`
`4. Incremental
`
`• T earning testers v.~th developers
`•Code coverage (SWS)
`• Automated environment generator (Drake)
`•Testing to help ship on demand
`•State task diagran1 (Tucson)
`• Men1ory resource failure simulation
`• Statistical testing (Tucson)
`•Semifonnal methods (e .. g. SOL)
`•Check-in tests for code
`• Minimizing regression test cases
`• lnstmmented versions for MTTF
`• Benchmark trends
`• Bug bounties
`
`Teaming Testers with Denlopers
`
`Its been recognized for a long time that the close coupling of testers v.~th
`developers in1proves both the test cases and the code that is developed. An extren1e case
`of this practice is Microsoft, where every developer is shadowed v.~th a tester. Needless to
`say, one does not have to resort to such an extren1e to gain the benefits of this teaming.
`This practice should, therefore, understand the kinds of teaJlling that are beneficial, and the
`environments in which they may be employed. The value of a best practice such as
`teaJlliug should be therefore more than just concept. Instead it should include guidance on
`forming the right team while reporting the pitfalls and successes experienced.
`
`Code Con?rage,
`
`The concept of code coverage is based on a stmctural notion of the code. Code
`coverage implies a numerical metric that measures the elen1ents of the code that have been
`exercised as a consequence of testing. There are a host of metrics: statements, branches,
`and data that are in1plied by the tenn code coverage .. Today, there exist several tools that
`assist in this measurement and additionally provide guidance on covering elen1ents not yet
`exercised. This is also an area that has had considerable acaderuic play and has been an
`issue of debate for a couple of decades. The practice of code coverage should therefore
`carry infonnation about the tools and the me.thods of how to employ code coverage and
`track results fron1 the positive benefits experienced.
`
`7
`
`
`
`7
`
`

`

`Copyright IBM Research - Technical Report RC 21457 Log 96856 4116199
`
`Automated En,ironment ~ nera tor
`
`A fairly time-consuming task is the setting up a test environments to execute test
`cases. These tasks can take greater amounts of time as we have more operating systems,
`more versions, and code that mus on multiple platforms. The shear task of bringing up an
`environment and taking it down for a different set oftest cases can dominate the calendar
`in system test. Tools that can autoniatically set up environments, mu the test cases, record
`the results, and then autoniatically reconfigure to a new environment, have high value. The
`IBM Hursley Lab has developed a tool called DRAKE that does precisely. This best
`practice should therefore capture the issues, tools, and techniques that are associated with
`an envirolllllents set up, break dow11, and autOll!atic mnniug of test cases.
`
`Testing to Help Ship on Demand
`
`This is an idea from Microsoft where they look at the testing process as one that
`eriables late changes and acconllllodates market pressures. It changes the role of testing to
`one of providing excellent regression ability and working in late changes that still do not
`break the product or the ship schedule. This really amounts to a philosophical view of
`testing, placing it in a different role yielding new ramifications for the entire development
`process. We cite this as a best practice to recognize that there may be areas where such a
`conceptual framework necessitate a very reactive testing practice. The practice ought to
`identify how to work this concept into organizations and products in specific markets. It
`may have applicability in the £-Commerce world, where there is far greater customer
`interaction and competitive pressure.
`
`State Task Diagram
`
`This practice captures the functional operations of an application or a module in
`the form of a state transition diagram The advantages of doing so allows one to create
`test cases autoniatically or create coverage metrics that are closer to the ftmctional
`decomposition of the application. There are a fair nwnber of tools that allow for capturing
`Markov models which may be useful for this practice. The difficulties have tlStlally been in
`extracting the functional view of a product which may not exist in any con1putable or
`documented form and producing the state transition diagram. One of the automated test
`generation tools called Test Master from Teradyne
`acrually uses state task cliagranis for the generation offlmctional test. This practice has
`possibly more than one application and the keepers of the practice need to capture the
`tools, the methods, and its uses.
`
`Memo11' Resource f ailure Simulation
`
`This practice addresses a particular software bug, namely the loss of memory
`because of poor heap management or the lack of garbage collection. It is a fairly serious
`problem for many C programs in Unix applications. It also exists on other platfonns and
`
`8
`
`
`
`8
`
`

`

`Copyright IBM Research - Technical Report RC 21457 Log 96856 4/26199
`
`languages. There are commercial tools available to help simulate memory failure and
`check for memoty leaks. The practice should be generic and develop methods and
`techniques for use on different platforms and language environments.
`
`Statistic.al Testing
`
`The concept of statistical testing was invented by the late Harlan Mills (IBM
`Fell ow who invented Clean Room software engineering). The central idea is to use
`software testing as a means to assess the reliability of software as opposed to a de.bugging
`mechanism. 1bis is quite contrary to the popular use of software testing as a debugging
`method. Therefore one needs to recognize that the goals and motivations of statistical
`testing are different fundamentally. There are many arguments as to why this might indeed
`be a very valid approach. The theory of this is buried in the concepts of Clean Roon1
`software engineering and are worthy of a separate discussion. Statistical testing needs to
`exercise the software along an operational profile and then measure interfailure times that
`are then used to estimate its reliability. A good development process should yield an
`increasing mean time between faih!fe every time a bug is fixed. 1bis then becon1es the
`release criteria and the conditions to stop software testing.
`
`Semi-Formal l\lethocls
`
`The origin of formal methods in software engineering dates a couple of decades.
`Over the years it has made considerable progress in some specific areas such as protocol
`implementation. The key concept of a formal method is that it would allow for a
`verification of the program as opposed to testing and debugging. The verification methods
`are varied, SOllle of which are theorem provers, while some of them simulation against
`which assertions can be validated. The vision of formal methods has always been that if
`the specification of software is succinctly captured it could lead to automatic generation of
`code, requiring minimal testing.
`
`In the practice there has been much debate on the viability of semi-formal methods
`and to a large extent the industty ignores it. However, one must recognize a very key
`contribution from IBM's Hursley I.ab. 1bis is where the kernel of CICS was implen1ented
`after a formal specification written in Z. A semi-formal method is one where the
`specificatiori~ caplll!ed may be in state transition diagran1S or tables that can then be used
`for even test generation. IBM's Zurich Research Lab has done some work in this area and
`very successftilly used this for protocol implementation~. The best practice in senli-formal
`methods ought to caplll!e our experience and also guide places where such applications
`may be viable.
`
`Check-in Tests for Code
`
`The idea of a check-in test is to couple an automatic test prograni (usually a
`regression test) \\~th the change control system. Microsoft has been known to en1ploy
`
`9
`
`
`
`9
`
`

`

`Copyright IBM Research - Technical Report RC 21457 Log 96&56 4126199
`
`such a system very well. 1bis allows for an automatic test run on recently changed code so
`that the chances of the code breaking the build are minimized. In fact, Microsoft's change
`control system and build are supposedly set up such that unless the code passes the test, it
`does not get promoted into the next build.
`
`Minimizing Regression Test Cases
`
`In organizations that have a legacy of development and of products that have
`matured over many releases, it is not uncommon to find regression test buckets that are
`huge. The negative consequence of such large test buckets is that they take long to
`execute. At the same time, it is often unclear as to which of these test cases are duplicative
`providing little additional value. There are several methods to minimize the regression
`tests. One of the methods looks at the code coverage produced, and distill test cases to a
`minima 1 set. One must note that this method, though attractive., does conft1Se a structural
`metric with a fonctional test. Never the less, it is a way to implenient the mininlization.
`
`Insniunentecl Versions for 1\ITTI
`
`An opportunity that a beta progran1 provides is that one get a large sample ofl1Sers
`to test the product. If the product is instrumented so that failures are recorded and
`returned to the vendor, they would yield an excellent source to measure the mean time
`between faihire of the software. There are several uses for this metric. Firstly, it can be
`used as a gauge to enhance the product's quality in a manner that would be meaningful to
`a tlSer. Secondly, it allows us to measure the mean time between faihire of the same
`product under different customer profiles or tlSer sets. Thirdly, it can be enhanced to
`additionally capture first failure data that could benefit the diagnosis and problen1
`determination. Microsoft has claimed that they are able to do at least the first two through
`instrumented versions that they ship in their betas.
`
`Benchmark Trends
`
`Benchm.1fking is a broad concept that applies to many disciplines in different areas.
`In the world of software testing, we could interpret this to mean the techniques and the
`performance of testing me.thods as experienced by other software developers. Today there
`is not an avenue to regularly excllange such infonnation and compare benchmarks. This
`best practice could be initiated by benchmarking across IBM Labs and then advance the
`practice to include a larger pool with competitors and custon1ers.
`
`Bug Bounties
`
`We have heard that bug bounties were tlSed in Microsoft and we know that they
`have been used in IBM during the I OX days. Bug bounties refers to our initiatives that
`charge the organization with a focus on detecting software bugs. At times providing
`
`10
`
`
`
`10
`
`

`

`Copyright IBM Research - Technical Report RC 21457 Log 96&56 4126199
`
`rewards too. The experience states that such effort tend to identify a larger than usual
`number of bugs. Clearly additional resource is necessary to fix the bugs. But the net result
`is a higher quality product.
`
`11
`
`
`
`11
`
`

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket