CS 372 Spring 2016 > Notes for February 8, 2016

CS 372 Spring 2016
Notes
for February 8, 2016

The Software Development Process (cont’d)

Testing

Definition

Testing is executing software to see whether it does what it is intended to do, and how well it performs. There is a wealth of testing-related practices and terminology. Here is an overview. Later, we will look at some of these in more detail.

How Much to Test: Unit, Integration, & System Testing

A unit is a small testable piece of code. A typical unit is a function or a class. Usually we cannot test a portion of a function in isolation; but we can test a function by calling it and seeing what it does.

In unit testing, we write code to test a specific unit of the software. This code is generally separate from the software itself. A specific test case (or simply test) checks a particular behavior of a particular unit. A group of test cases forms a test suite. A number of unit testing frameworks exist to support the writing of test suites.

In integration testing, we check how software units work together. For example, if one function writes a file, and another function reads that same file, then we might try them both.

System testing runs a complete software system. Typically there is no separate testing code. Facets of system testing include the following.

Smoke testing: Does the software successfully execute without anything disastrous occurring?
Sanity testing: Does the software do something sensible? Does the execution of the software do essentially what it is intended to do?
Usability testing: How amenable to actual use is the software? This kind of testing can involve trying out every option the software has.
GUI testing: Software that has a graphical use interface (GUI—pronounced like “gooey”) involves special usability issues. Does the GUI work?
Performance testing: Even if the software does what it is intended to do, it may not do it fast enough. Find out.
Scalability testing: Software that does fine with small datasets might not work well with large ones. Software that runs fine on a single machine may have problems when a large number of machines are involved. Again, find out.
Installation testing: Here we try the software exactly as the user will actually encounter it.

Testing Philosophy: White-Box & Black-Box Testing

When we design tests for code, what knowledge do we base these tests on?

In white-box testing we design tests based on knowledge of the code itself. We write tests to exercise all execution paths and corner cases: those situations that are handled by special-case code.

In black-box testing we design tests based on the specification of the code, but no knowledge of the code itself. We know what the code is supposed to do; does it do this?

Note that white-box tests can only be designed after code is written. Black-box tests, in contrast, can be designed before code is written.

Other Testing Concepts

When making a change to software, it is natural to test whether the change has the desired effect. However, changes can have other effects—they can often impact parts of the software other than that part that was changed. A regression is a new bug in existing code. Regression testing is testing for regressions.

Note that regressions can pop up even in code that has not been changed. For example, code that calls a C++ virtual function may perform differently upon the introduction of a new derived class with a different implementation of the virtual function. It is possible that such code has always performed correctly in the past, but performs incorrectly when the new virtual function is called.

A common practice in software development is to include unit-testing code in the project repository. A rule enforced in many projects is that all unit tests must be executed when checking in code to a repository. And code that does not pass all tests cannot be checked in. Rules like this essentially require regression testing.

When do we write unit tests? In the test-driven development (TDD) methodology, we write tests first, then we write the code to make them pass. Under TDD, the process of introducing a new software feature proceeds as follows.

Add one or more tests for the new feature to the project unit tests.
Run the tests; the new tests should fail. (If they do not, then they were poorly chosen.)
Add the feature to the software.
Now all tests should pass. If not, then fix the software; keep working until they do pass.

Note that when we use TDD, the tests we write are necessarily black-box tests.

Debugging, Deployment, & Maintenance

Debugging is finding and fixing problems in software. The word is also used to mean running software under a debugger, a package that allows us to examine software while it is running. An Intregrated Development Environment (IDE) will typically include an integrated debugger; there are also stand-alone debuggers. We will discuss debuggers later in the semester.

To deploy software is to make it available to users. This can take many forms. Web-based software might be deployed by uploading it to a web server. A mobile app might be deployed by making it available for sale on an app store.

Maintenance is a catch-all term for development activity done after deployment. Maintenance can include the following.

Fixing bugs found by end users.
Porting software to new platforms.
Updating software to address changes in the environment in which it is executed. For example, tax-computation software may need to be updated when tax law changes.
Writing new versions of the software.

Stories & Points

When compiling user stories, it is helpful to organize them according to the amount of effort likely to be required.

Traditionally, effort is measured in man-hours. A man-hour is the amount of work one person can do in one hour. However, measuring in man-hours has fallen out of favor in many circles. There are two reasons for this.

Most people are very bad at estimating man-hours.
The idea of a man-hour is based on a fallacy, for example, that if one person can achieve something in one hour, then two people can achieve the same thing in half an hour.

A solution is to rate user stories in terms of points. This is a non-specific unit; the idea is only that more points means more work. One way to do this is to choose a user story of medium difficulty, and rate it at 5 points. Then rate the other stories accordingly.

CS 372 Spring 2016 Notesfor February 8, 2016