The group testing problem was first proposed by Dorfman during World War II when blood samples of millions of draftees were subject to identical analyses in order to detect a few thousand cases of syphilis. The idea of group testing is to pool the blood samples, say, ten in a group, for one test instead of testing them one by one. When the outcome of the group test is negative, then all samples in the group are good (disease-free). Otherwise, then some samples are defective and further testing on them are necessary. In the 1960s Sobel and Groll revived the interest in group testing by giving many industrial applications in detecting chemical leakage and electrical blocking. Two general models have been used to study group testing. In the probabilistic model, the items are usually assumed to be independent and each has a certain probability of being defective. In the combinatorial model, the total number of defectives, or an upper bound, is assumed to be known. It is the combinatorial group testing model on which discrete mathematics has the greatest impact.
Let n denote the number of items to be classified which contain exactly d
defectives (most results extend to the case when d is merely an upper bound).
Let
denote the maximum (worst-case) number of tests required by
algorithm A for given n and d.
Define
.
The problem is to determine
.
While
can be easily proved, the
determination of
for general n with
fixed is open.
Furthermore, it is not known whether this problem is NP-complete.
Currently, there exist
-time heuristics which can complete
the classification in
tests, with at most
tests
more than
.
There are two special classes of group testing algorithms.
A ``nonadaptive'' algorithm specifies all tests simultaneously, without the
benefit of using the outcomes of previous tests to determine the present test.
The currently best nonadaptive algorithm requires
tests.
A ``competitive'' algorithm assumes no knowledge of d and its
performance is evaluated at the most
unfavorable d-value.
The currently best competitive algorithm has a competitive coefficient of 1.65.
A polymer chain reaction (PCR) can identify a specified DNA sequence in a pool of clones. Thus the problem of screening a clone library for some specific DNA sequences can be cast as a group testing problem with clones as the items and a PCR as a test. However, the goal of clone library screening goes beyond minimizing the number of PCR's; the easiness of preparing the pools and whether pooling can be done parallelly are important issues.
A prevalent type of fault in the manufacture of electrical circuits is the presence of a short between two nets. A short detector in common use is an apparatus with two connecting leads, which, when connected respectively to two groups of nets, can detect the presence of a short between the two groups. An algorithm based on group testing has been proposed to detect and locate all shorts with much fewer tests than existing algorithms.
A nonadaptive group test algorithm with n items, d defectives and t
tests can be represented by a
binary matrix where cell
has a 1-entry if and only if item j is contained in test i.
The Boolean sum of any up to d columns must be distinct
(or the algorithm would not differentiate two sets of d defectives).
Such a matrix can also be interpreted as a superimposed code (each
column as a code word), where any d code word can be superimposed into
one transmission and be uniquely decoded.
Consider a communication network where many users share a multiaccess channel. Typically, the number of ``active'' users, those with a message to transmit, is small at any given instance. A ``bit reservation'' scheme first identifies all active users; then schedules them to transmit in order to avoid more than one transmission at the same time. At each step of a bit reservation scheme, a set of users is queried and those active users must each transmit a bit. If a transmission occurs after a query, then we know there are active users among the set, but not knowing whom or how many. This is simply group testing with active users being the defective items and queries being the tests.
For real-world applications, various kinds of restrictions need to be considered. The test kit may have a size constraint, or the number of tests an item can go through is limited. On an assembly line, only consecutive items can be tested as a group. But the most important issue to be further studied is that tests are rarely error-free; typically, the accuracy of a test dilutes as the group size increases.
An introduction to the combinatorial group testing problem is given in the D. Z. Du and F. K. Hwang, Combinatorial Group Testing and Its Applications, World Scientific, 1993.
Thu Sep 28 09:58:04 CDT 1995