Something little on group testing

5 min read Original article ↗

The Problem

It was a year ago when I was testing multiple optional parts of a prompt to improve the overall prompt performance. My hypothesis was that there might be some parts that are lowering the performance. After a while, I just decided to remove those parts one by one.

Recently, I encountered a new problem. You have nn attachments. You know that some of them are malicious and will install malware on your VM. However, to test the malware on your computer, you have to run a scan in VM, which takes a long time. A naive solution is to test it on every attachment in sequence, but that will take forever. We can test multiple attachments at once. Is there a better solution with which we could find the malicious documents more quickly?

Turns out there is! It is called group testing. From wiki

In statistics and combinatorial mathematics, group testing is any procedure that breaks up the task of identifying objects into tests on groups of items, rather than testing each item individually. First studied by Robert Dorfman in 1943, group testing is a relatively new field of mathematics that can be applied to a wide range of practical applications and is an active area of research today.

Iterative testing

There is a property that you, however, have to roughly estimate. That is prevalence rate. In our estimate, that would be the probability of the document being malicious. You can estimate by picking a small number of documents and iterating through them one by one at time (e.g. 200). It is also shown that if the prevalence rate p>pu=(3(5))/20.38p > p_u = (3 - \sqrt(5))/2 \approx 0.38 then individual testing is the optimal strategy. For p<pup < p_u this is not an optimal strategy, and an optimal strategy is not known for the general population n>2n > 2.

Non-Adaptive vs Adaptive Group Testing

Non-adaptive group testing is the case when you need to design the whole test before the testing begins. The problem can be expressed using testing matrix MM of size t×nt \times n and the result vector y=(y1,y2,,yt)y = (y_1, y_2, \ldots, y_t) with binary values yi{0,1}y_i \in \{ 0, 1\}. We want to minimise the number of rows tt and hence the size of the results vector yy. An application is, for example, in virus testing, where the test is prepared and mass produced, and we can’t really change the design of the test during the testing itself.

Adaptive group testing is the interesting part. In most cases, when the number of things that we want to test is high. We can afford to do the first round of testing, re-evaluate and update our strategy. Maybe even update our strategy after every single test. Here is the most general best-performing algorithm. I think it is worth remembering.

The Generalised Binary-Splitting Algorithm

The most elegant adaptive approach is the generalised binary-splitting algorithm. Let’s say you have nn items and suspect that dd of them are defective. Let n=200n=200 and d=20d=20. Here’s how it works:

Step 1: Decide whether to test individually or in groups

  • If n2d2n \leq 2d - 2, you have relatively few items left. Test them one by one.
  • Otherwise, create a test group and continue to Step 2.

Step 2: Test a group of size 2α2^{\alpha}

  • Calculate l=nd+1l = n - d + 1 (the maximum number of non-defective items) (l=20020+1=181l = 200 - 20 + 1 = 181)
  • Set α=log2l/d\alpha = \lfloor \log_2 l/d \rfloor (α=log2(181/20)=3\alpha = \lfloor log_2 (181/20) \rfloor = 3)
  • Test a group of g=2αg=2^{\alpha} items together (g=2^3 = 8)

Step 3: Interpret the result

  • If the test is negative: Splendid. All items in the group are non-defective. Remove them from consideration and update n:=n2αn := n - 2^{\alpha}. Go back to Step 1.
  • If the test is positive: The group contains at least one defective item. Now perform binary search within this group:
    • Repeatedly split the current subgroup in half and test each half
    • Always follow the positive half (if both halves are positive, pick one)
    • Continue until you isolate exactly one defective item
    • This binary search takes exactly α\alpha additional tests (splitting 2α2^{\alpha} items down to 1)
    • Count all the non-defective items you identified during the binary search (those in the negative subgroups you encountered) - call this number xx
    • Total cost for a positive group: 1+α1 + \alpha tests
    • Update your totals: n:=n1xn := n - 1 - x and d:=d1d := d - 1. Go back to Step 1.

Important note about multiple defectives: If the original group of size 2α2^{\alpha} contains multiple defective items, the binary search will still find exactly one of them. Any other defectives that were in that group remain in the pool with the updated values of nn and dd, and will be discovered in future iterations of the main algorithm. Each positive group yields exactly one identified defective, no matter how many defectives it actually contains.

The algorithm is remarkably efficient. For large n/dn/d ratios, it requires approximately Tdlog2(n/d)T \approx d \log_2(n/d) tests, which is close to the theoretical optimal bound. This means if you have 1000 items with 10 defectives, you’d need around 66 tests instead of 1000!

Handling Capacity Constraints

What if we have a maximum elements uu that we can test at one moment? We just modify the group size to g=min(dα,u)g = \min (d^\alpha, u) and continue as before. It will be less efficient, but for sure still much better than running sequential testing.

Parallel Testing

For parallel execution, use the same group size gg across multiple simultaneous tests, then synchronize after a fixed number of steps to update nn and dd based on all results. After synchronization, recalculate α\alpha and the new group size gg for the next batch of parallel tests.


I highly recommend also looking into this presentation https://people.maths.bris.ac.uk/~maotj/sheffield.pdf.