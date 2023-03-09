Please enter your email and choose a password for your account. Passwords must include at least 8 characters, a mixture of both uppercase and lowercase letters, at least one letter and number and at least one special character, e.g. ! @ # ? ]

There are several statistical tests which can be used to measure assay quality. Zhang (1999)4 proposed Z-factors as a measure of assay quality:

The Z-factor defined in equation 1 utilises the population means for the positive and negative signals (uninhibited and inhibited reactions) and their corresponding population standard deviations. Z-factor is superior to other statistical parameters such as the signal-to-noise (S/N) ratio and signal‑to‑background (S/B) ratio, which do not consider the variation in both the positive and negative signals. The Z-factor is a dimensionless ratio and hence the obtained values are independent of the detection method used.4

The Z-factor is a parametric statistic, requiring the measured data approximates a Gaussian distribution,4,5 that the data is continuous, and the standard deviations are of similar size irrespective of signal magnitude (“homogeneity of variance”).6 Zhang (1999) also introduced the parameter Z’ which is derived in the same way using data from positive and negative controls,4 and is independent of the library and compound concentration used. A Z’-factor of >0.5 is generally considered sufficient to correctly identify hits within a screen.4 Both Z- and Z’-factors assume no column and row effects arising from differences in reagent dispensation and evaporation rates. There are alternative metrics available (such as the B-score and BZ‑score) for analysing data with significant row or column effects.5

The Z- and Z’-factors are used in pilot and high-throughput screens in several different ways. Firstly, Z’ serves as a quality control measure for each plate in the screen and is used to identify individual plates where the assay is not working correctly. Secondly, large HTS campaigns take several days to run and Z’ is used to ensure that large variations in performance do not occur during this time. However, it is also recognised that Z’ can be sensitive to changes when the signal window is small4,5 or an insufficient number of positive and negative controls are used. An implicit assumption is that the Z’-factor is representative of the performance in the actual screen, the Z-factor.5,7 The Z-factor does not require the use of control data and is representative of the actual screen.5 However, Z-factors are susceptible to anomalies introduced by library compounds and does not allow comparison of different plates within a screen or between different screens.

So, do higher Z- or Z’-factors result in higher hit rates? Gribbon et al. 20057 reported a positive correlation6 between the confirmation hit rate (ie, the percentage of hits giving the same behaviour on rescreening) and the median Z-factor. The reported correlation was r = 0.55, indicating that Z’-factor scores have an influence, but other factors are also important.7 Similarly, Lloyd (2020)1 investigated the influence of Z’ on the reported confirmed hit rate for a number of HTS campaigns against enzyme targets. The plotted data (Figure 1) shows an apparent positive correlation between hit rate and Z’, although there is considerable variation in the observed levels. However, statistical analysis failed to show a significant effect. These wide variations are likely to be dependent on several variables in addition to Z’-factor, including the type of compound library used in the screen, the density and diversity of the compound library,8 the compound concentration used in the screen, the assay read-out method and the ‘druggability’ of the target (ease of finding an inhibitor). In addition, increases in confirmed hit rate due to increasing Z’ are likely to become less pronounced as assay quality improves given that most hits will be correctly identified at lower scores. It is therefore unclear whether increasing Z’ or Z-factor above a certain threshold conveys much advantage. However, measures that tend to improve Z’-factors such as including detergent in the assay and using longer wavelengths in absorbance, luminescent and fluorescent assays convey other advantages, as they reduce false positive results by lessening compound aggregation,9 autofluorescence and photochemical degradation,1 which otherwise result in elevated false positive hit rates.

Therefore, does Z- or Z’-factor provide a meaningful measure of assay quality? Shun (2011)5 compared data from six high-throughput screens in which they predicted hit rates using several different methods, including percent inhibition, the parametric statistical tests Z- and Z’-factors and percent coefficient of variation (standard deviation or mean), and the non-parametric median absolute deviation (MAD) score, B-score and BZ-score. They concluded that no statistical method was suitable for analysing all the different screens. The Z-score performed well with parametric data, although the authors noted a screen which appeared to give acceptable data (as judged by the low coefficient of variation) with ready identification of hits which had a Z’-score well below the commonly cited 0.5 cut-off. On the other hand, the B-score and BZ-score were useful for data where significant column and row effects existed. The authors proposed a series of different statistical and graphical tests for analysing screening data.

Use of the Z- and Z’-factor was recently criticised on several statistical grounds,10 including use of population means and standard deviations instead of the corresponding sample parameters, and the use of additive standard deviations rather than pooled standard deviations (the square root of the pooled variance).6 These issues are likely to have impact when the numbers of samples in the analysis are limited, for example when using positive and negative controls to assess the data in individual plates. Plates with 384 wells are commonly used in screens, which typically incorporate 24 to 32 wells each for positive and negative controls.5 Approximately 30 samples of each control will be required for data to show Gaussian distribution, according to the central limit theorem.6 The numbers of wells used for these controls may not be sufficient to ensure Gaussian distribution, which means that use of population means and standard deviations may not be justified.10 One option may be to use 1536‑well plates instead, thereby enabling a greater number of control or test wells in each plate, but this also increases the occurrence of other anomalous behaviour such as non‑pharmacological activation of enzyme activity.11 The mechanism of activation is unclear, but it appears to be related to compound aggregation in low volume (1536-well) plates as similar phenomena are not observed in parallel experiments in 384-well plates. Moreover, it is claimed that the thresholds for Z’ have been chosen arbitrarily without rigorous testing.10 Further statistical research will be required to assess the potential impact of these statistical issues10 on assessing assay performance, establishing appropriate threshold values, and identification of hit compounds.

Key takeaways

Z- and Z’-factors are commonly used to determine assay performance, perform quality control on the screen, and identify active compounds. Although the use of these metrics have been criticised,10 they are likely to be used for the foreseeable future.

Matthew D Lloyd read Biological Chemistry at the University of Leicester and achieved a doctor of philosophy (DPhil) at the University of Oxford. Following post-doctoral fellowships, he joined the Department of Life Sciences at the University of Bath in 2002. He is Senior Lecturer and Director of Studies for the MSc in Drug Discovery, researching enzymes as drug targets using enzyme kinetics and chemical biology techniques.

