# Data Analysis

Numerous approaches to data analysis have been suggested for quantitative real-time PCR. Unfortunately, data analysis does require use of a little mathematics, although this should not put off the nonmathematically inclined. Once the basis is understood, these calculations may all be simply automated.

The amplification of DNA during a PCR may be described by the following simple equation (Eq. 1):

When Xn is the DNA concentration at cycle n, Xg is the initial DNA concentration, E is the reaction efficiency, and n is the number of cycles of PCR (16). In real-time PCR, instead of measuring DNA concentration, the fluorescence of each sample is measured, which is proportional to the DNA content. If the amplification efficiency (E), threshold (Ra), and number of cycles taken to reach this threshold (Ct) are known, one may calculate the theoretical initial fluorescence (Rg) as follows:

The simplest way of analyzing qPCR data is to calculate the Rg value for each sample at a known threshold (RCt in Eq. 2) from its Ct value. This circumvents many problems, as Rg is a linear unit (as opposed to Ct). Therefore, a sample with an Rg twice that of the control Rg contains double the transcript level, and moreover, statistical analysis and measures of variance can be calculated from Rg values, which are less straightforward when using Ct values.

If threshold (RCt) and Ct are known, then the only unknown is the amplification efficiency (E). The different approaches to data analysis just represent different approaches to calculating this value (13). These can be broken down into three major approaches:

1. Assumed efficiency.

2. Standard curves.

### 3. Kinetic analysis.

The first method, assumed efficiency, is perhaps more commonly referred to throughout the qPCR literature as the 2-AACt method (16,17). This simply assumed the reaction efficiency to be 1.00 for both target gene and internal control, i.e., a perfect doubling of reaction product every cycle of PCR. The advantage of this approach is its simplicityâ€”there is no need for any additional calculations, and in most cases it provides a good approximation (see Note 7). The disadvantage is that in most cases, the amplification efficiency will be lower than 1, and as such, this approach will introduce errors into the exact quantification, as well as exaggerating the magnitude of any differences between groups.

The second approach makes use of standard curves (see Fig. 5), typically constructed of either copy numbers (if absolute quantification is required) or a diluted cDNA sample (18). The amplification efficiency can be derived from the slope of a standard curve as follows (Eq. 3):

A 10

A 10

Fig. 5. The use of standard curves involves preparing a serial dilution of template, and plotting Ct vs the initial concentration (on a logarithmic scale). The concentration of unknown samples may then be extrapolated using linear regression. For example, samples containing a range of concentrations between 100 and 100,000,000 copies are amplified (A). The Ct of each known concentration is then plotted against the copy number, and unknowns may then be extrapolated (B). If an unknown sample has a Ct of 21, this would correspond to a concentration of 4.6 (log scale), or 39,811 copies.

Fig. 5. The use of standard curves involves preparing a serial dilution of template, and plotting Ct vs the initial concentration (on a logarithmic scale). The concentration of unknown samples may then be extrapolated using linear regression. For example, samples containing a range of concentrations between 100 and 100,000,000 copies are amplified (A). The Ct of each known concentration is then plotted against the copy number, and unknowns may then be extrapolated (B). If an unknown sample has a Ct of 21, this would correspond to a concentration of 4.6 (log scale), or 39,811 copies.

When calculating relative expression, if using a standard curve composed of copy numbers, the end result (fold change between control and experimental samples) is mathematically identical whether using Eq. 2 or deriving the copy number for every sample.

The final approach, kinetic analysis, uses the information that is present in every amplification plot to calculate the amplification efficiency for every sample (19). As there will be an associated measuring error, individual corrections are possible only when this measuring error is very small. Otherwise, slight differences in reaction efficiency result in an exponential addition of this error (see Subheading 1.2.3.). In its simplest form, a linear regression may be conducted to the exponential portion of each amplification plot, the slope of which enables the amplification efficiency to be derived. The advantage of this approach is that no additional standards are required, and furthermore one has multiple measurements of the amplification efficiency for every transcript under study. One may therefore calculate the mean efficiency and test for any deviations in amplification efficiency between groups (13). More advanced models are available, but these are computationally intensive and present additional technical challenges (ref. 20; see Note 8).