# Use of Cross Correlation Analysis to Estimate Phase and Periodicity in Common

Phase was assessed in data SET IV by running the program CROSSCO. This program uses a reference data set of known phase and amplitude as a base to measure the phase of an experimental vector. The algorithm is identical to standard autocorrelation with the exception that two data sets are being compared, instead of a single data set being compared with itself (7). If the data are in phase, there will be a peak at lag zero, although it will likely not be unity, as is always seen in autocorrelograms, given that the two sets will differ. Displacement of the first peak from lag zero in either direction is a direct measure of the phase difference between the reference and experimental periodicities (see Note 12).

Figure 9 depicts the output of CROSSCO, comparing a reference data set with phase defined as zero and a data set that is phase-delayed by 6 h. All data had 80% noise added. The peak in correlation occurs at the lag corresponding to a 6-h difference. This analysis also serves to illustrate what two rhythms with a periodicity in common would look like when analyzed in this manner. The robust crosscorrelation, even in the presence of considerable noise, indicates the periods of the two data sets are close. (See Note 12).

4. Notes

1. MATLAB is a numeric computation software package offered by The Math Works in Natick, MA (www.mathworks.com). It can be configured to compile and run software written in other languages—in this case, FORTRAN. Other Fig. 9. (A) A short segment of the crosscorrelation of two data vectors differing in phase by 6 h. The periods in the two data sets are the same and both have 80% noise added. (B) The entire crosscorrelation shown to emphasize the periods held in common between the two vectors. If the periods were different, there would be a concomitant decay in the envelope of the function depending on the difference in the periods.

Fig. 9. (A) A short segment of the crosscorrelation of two data vectors differing in phase by 6 h. The periods in the two data sets are the same and both have 80% noise added. (B) The entire crosscorrelation shown to emphasize the periods held in common between the two vectors. If the periods were different, there would be a concomitant decay in the envelope of the function depending on the difference in the periods.

systems have this capability, and the author and others have converted the programs here to run in these systems (8).

2. The programs employed here were run in their DOS window executable file format, as it was desired to maintain the maximum flexibility in producing the examples. In general, this would not be necessary for normal work. Except as noted, all programs were written in FORTRAN.

3. MESA has proven its worth in analysis of biological time series over a period of 20 yr. It was instrumental in uncovering ultradian periodicities in the behavioral rhythms in strains of D. melanogaster having no circadian rhythmicity, such as period01 (17,18). MESA was set to a very high resolution here, which might not be necessary for normal usage.

4. FILCON: This program takes the discrete Fourier transform of the data and may then be directed to zero out coefficients in a frequency range to be eliminated. Owing to the great sensitivity of MESA, FILCON was not always necessary here to demonstrate the periodicities, but in practice, the author has found it essential as a first step when trends or strong confounding rhythms are present.

5. The programs used to extract the spectral peaks and the RIs were written in Turbo Basic, but could be implemented in several languages. They involve simple bubble sort analysis at the core. In this algorithm, values from a set are ordered in a column by magnitude. The RI program incorporates criteria to ensure that the proper peak is reported. If the autocorrelation function is sufficiently weak, the program reports out arrhythmicity.

6. Heartbeat of Drosophila is in the range of 1 to 4 Hz and is monitored in several ways, including optically. The data are best presented as frequency rather than period (9-11).

7. Noise was added by incorporating the output of a white noise generator. The noise file is simply added to the output of the signal generating function in a proportion that reflects the percentage and reflects the signal-to-noise ratio.

8. The determination of significance in rhythmicity may be based on a number of methods. MESA lacks any way of inherently testing for significance of the peaks, as would be possible with the Fourier transform, but the strength of the system described here is that it uses an entirely different algorithm, the autocorrelation function, to assess significance. One may calculate a 95% confidence limit to apply to peaks in the autocorrelation functions, namely 2/VN; however, it is common simply to look for regularly recurring peaks in the correlogram to determine if a genuine periodicity is present (7).

9. The author uses a batch plotting routine written specifically in MATLAB that accommodates this format for heartbeat (frequency output) data, but currently has none for circadian rhythms (period output).

10. The Butterworth is a recursive filter that can be configured in a high- or low- pass form. The one used here was two-pole low-pass with a 3-db cutoff period of 4 h at a sampling rate of two per hour (16). Recursive filters use a combination of raw and previously filtered data in computing the output. Two pole filters have three coefficients in the formula. A 3-db cutoff means a power reduction in the signal of 50% at the transition period, here 4 h. None of the output shown in the figures had been filtered first, owing to MESA's power, but normally the author looks at data both with and without filters, and actual rhythm data commonly are improved greatly by the process. This filter will induce an approx 4-h phase shift. If this is a problem, one simply runs the filter twice, sending the filtered data set through in reverse order to cancel out the shift (13).

11. The sensitivity and resolution of MESA, coupled with the ability of FILCON to remove strong circadian rhythmicity, was essential to uncovering a cirhoral rhythm in human core body temperature. Other methods, including fast Fourier transform, failed to detect it despite the periodicity being clearly visible in the raw data plots (14).

12. When assessing phase in this manner with actual biological data, the test data set must have the same period as the experimental set. The periods of each experimental series are first estimated with MESA, and test sets of the same period are created using SIGGEN. The program automatically adjusts and normalizes amplitude before crosscorrelation is estimated. This method is particularly useful for very noisy and irregular data sets because it estimates phase based on the entire signal rather than just one identifiable phase marker, which by itself may not be reliable from cycle to cycle.

13. All programs written by the author are available free of charge by e-mail or FTP either as FORTRAN source code, executable files, or as files executable from MATLAB.