Analytical techniques

This volume demonstrates the use of a range of analytical tools, statistical and others, as applied to the comparative method. Some analyses are largely conducted by argument and non-statistical means; others use simple measures such as whether a particular feature is commonly associated with another, in order to establish possible cause-and-effect relationships. However, most use statistical techniques including bivariate regression and correlation, and a range of multivariate methods such as multiple regression, stepwise multiple regression, partial correlation, principal components analysis and clustering techniques. The choice of method is important because it can have an effect on the results. This is partly because of the demands or requirements of statistical techniques themselves, and the need to understand fully how they behave, and why they produce the patterns of significant and insignificant results that they do. For example, how sensitive is a particular technique to error variance in the data (e.g. see Purvis and Webster, Chapter 3)? If the sample were slightly larger, or some of the data slightly different, how much difference could it make to the results obtained? Are some data points particularly influential on the results, and hence is it particularly important that these data are confirmed? Choice of method is also important because of the different models on which the various techniques are based. For example, if two variables together influence a feature, perhaps in some species one variable is more influential in producing species-specific variation whereas in other species the other variable is more important. In such a case, overall variation in the 'feature' may not show up as significantly related to either variable independently across the range of a particular sample. Models that might be used to test hypotheses, such as that underlying multiple regression, commonly only test to see whether the variables are independently correlated with the feature concerned. However, in this case, the hypothesis would be supported even if the two variables are only significantly correlated with the feature when their joint effects are considered, which requires a different model. The maternal energy hypothesis for brain size, which is referred to by both Barton (Chapter 7) and Lee (Chapter 5) is a case in point. It proposes that brain size depends on maternal investment during gestation, as measured by a combination of the mother's basal metabolic rate and gestation length. Increase in either variable would increase maternal investment, and so their combined effects must be investigated to test the hypothesis fully.

Several chapters, explicitly or implicitly, involve the use of cladistic methodology to determine the phylogenetic patterns of character variation. Robson-Brown (Chapter 2) discusses this in some detail and in particular provides an overview of what is conceptually required for such analyses. Using the example of sleep patterns, she demonstrates how cladistics can help to identify the relative importance of phylogenetic constraints and present ecological conditions on various sleep parameters. For example, she shows that total sleep times and the amount of paradoxical sleep do not easily change evolutionarily with changes in the security of species' sleeping conditions.

This is related to a longstanding concern in comparative biology, which is the potential problem of phylogenetic inertia; in other words, the possibility that species retain some features because of their ancestry, rather than these features being adaptations to present conditions. If this has occurred, species should not be treated as independent points in analyses seeking causal correlations or associations between factors.

A number of chapters use the method of phylogenetic contrasts, and Purvis and Webster (Chapter 3) provide a clear explanation of the reasons why such a method may be needed, and its basic functioning, as developed in the most commonly used form, Comparative Analysis by Independent Contrasts (CAIC). Mace and Holden (Chapter 15) discuss an alternative based on a maximum likelihood approach. Phylogenetic contrasts by CAIC has become de rigueur in many areas of comparative biology, but, like any other analytical technique, it needs to be used with care, with an appreciation of how it behaves, or why it produces the results it does in any particular case. Plotting out results and examining which points are the important contributors to significant (or non-significant) findings can be very instructive. It can, for example, enable the identification of outliers that inspection of the raw data may identify as resulting from poor or mistaken data, a weakly supported point on the phylogenetic tree used, an obvious grade effect, etc.

Purvis and Webster (Chapter 3) provide a welcome discussion of the criticisms made of CAIC. They accept some of the problems raised, and they reject others, including the potential problem of grade shifts. In interspecies analyses, which largely do not take the potential problems of phylogenetic inertia into account in a formalised manner, but which rely on biological knowledge and sensitive investigation of the data sets involved to ensure that results are properly founded, grade shifts certainly can be, and are, recognised (contra Purvis and Webster). On the other hand, grade shifts are not necessarily easy to identify using CAIC. For example, a grade shift may not be marked and hence not produce a clearly outlying contrast point. In any case, the size of the contrast measurements between species points on either side of a grade shift is a meaningless combination of the effects of the grade shift and the relationship within any one grade between the two variables concerned. This combination may produce pairs of contrast values that stand out as outlying points on a contrast plot, or it may not, and hence a grade effect would be missed. Also, a grade shift may involve multiple points in a phylogeny. Whilst there is a single phylogenetic link between strepsirhines and haplorhines, there are multiple links on the primate phylogeny between insectivores and frugivores, for example. Therefore, if species in the different dietary categories form grades for the variables under consideration, multiple points on the contrast plot should be removed from analyses, but they may well be difficult, or impossible, to distinguish.

The phylogenetic tree used in analyses by phylogenetic contrasts is fundamental to the resultant findings. Although CAIC provides techniques for dealing with incomplete phylogenies (see Chapter 3), complete phylogenies or completed parts of the phylogenies used may not be correct. The effects of at least known 'weak links' should be tested, especially if the contrast points for these links are particularly influential on the overall results. The phylogenetic tree chosen for analyses can easily be a significant factor in the results produced. Mace and Holden (Chapter 15), whilst not specifically referring to CAIC, describe the need to test whether results of phylogenetic methods are dependent on a particular phylogeny or 'model of history'.

The use of a regression model in phylogenetic contrasts by CAIC, as Purvis and Webster (Chapter 3) mention, is also potentially problematic. When other line-fitting techniques, which recognise the potential for error in both variables, were first used for comparative species analyses, some significantly different results were produced, affecting the acceptance or dismissal of hypotheses when correlation levels were not especially high. This problem at least needs to be borne in mind by the users of CAIC until the problems associated with incorporating more suitable line-fitting techniques are solved.

Purvis and Webster (Chapter 3) discuss at some length one particular problem of phylogenetic contrasts by CAIC, that it is especially sensitive to error variance in the data. This, they explain, is expected to be greater at lower phylogenetic levels, or younger nodes on a phylogenetic tree. One means of trying to avoid this potential problem is to limit analyses to older nodes or contrasts at higher phylogenetic levels, as both Ross and Jones, and Barton (Chapters 4 and 7) do. However, this inevitably reduces sample sizes and wastes potentially useful data. Also, in most cases, the raw data for analyses only come from extant species, and all other data points on the phylogenetic tree are estimated from these. The older a node, the more estimation is involved in establishing values for the variables concerned, and the more likely the node itself is affected by inaccurate phylogenetic reconstruction. In trying to overcome a problem inherent in a particular analytical method, it does not seem wise to place too great reliance on what may be unreliable data estimations, or at least on estimations whose accuracy is unknown. CAIC calculates variable values at ancestral nodes by simple averaging of the values of descendants. However, this may be far from accurate. For example, fossil evidence shows us that the ancestor of chimps and humans probably had a relative brain size close to that of chimps, and much smaller than that of humans, but, as Foley (Chapter 14) points out, CAIC would reconstruct its value very inaccurately as the average of the two extant species. Evolutionary rates of change can be very different in different lineages, even closely related ones, and models that assume otherwise introduce an error factor of often unknown magnitude.

As a means of trying to overcome the potential problem of phylogenetic inertia in comparative analyses in biology, the method of phylogenetic contrasts has a clear logical basis - it is a form of the Method of Residues. However, important problems remain with its implementation. At the very least, great care needs to be taken to examine the possible effects of such problems on any analysis undertaken, although in some cases, as outlined above, these are unknown and unknowable. As Purvis and Webster state 'phylogenetic comparative methods [are] not. . . black boxes; an understanding of the methods permits informed choice of which is most suitable for the available data, and an assessment of whether the data and phylogenetic estimates are indeed adequate for good tests of hypotheses' (p. 65). Similar provisos, with the specific suggestion that plots of results should always be inspected carefully, stand as good advice whatever the analytical technique used.

Keep Your Weight In Check During The Holidays

Keep Your Weight In Check During The Holidays

A time for giving and receiving, getting closer with the ones we love and marking the end of another year and all the eating also. We eat because the food is yummy and plentiful but we don't usually count calories at this time of year. This book will help you do just this.

Get My Free Ebook

Post a comment