BIAS IN MENTAL TESTING is an extraordinary work. It is extraordinary for the Herculean labor it represents, for the thoroughness and ingenuity of its statistical analyses, for the importance of the new knowledge it presents -- and, regrettably, for the narrowness of the scientific and social perspective for which the analyses are conducted and the new knowledge is interpreted.
The author of this monumental study, Professor Arthur Jensen, is known as the chief proponent of the theory that there are inherited differences in intelligence among race. But in this work he carefully defines a more limited area of investigation. The main question, indeed the only question, that Jensen undertakes to answer is this: Are mental tests in general, and IQ test in particular, "culturally biased so as to discriminate unfairly against racial and ethnic minorities or persons of low socioeconomic status"?
Jensen's narrowly defined task nevertheless leads him to a broad scientific conclusion. Here we encounter the most extraordinary feature of the book. Jensen never states his principal conclusion, but leaves it to be inferred by the reader. If one accepts Jensen's restricted scientific perspective, the inference is inescapable and, in the absence of explicit warnings to the contrary, follows logically and smoothly from the massive material provided. But since that material is cast within the same narrow scientific frame, it fails to include the issues and evidence that present the most critical challenge to his position. I shall consider that evidence later in this review.
Although one cannot quote Jensen's conclusion in his own words, it was already being widely disseminated by the mass media well before the book reached the public:
"The implications of Jensen's book are grim. His work suggests that attempts to raise the educational success rate of all black youngsters to parity with whites are ultimately doomed to fall short." (Newsweek, Jan. 14, 1980)
"Jensen's findings clearly have horrendous implications. Indeed they come close to saying that blacks are a natural and permanent underclass." (Time, Sept. 14, 1979)
What line of argument and evidence provokes such statements? The line begins, and ends, with Jensen's stated purpose: to investigate cultural bias in mental tests.
Cultural bias is the criticism most frequently leveled against intelligence tests both by scientists and by the general public. It was also the principal criticism directed at Jensen's much-publicized work of a decade ago, in which he claimed that the commonly found 15-point discrepancy in IQ between whites and blacks was primarily a function of differences in genetic endowment. In the present volume, Jensen conspicuously avoids taking any explicit position on the controversial issue he raised. Repeatedly, he reminds the reader that his sole aim is to investigate test bias as a possible factor in producing racial difference in IQ. In Jensen's view, the charge that intelligence tests are unfair to blacks is unequivocally refuted by the evidence he presents; therefore, he argues, the substantial discrepancy in IQ score between the races reflects real differences in the mental ability of blacks and whites.
The seeming straightforwardness of this argument makes a series of unstated assumptions and unexamined data that are critical to the soundness of Jensen's conclusions. Throughout the book Jensen is concerned only with the potential of bias in the methods through which test results are obtained. He does not consider at all the possibility of bias in the interpretation of test findings; that is, bias in determining what is in fact measured by the observed difference in IQ score between blacks and whites.
Jensen's entire argument is based on this hidden issue of interpretation. If one digs beneath the surface one discovers that the structure is built on shifting ground. As a result, Jensen's principal conclusions are unsupportable.
Before pursuing this fundamental issue, it is important to present Jensen's evidence and argument in his own terms. He begins by setting forth three objective criteria that he views as necessary and sufficient for detecting test bias. Specifically, mental tests are culturally biased if it can be shown that:
1. The observed racial difference is a result of factors existing in the test situation, such as the race, attitudes or dialect of the examiner, individual versus group administration, or differing experience in the taking of tests.
2. A given IQ score predicts different levels of school or job performance for whites versus blacks.
3. The items producing variation in IQ are different for the two races, with some being more familiar to one race than the other because the content is culture-specific; in other words, the questions are drawn from one culture but not the other.
Turning to the research literature, Jensen then presents extensive evidence to demonstrate that none of these invalidating conditions exists in fact. Studies of situational factors, such as race of examiner, have failed to show any consistent effects. To the extent that IQ tests reveal any bias in predicting school or job performance, the difference is opposite in direction to that claimed by critics of mental testing; instead of underestimating the future achievement of blacks, the intelligence tests predicted a higher level of performance than was actually attained. In other words, such bias as existed in the test tended to favor blacks rather than penalize them. Finally, the largest white-black differences were found not on items criticized as culture-specific but on those judged to be least dependent on cultural content.
Jensen accounts for this last surprising finding by explaining that the more culture-bound items are less highly loaded on what is called the g or "general factor" in intelligence. This factor is identified through a statistical technique, known as factor analysis, that can be applied to test results in order to determine the extent to which various items cluster together. Such clusters are presumed to reflect the structure of intelligence. Most factor analyses of mental test data reveal the presence of one large cluster, referred to as g. Jensen cites evidence to show that it is this general factor, which he describes as "mainly reasoning ability," that accounts for most of the variation in IQ both within races and between them. He then proceeds to demonstrate that the results of mental tests show the same factorial structure in both racial groups. Furthermore, the closest correspondence in patterns is seen when black children are compared not with whites of their own age, but with what Jensen calls a "pseudorace" -- white children who are two years younger.
On the basis of all the foregoing evidence, Jensen concludes that the 15-point difference in IQ is not a function of mental test bias but reflects a "general cognitive developmental lag" -- a slower mental development among blacks.
In Jensen's view, there is a redeeming feature to this disturbing state of affairs. Despite the developmental lag, there is considerable overlap in the IQ scores of the two races, with a substantial number of blacks scoring above the white average. This fact, Jensen believes, accords special importance to the use of mental tests, now proved unbiased in his view. While cautioning against the indiscriminate application of IQ tests in the schools, Jensen strongly urges their careful use for educational and job selection. The final sentence is his book reads: "Whatever may be the cause of group differences that remain after test bias has been eliminated, the practical application of sound psychometrics [the science of mental measurement] can help reinforce the democratic ideal of treating every person according to the person's individual characteristics, rather than according to his or her sex, race, social class, religion, or national origin." EXPLAINING THE 'DEVELOPMENTAL LAG'
It is clear, however that, even in the ideal democratic world Jensen envisions, he still expects whites and blacks to differ in intelligence. And if no other measures are to be taken beyond "the practical application of sound psychometrics," what Jensen calls a "cognitive developmental lag" will surely continue to exist.
It is therefore important to consider the precise nature of this lag. On what is it based, how stable is it, and to what extent is it susceptible to environmental influence of intervention? In short, what does the IQ difference measure? Curiously, Jensen does not address these critical questions in the discussion of his findings, but answers are scattered throughout his first nine chapters, which, in effect, constitute a comprehensive textbook of psychometrics.
The answers what Jensen provides are not encouraging. For example, on the nature and bias of g, or the general factor in intelligence, Jensen cites what he views as a classic formulation by psychologist Cyril Burt: "It is the general character of the individual's brain tissue -- viz., the general degree of systematic complexity in the neutral architecture -- that seems to represent the general factor." In effect, Burt and Jensen see the g factor as physiological, hence inherited rather than developed.
As for IQ itself, Jensen reminds the reader on several occasions that most of the variation in human intelligence is due to genetic endowment. According to him, 70 to 80 percent.) What does this mean for the 15-point difference in IQ between blacks and whites? Though left unstated, the implication is clear enough, especially in the absence of any statements to the contrary. Namely, the race difference in IQ has some genetic basis. NATURE vs. NURTURE
Given the thoroughness with which Jensen usually treats data on race differences in mental test performance, it is noteworthy that he never examines the question of the stability of these differences over time or place. Has their magnitude remained constant over the years? Does it change over the course of the lifespan? Are race differences the same in different regions, in segregated versus unsegregated schools? Ultimately, can these differences be eliminated?
The only information Jensen presents on the issue of changes over time is indirect and deals with the constancy of IQ in general: "The stability of mental test scores during the period from infancy to maturity is roughly comparable to the stability of height and weight measures. . . . Mental test scores become increasingly stable over any given time interval as the chronological age of the subjects increases, from infancy to adulthood."
Even more striking is the contrasting treatment Jensen gives to genetic versus environmental influences. Whereas the former are mentioned repeatedly and receive extended discussion, explicit reference to environmentally-induced differences in IQ is limited to a single paragraph of four lines on page 285.
The only concrete data Jensen presents on environmental influences appear in his own analysis of race and social class differences in IQ scores obtained from a sample of 1,200 white and black children in California schools. The relevant findings are as follows: "The overall IQ difference between white and blacks is 15 points. Whites and blacks of the same SES [socioeconomic status] differ by 12 points." In other words, controlling for social class reduces the racial difference in IQ by only 3 points.
Although Jensen never draws the connection explicitly, he presents in the first, introductory half of his book the context for a rather unequivocal interpretation of the scientific and social significance of his subsequent findings; namely, the "general cognitive developmental lag" revealed by what he regards as unbiased mental tests is enduring, probably hereditary in origin, and, in any event, not very susceptible to enviromental influence or intervention. In other words, probably for genetic reasons, blacks are less intelligent than whites, and the difference is here to stay. EXAMINING JENSEN'S ASSUMPTIONS
Under these delicate circumstances, it is especially important to examine the assumptions underlying Jensen's explicit argument. To begin with, what are the external criteria on which he relies to check on cultural bias? A substantial number of the studies he cites (but by no means all of them) employ grades in school and job performance ratings by employers and supervisors to measure results. Thus any conclusions about the absence of test bias in these studies rest on the assumption that, in judging performance, teachers and supervisors do not discriminate by race. If such prejudice were in fact present, then the lower level of performance found in blacks could reflect, in significant degree, the continuing operation of prejudice across a series of life settings -- in the street, in the school, and on the job. Moreover, Jensen's finding that blacks did not do as well in school and on the job as would have been expected from their IQ scores is consistent with this interpretation, since it would suggest that, under existing environmental and social conditions, blacks are not as able to realize their demonstrated intellectual potential as are whites.
It is true that other studies cited by Jensen used objective measures of performance based on standardized achievement tests or simulated work problems. But to accept these as unbiased criteria, one would have to assume that how well a child or adult does on a test is unaffected by any racial discrimination on the part of teachers or supervisors. It is noteworthy in this regard that the tendency for blacks to do more poorly on such tests than predicted by their IQ level was apparent even in those articles in which objective methods of measuring results had been used. ENVIRONMENTAL INFLUENCES ON IQ
On the issue of the comparative stability of the IQ and its resistance to environmental influence, it is instructive to examine some of the instances of marked change in IQ (to which Jensen does not assign much importance, presumably because of their relative infrequency). For example, in an especially well-designed study ["IQ Test Performance of Black Children Adopted by White Families," in the October, 1976 American psychologist], Sandra Scarr found that the average IQ of black children who had been adopted by advantaged white families was 16 points higher than the average score achieved by disadvantaged black children reared in their own homes in the same geographical area. Scarr took special pains to avoid possible bias in the selection of children for adoption on the basis of their intellectual ability or background of the biological parents. Note that the gain attributable to a more advantaged environment corresponds in magnitude to the typical discrepancy in IQ between races. It is also the gain in IQ achieved in the first years of a number of early intervention projects conducted with children from low-income families.
An example of a more modest rise in IQ on a far grander scale is referred to by Jensen himself. When the Stanford-Binet IQ test was restandardized with a new national sample in 1972, the average IQ, computed on the basis of the previous 1939 norms, showed an appreciable rise for every age group, with the largest increase (10 IQ points) occurring among the younger children. According to Jensen, the authors of both versions of the test explain the upward shift in terms of "differences in cultural background between the 1930s and the 1970s." The 1972 sample was also the first to include blacks and Spanish surname children in proportion to their numbers in the large population. The question naturally arises of the relative contribution of blacks and whites to the observed rise in overall score, but Jensen does not consider the issue.
Unmentioned by Jensen is a more recent national trend in the opposite direction. I refer to the much publicized steady decline over the past 15 years in national mean scores on the SAT, the Scholastic Aptitude Test, taken by the overwhelming majority of high school students desiring to go to college. (The trend is paralleled by a progressive drop in achievement test scores for all elementary and high school pupils.) The decline is strongly manifested even after controls are introduced for the social and ethnic background of pupils taking the test.
Yet Jensen describes the SAT as a measure of mental ability that "undoubtedly" has a very high loading on the g factor. Since such marked shifts over time in average levels of mental ability can hardly be interpreted as indicating rise or decline in the quality of the national gene pool, the variations must be attributed to changing environmental conditions. CYRIL BURT AND TWINS
But how can the environment be so powerful given Jensen's claim that at least 70 percent of mental ability is genetically determined? First, more recent large-scale studies (not cited by Jensen) indicate that his estimates of heritability are too high. (See, for example, Christopher Jencks' Inequality: A reassessment of the Effect of Family and Schooling in America. ) Second, there is no single heritability estimate applicable to human beings under all circumstances; its value depends on the environment in which it is measured. For example, the estimates of genetic inheritance are derived primarily from studies of twins. The figure of 70 percent is not unreasonable for the situation in which children of the same age are brought up in the same home, live in the same neighborhood, typically have the same teachers, and are members of the same peer groups. But what about children brought up in different homes, such as identical twins reared apart? Here estimates as high as 70 percent are obtainable only if one assumes that the environments into which the twins are separated are completely unrelated, for example, that twins do not end up in homes that are similar in social class level.
The only evidence clearly in support of this assumption was reported by Sir Cyril Burt, whose work has since been shown to have been based on data that were highly questionable, if not actually fabricated. Finally, and this is a basic point in genetics that Jensen surely knows: The level of heritability within groups has no necessary bearing whatever on differences between them. For example, height has a heritability coefficient of 90 percent; yet children of immigrants are frequently taller than their parents because of improved nutrition.
In light of the foregoing considerations, it would appear that Jensen's interpretation of the 15-point IQ difference between whites and blacks as evidence of a two-year lag in the development of black intelligence rests on a simplistic conception of the environment and its possible role in influencing psychological growth.
Specifically he considers environmental measures only of the crudest sort, such as indices of socioeconomic status based only on fathers' occupation without regard for family income or parents' education. Moreover, his analysis assumes that black and white lower-class families are equally disadvantaged. Jensen also fails to take into consideration changes in environmental conditions over time and place. For example, he does not examine the possibility that the IQ difference between whites and blacks may have decreased over recent decades, during which many, though certainly not all, legal, educational and social barriers to racial integration have been removed. Nor has he explored whether the prediction of educational outcomes from IQ scores might vary from school to school or from neighborhood to neighborhood.
In an interview reported on Jan. 7, 1980 in The New York Times, Jensen is quoted as saying that he finds it difficult to conceive of an environmental explanation for differences in the mental test scores of whites and blacks. This is precisely the problem with his analysis and his conclusions, for in his work he never asks what might be the concrete environmental factors that influence the course of human development.
I have recently reviewed the existing research evidence on this topic in my book The Ecology of Human Development. The available scientific material points to two kinds of conditions that appear to be especially important for fostering psychological growth in all its aspects -- intellectual, emotional and social. The first pertains to the child's immediate environment: A major factor affecting development is the child's involvement in progressively more complex activities with adults toward whom the child has developed feelings of trust and affection. Examples of such activities include answering the child's questions, reading to the child, playing games together, or -- at later ages -- working together on practical problems that come up in everyday life, or discussing past and future experiences.
The second set of conditions refers to the broader social world in which the family lives: The child's development depends on the extent to which the family has access to health and social services, material resources, information, informal social supports, sources of power and influence in the community and just plain time, all necessary for the family to be able to function effectively in its child-rearing role.
Jensen's conclusion that blacks exhibit a developmental lag that is here to stay is tenable only if one disregards or denies the importance of the two types of environmental conditions described above.
The narrowness of perspective is also evident in Jensen's conception of human intelligence. His view is limited to the kinds of processes that can be measured by standardized mental tests, and even there he focuses almost exclusively on difference of degree in a single general factor rather than in multi-factor profiles. And he gives no recognition at all to the possibility that the intellectual ability of individuals or groups might be manifested in other domains, such as the richness and variety of linguistic behavior, or the complexity of the social relationships in which people engage. RESEARCH AND RESPONSIBILITY
Nevertheless, we owe Jensen a debt for a contribution to scientific knowledge, for he has marshaled convincing evidence that mental structure and content, at least as revealed by intelligence tests, is indistinguishable in blacks and whites. There is no need to seek genetic explanations for differing patterns since there are no differences in factor structure to be explained. Blacks and whites appear to use their brains in the same way, but under the contrasting circumstances in which they live one group is able to function more effectively than the other. Moreover, since it is generally agreed that intelligence has a polygenic basis, it is difficult to imagine how a multitude of genetic factors could operate to produce a mean difference in overall intellectual level without some accompanying variation in form and content. A more parsimonious explanation for the observed racial difference in IQ is in terms of environmental opportunity, particularly with reference to the second set of social conditions mentioned above, which are known to be unequally distributed by race in contemporary American society. Infant mortality among blacks is 22 percent as compared to 12 percent for whites; 28 percent of all blacks live below the poverty level, but only 7 percent of whites do. And black income when compared to that of whites is actually declining: figures show that in 1970 the median income for black families was $10,500 and $17,200 for white families, while in 1978 those incomes, after adjustment to the lower cost of living in 1970, were $10,900 and $18,400 respectively. Thus the discrepancy in average IQ of blacks and whites may reflect their contrasting conditions of life rather than differences in the mental ability of two racial groups.
A concrete example may be useful here, one that turns Jensen's somewhat invidious concept of "pseudorace" back on his own argument. Let us consider two white populations, similar in all other respects, except that one has, for several generations, had access to adequate health services, whereas the other has not. Thus only the first group will have received needed prenatal and birth care, immunizations, antibiotics, surgical treatment, etc. What might be the effect on the develpment of mental ability in the two groups? Clearly, there will be individuals within each population who will be more intelligent than others, and thereby score higher on IQ tests and do better in school and on the job. But in the second group the enervating effects, both on children and adults, of frequent illness -- such as fatigue, inefficiency, and prolonged absences from school -- as well as the impact of uncorrected handicaps, would interfere with learning, impair psychological development, and thereby result in lower IQ scores. A statistical analysis of the scores for the two groups might be expected to produce a pattern closely resembling that obtained by Jensen for blacks and whites, namely, similarity in the structure and content of mental abilities, but a marked difference in the functional level of intellectual performance.
The example is not completely hypothetical. Studies of groups receiving differential levels of health care have indeed revealed differences in developmental status and course. Under such circumstances difference in IQ between groups of human beings, including differenct races, reflect not variation in mental ability but in the quality of the environment in which they live their lives.
It is instructive to be reminded that on intelligence tests administered to recruits in World War I, as well as to school children during the same period, the average IQ for the Italian groups was consistently in the 80s compared to 100 for so-called "American" control groups. For two decades afterwards widely used educational texts continued to interpret these test scores as reflecting "the difference in the mentality of the two races." Had these same results passed all three of Jensen's scientific criteria, would we in the light of our present knowledge be prepared to regard those tests as free from cultural bias?
In sum, Jensen's analysis in the present volume is based on data, theory and policy of the status quo without any serious allowance for, or consideration of, possibilities of change. As such, the analysis is incomplete, one-sided and misleading both in its stated and, especially, its unspoken conclusions. Jensen cannot escape responsibility for the latter, since they follow logically from his argument and have been stated quite explicitly in his earlier writings. The responsibility is a heavy one. It would be very unfortunate, unjustified, but not at all unlikely if Jensen's conclusions from his present work were to help perpetuate or even aggravate present inequities by discouraging further scientific and programmatic efforts exploring the potential of environmental change to enhance the development of human beings in all segments of American society. Such efforts include food stamps, health care, youth employment, Head Start and the basic research on which such programs are based. Alas, many other forces are already operating to undermine such endeavors so desperately needed in a time of mounting socioeconomic stress for families.
But there is one area in which a scientist carries complete responsibility; namely for his scholarly work. If it is necessary to insist, as Jensen does, that strict scientific criteria be applied before a test is adjudged free from bias, how demanding must the scientific criteria be for concluding that one race is intellectually inferior to another?