Assessing/Testing for Giftedness:
Dr. Linda Silverman also highly recommends the use of tests with higher ceilings to reduce ceiling effects. The situation occurs when a
student obtains one or more subtest scores at the 99th percentile; then the composite score is actually well below the student's capabilities. She discusses her findings in this article, found on the Gifted Development Center's website:
Assessment of Gifted Children
Linda Kreger Silverman, Ph.D.
Gifted Development Center
There is more perplexity in the assessment of gifted children than in the assessment of any other population, due to surprising discrepancies in the IQ scores they attain on various tests. Average children and developmentally delayed children usually obtain consistent IQ scores on different instruments. However, a profoundly gifted child can score 120 on one IQ test and 220 on another a difference of 100 points! While the higher score is the better estimate of the child's abilities, too often it makes people uncomfortable and so it is dismissed as "inflated."
The major problem encountered in assessing the gifted is ceiling effects. Most people are unaware of the extent to which low test ceilings can depress IQ scores in the gifted range. Ceiling effects occur when the child's knowledge goes beyond the limits of the test. In order to assess the full strength of a gifted child's abilities, test items must be of sufficient difficulty. Imagine trying to measure a six-foot-person with a five-foot ruler (Stanley, 1990). The magnitude of the problem increases with age: the older the child, the more likely he or she is to outstrip the capacity of the measurement tool.
Ceilings vary on different types of tests. Grade-level group achievement tests have low ceilings and individual IQ tests, particularly those that go up to adult levels, like the Stanford-Binet scales, have higher ceilings. Educators often think the depressed test scores of gifted children are accurate measures of their abilities because they have no opportunity to observe what the students are actually capable of doing. Classrooms also have ceiling effects. Gifted children often know more than the teacher is teaching or classroom tests are testing and they have no chance to display their advanced knowledge. The antidote to ceiling effects is opportunity to demonstrate advanced problem-solving abilities.
The Talent Searches provide an excellent view of what happens when we remove ceiling effects. In Talent Search programs, middle school students who achieve at the 95th (or 97th) percentile in grade-level reading or mathematics achievement tests are allowed to take college board examinations (SAT-1 or ACT). College board exams were designed to differentiate among capable high school seniors for college placement purposes. When such a difficult examination is given to 12- and 13-year-olds, those who appear on achievement tests to be similar in abilities are discovered to have vastly different levels of ability. For example, two students with 97th percentile scores in math achievement may obtain scores anywhere between 200 (lowest score possible) and 800 (highest score possible) on the SAT-Math. Talent Searches enable highly gifted youth the opportunity to demonstrate their full capabilities, perhaps for the first time, and it becomes apparent that they are ready for considerably advanced work.
Giftedness, like retardation, involves inherent differences in development from birth through maturity. Everyone agrees that we must identify developmentally delayed children as early as possible because it has been shown that early intervention is essential to optimal development. It is not as clearly understood that early intervention is also essential for the optimal functioning of developmentally advanced children.
In a study conducted by Gogel, McCumsey and Hewett (1985), nearly half of the 1,039 parents of identified gifted children suspected that their children were gifted before their toddlers were two years old. White and Watts (1973) noted that children who are either unusually rapid or unusually slow in their development show signs of their exceptionality as early as 18 months. Findings from the Fullerton longitudinal study indicated that "gifted and nongifted children develop at different levels from infancy through adolescence" (Gottfried, Gottfried, Bathurst & Guerin, 1994, p. 61).
Differences in level of intellectual performance between the gifted and nongifted children emerged on the psychometric testing at 1.5 years and maintained continuity thereafter. However, the earliest difference was found on receptive language skills at age 1 year. Differences in receptive and expressive language skills were consistently found from infancy onward. (pp. 84-85)
Some educators believe that giftedness cannot be assessed before third grade, but this is due to budgetary constraints rather than to the limitations of testing. Gifted children can be assessed in a valid and reliable manner at the age of four. Gifted four- or five-year-olds are mentally like six- or seven-year-olds, and usually have excellent attention spans, so this is an ideal time for testing. Based on a half-century of her research in testing, Elizabeth Hagen, co-author of the Cognitive Abilities Test and the Stanford-Binet Intelligence Scale, Revision IV, revealed the following information in an interview:
At age two or three,...the behaviors of children are too erratic to get a valid test score, and I would be very skeptical about identifying giftedness at those ages, but I don't think four to six is too early to obtain a valid assessment. The correlations between scores obtained at ages four or five and later IQ scores are slightly lower than those obtained at age nine, but not that much lower. The only reservation I would have about testing at that age is being able to locate children who come from somewhat limited backgrounds. (quoted in Silverman, 1986, p. 170)
There is a widespread myth that IQ test scores of preschool and primary-aged children are inflated due to environmental advantage (e.g., parents reading to their children or the children attending excellent preschools). However, the impact of the environment increases with age; therefore, the IQ scores of third graders are unquestionably more influenced by the environment than the scores of kindergartners. For girls, in particular, early IQ scores are more reliable than those obtained after they have been socialized into hiding their abilities.
At the Gifted Development Center, we have found that the optimal time to test gifted children is between the ages of four and nine. We find that at the age of nine, test scores for gifted children usually decline, sometimes as much as 20 points, due to (1) ceiling effects (test items not being sufficiently difficult to measure the full range of abilities); (2) perfectionism (particularly in girls), leading to unwillingness to guess when uncertain; and (3) the increased emphasis on crystallized (learned) knowledge and skills rather than fluid abilities (purer forms of abstract reasoning, considered innate). This does not mean that testing is useless after age nine. While the score generated may be an underestimate, we find that children and adults profit from even minimal estimates of their abilities.
Group IQ versus Individual IQ Tests
Developmentally advanced children, like developmentally delayed children, should be assessed on individual intelligence tests by trained examiners. Group tests are rough screening tools only, like vision and hearing screening tests. They indicate the need for further testing by a specialist. Most school districts rely on group IQ tests for selecting students for gifted programs because individual IQ tests are substantially more expensive. However, even the best group IQ tests, such as the Cognitive Abilities Test (CogAT), were designed for screening purposes, as co-author Elizabeth Hagen points out:
Although I still believe you should use an individual intelligence test for assessing young children, I would use the Cognitive Abilities Test as a screening test to find out what the potential pool is and then use an individual test for final selection ... (Quoted in an interview with Linda Silverman in Roeper Review, 1986, p. 170)
No parent of a disabled child would agree to have his or her child labeled on the basis of a group screening device, or grades, or a teacher's opinion. Yet, this is exactly what we do with gifted children. These practices will continue until parents recognize the inequity and advocate for a federal mandate to identify and serve gifted children appropriately.
Individual IQ Tests Are Not All Alike
Individual IQ tests also present problems, since the scores they generate for gifted children are not comparable. The newer IQ scales are probably excellent for 90% of the population, but they are inadequate for assessing both the highly gifted and the profoundly retarded. Children in the highly (145-159 IQ), exceptionally (160 - 174 IQ) and profoundly gifted (175+ IQ) ranges have seriously depressed scores on the newer instruments. We continue to believe that the best measure of giftedness is the Stanford-Binet Intelligence Scale, Form L-M (SBL-M) (Silverman & Kearney, 1989; 1992a; 1992b). Since it goes up to Superior Adult III, the SBL-M acts as an above-level measure, similar to the SAT for Talent Search participants. In the words of Julian Stanley (1990), founder of the Talent Searches, "The Binet-type age scale might be considered the original examination suitable for extensive out-of-level testing." (p. 167)
The SBL-M assesses high-level verbal abstract reasoning, as well as mathematical and spatial reasoning; it has very few timed items; and few items require visual-motor abilities. It attracts the attention of younger children better than the Wechsler tests because it moves rapidly from one type of item to another. As the SBL-M is dated, we now assess children first on one of the newer instruments, such as the Wechsler Intelligence Scale for Children, Fourth Edition (WISC-IV) or the Stanford-Binet Intelligence Scale, Fifth Edition (SB5), and, similar to the use of the SAT in the Talent Searches, we use the SBL-M as a supplemental test if they attain at least two subtest scores at the 99th percentile.
The strongest objection to the use of the SBL-M is its outdated norms. Norms are periodically updated to reflect the increase in intelligence in the general population. This increase is called the "Flynn Effect," and it is estimated to be one third of an IQ point per year. In a study of 121 cases in four IQ ranges, selected randomly from the Gifted Development Center files, Falk, Moran and Silverman (2003) found that the discrepancy between WISC-III and SBL-M scores was significant well beyond what could be accounted for by the Flynn Effect. In addition, the Flynn Effect appears to apply to the midrange of intelligence, with "only very minimal changes at the extremes of ability," (J. D. Wasserman, personal communication, December 23, 1997).
As of 2004, we have found 820 children with IQ scores on the SBL-M in the exceptionally gifted range (160+), where the other tests abruptly end. Fifty of these children scored 200 IQ and beyond. The gender distribution of our sample is particularly noteworthy. Approximately 60% of the children brought for assessment are male, and 40% female. In the exceptionally gifted range, we find the same gender ratio as the population referred for testing, and the two highest scoring children were female. Our sample demonstrates that there are as many brilliant females as males-at least in terms of native ability. If this information were widely known, it would help dispel the 5,000-year-old myth of the "natural inferiority" of women. Using only the newer tests, we would be unable to locate these girls and develop their extraordinary talents. They would quickly go underground and remain there.
Researchers who have studied exceptionally and profoundly gifted children have documented their remarkably different thought processes (e.g., Gross, 1993; Morelock, 1995; Lovecky, 1994). In Nature's Gambit, David Feldman and Lyn Goldsmith (1986) describe an "omnibus prodigy" with the highest IQ score on record, whose inexhaustible energy wore out everyone around him, including his parents. Anyone who has studied or lived with a child in these highest IQ ranges can attest to their immense curiosity, their intense absorption in learning, their great intuitive leaps, their profound ethical concerns, their endless energy and their deep sense of social isolation. They are, indeed, the proverbial "horse of a different color."
One child we tested attained a WISC-III Full Scale IQ score of 138 and an SBL-M score of 223+, a difference of more than 85 points. He graduated high school at 14. A six-year-old tested 155 Full Scale IQ on the WISC-III and 244 on the SBL-M, a difference of 89 points. She completed grades 2 through 7 in two years. Another child achieved a Wechsler Pre-school and Primary Scale of Intelligence (WPPSI) Full Scale IQ score of 145 and a SBL-M score of 236+, more than a 91-point discrepancy. At the age of eight, he achieved 760 on the SAT-Math test. He could perform several mental operations simultaneously. A seven-year-old exceeded the raw points necessary to attain ceiling scores on six subtests of the WISC-III, generating a Full Scale IQ score of 155. She passed one third of the items at the highest level of the SBL-M. Her IQ score was 262+ and the difference between her SBL-M and WISC-III scores was more than 107 points! Her math mentor considered her the most brilliant mathematical mind of our time.
A girl tested by an Eastern examiner scored 124 on the Kaufman Assessment Battery for Children (K-ABC), 137 on the WISC-R, and 229+ on the SBL-M. The difference between her K-ABC score and her SBL-M score was over 105 points! A child prodigy in writing, her literary gifts confirmed the accuracy of the SBL-M score. Camilla Benbow tested a boy who scored 199 on the SBL-M at the age of seven and 203 on a second administration. Julian Stanley (1990) reported that as a 14-year-old eleventh grader, the same young man earned perfect scores on the Verbal and Mathematical portions of the Preliminary Scholastic Aptitude Test (PSAT) and was 320 points above the 99th percentile of college-bound seniors on his National Merit Scholarship type index. "Truly, an IQ of 200 can be far more powerful than any of 150!" (p. 167).
The SBL-M remains the only tool that can measure extreme verbal abilities. Unfortunately, due to its age, this valuable instrument may be lost as a means of discovering society's most brilliant minds. What would happen to these children if we relied only on the lower estimates supplied by current tests? Most would be misunderstood, due to their inability to relate to age-peers and age-normed curriculum. Some would be misdiagnosed and placed on medication. Others would languish in grade-level placements, when they desperately need radical acceleration. And a few would sink into lifelong depression. There would be no way of documenting the extent of their differences and supporting their need for tremendously advanced work. If we had no way of knowing the actual level of their abilities, we would be unable to find them true peers-intellectual equals. If their true abilities were unrecognized and undeveloped, they would be likely to develop patterns of underachievement. Many high school dropouts were highly gifted. Motivation and scholarship depend on recognition. It would be debilitating to these individuals, to their families and to our scientific understanding of intelligence, to lose the only tool we have for measuring the highest levels of potential.
Any test can only measure a small portion of a person's competence. Therefore, all tests underestimate children's abilities rather than overestimate them. It is nearly impossible to fake abstract reasoning at an advanced level. When a disabled child achieves two different IQ scores, the higher score is believed to be the best estimate of the child's potential. Gifted children deserve the same attitude.
Several agencies have found an astonishing number of exceptionally gifted children. From the very beginning of IQ testing, Terman (1925) noted that there were more children above 160 in the population than the normal curve would predict. If we are to serve them properly, it behooves us to find them. The adjustment problems of a misdiagnosed child whose actual IQ is 180 are staggering. The further a child is from the norm, the greater the potential for suffering alienation and the greater the need for detection and early intervention. As parents and teachers, we need to seek the most accurate information we can on the gifted children in our charge.
Falk, R. F., Moran, D., & Silverman, L. K. (2003, November). WISC-III and Stanford-Binet L-M Scores for gifted children. Paper presented at the 50th annual convention of the National Association for Gifted Children, Indianapolis, IN.
Feldman, D. H., with L. T. Goldsmith. (1986). Nature's gambit: Child prodigies and the development of human potential. New York: Basic Books.
Gogel, E. M., McCumsey, J., & Hewett, G. (1985). What parents are saying. G/C/T, Issue Number 41, 7-9.
Gottfried, A. W., Gottfried, A. E., Bathurst, K., & Guerin, D. W. (1994). Gifted IQ: Early developmental aspects. The Fullerton longitudinal study. New York: Plenum.
Gross, M.U.M. (1993). Exceptionally gifted children. London: Rutledge.
Lovecky, D. V. (1994). Exceptionally gifted children: Different minds. Roeper Review, 17, 116-120.
Morelock, M. J. (1995). The profoundly gifted child in family context. Unpublished doctoral dissertation, Tufts University, Medford, MA.
Silverman, L. K. (1986). An interview with Elizabeth Hagen: Giftedness, intelligence and the new Stanford-Binet. Roeper Review, 8, 168-171.
Silverman, L. K., & Kearney, K. (1989). Parents of the extraordinarily gifted. Advanced Development, 1, 41-56.
Silverman, L. K., & Kearney, K. (1992a). The case for the Stanford-Binet L-M as a supplemental test. Roeper Review, 15, 34-37.
Silverman, L. K., & Kearney, K. (1992b, November). Don't throw away the old Binet. Presented at the 39th annual convention of the National Association for Gifted Children, Los Angeles, CA. [Appeared in part in Understanding Our Gifted, 4(4), 1, 8-10.]
Stanley, J. C. (1990). Leta Hollingworth's contributions to above-level testing of the gifted. Roeper Review, 12(3), 166-171.
Terman, L. M. (1925). Genetic studies of genius: Vol. 1. Mental and physical traits of a thousand gifted children. Stanford, CA: Stanford University Press.
White, B. L., & Watts, J. C. (1973). Experience and environment (Vol. 1). Englewood Cliffs, NJ: Prentice-Hall.