Spatial IQ Quiz Test Manual
	< Back to Test Description

Introduction

This test was developed as an Internet self-administered spatial intelligence test for children and adults, ages 5 and up. A previous test to measure verbal intelligence in children has been developed by the author, with Internet norming and very satisfactory reliability and validity properties.

It was felt that a spatial test would complement this "Kids IQ Test and Free IQ Test" and another Internet verbal test for adults, already marketed by FunEducation.com in conjunction with Dr. McConochie.

Development

Test content was written by the author in five categories: everyday physics, worldly knowledge, patterns and shapes, directions, and common hand tools. Some items were taken from another spatial intelligence measure previously developed by the author. An effort was made to keep item content fair to both genders, and, to maximize aptitude rather than achievement, independent of formal learning experiences. Specific content can be examined by reviewing the test online at www.funeducation.com/SpatialIQtest/. Approximately 42 items were written for each section, ranging in estimated difficulty from very easy for a six year old to difficult for a bright young adult. The format of the items is multiple-choice, with four or five options per item plus an "I Don't Know" option.

Sample #1:

The test was put into online format by FunEducation staff. 75,000 prior Internet customers were invited by e-mail to take the pilot version of the test. 862 completed the test within a couple of weeks. This data was used to study test properties and create initial norms and reliability data for persons aged 16 to 61. Sample sizes for persons below and above this range were insufficient for reliable norming at this stage.

Statistical Properties

Males scored slightly but significantly higher than females on all test sections:

Mean Raw Scores by Gender (269 males, 478 females) Aged 16 and up:

Physics

Worldly K.

Patterns

Directions

Common tools

Total

Males

24.8

29.6

26.3

32.3

29.3

142.3

Females

20.7

27.3

24.7

29.5

27.6

129.8

This finding is compatible with similar slightly higher scores for males on the Wechsler-Bellview and WAIS intelligence tests (Wechsler, 1958, p. 144), though for the WAIS-III these consistent but slight gender differences have been explained, in some instances, as reflecting achievement rather than aptitude factors (van der Sluis, 2006). Separate norms for men and women are used for scoring the present test.

Means and standard deviations

Means and standard deviations for the five sub-tests and all norm groups, ages 16 through 60 were quite similar, with means mostly in the 20's and standard deviations between approximately 4.5 and 7. The range of scores was also very satisfactory for both sub-tests and the total score. The range of sub-test scores was typically between about 10 and 35 across 40-42 items. The highest possible total score is 208; the highest obtained score was 180 across the 862 persons.

Reliabilities were high and comparable to those for the Wechsler Adult Intelligence Test, (WAIS-III). For example, for both men and women ages 21-30 the mean section reliability is .81 and the total score reliability is .95. Reliabilities were computed by the Kuder-Richardson-21 formula, which in the author=s experience yields values about .02 lower than Cronbach Alpha reliabilities.

The comparable Wechsler III Performance sub-tests have a median split-half reliability of .83 and total score reliabilities ranging from .88 to .92 across various age groups (media.wiley). The WAIS-III mean of test-retest reliabilities for the five performance sub-tests is .75 (range .67 to .81) in a sample of 100 16-29 year-olds. For the total score, the reliability is .88 for this sample (Tulsky and Zhu, p. 58).

Thus, the present test reliability appears to be as good as or better than that of the widely used WAIS-III test.

Internal consistency

A few test section and total scores have minor correlations with age and education. When these factors are controlled for, the five test sections for adults have a mean between-test correlation of .45 (range .40 to .50) and all sections correlate substantially with the total score, with a mean of .75 (range .72 to .78). This is interpreted to indicate that each section contributes equally a unique and valuable element to the total score. It is desirable to have low correlations between test sections and high correlations between test sections and the total score to which they are contributing, as this maximizes the value of each section as a unique contributor to the total score and maximizes its reliability.

These values are somewhat better than those for the early WAIS test. For example, for a sample of 300 males and females ages 20-34, the mean of the correlations between the five WAIS Performance tests is .53 (range of .44 to .62). The median correlation between these five tests and the total Performance I.Q. score is only .53 (range .44 - .62) (Wechsler, p. 100).

However, for the more recent WAIS-III Wechsler test, the mean of the correlations between the basic five performance tests is .47 (range .37-.60). The mean of the correlations between these five and the total Performance I.Q. of which they are a part is .76 (range .68 to .79) (Tulsky and Zhu, 1997).

Thus, the present author=s internal consistency test data is close to that of this widely used current WAIS-III test, with a mean between-test correlation of .45 compared to .47 for the WAIS-III, and a mean of correlations between sub-tests and total score of .75 compared to .76 for the WAIS-III.

Regarding age, the only significant correlation for women was with Common Hand Tools (.27**). For men the significant correlations were for Physics (.16**), Tools (.28**) and Total Score (.16*).

Regarding education, no tests correlated significantly with education for men. All did for women, but not strongly: .10*, .10*, .11*, .23* .12* and .19** respectively for the five sub-tests and the total score.

This general lack of relationship between test scores and age and education is interpreted to indicate that the tests are measuring innate aptitude for learning ("intelligence") more than amount learned ("achievement"), and may be relatively "culture free". Ethnic background was not assessed in the initial test takers due to an oversight but will be solicited in subsequent use to obtain data for checking possible minority group bias.

Validity

Other than content validity, the test currently has no other objectively established validity. However, given its high reliability, it is expected to be as valid as the Wechsler Performance tests and similar spatial aptitude tests for predicting relevant behavior, such as success and enjoyment in vocations and hobbies requiring spatial aptitude.

Initial Norms and Report Format

Norms as of February 21, 2007 were for about 1250 persons tested over the Internet. Norms as of this date were not yet large enough for children under 16 or adults over 60.

The educational backgrounds of the 1250 persons who had taken the test as of February, 21, 2007 are as follow:

Highest education	Frequency	Percent
Some high school	301	24.1
Completed high school	204	16.3
Some college or associates degree	463	37.1
Bachelors degree	175	14.0
Masters degree	79	6.3
Ph D degree	28	2.2

Norms as of this sample were by gender and age group, in 10-year segments from 21 to 30, etc., except 16 to 20 for teens and young adults. The top group was 51 to 60. I.Q.s are based on a mean of 100 and standard deviation of 15. The printed report provides all scores, percentile equivalents and a brief explanation of the general meaning of the scores as measures of aptitude rather than achievement.

Norms were to be increased as data is obtained from administration of the test to Internet customers, as has been done for the Kids I.Q. test, which is currently normed on several thousand children.

Sample #2, Updating Research Data

In September of 2007 all data available to date was analyzed. The sample totaled 2,854, 1011 males and 1843 females. The subjects ranged in age from 5 to 90. Nationality data was available for about 1600 subjects and included persons from Australia, Canada, Hong Kong, India Ireland New Zealand, Pakistan, the Philippines, South Africa, the United Kingdom and the United States. 39 subjects were from "other countries".

Alpha reliability coefficients were computed for each ten-year age group, e.g. for ages 20 to 29, 30 to 39, etc. separately for males and females. These alpha coefficients were generally in the .80's and .90's. The total score alphas were all in the 90's except for males in their 60's (.86).

Mean raw scores for the five test sections and total score were plotted on a graph for each 10-year age group and for children aged 5 to 10. The mean for each group beginning at group 30 to 40 was computed and plotted at the midpoint age for that group (e.g. age 35 for the 30 to 39 year old group. These graphs were prepared for males and females separately.

For the section scores, the resulting curves rose steadily to age 25 and then remained essentially level to age 45. After that they gradually declined so that the raw scores for the 60 to 69 age group were about equal to those for the 20 to 29 age group. Of note was the more gradual slope of the Common Hand Tools score and its climb to a later peak age, the 50s for women and the 60's for men.

These curves were then used to determine estimated means for each specific age from 5 to 25. Standard deviations for these ages were also estimated based on data averaged across age groups, e.g. 10 to 19, 20 to 25. These estimated means were close to the actual means for each teen year, e.g. 14, 15 and 16 but provided a smoothly consistent rise that seems a more reasonable basis for future norm data. The sample sizes were small for ages below 10 and the actual scores appear to be for very bright children. Therefore, for this age range, 5 to 10, the smoothed curve means are much different than the actual means.

Similarly, standard deviations tended to vary from age to age, so mean standard deviations were computed. The smoothed score means and mean standard deviations are used for I.Q. report computations as of late September, 2007, as these are judged to closely approximate the true values for the English speaking population likely to take the test over the Internet.

For example consider these actual and smoothed scores for the physics sub-test, ages 5 through 25:

Means and Standard Deviations for Physics Subtest, Males,

Actual and Smoothed, Ages 5 to 25

Age	Actual Mean	Smoothed Mean	Actual Standard Deviation	Mean Standard Deviation
5	20.7	11.5	6.7	6.2
6	16.8	12.8	6.2	6.2
7	14.6	14.2	7.0	6.2
8	15.6	15.0	6.9	6.2
9	18.4	16.5	5.9	6.2
10	17.1	18.7	7.1	6.7
11	17.4	19.3	7.6	6.7
12	20.5	20.0	6.3	6.7
13	20.9	20.4	6.8	6.7
14	20.9	20.6	7.0	6.7
15	22.5	21.0	6.3	6.7
16	22.5	21.4	5.6	6.7
17	23.0	21.7	6.0	6.7
18	21.1	22.1	7.5	6.7
19	22.8	22.4	7.0	6.7
20	23.8	22.7	6.2	6.6
21	23.9	22.9	5.8	6.6
22	23.7	23.2	7.7	6.6
23	21.6	23.5	6.2	6.6
24	23.3	23.7	6.5	6.6
25	21.9	24.0	7.8	6.6

The total raw score (across all five sections) tended to rise steadily to the 50 to 59 age group and then decline slightly in a smooth arc.

Norms as of late September, 2007

Norms as of late September, 2007 will be by gender and age, for each specific age from 5 to 25 and in clusters of age (e.g. 30 to 39) above 25. The total sample as of this norming is 2854.

References

http://media.wiley.com/product_data/excerpt/52/04712829/0471282952.pdf.

Tulsky, D. and Zhu, J., WAIS-III, WMS-III Technical Manual, The Psychological Corporation, San Antonio, Chicago, New York, 1997, (p. 98).

van der Sluis, S., Posthuma, D., Dolan, C., Geus, E., Colom, R. and Boomsma, D., Sex differences on the Dutch WAIS-III, Intelligence, Vol. 34, p. 283, 2006.

Wechsler, David, The Measurement and Appraisal of Adult Intelligence, The Williams and Wilkins Company, Baltimore, 1958.

< Back to Test Description

Spatial IQ Quiz Test Manual

Means and Standard Deviations for Physics Subtest, Males,

Actual and Smoothed, Ages 5 to 25

Spatial IQ Quiz
Test Manual