Classical Test Theory – The Basis of all Assessment
At a given point in time, one’s ability or magnitude of a trait is more or less consistent, or stable. Using ability as an example for this article, every individual has a potential to learn and acquire new information. This internal potential in the case of ability is something that cannot necessarily be determined with 100% confidence as the tools that we use to do this are somewhat flawed.
Whilst tools are inherently created to do exactly this, in that they are developed to accurately and reliably assess an individual’s potential (in the case of ability), these tools are nevetheless man-made and imperfect and therefore confound the accurate determination of one’s ability. An individual’s ‘true’ magnitude of a trait is inaccessible, and so we rely on an ‘observed’ trait, which is determined using psychometric tools and assessments.
The theory suggests that our ‘true’ score (or ability in this case) is a combination of an ‘observed’ score and some level of ‘error’ associated with the assessment and environmental conditions.
In actual terms, we can explain this using an example of a numerical test. The score an individual obtains on the test would be considered their ‘observed’ score. The ‘error’ would be those aspects of the test or environmental conditions that hindered the individual’s performance (or demonstration) on the test. Examples of these include an uncomfortable testing room, poor interpretation of questions, or ambiguous questions which do not make clear what they are asking – thus the test taker is unlikely to know what is required.
Best practice dictates the use of consistency, otherwise known as ‘standardisation’ as a practice which minimises ‘error’, resulting in an ‘observed’ score that is closer to the ‘true’ (actual) score than would be otherwise. Standardised instructions which are often read from a card ensure that all candidates get the same instructions prior to a testing session so that no one is unfairly disadvantaged but also so that any variation in the ‘observed’ scores between two individuals can be more confidently attributed to some variation between their ability (or other trait), and not due to some variation in the instructions they received.
We have used the example of ability thus far; however this theory is also applicable to other areas of assessment. In Psychometrics, this can include personality factors too. For each test a band of ‘error’ around an ‘observed’ score is provided to help make distinctions as to whether a ‘real’ (or significant) difference exists between two individual’s observed scores.