Randomization: increasing data quality in research
Randomization is a very well known technique that can increase data quality in research and limit falsification. Let’s see how it works.
Everybody knows that high quality data depends on how and when you make questions. Respondent’s attention varies during the questionnaire invalidating answers and causing quality problems.
What is randomization?
By analyzing respondents’ behavior, researchers found out that some corrupting effects are actually really frequent:
- Primary effect: people usually choose the first option in a long list of answers
- Recency effect: people usually choose choose the last option
- Attention span and question blocks: if the questionnaire is in blocks, the respondent will be inclined to answer to the second block of questions exactly as to the first one.
Randomization limits these effects improving data quality. With this in mind, let’s see some applications.
Types and applications to increase data quality in research
There are 2 types of randomization:
- Basic Randomization: it involves all items that are rotating one with the other. “Don’t answer”, “Other/specify” or “Don’t know” are usually excluded from the randomization and placed as last
- Block randomization: items are in blocks. Randomization involves both blocks and items within the blocks.
When to use it? When is it better (and more effective) to use it? And when is it irrelevant?
Firstly: you usually find randomization in opinion surveys, as election polls. In questions – like “who will you vote for?” – answers are in random order. Another common use is for surveys to evaluate products to guarantee data integrity.
Also, you can use it for questions on personal information. Really often a lot of respondents drop questionnaires because personal questions are not well distributed. That’s why a simple randomization can benefit not just the quality of data but also response rate.
But when is it irrelevant? Let’s see 3 scenarios:
- In demographic questions: age, gender, date of birth…randomization is irrelevant and may also be counterproductive as it requires a bigger effort for the respondent
- Rate scales: randomizing scale (like 1 to 5) is useless and may confuse the respondent
- Long list of options: when facing a long list of options it’s easier to scroll them in a logic order (like alphabetic order) rather than in a random one.
To sum up
Randomization is helpful to minimize data corruption and improve data collection in some surveys. It cannot guarantee zero risks for your results but it can be really helpful.
Moreover, with the majority of current software randomization can be automatically set in any survey methodology saving time to researchers and assuring a correct use of this technique.