Biostatistics and Epidemiology


The techniques of providing confounding questions or switching respondents' responses somehow protect the respondent by keeping personal data undisclosed. Random noise is created when instead of directly asking the respondent about his/her socio-economic status, the researcher construes this by asking instead of appliances found at home, amount paid for utilities per month, and other home services utilized by the respondent. Combining the responses to these confounding questions would then yield information to the individual's socio-economic status as ascertained quantitatively by the researcher. By adding random noise through the creation of confounding questions, the responses will be modified that it would then be used to estimate the actual responses of the respondents.

Another technique that can be utilized to keep the respondents' personal data undisclosed is through switching responses of individual's responses with another respondent who has the same socio-demographic characteristics as the other respondent. Switching responses masks the true characteristic or response of the subject by switching them to the responses of the other subject therefore protecting them in being identified. Yet in switching responses, the actual response will still be in the data and be analyzed accordingly.

But looking in the benefit of the respondents for the survey being conducted, the modification in random noise or switching of responses may affect the results, resulting to misleading interpretations and wrong conclusions. In random noise, the modification done may vary for every response and may not equate or be the same as the information the researcher needs. Thus, calculation of respondents' collective responses would be different vis-a-vis the calculation of responses per case or respondent. In effect, the sum of the parts does not make up the "calculated" whole. This arises to the problem of representativeness, wherein respondents are not well-represented to fully describe the characteristics of the group they belong to. Confounding questions are especially susceptible to this kind of statistical problem because they do not necessarily wholly "represent" the variable that the researcher needs identified, yet cannot ask the respondent directly.

The statistical difficulty with switching responses is that the true characteristics or true effect that may be present among the responses will weaken as researchers switch responses among respondents -- generally, manipulating with the data. Having the same characteristics or demographics is still a broad area that using it as a reference point would be not effective. The effect of switching responses is especially more evident when a group with a specific and homogenous socio-demographic characteristics is further divided into groups, to create two sub-groups. The division of the said group, and the resulting two sub-groups, would not now be representative of the population because there will be characteristics within the sub-groups that may no longer be true when each respondent member's data of each sub-group is analyzed, as ascertained by the researcher's criteria.

In effect, these techniques protect respondents' identities, but do not protect the reliability and validity of the data generated and findings/conclusions developed from the generated data.

Adding random noise may skew the data, thus the generated data may not give the accurate results. Random noises are estimates or modifications of the actual responses and making these estimates equate to the actual responses would not help in concealing the personal information. That is why it is expected that the random noise will somehow have skewed results, and this potentially lessens the power of the study -- that is, the degree at which the data can be considered reliable and valid by both researchers and end-users.

Masking the true responses makes room for more error. Since in a survey…