An Alternative Modified Item Count Technique          in Sampling Survey

Farid Ibrahim

Paper Information
Paper Submission

International Journal of Statistics and Applications

p-ISSN: 2168-5193 e-ISSN: 2168-5215

2016; 6(3): 177-187

doi:10.5923/j.statistics.20160603.11

An Alternative Modified Item Count Technique in Sampling Survey

Abstract
Reference
Full-Text PDF
Full-text HTML

Farid Ibrahim

Department of Statistics, Benha University, Egypt

Correspondence to: Farid Ibrahim , Department of Statistics, Benha University, Egypt.

Email:

This work is licensed under the Creative Commons Attribution International License (CC BY).
http://creativecommons.org/licenses/by/4.0/

Abstract

In this study we have proposed an alternative modification to the usual item count technique (ITC) to estimate the proportion of a sensitive characteristic in some of the fields such as health care. This technique produces two estimators to estimate the proportion. The first proposed estimator has been proven to be more efficient than the second one. Efficiency comparisons of the first proposed estimator with the estimator of Doitcour et al.’s (1991) ICT, and with the estimator of Hussain et al.’s (2012) ICT are performed. It is found that the first proposed estimator uniformly performs better than the other estimators. The optimal sample size n, in the case of minimizing the variance of the estimator, assuming that the cost of conducting the survey is fixed, will be determined.

Keywords: Health surveys, Randomized response, Item count technique, Lagrange multipliers, Relative efficiency, Minimal variance

Cite this paper: Farid Ibrahim , An Alternative Modified Item Count Technique in Sampling Survey, International Journal of Statistics and Applications, Vol. 6 No. 3, 2016, pp. 177-187. doi: 10.5923/j.statistics.20160603.11.

Article Outline

1. Introduction

2. The Usual ICT and Its Modification

2.1. The Usual ICT

2.2. A Modification of ICT

3. The Proposed Alternative Modification of the Usual ICT

3.1. The First Proposed Estimator

3.2. The Second Proposed Estimator

4. The Relative Efficiency of the Proposed Estimators of

5. Efficiency Comparisons

5.1. Proposed Estimator

Versus the Estimator

5.2. Proposed Estimator

Versus the Estimator

6. The Optimal Sample Size

7. Conclusions and Discussions

ACKNOWLEDGMENTS

Appendix

1. Introduction

In sampling surveys, it is easily conceivable that many respondents may not give truthful answers to direct questions on illegal behaviors such as habitual gambling, shoplifting, induced abortion, tax evasion, addiction to drugs, rash driving, history of past involvement in crimes, embarrassing or socially undesirable opinions and prejudices, … , etc. Measurement error in answers to previous sensitive topics may be reduced by some choices in survey design, such as open-ended questions, asking about behavior over long reference periods, tolerantly loaded introductions, and self-administration of the sensitive questions (see Tourangeau and Yan (2007)). An ingenious alternative approach to direct questioning is the randomized response technique (RRT) that was introduced by Warner (1965). In RRT the respondents employ a randomizing device to add probabilistic misclassification to their responses and conceal their true answers from the interviewer. Then several RR models have been proposed in the last decades as a valuable way for performing surveys on these sensitive topics. For details one can refer to Greenberg et al. (1969), Moors (1971), Mangat (1994), Mangat et al. (1997), Saha (2007), Singh and Tarray (2013), Barabesi et al. (2014), Adebola and Johnson (2015), and Blair et al. (2015).

Chaudhuri and Christofides (2007) gave a criticism on the randomized response technique in the sense that it demands the respondent’s skill of handling the device and also asks respondents to report the information which may be useless or tricky. A clever respondent may also think that her/his reported response can be traced back to her/his actual status if she/he does not understand the mathematical logic behind the randomization device. Some of the alternatives to the randomized response technique are the Item Count Technique (ICT), the Nominative Technique and the Three Card Method. Details can be found in Droitcour et al. (1991), Droitcour and Larson (2002) and Miller (1985) respectively. These alternatives are designed because, in general, respondent evade sensitive questions especially regarding personal issues, socially deviant behaviors or illegal acts. Chaudhuri and Christofides (2007) also added that in these three alternatives to RRT respondents know that what they are revealing about themselves and that they do not need to know about any special estimation technique. Also respondents provide answers which make sense to them.

ICT has an impression, mainly, on sensitive fields such as health care. This Technique consists of selecting two independent samples (control and treatment groups). Each respondent in the control sample is presented with a list of innocuous items (control items), say

with possible answers of “Yes” or “No” and is asked to report the total number of items that are applicable to her/him. Each respondent in the treatment sample, from the same population, is provided with the same list to which one sensitive item is added and is requested to report the total number of items that are applicable to her/him. The respondents are randomly assigned to either the control group or treatment group.

Compared ICT to the classical RRT, the ICT has the advantage of avoiding the potentially distracting act of randomization by the respondents themselves during the interview. Potential disadvantages of the ICT are that only the treatment group provides any information about the item of interest, and that the inclusion of the control items complicates the survey design and adds uncertainty to the estimation (see Kuha and Jackson (2014)).

Dalton et al. (1994) named ICT as the unmatched count technique and applied it to study the illicit behaviors of the auctioneers and compared to direct questioning they obtained higher estimates of six stigmatized items. Wimbush and Dalton (1997) applied this technique in estimating the employee theft rate in high-theft exposure business and found higher theft rates. Tsuchiya (2005) proposed two new methods, referred to as the cross-based method and the double cross-based method, by which proportions in subgroups or domains are estimated based on the data obtained via the item count technique. In order to assess the precision of the proposed methods, Tsuchiya conducted simulation experiments using data obtained from a survey of the Japanese national character. The results illustrated that the double cross-based method is much more accurate than the traditional stratified method, and is less likely to produce illogical estimates. Tsuchiya et al. (2007) conducted an experimental web survey in an attempt to compare the direct questioning technique and the ICT. Compared with the direct questioning technique, the ICT yielded higher estimates of the proportion of shoplifters by nearly 10 percentage points, whereas the difference between the estimates using these two techniques was mostly insignificant with respect to innocuous blood donation. Imai (2011) proposed new nonlinear least squares and maximum likelihood estimators for efficient multivariate regression analysis with the ICT.

Kuha and Jackson (2014) analyzed item count survey data on the illegal behavior of buying stolen goods. The analysis of an item count question was best formulated as an instance of modeling incomplete categorical data. They proposed an efficient implementation of the estimation which also provides explicit variance estimates for the parameters. Walter and Laier (2014) compared the methodological pros and cons of ICT to direct questioning (DQ). They presented findings from a face-to-face survey of 552 respondents who had all been previously convicted under criminal law prior to the survey. The results showed, first, that subjective measures of survey quality such as trust in anonymity or willingness to respond were not affected positively by ICT with the exception that interviewers feel less uncomfortable asking sensitive questions in ICT format than in DQ format. Second, all prevalence estimates of self-reported delinquent behaviors were significantly higher in ICT than in DQ format. Third, a regression model on determinants of response behavior indicated that the effect of ICT on response validity varies by gender. Overall, their results were in support of ICT.

Hussain and Shabbir (2010) and Hussain et al. (2012) proposed two modifications to the usual ICT that was proposed by Droitcour et al. (1991), and they showed that their estimators are always more efficient than the estimator of the usual ICT.

Since the two main problems of the randomized response models and their alternative techniques are the respondent’s privacy and the efficiency of the estimators of these models, so in this paper we have proposed an alternative modification to the usual ICT that produces two estimators of the parameter of the item count model, and their properties are studied. The first proposed estimator, which has been proven to be more efficient than the second one, is more efficient than the estimators of the other models of ICT. Also this alternative modification provides full protection to the respondent’s privacy.

So the remainder of the present research is organized as follows; Section 2 presents the usual ICT that was proposed by Droitcour et al. (1991) and Hussain et al.’s (2012) modification of the usual ICT. An alternative modification of the usual ICT, and the two estimators of the parameter of the model of ICT and their properties are presented in Section 3. The relative efficiency, of the two proposed estimators, is performed in section 4. Efficiency comparisons of the first proposed estimator, that has more efficiency compared to the other one, with the estimator of Droitcour et al.’s (1991) ICT and the estimator of Hussain et al.’s (2012) ICT are performed in section 5. The optimal sample size n, in the case of minimizing the variance of the estimator, assuming that the cost of conducting the survey is fixed, is determined in Section 6. Section 7 is devoted for conclusions and discussions.

2. The Usual ICT and Its Modification

This section presents the usual ICT which was proposed by Droitcour et al. (1991) and its modification was proposed by Hussain et al. (2012).

2.1. The Usual ICT

The usual ICT was introduced by Droitcour et al. (1991). It consists of selecting two independent samples of sizes

and

The

th respondent in the first sample is given a list of

innocuous items and asked to report the total number, say

of items that are applicable to her/him. Similarly, the

th respondent in the second sample is provided another list of

items including the sensitive item and asked to report a total number, say

of the items that are applicable to her/him. The

innocuous items may or may not be the same in both the samples. Unbiased estimator of proportion of the sensitive item in the population is given by

(2.1)

and its variance is given by

(2.2)

2.2. A Modification of ICT

This modification was suggested by Hussain et al. (2012). Each respondent in a sample of size

is provided a questionnaire (list of questions) consisting of

questions. The

th question consists of queries about an unrelated item

and a sensitive characteristic

The respondent is requested to count 1 if she/he possesses at least one of the characteristics

otherwise, count zero, as a response to the

th question, and to report the total count based on entire questionnaire. And they assumed that

denote the total count of the

th respondent, and then mathematically they wrote it as

(2.3)

where

takes the values 1 and zero with probabilities

and

respectively.

(2.4)

This suggests an unbiased estimator of

(2.5)

with variance

(2.6)

3. The Proposed Alternative Modification of the Usual ICT

Let a random sample of size

be selected using simple random sampling with replacement (SRSWR). The

th respondent is provided a list consists of

items that including

innocuous items and one sensitive item and asked to:

- Firstly; count a number, say

of the items that are applicable to her/him based on the entire list.

- Secondly; report how far away the produced number

is from

if she/he has the sensitive item, or report the produced number

if she/he doesn’t have it. It is to be mentioned that this idea is due to Christofides (2003).

The survey procedures are performed under the assumptions that the sensitive and innocuous items are unrelated and independent. The

items are arranged randomly in the list. This technique improves the privacy protection of the respondents.

3.1. The First Proposed Estimator

Let

the true proportion of the population with the sensitive item.

the true proportion of the population without the sensitive item.

Since the

th item may be innocuous or sensitive item, and

be the total number produced by the

th respondent using the ICT.

can be written as follows

(3.1)

where

Then, the expected value of

can be written as

(3.2)

Then, from equation (3 - 2) we find

(3.3)

The expected value of

is an unbiased estimator of

Also,

(3.4)

(3.5)

3.2. The Second Proposed Estimator

Let

the total number reported by the

th respondent in the sample

and

(3.6)

where

the probability that the produced total number by the

th respondent is

the probability that the produced total number by the

th respondent is

the proportion of the respondents that report

For the

th respondent

and similarly

(3.7)

(3.8)

The expected value of

is an unbiased estimator of

Also

(3.9)

(3.10)

where

(3.11)

and

(3.12)

where

is the proportion of the respondents that report

in the sample.

From equations, of

(3 - 2) and (3 - 7) we find

(3.13)

4. The Relative Efficiency of the Proposed Estimators of

In this section we present efficiency comparison of the proposed estimator

with the proposed estimator

From equations (3 - 5) and (3 - 10), the relative efficiency of the two unbiased estimators

and

(4.1)

From the properties of the weighted mean the following property.

Property: Consider a set of real positive numbers

all are less than a positive real number

and there are

positive weights

such that

then

The Prove:

Which is always true for all values of

and

(4.2)

which also, is always true for all values of

and

Also, from equation (3 - 9) we find

(4.3)

and from equation (3 - 4) we find

(4.4)

By subtracting equation (4 - 4) from equation (4 - 3)

According to equation (4 - 2)

has smaller variance than

is less efficient than

We have calculated

and the relative efficiency of the proposed estimator

relative to the proposed estimator

for

and for

and

at each value of

The results are provided in Table (1) in Appendix. From table (1) it is indicated that;

- The proposed estimator

is more efficient than the proposed estimator

- The sample size

does not have significant effect on the relative efficiency of the two proposed estimators.

5. Efficiency Comparisons

In this section we present efficiency comparisons of the estimator

of the proposed ICT with the estimator

of the usual ICT and with the estimator

of Hussain et al.’s ICT.

5.1. Proposed Estimator Versus the Estimator

We compare the relative efficiency of the proposed estimator

with the estimator

in both cases of having and not having unequal

In the case of having unequal

the proposed estimator

would be more efficient than the estimator

From equations (2 – 2) and (3 – 5), we have

(5.1)

since

and

which are always true for all values of

Moreover, in the case of having

(which is difficult/impossible case), we find

(5.2)

(5.3)

and

(5.4)

since

and

which is always true for all values of

Then

is more efficient than

5.2. Proposed Estimator Versus the Estimator

We compare the relative efficiency of the proposed estimator

with the estimator

in both cases of having and not having unequal

In the case of having unequal

the proposed estimator

would be more efficient than the estimator

From equations (2 - 6) and (3 – 5) we find,

(5.5)

We have calculated the relative efficiency of the proposed estimator

relative to the estimator

for

and for

and

at each value of

The results are arranged in table (2).

Moreover, in the case of having

(which is difficult/impossible case), we find,

(5.6)

From equations (5 - 3) and (5 - 6) we find

(5.7)

Also, we have calculated the relative efficiency of the proposed estimator

relative to the estimator

in the case of having

for

and for

and

at each value of

The results are arranged in table (3).From tables (2) and (3), it is indicated that;

- For

the proposed estimator

is more efficient than the estimator

for all values of

- For

the proposed estimator

is less efficient than the estimator

for all values of

- For

the proposed estimator

is more efficient than the estimator

for all values of

- The sample size

does not have significant effect on the relative efficiency of the proposed estimator

relative to

- In the case of having

the relative efficiency of the two estimators

and

is approximately equivalent for all values of

and

6. The Optimal Sample Size

In the sampling surveys it is necessarily to find the values of n, so that the variance of the estimator is minimum under fixed cost, or the cost is minimized under assumption that the variance of the estimator is predetermined (see Christofides (2005)). Consider the case of minimizing the variance of the estimator assuming that the cost of conducting the survey is fixed. Suppose that we have the following linear cost functions

(6.1)

where

the general (fixed) cost of the survey,

the cost of interviewing an individual in the sample,

the total cost of the survey.

Then we want to find the optimal values of

which minimize the variance of

subject to a fixed cost. Thus, we have the following optimization problem

subject to

(6.2)

Using the method of Lagrange multipliers, we can equivalently have the unconstrained minimization problem

where

Taking partial derivatives, we find

(6.4)

(6.5)

From equations (6 - 4) and (6 - 5), we find

(6.6)

(6.7)

Solving the system of the equations (6 - 6) and (6 - 7), it follows that

(6.8)

If we substituting the value of n in equation (3 - 4) (Section 3), then we obtain the minimal value of variance

as follows;

7. Conclusions and Discussions

The aim of this article was to provide an alternative modification of the usual ICT to estimate the proportion of a sensitive characteristic in some of the fields such as health care. The main features of this alternative modification are that, first of all it does not need to select two subsamples of sizes

and

Therefore, we do not require to worry about the optimal values

and

as is the usual ICT estimator

Secondly the first estimator of this alternative modification has been proven to be more efficient than the estimators of the other ICT techniques. Thirdly this technique provides full protection to the respondent’s privacy since the reported number by the respondent may differ from the produced number and each reported number falls in the range

and means that the respondent may have or may not have the sensitive item. This technique produced two estimators to estimate the proportion. We proved that the first proposed estimator to be more efficient than the second one. Also we proved that the first proposed estimator of the proposed item count technique is more efficient than the estimator of the usual ICT

Also, it has been observed that the first proposed estimator performs better than the estimator

of Hussain et al.’s (2012) ICT when

(which is the logical event). We determined the optimal sample size n, in the case of minimizing the variance of the estimator, assuming that the cost of conducting the survey is fixed. In summary, based on the findings of Sections 4 and 5, and the concluding discussion above we recommend the use of the proposed ICT with the first proposed estimator in surveys about sensitive items instead of the usual ICT and Hussain et al.’s item count technique.

ACKNOWLEDGMENTS

The author is especially appreciative for Dr. Tasos, C., Chistofides of Cyprus University, Nicosia, Cyprus. Dr. Zawar Hussain of Faculty of Sciences, King Abdul-Aziz University, Jeddah, Kingdom of Sudia Arabia. Dr. Ejaz Ali Shah and Dr. Javid Shabbir of Quaid-I-Azam University, Islamabad, Pakistan.

Also, the author is grateful to the Editor-in-Chief and to the learned referee for their valuable suggestions regarding improvement of the paper.

Appendix

Table (1). The percent relative efficiency of the two proposed estimators

Table (2). The percent relative efficiency of

in the case of not having

Table (3). The percent relative efficiency of

in the case of having

References

[1]	Adebola, F., and Johnson, O., (2015): An Improved Warner’s Randomized Response Model. International Journal of Statistics and Applications, Vol. 5, pp. 263-267.
[2]	Barabesi, L., Dianna, G., and Perri, P., (2014): Horvitz-Thompson Estimation with Randomized Response and Non-Response Model. Assist. Statist. Appl., Vol. 9, pp. 3-10.
[3]	Blair, G., Imai, K., and Zhou, Y.-Y., (2015): Design and Analysis of the Randomized Response Technique. JASA, Vol. 110, pp. 1304-1319.
[4]	Chaudhuri, A., and Christofides, T., (2007): Item Count Technique in Estimating the Proportion of People with a Sensitive Feature. Journal of Statistical Planning and Inference, Vol. 137, pp. 589–593.
[5]	Christofides, T., (2003). A Generalized Randomized Response Technique. Metrika, Vol. 57, pp. 195-200.
[6]	Christofides, T., (2005): Randomized Response in Stratified Sampling. Journal of Statistical Planning and Inference, Vol. 128, pp. 303-310.
[7]	Dalton,D., Wimbush, J., and M. Daily, C., (1994): Using the Unmatched Count Technique (UCT) to Estimate the Base Rates for Sensitive Behavior. Personnel Psychology, Vol. 47, pp. 817–828.
[8]	Droitcour, J., Caspar, R., Hubbard, M., Parsley, T., Visscher, W., and Ezzati, T., (1991): The Item Count Technique as a Method of Indirect Questioning: a Review of its Development and a Case Study Application, in Measurement Errors in Surveys, Biemer, P., Groves, R., Lyberg, L., Mathiowetz, N., and Sudman, S., (editors). Wiley, New York.
[9]	Droitcour, J., and Larson, E., (2002): An Innovative Technique for Asking Sensitive Questions: The Three Card Method. Sociological Methodological Bulletin, Vol. 75, pp. 5–23.
[10]	Greenberg, B., Abul-Ela, A., Simmons, W., and Horvitz, D., (1969): The Unrelated Question Randomized Response Model: Theoretical Framework. JASA, Vol. 64, pp. 520-539.
[11]	Hussain, Z., Ali Shah, E., and Shabir, J., (2012): An Alternative Item Count Technique in Sensitive Surveys. Revista Colombiana de Estadística, Vol. 35, pp. 39- 54.
[12]	Hussain, Z., and Shabbir, J., (2010): On Item Count Technique in Survey Sampling, Journal of Informatics and Mathematical Sciences. Vol. 2, pp. 161-169.
[13]	Imai, K., (2011): Multivariate Regression Analysis for the Item Count Technique. JASA, Vol. 106, pp. 407-416.
[14]	Kuha, J., and Jackson, J., (2014): The Item Count Method for Sensitive Survey Questions: Modeling Criminal Behavior. Journal of the Royal Statistical Society: Series C (Applied Statistics), Vol. 63, pp. 321-341.
[15]	Mangat, N., (1994): An Improved Randomized Response Strategy. Jour. Royal Statist. Soc. Ser. B, Vol. 56, pp. 93-95.
[16]	Mangat, N., Singh, R., and Singh, S., (1997): Violation of Respondent’s Privacy in Moors’ Model- its Rectification Through a Random Group Strategy Response Model. Commun. Statist. Theory Methods, Vol. 26, pp. 243-255.
[17]	Miller, J., (1985): The Nominative Technique: A New Method of Estimating Heroin Prevalence. NIDA Research Monograph, Vol. 57, pp. 104–124.
[18]	Moors, J., (1971): Optimization of the Unrelated Question Randomized Response Model. JASA, Vol. 66, pp. 627-629.
[19]	Tsuchiya, T., (2005): Domain Estimators for the Item Count Technique. Statistics Canada, Vol. 31, pp. 41-51.
[20]	Tsuchiya, T., Hirai, Y., and Ono, S., (2007): A Study of the Properties of the Item Count Technique. Public Opinion Quarterly, Vol. 71, pp. 253–272.
[21]	Tourangeau, R., and Yan, T., (2007): Sensitive Questions in Surveys. Psychological Bulletin, Vol. 133, pp. 859–883.
[22]	Saha, A., (2007): A Simple Randomized Response Technique in Complex Surveys. METRON-International Journal of Statistics, Vol. LXV, pp. 55-66.
[23]	Singh, H., and Tarray, T., (2013): An Alternative to Kim and Warde’s Mixed Randomized Response Technique. Statistica, Vol. LXXIII, pp. 379-402.
[24]	Warner, S., (1965): Randomized Response: A Survey Technique for Eliminating Evasive Answer Bias. JASA, Vol. 60, pp. 63–69.
[25]	Walter, F., and Laier, B., (2014): The Effectiveness of the Item Count Technique in Eliciting Valid Answers to Sensitive Questions. An Evaluation in the Context of Self-Reported Delinquency. Survey Research Methods, Vol. 8, pp. 153-168.
[26]	Wimbush, J., and Dalton, D., (1997): Base Rate for Employee Theft: Convergence of Multiple Methods. Journal of Applied Psychology, Vol. 82, pp. 756–763.

Paper Information

Journal Information

An Alternative Modified Item Count Technique in Sampling Survey

Article Outline

1. Introduction

2. The Usual ICT and Its Modification

2.1. The Usual ICT

2.2. A Modification of ICT

3. The Proposed Alternative Modification of the Usual ICT

3.1. The First Proposed Estimator

3.2. The Second Proposed Estimator

4. The Relative Efficiency of the Proposed Estimators of

5. Efficiency Comparisons

5.1. Proposed Estimator Versus the Estimator

5.2. Proposed Estimator Versus the Estimator

6. The Optimal Sample Size

7. Conclusions and Discussions

ACKNOWLEDGMENTS

Appendix

References