![]() |
|
|
|
|
|
Practical considerations, however, often limit application of the guidelines just described. The intended effect of these guidelines is to spread the sample as widely as possible over the area representing the population. While increasing the chances of getting a representative sample, the wider spread also increases the time, effort, and cost required to conduct the interviews. Selection of a smaller number of primary clusters concentrates interviewing in a smaller number of areas, making it easier to complete the required fieldwork. This approach increases the possibility that the final sample will be less representative of the population being studied. In practice, researchers balance these two competing conditions. To the extent practical, larger numbers of primary and secondary clusters and fewer ultimate sampling units are used. A final word: anytime you cannot find a suitable sampling frame, consider using a cluster sample. Some form of clustering, whether in terms of geographical areas or institutions, is almost always possible. Creativity in sampling Different sampling techniques can be creatively combined to meet the requirements of an investigation. For example, ElTigani (1989) used a multistage sampling design, but with an adaptation. He first selected two primary sampling units within the Geriza area of the Sudan. In addition, he selected another primary unit on purpose because it contained a health clinic. Since his study focused on child health, he wanted to see if the clinic made a difference in the health of children in the area it served. In another variation, Khamis and Alsumi (1988) first stratified their sample by identifying rural and urban areas for sampling and then used systematic sampling to select households within each of six selected areas. These variations illustrate how sampling methods can be creatively combined and adapted to meet requirements of research. Weighted samples Probability sampling is based on the fact that each sampling element has an equal chance of being selected. In our example of stratified random sampling, however, females had a much great chance of being selected than males. Disproportionate sampling of females versus males raises no problems as long as the data from each of the samples are used to generate estimates of the parameters for the corresponding population. Results from each of the samples could also be used in comparisons between the two groups. We could do either of these things because each sample was a valid probability sample of the sub-population of males and females. The separate data for each sample, however, cannot be combined as they stand to generalize to the entire student population. While we had probability samples for each sub-population based on gender, we did not have a valid probability sample for the entire student body population: Females had a much greater chance of being selected than males. There is a way, however, to correct for the disproportionate selection of females. The sample has to be corrected to reflect the greater representation of females relative to males. Here is one way this can be done. In our illustration, the chance of selecting a man was one-fifth that for selecting a women. If we wish to combine data for both samples to form one sample to represent the entire student population, we have to correct this imbalance. Since we know males are under represented by a factor of 5, we could give any data for males a weight of 5 compared to data for females. Accordingly, any measurement for males would be multiplied by 5, while those for females would be left as obtained. As an alternative, we could weight the statistics, such as mean, for the male sample before combining it with the mean for the female sample. A mean for the male sample would be multiplied by 5 and then added to the mean for the female sample, after which the sum would be divided by 6, the number of means involved in the summation. For example, suppose the mean for some variable for the male sample was 20 and that female sample was 17. The mean for the total sample would be 20 times 5 for the male sample plus 17 for the female sample, giving a weighted mean of 100 plus 17 or 117 divided by 6 or 19.5 for the entire sample. Weighting can become complicated and involve some difficult calculations. This is particularly true when area sampling is used. When the clusters at any stage contain different numbers of units (households, etc.), weighting is necessary to correct for the disproportionate selection of various clusters. This problem can be avoided by creating clusters with approximately the same numbers in each before selecting the clusters at any stage in the process. When this is done, the sample becomes self weighted. Then, data from the final clusters can be safely combined to make estimates of characteristics of the population. |