![]() |
|
|
|
|
|
Analyzing available records Getting an idea An important first step in research, as we have said so often, is to define the question to be investigated. In studies based on the analysis of available data, this decision is obviously influenced by knowledge of what kinds of data are available. For example, you might begin with a vague idea about some question you want to investigate, discover some relevant data and, in light of these data, modify your question so that it can be answered with the data you discovered. Research questions also can emerge the other way around. You might discover a body of data and, in thinking about it, formulate a question that you choose to answer. Conceptualization and operationalization These processes are handled differently when one uses available data. When deciding on using available data, the investigator has to judge whether the data can be used as valid indicators for the concepts being investigated. Badri and Burchinal (1985), for example, used the ratio of girls in relation to boys in school as an indicator of social change. Given the previous norm against equal education for girls relative to boys in rural areas of the Sudan, this ratio would appear to be a valid indicator of changes in social values regarding the education of girls. When using available data, one also needs to check the level at which the raw data are aggregated. Raw data, which generally consist of the responses from informants, is seldom reported because, without analysis, the data have little meaning. Instead, the raw data are combined or aggregated in some way. Sometimes the data are aggregated for comparisons between rural and urban areas of a country; sometimes by provinces within the country; and finally for the country as whole. Before deciding to use certain data, you will have to be sure that the data are reported at the level of aggregation you want to use. If you wanted to compare school attendance rates for certain towns, but the lowest level of aggregation was at the province level the study could not be done. If data are reported at a lower level of aggregation, say at administrative levels that make up towns, you could combine the data for each town and continue with your analysis. Methods used to analyze data from available records may be as simple as calculating rates of change in variables to more complex measures of association. After you have studied Chapters 16 through 19, you will have a better idea of the kinds of analyses that can be done. Caution in interpreting results The level at which data are aggregated also has to be taken into account when results are interpreted. An example will show why this is important. Suppose we obtain the number of votes cast in each of 200 voting districts for the candidates representing the socialist and conservative parties in some country. We could then calculate the percentage of voters who voted for each candidate in each of the 200 districts. Further, let's say we obtained data on the per capita income for these same districts. Suppose also that we were testing the hypothesis that voting for conservative candidates was positively related to per capita income: That is, the percentage voting for conservative candidates increased as per capita income of the districts increased. Imagine also that the hypothesis was supported: As per capita income of districts increased so did the percentage of votes cast for the conservative candidates. So far so good. We could conclude that the hypothesis was supported at the level of the voting district. But suppose we wanted to make a generalization at the level of the individuals, who, after all were the ones who cast the votes. Now we have a problem: We have no data at the individual level. The data were aggregated at the level of the voting district. Each district might contain thousands of individuals. And we know nothing about these individuals. We do not know who voted; and if they voted, whether they voted for the conservative or socialist candidate or what their income level is. Yet we want to generalize about individual behavior. In situations like this, you might be tempted to infer that what was found at the aggregated level is also true at the individual level. But there is a danger, you could be in danger of committing an ecological fallacy: That is, assuming without sufficient evidence that what is true at an aggregated level is also true at the individual level. Though probably unlikely, there remains a possibility that a large proportion of low income persons voted for the conservative candidate or that large proportion of the high income persons voted for the socialist candidate. With the data in hand, we could not exclude these possibilities. When aggregated data are used, one has to be careful in making generalizations at levels below that used in the analysis. |