The objective of the processing operation was to produce a set of data that describes the population as accurately and clearly as possible. In a major change from past practice, the information on Census 2000 questionnaires generally was not edited for consistency, completeness, or acceptability during field data collection or data capture operations. Census crew leaders and local office clerks reviewed enumerator-filled questionnaires for adherence to specified procedures. Mail return questionnaires were not subjected to clerical review and households were not contacted, as in previous censuses, to collect data that were missing from census returns.
Most census questionnaires received by mail from respondents as well as those filled by enumerators were processed through a new contractor-built image scanning system that used optical mark and character recognition to convert the responses into computer files. The optical character recognition, or OCR, process used several pattern and context checks to estimate accuracy thresholds for each write-in field. The system also used "soft edits" on most interpreted numeric write-in responses to decide whether the field values read by the machine interpretation were acceptable. If the value read had a lower than acceptable accuracy threshold or was outside the soft edit range, the image of the item was displayed to a keyer who then entered the response.
To control the possible creation of erroneous people from questionnaires containing stray marks or completed incorrectly, the data capture system included an edit for the number of people indicated on each mail return and enumerator-filled questionnaire. If the edit failed, the questionnaire image was reviewed at a workstation by an operator. The operator identified erroneous person records and corrected OCR interpretation errors in the population count field.
At Census Bureau headquarters, the mail response data records were subjected to a computer edit that identified households exhibiting a possible coverage problem and those with more than six household members - the maximum number of people who could be enumerated on a mail questionnaire. Attempts were made to contact these households on the telephone to correct the count inconsistency and to collect census data for those people for whom there was no room on the questionnaire.
Incomplete or inconsistent information on the questionnaire data records was assigned acceptable values using
imputation procedures during the final automated edit of the collected data. As in previous censuses, the general procedure for changing unacceptable entries was to assign an entry for a person that was consistent with entries for people with similar characteristics. Assigning acceptable codes in place of blanks or unacceptable entries enhances the usefulness of the data.
Another way in which corrections were made during the computer editing process was
substitution. Substitution assigned a full set of characteristics for people in a household. If there was an indication that a household was occupied by a specified number of people but the questionnaire contained no information for people within the household, or the occupants were not listed on the questionnaire, the Census Bureau selected a previously accepted household of the same size and substituted its full set of characteristics for this household.
Table A. Unadjusted Standard Error for Estimated Totals [Based on a 1-in-6 simple random sample] |
Estimated total1
| Size of publication area2
|
|
500 |
1,000 |
2,500 |
5,000 |
10,000 |
25,000 |
50,000 |
100,000 |
250,000 |
500,000 |
1,000,000 |
5,000,000 |
10,000,000 |
25,000,000 |
50 |
15 |
15 |
16 |
16 |
16 |
16 |
16 |
16 |
16 |
16 |
16 |
16 |
16 |
16 |
100 |
20 |
21 |
22 |
22 |
22 |
22 |
22 |
22 |
22 |
22 |
22 |
22 |
22 |
22 |
250 |
25 |
31 |
34 |
34 |
35 |
35 |
35 |
35 |
35 |
35 |
35 |
35 |
35 |
35 |
500 |
- |
35 |
45 |
47 |
49 |
49 |
50 |
50 |
50 |
50 |
50 |
50 |
50 |
50 |
1,000 |
- |
- |
55 |
63 |
67 |
69 |
70 |
70 |
71 |
71 |
71 |
71 |
71 |
71 |
2,500 |
- |
- |
- |
79 |
97 |
106 |
109 |
110 |
111 |
112 |
112 |
112 |
112 |
112 |
5,000 |
- |
- |
- |
- |
112 |
141 |
150 |
154 |
157 |
157 |
158 |
158 |
158 |
158 |
10,000 |
- |
- |
- |
- |
- |
173 |
200 |
212 |
219 |
221 |
222 |
223 |
223 |
224 |
15,000 |
- |
- |
- |
- |
- |
173 |
229 |
252 |
266 |
270 |
272 |
273 |
274 |
274 |
25,000 |
- |
- |
- |
- |
- |
- |
250 |
306 |
335 |
345 |
349 |
353 |
353 |
353 |
75,000 |
- |
- |
- |
- |
- |
- |
- |
306 |
512 |
565 |
589 |
608 |
610 |
611 |
100,000 |
- |
- |
- |
- |
- |
- |
- |
- |
548 |
632 |
671 |
700 |
704 |
706 |
250,000 |
- |
- |
- |
- |
- |
- |
- |
- |
- |
791 |
968 |
1090 |
1104 |
1112 |
500,000 |
- |
- |
- |
- |
- |
- |
- |
- |
- |
- |
1118 |
1500 |
1541 |
1565 |
1,000,000 |
- |
- |
- |
- |
- |
-- |
- |
- |
- |
- |
- |
2000 |
2121 |
2191 |
5,000,000 |
- |
- |
- |
- |
- |
- |
- |
- |
- |
- |
- |
- |
3536 |
4472 |
10,000,000 |
- |
- |
- |
- |
- |
- |
- |
- |
- |
- |
- |
- |
- |
5477 |
1For estimated totals larger than 10,000,000, the standard error is somewhat larger than the table values. Use the formula given below to calculate the standard error.
2The total count of people, housing units, households, or families in the area if the estimated total is a person, housing unit, household, or family characteristic, respectively.
The 5 in the above equation is based on a 1-in-6 sample and is derived from the inverse of the sampling rate minus one, i.e., 5 = 6 1 .
Table B. Unadjusted Standard Error in Percentage Points for Estimated Percentages [Based on a 1-in-6 simple random sample] |
Estimated percentage |
Base of estimated percentage1
|
|
500 |
750 |
1,000 |
1,500 |
2,500 |
5,000 |
7,500 |
10,000 |
25,000 |
50,000 |
100,000 |
250,000 |
500,000 |
2 or 98 |
14 |
11 |
10 |
08 |
06 |
04 |
04 |
03 |
02 |
01 |
01 |
01 |
00 |
5 or 95 |
22 |
18 |
15 |
13 |
10 |
07 |
06 |
05 |
03 |
02 |
02 |
01 |
01 |
10 or 90 |
30 |
24 |
21 |
17 |
13 |
09 |
08 |
07 |
04 |
03 |
02 |
01 |
01 |
15 or 85 |
36 |
29 |
25 |
21 |
16 |
11 |
09 |
08 |
05 |
04 |
03 |
02 |
01 |
20 or 80 |
40 |
33 |
28 |
23 |
18 |
13 |
10 |
09 |
06 |
04 |
03 |
02 |
01 |
25 or 75 |
43 |
35 |
31 |
25 |
19 |
14 |
11 |
10 |
06 |
04 |
03 |
02 |
01 |
30 or 70 |
46 |
37 |
32 |
26 |
20 |
14 |
12 |
10 |
06 |
05 |
03 |
02 |
01 |
35 or 65 |
48 |
39 |
34 |
28 |
21 |
15 |
12 |
11 |
07 |
05 |
03 |
02 |
02 |
50 |
50 |
41 |
35 |
29 |
22 |
16 |
13 |
11 |
07 |
05 |
04 |
02 |
02 |
1For a percentage and/or base of percentage not shown in the table, use the formula given below to calculate the standard error. Use this table only for proportions; that is, where the numerator is a subset of the denominator.
The 5 in the above equation is based on a 1-in-6 sample and is derived from the inverse of the sampling rate minus one, i.e., 5 = 6 1 .