The following 5-year Summary File example is also applicable to the 1-year Summary File. Let's say that you want to create Table B08406, "Sex of Workers by Means of Transportation to Work for Workplace Geography," for the state of Alaska. Which files do you need? How do you read the files?
You will need files from www2.census.gov/acs2009_5yr/summaryfile/
- The Sequence_Number_and_Table_Number_Lookup.xls spreadsheet
- The zipped file 20095ak0003000.zip containing the estimate file e20095ak0003000.txt, the margin of error file m20095ak0003000.txt, and the geography file g20095ak.txt. All these data files are in folder 2005-2009_ACSSF_By_State_All_Tables
Start with the Sequence_Number_and_Table_Number_Lookup.xls spreadsheet. Under the "Tblid" column, look for the value "B08406". You will see that the "Sequence Number" is 3 which for us means "0003." This means that the estimates you are looking for are in the data file "e20095ak0003000.txt". How do you know this is the right file? You know this from the name of the file: the "e" stands for estimate, "2009" is the year, "5" means that these are 5-year estimates, "ak" is the state (Alaska), and "0003" is the sequence number (which contains the data for Table B08406). Likewise, you need the "m20095ak0003000.txt" file for the margins of error.
Next click on the folder 2005-2009_ACSSF_By_State_All_Tables. Then click on the file Alaska_All_Geographies_Not_Tracts_Block_Groups.zip and extract the files e20095ak0003000.txt, m20095ak0003000.txt, and g20095ak.txt
When you open the estimate file, e20095ak0003000.txt, you will see the following comma-delimited fields on the first line:
ACSSF,2009e5,ak,000,0003,0000001,333471,266201,220762,45439,36132,5602,3705,...
The first six fields - from "ACSSF" to "0000001" - are identifiers:
- The first field tells you that this is an ACS Summary File
- The second tells you that these data are five-year estimates for the year 2005-2009 (notice the "e" before "2009" and the "5" at the end)
- The third tells you the state, e.g.. "ak" is Alaska
- The fourth is an iteration number 000
- The fifth is the sequence number 0003
- The last is a logical record code LOGRECNO 0000001. Use LOGRECNO to determine the geographic area within a state.
These six identifiers begin each new line in the estimate file, and the same holds true for the margin of error files. You can compare these identifiers with those in the respective margin of error file, m20095ak0003000.txt.
Then use the geography file for Alaska to determine the location within the state to which the data refer. The appropriate file is g20095ak.txt, where "g" stands for geography, "2009" is the year, "5" is the period estimate (in this case, 5-year estimate), and "ak" is the state. The geography file, g20095ak.txt, defines the LOGRECNO. Each LOGRECNO in this file specifies a geographic area pertaining to the state. For example, a LOGRECNO of "0000001" means the state of Alaska; a LOGRECNO of "0000002" means just the urban areas in Alaska; a LOGRECNO of "0000003" refers to just rural areas in Alaska. (Each state geography file also contains the lower-case FIPS State Code.) Please be aware that each state has its own geography file. For more information, see Chapter 2.4.
Glancing back at the Sequence Number and Table Number Lookup file, you will see that Table 08406 in sequence "0003" begins at the seventh position. From this point forward, for 51 fields (indicated on the same file), each field corresponds to the value of a "line number" in the table. So, field number seven, the 333471 value, corresponds to line number one, which is "Total". Field number eight, the 266201 value, refers to line number two, which is "Car, Truck, or Van." Field number nine, the 220762 value, corresponds to line number three, which is "Drove alone." This continues all the way up to line number 51, at which point Table B08406 ends.
Were you to read all these files into a computer program using software such as SAS, you could translate the first nine fields of e20095ak0003000.txt as follows:
TABLE B08406: SEX OF WORKERS BY MEANS OF TRANSPORTATION TO WORK FOR WORKPLACE GEOGRAPHY |
FILEID |
FILE |
STUSAB |
CHARITER |
SEQUENCE |
LOGRECNO |
Total |
Car, Truck, or Van |
Drove Alone |
ACSSF |
TYPE |
ak |
0 |
0003 |
0000001 |
333471 |
266201 |
220762 |
ACSSF |
2009e5 |
ak |
0 |
0003 |
0000013 |
2440 |
340 |
252 |
ACSSF |
2009e5 |
ak |
0 |
0003 |
0000014 |
5266 |
1046 |
620 |
ACSSF |
2009e5 |
ak |
0 |
0003 |
0000015 |
151263 |
135626 |
114936 |
ACSSF |
2009e5 |
ak |
0 |
0003 |
0000016 |
6421 |
2455 |
1586 |
Merging the geography file, the table shell, and the estimate and margin of error files together creates an excerpt of Table B08406, shown below:
Table ID |
Line Number |
Sequence Number |
Table Title |
Estimates |
Margin of Error |
B08406 |
|
003 |
SEX OF WORKERS BY MEANS OF TRANSPORTATION TO WORK FOR WORKPLACE GEOGRAPHY |
|
|
B08406 |
|
003 |
Universe: Workers 16 years and over |
|
|
B08406 |
1 |
003 |
Total: |
333,471 |
+/-2,630 |
B08406 |
2 |
003 |
Car, truck, or van: |
266,201 |
+/-2,589 |
B08406 |
3 |
003 |
Drove alone |
220,762 |
+/-2,439 |
B08406 |
4 |
003 |
Carpooled: |
45,439 |
+/-1,795 |
B08406 |
5 |
003 |
In 2-person carpool |
36,132 |
+/-1,596 |