Design and Methodology: American Community Survey
Chapter 3. Frame Development
The sampling frame used for the American Community Survey (ACS) is an extract from the national Master Address File (MAF), which is maintained by the U.S. Census Bureau and is the source of addresses for the ACS, other Census Bureau demographic surveys, and the decennial census. The MAF is the Census Bureau's official inventory of known living quarters (housing units [HUs] and group quarters [GQs] facilities) and selected nonresidential units (public, private, and commercial) in the United States and Puerto Rico. It contains mailing and location address information, geocodes, and other attribute information about each living quarter. (A geocoded address is one for which state, county, census tract, and block, have been identified.)
The MAF is linked to the Topologically Integrated Geographic Encoding and Referencing (TIGER®) system. TIGER® is a database containing a digital representation of all census-required map features and related attributes. It is a resource for the production of maps, data tabulation, and the automated assignment of addresses to geographic locations in geocoding.
The initial MAF was created for Census 2000 using multiple sources, including the 1990 Address Control File, the U.S. Postal Service's (USPS's) Delivery Sequence File (DSF), field listing operations, and addresses supplied by local governments through partnership operations. The MAF was used as the initial frame for the ACS, in its state of existence at the conclusion of Census 2000. The Census Bureau continues to update the MAF using the DSF and various automated, clerical, and field operations, such as the Demographic Area Address Listing (DAAL).
The remainder of this chapter provides detailed information on the development of the ACS sampling frame. Section B provides basic information about the MAF and its contents. Sections C and D describe the MAF development and update activities for HUs in the United States and Puerto Rico. Section E describes the MAF development and ACS GQ data collection activities. Finally, Section F describes the ACS extracts from the MAF.
Master Address File Content
The MAF is the Census Bureau's official inventory of known HUs and GQs in the United States and Puerto Rico. Each HU and GQ is represented by a separate MAF record that contains some or all of the following information: geographic codes, a mailing and/or location address, the physical state of the unit or any relationship to other units, residential or commercial status, latitude and longitude coordinates, and source and history information indicating the operation(s) (see Section C) that add/update the record. This information is gathered from the MAF and provided to ACS in files called MAF extracts (see Section F).
The geographic codes in the MAF, some of which come from the TIGER® database, identify a variety of areas, including states, counties, county subdivisions, places,1 American Indian areas, Alaska Native areas, Hawaiian Homelands, census tracts, block groups, and blocks. Two of the MAFs important geographic code sets are the Census 2000 tabulation geography set, based on the January 1, 2000, legal boundaries, and the current geography set, based on the January 1 legal boundaries of the most recent year (for example, MAF extracts received in July 2007 reflect legal boundaries as of January 1, 2007). The geographic codes associated with each MAF record are assigned by the TIGER® database. Because each record contains a variety of geographic codes, it is possible to sort MAF records according to different geographic hierarchies. ACS operations generally require sorting by state, county, census tract, and block.
The MAF contains both city-style and non-city-style mailing addresses. A city-style address is one that uses a structure number and street name format; for example, 201 Main Street, Anytown, ST 99988. Additionally, city-style addresses usually appear in a numeric sequence along a street and often follow parity conventions, such as all odd numbers occurring on one side of the street and even numbers on the other side. They often contain information used to uniquely identify individual units in multiple-unit structures, such as apartment buildings or rooming houses. These are known as unit designators, and are part of the mailing address. A non-city-style mailing address is one that uses a rural route and box number format, a post office (PO) box format, or a general delivery format. Examples of these types of addresses are RR 2, Box 9999, Anytown, ST 99988; P.O. Box 123, Anytown, ST 99988; and T. Smith, General Delivery, Anytown, ST 99988.
In the United States, city-style addresses are most prevalent in urban and suburban areas, and accounted for 94.4 percent of all residential addresses in the MAF at the conclusion of Census 2000. Most city-style addresses represent both the mailing and location addresses of the unit. City-style addresses are not always mailing addresses, however. Some residents at city-style addresses receive their mail at those addresses, while others use non-city-style addresses (Census 2000b). For example, a resident could have a location address of 77 West St. and a mailing address of P.O. Box 123. In other cases, city-style addresses ("E-911 addresses") have been established so that state emergency service providers can find a house even though mail is delivered to a rural route and box number.
Non-city-style mailing addresses are prevalent in rural areas and represented approximately 2.5 percent of all residential addresses in the MAF at the conclusion of Census 2000. Because these addresses do not provide specific information about the location of a unit, finding a rural route and box number address in the field can be difficult. To help locate non-city-style addresses in the field, the MAF often contains a location description of the unit and its latitude and longitude coordinates.2 The presence of this information in the MAF makes field follow-up operations possible.
Both city-style and non-city-style addresses can be either residential or nonresidential. A residential address represents a housing unit in which a person or persons live or could live. A nonresidential address represents a structure, or a unit within a structure, that is used for a purpose other than residence. While the MAF includes many nonresidential addresses, it is not a comprehensive source of such addresses (Census 2000b).
The MAF also contains some address records that are classified as incomplete because they lack a complete city-style or non-city-style address. Records in this category often are just a description of the units location, and usually its latitude and longitude. This incomplete category accounted for the remaining 3.1 percent of the United States residential addresses in the MAF at the conclusion of Census 2000.
For details on the MAF, including its content and structure, see Census (2000b).
Footnotes:
1Place is defined by the Census Bureau as A concentration of population either legally bounded as an incorporated place, or delineated for statistical purposes as a census designated place (in Puerto Rico, a comunidad or zona urbana). See census designated place, consolidated city, incorporated place, independent city, and independent place under "Glossary of Basic Geographic and Related Terms - Census 2000".
2For example, "E side of St. Hwy, white house with green trim, garage on left side."
Master Address File Development and Updating for the Uinted States Housing Unit Iinventory
MAF Development in the United States
For the 1990 decennial and earlier censuses, address lists were compiled from several sources (commercial vendors, field listings, and others). Before 1990, these lists were not maintained or updated after a census was completed. Following the 1990 census, the Census Bureau decided to develop and maintain a master address list to support the decennial census and other Census Bureau survey programs in order to avoid the need to rebuild the address list prior to each census.
The MAF was created by merging city-style addresses from the 1990 Address Control File;3 field listing operations;4 the USPSs DSF; and addresses supplied by local governments through partnership operations, such as the Local Update of Census Addresses (LUCA)5 and other Census 2000 activities, including the Be Counted Campaign.6 At the conclusion of Census 2000, the MAF contained a complete inventory of known HUs nationwide.
Footnotes:
3The Address Control File is the residential address list used in the 1990 Census to label questionnaires, control the mail response check-in operation, and determine the response follow-up workload (Census 2000, pp. XVII-1).
4In areas where addresses were predominantly non-city-style, the Census Bureau created address lists through a door-to-door canvassing operation (Census 2000, pp. VI-2).
5The 1999 phase of the LUCA program occurred from early March through mid-May 1999 and involved thousands of local and tribal governments that reviewed more than 10 million addresses. The program was intended to cover more than 85 percent of the living quarter addresses in the United States in advance of Census 2000. The Census Bureau validated the results of the local or tribal changes by rechecking the Census 2000 address list for all blocks in which the participating governments questioned the number of living quarter addresses.
6The Be Counted program provided a means to include in Census 2000 those people who may not have received a census questionnaire or believed they were not included on one. The program also provided an opportunity for people who had no usual address on Census Day to be counted. The Be Counted forms were available in English, Spanish, Chinese, Korean, Tagalog, and Vietnamese. For more information, see Carter (2001).
MAF Improvement Activities and Operations
MAF maintenance is an ongoing and complex task. New HUs are built continually, older units are demolished, and the institution of addressing schemes to allow emergency response personnel to find HUs with noncity mailing addresses render many older addresses obsolete. Maintenance of the MAF occurs through a coordinated combination of automated, clerical, and field operations designed to improve existing MAF records and keep up with the nations changing housing stock and associated addresses. With the completion of Census 2000, the Census Bureau implemented several short-term, one-time operations to improve the quality of the MAF. These operations included count question resolution (CQR), MAF/TIGER® reconciliation, and address corrections from rural directories. For the most part, these operations were implemented to improve the addresses recognized in Census 2000 and their associated characteristics. Some ongoing improvement operations are designed to deal with errors remaining from Census 2000, while others aim to keep pace with post-Census 2000 address development. In the remainder of this section, several ongoing operations are discussed, including DSF updates, Master Address File Geocoding Office Resolution (MAFGOR), ACS nonresponse follow-up updates, and Demographic Area Address Listing (DAAL) updates. We also discuss the Community Address Updating System (CAUS), which has been employed in rural areas. Table 3.1 summarizes the development and improvement activities.
Table 3.1
Master Address File Development and Improvement
Initial Input |
Improvements (POST-2000) |
1990 Decennial Census address control file |
DSF updates |
USPS Delivery Sequence File (DSF) |
Master Address File Geocoding Office Resolutions (MAFGOR) |
Local government updates |
ACS nonresponse follow-up |
Other Census 2000 activities |
Community Address Updating System (CAUS) |
|
Other Demographic Area Address Listing (DAAL) Operations |
The DSF is the USPSs master list of all delivery-point addresses served by postal carriers. The file contains specific data coded for each record, a standardized address and ZIP code, and codes that indicate how the address is served by mail delivery (for example, carrier route and the sequential order in which the address is serviced on that route). The DSF record for a particular address also includes a code for delivery type that indicates whether the address is business or residential. After Census 2000, the DSF became the primary source of new city-style addresses used to update the MAF. DSF addresses are not used for updating non-city style addresses in the MAF because those addresses might provide different (and unmatchable) address representations for HUs whose addresses already exist in the MAF. New versions of the DSF are shared with the Census Bureau twice a year, and updates or refreshes to the MAF are made at those times.
When DSF updates do not match an existing MAF record, a new record is created in the MAF. These new records, which could be new HUs, are then compared to the USPS Locatable Address Conversion Service (LACS), which indicates whether the new record is merely an address change or is new housing. In this way, the process can identify duplicate records for the same address. For additional details on the MAF update process via the DSF, see Hilts (2005).
MAFGOR is an ongoing clerical operation in all Census Bureau regional offices, in which geographic clerks examine groups of addresses, or "address clusters" representing addresses that do not geocode to the TIGER® database. Reference materials available commercially, from local governments and on the Internet, are used to add or correct street features, street feature names, or the address ranges associated with streets in the TIGER® database. This process increases the Census Bureau's ability to assign block geocodes to DSF addresses. At present, MAFGOR operations are suspended until the 2010 Census Address Canvassing and field follow-up activities are completed.
Address Updates From ACS Nonresponse Follow-Up
Field representatives (FR's) can obtain address corrections for each HU visited during the personal visit nonresponse follow-up phase of the ACS. This follow-up is completed for a sample of addresses. The MAF is updated to reflect these corrections.
For additional details on the MAF update process for ACS updates collected at time of interview, see Hanks, et al. (2008).
DAAL is a combination of operations, systems, and procedures associated with coverage improvement, address list development, and automated listing for the CAUS and the demographic household surveys. The objective of DAAL is to update the inventory of HUs, GQs, and street features in preparation for sample selection for the ACS and surveys such as the Current Population Survey (CPS), the National Health Interview Survey (NHIS), and the Survey of Income and Program Participation (SIPP).
In a listing operation such as DAAL, a defined land area-usually a census tabulation block-is traveled in a systematic manner, while an FR records the location and address of every structure where a person lives or could live. Listings for DAAL are conducted on laptop computers using the Automated Listing and Mapping Instrument (ALMI) software. The ALMI uses extracts from the current MAF and TIGER® databases as inputs. Functionality in the ALMI allows users to edit, add, delete, and verify addresses, streets, and other map features; view a list of addresses associated with the selected geography; and view and denote the location of HUs on the electronic map. Compared to information once collected by paper and pencil, ALMI allows for the standardization of data collected through edits and defined data entry fields, standardization of field procedures, efficiencies in data transfer, and timely reflection of the address and feature updates in MAF and TIGER®. For details on DAAL, see Perrone (2005).
The CAUS program is designed specifically to address ACS coverage concerns. The Census Bureau recognized that the DSF, being the primary source of ACS frame updates, does not adequately account for changes in predominantly rural areas of the nation where city-style addresses generally are not used for mail delivery. CAUS, an automated field data collection operation, was designed to provide a rural counterpart to the update of city-style addresses received from the DSF. CAUS improved coverage of the ACS by (1) adding addresses that exist but do not appear in the DSF, (2) adding non-city-style addresses in the DSF that do not appear on the MAF, (3) adding addresses in the DSF that also appear in the MAF but are erroneously excluded from the ACS frame, and (4) deleting addresses that appear in the MAF but are erroneously included in the ACS frame.
Implemented in September 2003, CAUS focused its efforts on census blocks with high concentrations of non-city-style addresses and suspected growth in the HU inventory. Of the approximately 8.2 million blocks nationwide, the CAUS universe comprised the 750,000 blocks where DSF updates are not used to provide adequate coverage. CAUS blocks were selected by a model-based method that used information gained from previous field data collection efforts and administrative records to predict where CAUS work was needed. At present, the CAUS program is suspended until the 2010 Census Address Canvassing and field follow-up activities are completed. For details on the CAUS program and its block selection methodology, see Dean (2005).
All of these MAF improvement activities and operations contribute to the overall update of the MAF. Its continual evaluation and updating are planned and will be described in future releases of this report.
It is expected that the 2010 Census address canvassing and enumeration operations will improve the coverage and quality of the MAF. Field operations to support the 2010 Census will enable HU and GQ updates, additions, and deletions to be identified, collected, and used to update the MAF. The Census Bureau began its Census 2010 operations in 2007. The operations will include several nationwide field canvassing and enumeration operations and will obtain address data through cooperative efforts with tribal, county, and local governments to enhance the MAF. The MAF extracts used by the ACS for sample selection will be improved by these operations. ACS and Census 2010 planners are working together closely to assess the impact of the decennial operations on the ACS.
Master Address File Development and Updating for Puerto Rico
The Census Bureau created an initial MAF for Puerto Rico through field listing operations. This MAF did not include mailing addresses because, in Puerto Rico, Census 2000 used an Update/ Leave methodology through which a census questionnaire was delivered by an enumerator to each living quarter. The MAF update activities that took place from 2002 to 2004 were focused on developing mailing addresses, updating address information, and improving coverage through yearly updates.
MAF Development in Puerto Rico
MAF development in Puerto Rico also used the Census 2000 operations as its foundation. These operations in Puerto Rico included address listing, Update/Leave, the LUCA, and the Be Counted Campaign.
For details on the Census 2000 for Puerto Rico, see Census Bureau (2004b). The Census 2000 procedures and processing systems were designed to capture, process, transfer, and store information for the conventional three-line mailing address. Mailing addresses in Puerto Rico generally incorporate the urbanization name (neighborhood equivalent), which creates a four line address. Use of the urbanization name eliminates the confusion created when street names are repeated in adjacent communities. In some instances, the urbanization name is used in lieu of the street name.
The differences between the standard three-line address and the four-line format used in Puerto Rico created problems during the early MAF building stages. The resulting file structure for the Puerto Rico MAF was the same as that used for states in the United States, so it did not contain the additional fields required to handle the more complex Puerto Rico mailing address. These processing problems did not adversely impact Census 2000 operations in the United States because the record structure was designed to accommodate the standard U.S. three-line address. However, in Puerto Rico, where questionnaire mailout was originally planned as the primary means of collecting data, the three-line address format turned out to be problematic. As a result, it is not possible to calculate the percentage of city-style, non-city-style, and incomplete addresses in Puerto Rico from Census 2000 processes.
MAF Improvement Activities and Operations in Puerto Rico
Because of these address formatting issues, the MAF for Puerto Rico as it existed at the conclusion of Census 2000 required significant work before it could be used by the ACS. The Census Bureau had to revise the address information in the Puerto Rico MAF. This effort involved splitting the address information into the various fields required to construct a mailing address using Puerto Rico addressing conventions.
The Census Bureau contracted for updating the list of addresses in the Puerto Rico MAF. Approximately 64,000 new Puerto Rico HUs have been added to the MAF since Census 2000, with each address geocoded to a municipio, tract, and block. The Census Bureau also worked with the USPS
DSF for Puerto Rico to extract information on new HU addresses. Matching the USPS file to the existing MAF was only partially successful because of inconsistent naming conventions, missing information in the MAF, and the existence of different house numbering schemes (USPS versus local schemes).
Data collection activities in Puerto Rico began in November 2004. The Census Bureau is pursuing options for the ongoing collection of address updates in Puerto Rico. This may include operations comparable to those that exist in the United States, such as DSF updates, MAFGOR, and CAUS. Future versions of this document will include discussions of these operations and MAF development and updating in Puerto Rico.
Master Address File Development and Updating for Special Places and Group Quarters in the United States and Puerto Rico
MAF Development for Special Places and GQs
In preparation for Census 2000, the Census Bureau developed an inventory of special places (SPs) and GQs. SPs are places such as prisons, hotels, migrant farm camps, and universities. GQs are contained within SPs, and include college and university dormitories and hospital/prison wards. The SP/GQ inventory was developed using data from internal Census Bureau lists, administrative lists obtained from various federal agencies, and numerous Census 2000 operations such as address listing, block canvassing, and the SP/GQ Facility Questionnaire operation. Responses to the SP/GQ Facility Questionnaire identified GQs and any HUs associated with the SP. Similar to the HU MAF development process, local and tribal governments had an opportunity to review the SP address list. In August 2000, after the enumeration of GQ facilities, the address and identification information for each GQ was incorporated into the MAF.
MAF Improvement Activities and Operations for Special Places and GQs
As with the HU side of the MAF, maintenance of the GQ universe is an ongoing and complex task. The earlier section on MAF Improvement Activities and Operations for HUs mentions short-term/ one-time operations (such as CQR and MAF/TIGER® reconciliation) that also updated GQ information. Additionally, the Census Bureau completed a GQ geocoding correction operation to fix errors (mostly census block geocodes) associated with college dormitories in the MAF and TIGER®. Information on the new GQ facilities and updated address information for existing GQ facilities are collected on an ongoing basis by listing operations such as DAAL, which also includes the CAUS in rural areas. This information is used to update the MAF. Additionally, it is likely that DSF updates of city-style address areas are providing the Census Bureau with new GQ addresses; however, the DSF does not identify such an address as a GQ facility.
A process to supplement these activities was developed to create an updated GQ universe from which to select the ACS sample. The ACS GQ universe for 2007 was constructed by merging the updated SP/GQ inventory file, extracts from the MAF, and a file of those seasonal GQs that were closed on April 1, 2000 (but might have been open if visited at another time of year). To supplement the ACS GQ universe, the Census Bureau obtained a file of federal prisons and detention centers from the Bureau of Prisons and a file from the U.S. Department of Defense containing military bases and vessels. The Census Bureau also conducted Internet research to identify new migrant worker locations, new state prisons, and state prisons that had closed. ACS FR's use the Group Quarters Facility Questionnaire (GQFQ) to collect updated address and geographic location information. The ACS will use the updates collected via the GQFQ to provide more accurate information for subsequent visits to a facility, as well as to update the ACS GQ universe. For more information about the GQFQ, see the section titled Group Quarters Facility Questionnaire-Initial GQ Contact in Section B.2 of Chapter 8.
In addition to the major decennial operations that will collect and provide updates for GQs, ACS and Census 2010 planners are evaluating the feasibility of a repeatable operation to extract information on new GQ facilities from administrative sources, including data provided by members of the Federal and State Cooperative Program for Population Estimates (FSCPE). If this approach is successful, it likely will provide a cost-effective mechanism for updating the GQ universe for the ACS during the intercensal years. For more information on SP and GQ issues, see Bates (2006a).
American Community Survey Extracts from the Master Address File
The MAF data are provided to ACS in files called MAF extracts. These MAF extracts contain a subset of the data items in the MAF. The major classifications of variables included in the MAF extracts are: address variables, geocode variables, and source and status variables (see Section B). The MAF, as an inventory of living quarters (HUs and GQs) and some nonresidential units, is a dynamic entity. It contains millions of addresses that reflect ongoing additions, deletions, and changes; these include current addresses, as well as those determined to no longer exist. MAF users, such as the ACS, define the set of valid addresses for their programs.
Since the ACS frame must be as complete as possible, filtering rules are applied during the creation of the ACS extracts to minimize both overcoverage and undercoverage and obtain an inclusive listing of addresses. For example, the ACS includes units that represent new construction units, some of which may not exist yet. The ACS also includes other housing units that are not geocoded, which means that the address is one that cannot be linked to a county, census tract, and block. In addition, the ACS includes units that are "excluded from delivery statistics" (EDS); these units often are those under construction, i.e., the housing unit is being constructed and has an address, but the USPS is not yet delivering to the address. In this regard, the ACS filtering rules differ from those for the Census 2000 and the 2004 Census Test, both of which excluded EDS and ungeocoded addresses. The 2006 Census Test filter included EDS, but excluded ungeocoded records.
The filter is reviewed each year and may be enhanced as the ACS learns about its sample addresses and more about the coverage and content of the MAF. For a record to be eligible for the ACS survey, it must meet the conditions set forth in the filter. In general, the ACS sampling frame contains several classes of units, including HUs that existed during Census 2000, post-census additions from the DSF, additions from the DAAL, CQR additions and reinstatements, additions from special censuses and census tests, and Census 2000 deletions that persist in the DSF. Filtering rules change, and with them, the ACS frame. One change was implemented in 2003 when ungeocoded addresses in counties not part of mail-out/mail-back areas (areas where mail is the major mode of data collection) were excluded from the ACS sample.
As discussed above, the ACS attempts to create a sampling frame that is as accurate as possible by minimizing both overcoverage and undercoverage. In the process, the ACS filter rules can lead to net overcoverage, reflecting some duplicate and ineligible units. This overcoverage has been estimated to be approximately 2.0 to 3.7 percent for the years 2002−2006, see Hakanson (2007). For details on the ACS requirements for MAF extracts, see Bates (2006b). For more information on the ACS sample selection, see Chapter 4. For a description of data collection procedures for these different kinds of addresses, see Chapter 7. For details on the MAF, its coverage, and the implications of extract rules on the ACS frame, see Shapiro and Waksberg (1999) and Hakanson (2007).
Bates, Lawrence M. (2006a). "Creating the Group Quarters Universe for the American Community Survey for Sample Year 2007." Internal U.S. Census Bureau Memorandum From D. Whitford to L. Blumerman, Draft, Washington, DC, October 30, 2006.
Bates, Lawrence M. (2006b). "Geographic Products Requirements for the American Community Survey. REVISED for July 2006 Delivery." Internal U.S. Census Bureau Memorandum From D. Kostanich to R. LaMacchia, Draft, Washington, DC, June 19, 2006.
Carter, Nathan E. (2001). "Be Counted Campaign for Census 2000." Proceedings of the Annual Meeting of the American Statistical Association , August 5−9, 2001. Washington, DC: U.S. Census Bureau, DSSD.
Dean, Jared (2005). "Updating the Master Address File: Analysis of Adding Addresses via the Community Address Updating System." Washington, DC.
Hakanson, Amanda (2007). "National Estimate of Coverage of the MAF for 2006," Internal U.S. Census Bureau Memorandum From D. Whitford to R. LaMacchia, Washington, DC, September 28, 2007.
Hanks, Shawn C., Jeremy Hilts, Daniel Keefe, Paul L. Riley, Daniel Sweeney, and Alicia Wentela (2008). "Software Requirements Specification for Address Updates From the Demographic Area Address Listing (DAAL) Operations." Version 1.0, Washington, DC, March 26, 2008.
Hilts, Jeremy (2005). "Software Requirement Specification for Updating the Master Address File From the U.S. Postal Services Delivery Sequence File." Version 7.0, Washington, DC, April 18, 2005.
Perrone, Susan (2005). "Final Report for the Assessment of the Demographic Area Address Listing (DAAL) Program." Internal U.S. Census Bureau Memorandum From R. Killion to R. LaMacchia, Washington, DC, November 9, 2005.
Shapiro, Gary and Joseph Waksberg (1999). "Coverage Analysis for the American Community Survey Memo." Final Report Submitted by Westat to the U.S. Census Bureau, Washington, DC, November 1999.
U.S. Census Bureau (2000). "Census 2000 Operational Plan." Washington, DC, December 2000. U.S. Census Bureau (2000b). "MAF Basics." Washington, DC, 2000.
U.S. Census Bureau (2004b). "Census 2000 Topic Report No. 14: Puerto Rico." Washington, DC, 2004.