International Journal of Traffic and Transportation Engineering

p-ISSN: 2325-0062    e-ISSN: 2325-0070

2013;  2(3): 51-54

doi:10.5923/j.ijtte.20130203.05

Validation of Disaggregate Methodologies for National Level Freight Data

Michael Anderson1, Lisa Blanchard2, Lauren Neppel2, Tahmina Khan1

1Civil and Environmental Engineering, University of Alabama in Huntsville, Huntsville, 35899, USA

2Center for Management and Economic Research, University of Alabama in Huntsville, Huntsville, 35899, USA

Correspondence to: Michael Anderson, Civil and Environmental Engineering, University of Alabama in Huntsville, Huntsville, 35899, USA.

Email:

Copyright © 2012 Scientific & Academic Publishing. All Rights Reserved.

Abstract

As freight data is typically available in large, spatially aggregated databases, methodologies for the disaggregation of the data have been developed as a means to determine freight flow data for smaller geographic areas. This paper examines the validity of disaggregating freight data both using values of sales data and employment data by county as a disaggregation metric for specific commodities. The paper examines the use of disaggregation data for selected states within the Freight Analysis Framework Version 3 Database (FAF3). The paper concludes that specific commodities within the FAF3 database can be accurately disaggregated using value of sales data while other counties can be accurately disaggregated using employment data. Thus a combination of disaggregation techniques might yield the best results when disaggregating large national freight data to the county level.

Keywords: Freight, Forecasting, Statistical Validation

Cite this paper: Michael Anderson, Lisa Blanchard, Lauren Neppel, Tahmina Khan, Validation of Disaggregate Methodologies for National Level Freight Data, International Journal of Traffic and Transportation Engineering, Vol. 2 No. 3, 2013, pp. 51-54. doi: 10.5923/j.ijtte.20130203.05.

1. Introduction and Background

Freight data is typically disseminated through highly aggregated national databases, such as the Freight Analysis Framework Version 3 Database (FAF3), that are not practical for local area transportation professionals due to the large spatial areas of zones, often covering multi-county urban areas, portions of states, or entire states. These highly aggregated national zones are too large to obtain specific freight flow data for individual counties. Therefore, these freight databases are not applicable to local planning efforts, thus freight is often ignored in the planning process.
Several studies have been undertaken regarding the methodologies to disaggregate the FAF database including [1], [2], and [3]. Though validation efforts were attempted by a variety of means, including assignment through models and comparisons to other databases, a main limitation of these studies was the inability to verify that the disaggregation technique was successful. The goal of this paper is to develop a mechanism to evaluate the accuracy of the disaggregation techniques and test the methodology on two common disaggregation techniques.

1.1. FAF3 Database

The Federal Highway Administration’s Freight Analysis Framework, Version 3 (FAF3) is a publicly available database that provides estimates of freight movements for 131 geographical regions (FAZs), for 43 commodities (see Table 1) and 8 modes of transport[4]. These estimates are reported in both annual kilotons and millions of dollar value shipped.
Figure 1. FAF3 Domestic Geographic Zones
The disaggregation of originating freight data is often performed using employment as the disaggregation factor [2], [3]. This variable is used as its mimics traditional trip generation methodologies and the prevailing thought is that the higher the number of employees in an industry, the greater the amount of production and freight movement. However, an alternate report documented a disaggregation technique undertaken using the FAF2 database for Alabama where the 2 zones representing the state were disaggregated into 67 individual counties using value of sales[1]. The disaggregation methodology attempted to incorporate factors of production into the disaggregation methodology, essentially introducing the impact of automation which allows for greater production with fewer people[5],[6]. The results of these efforts showed value of sales was a key variable that could be used for disaggregation of the FAF3 data from the zone to the county level.
Table 1. Commodities included in FAF3 database
     

1.2. Scope of Study

The goal of this paper is to examine the efficacy of using either the value of sales data or employment data as the appropriate disaggregation variable. The paper uses a validation procedure to verify the accuracy of disaggregation different commodities. This paper presents a brief introduction and review of related work, the data used in the validation, analysis to validation the technique, and develops conclusions related to the disaggregation techniques.

2. Study Methodology

This study was designed to validate the two disaggregation techniques developed. The first techniques uses value of sales data and handles disaggregation as:
(1)
Where:
Tonnage i j = total tons of freight for county i of commodity j,
Value of Sales i j = Value of sales for county i of commodity j, and
∑ (Value of Sales for all i) j = Total value of sales for the state of commodity j.
The second technique uses either commodity specific employment or total employment, if the commodity specific data is unavailable, and handles the disaggregation as:
(2)
Where:
Tonnage i j = total tons of freight for county i of commodity j,
Employment i j = Employment for county i of commodity j, and
∑ (Employment for all i) j = Total employment for the state of commodity j.
This study uses the FAF3 database in selected states where there exists an urbanized area of sufficient size to warrant its own zone, thus making a state with more than one zone. The disaggregation method was applied to all counties in the state using an aggregated, state-wide value of tonnage moved from the FAF3 database. The value of sales data and employment data were used to disaggregate the data into individual counties. Then, after the disaggregation all the counties in the state, the counties that comprised the FAF3 urban zones were aggregated and compared to the original FAF3 data entry. The hypothesis was that the aggregated counties in the FAF3 zone would have a similar value to the tonnage of freight for the urban FAF3 zone for the specific commodity if the disaggregation technique was appropriate.

2.1. Selection of States

The approach to validation involved the use of the methodology presented and disaggregation techniques that used either employment of value of sale. Several states were considered for selection based on two key attributes. The states selected had to contain a manageable number of counties for efficient data processing and the states selected needed to contain at least two FAF3 freight analysis zones. Five states were selected for use in this study: Colorado, Indiana, Oklahoma, Tennessee, and Utah.

2.2. Employment Data

The employment data were obtained from the Bureau of Labor Statistics and the U.S. Energy Information Administration. The employment data were available using the industry specific coding of the North American Industrial Classification System (NAICS). To use this data with the FAF3 data classified by the Standard Classification of Transported Goods (SCTG), a SCTG-NAICS cross-reference was developed to allow the use of data from both systems. The system used to cross-reference the data is show in Table 2. The two systems do not follow a natural 1-to-1 mapping. A special challenge was presented with the SCTG codes that mapped to more than one NAICS code (SCTG 30 mapped to NAICS 313, 314, and 315; SCTG 35 mapped to NAICS 334 and 335). In these two cases, the NAICS codes were grouped so that all the economic data referred to the smallest NAICS code for that group. That is, all economic data for NAICS 314 and 315 were adjusted to 313, and NAICS 335 adjusted to 334.
Table 2. Cross-reference between SCTG and NAICS
     

2.3. Value of Sales Data

The value of sales data we obtained from a variety of sources. Manufacturing values are published in the Economic Census of the US Census Bureau for each state, metropolitan area, and county that contains manufacturing enterprises. If the actual data were suppressed due to privacy concerns, the values are estimated by taking the portion of the statewide value of sales not already accounted and allocating it to the remaining counties based on employment in that industry, if available.
The Census of Agriculture published by the US Department of Agriculture contains a comprehensive summary of agricultural statistics for every county. Included in the series are value of sales data for various types of crops and animals sold from a particular county. The 2007 Census values were used for this project
For mining and timber data, the US Geological Survey,US Energy Information Administration, US. State Departments of Forest Resources were used. The Bureau of Labor Statistics and population data from the US Census Bureau were used fill in gaps and provide proxies to estimate value of sales as mentioned with manufacturing.
Table 3. Commodity Specific Correlation Coefficients for the Disaggregation Techniques
     

3. Results and Conclusions

The commodity specific correlation coefficient using each of the disaggregation methodologies are presented in Table 3. Obviously from the results presented in Table 3, the disaggregation methodologies based on value of sales data or employment data have commodities that respond better depending on the nature of the industry. Certain commodities (pharmaceuticals, gravel, metallic ores and other foodstuff) show a distinct advantage when using value of sales data while other commodities (meat/seafood, gasoline, fertilizer, wood products) show a distinct advantage when employment is used. Several commodities show a direct linkage between both value of sales data and employment data when trying to disaggregate tonnage shipped, allowing either variable to provide reasonable results.
Overall, the contribution of this paper is identification of which commodities perform better when disaggregating large datasets using either employment or value of sales. The paper identifies several variables that perform better using one of the two disaggregation techniques, leading to the identification of a collection of optimal variables for disaggregating national level databases to county level data for use in local and statewide transportation planning efforts.

References

[1]  Harris and Anderson. FAF2 Pilot Project – Utilization of FAF2 Data by State and Local Governmental Agencies. Alabama Department of Transportation Final Report. February 2009.
[2]  Opie, K., Rowinski, and L.N. Spasovic. Commodity-Specific Disaggregation of 2002 Freight Analysis Framework Data to County Level for New Jersey. In Transportation Research Record: Journal of the Transportation Research Board, No 2121, Transportation Research Board of the National Academies, Washington, D.C., 2009, pp.128-134.
[3]  Viswanathan, K., D.F. Beagan, V. Mysore, and N.N. Srinivasan. Disaggregating Freight Analysis Framework Version 2 Data for Florida: Methodology and Results. In Transportation Research Record: Journal of the Transportation Research Board, No 2049, Transportation Research Board of the National Academies, Washington, D.C.,2008, 167-175.
[4]  FAF3 documentation page.http://faf.ornl.gov/fafweb/Documentation.aspx (August 1, 2010).
[5]  Anderson, M., G. Harris, S. Jeereddy, S. Gholston, J. Swain and N. Schoening. Using a Federal Database and New Factors for Disaggregation of Freight to a Local Level. Proceedings of the 10th International Conference on Application of Advanced Technologies in Transportation, Athens, Greece, May 2008.
[6]  Anderson, M., G. Harris, L. Blanchard, and L. J. Neppel. Using a Federal Database and Local Industry Sector Knowledge to Develop Future Freight Forecasts. Conference Proceedings , Tools of Trade Conference, September 2010