Add Standard Groupings
add-standard-groupings.Rmd
Agency standard category schemes
In light of sample sizes, breakdowns from PSRC’s Household Travel
Survey instrument often involve category groupings with less than the
most granular reported level of detail.
psrc.travelsurvey functions provide the most commonly
utilized groupings, which can be considered agency standard for the
variables of interest. Although each new variable is specific to one
unit of analysis (i.e. hh, person, day, trip, vehicle), the first and
only argument to each grouping function is the entire
hts_data
object, in order to facilitate piping. When more
than one standard grouping exists for a variable, the function returns
them all, for convenience; they can be distinguished by a suffix
denoting the number of categories.
library(psrc.travelsurvey)
library(magrittr)
library(data.table)
# Specify which variables to retrieve
vars <- c("age", "hhsize")
# Retrieve the data
hts_data <- get_psrc_hts(survey_year=2023, survey_vars = vars)
# Add standard groupings for age and household size
# -- Notice age and hhsize use different tables (person, hh)
# -- but you don't need to worry about that; the package knows
hts_data <- hts_data %>%
hts_bin_age() %>%
hts_bin_hhsize()
hts_data$person[, head(.SD), .SDcols=patterns("^age")]
#> age age_bin5 age_bin3
#> <ord> <fctr> <fctr>
#> 1: 45-54 years 45-64 Years 18-64 Years
#> 2: 16-17 years Under 18 Years Under 18 Years
#> 3: 12-15 years Under 18 Years Under 18 Years
#> 4: 35-44 years 25-44 Years 18-64 Years
#> 5: 25-34 years 25-44 Years 18-64 Years
#> 6: 25-34 years 25-44 Years 18-64 Years
These functions each require a disaggregate input variable to be
requested in the earlier get_psrc_hts()
call. Both the
input variable and the output variable are included in the resulting
hts_data
object.
Currently, the following functions exist, along with the required input variable.
Function | Input variable | Output variable(s) | Breakpoints |
---|---|---|---|
hts_bin_income() |
hhincome_broad | hhincome_bin3 / hhincome_bin5 |
$50K ]( $100K or $25K ]( $50K ]( $75K ]( $100K |
hts_bin_hhsize() |
hhsize | hhsize_bin4 | 1 • 2 • 3 • 4+ |
hts_bin_vehicle_count() |
vehicle_count | vehicle_count_bin4 | 1 • 2 • 3 • 4+ |
hts_bin_rent_own() |
rent_own | rent_own_bin2 | rent • own |
hts_bin_age() |
age | adult / age_bin3 / age_bin5 |
18[ or 18 )[ 64 )[ 65
or 18 )[ 25 )[ 45 )[ 65 |
hts_bin_worker() |
employment | worker | worker • non-worker |
hts_bin_edu() |
education | edu_bin2 | Less )[ Bachelors+ |
hts_bin_gender() |
gender | gender_bin3 | m • f • nb/oth |
hts_bin_sexuality() |
sexuality | sexuality_bin3 | hetero • LGBTQ+ • Missing |
hts_bin_commute_freq() |
commute_freq | commute_freq_bin6 | Less • 1 • 2 • 3 • 4 • 5+ days/week |
hts_bin_commute_mode() |
commute_mode | commute_mode_bin5 | SOV • HOV • trnst • w/b • oth |
hts_bin_lum_sector() |
industry | lum_sector | (13 sectors) |
hts_bin_industry_sector() |
industry | industry_sector | (8 sectors) |
hts_bin_telecommute_freq() |
telecommute_freq | telecommute_freq_bin4 | Less )[ 1 )[ 3 )[ 5+ |
hts_bin_telework_time() |
telework_time | telework_time_bin3 | Less )[ 3 )[ 6 hrs |
hts_bin_dest_purpose() |
dest_purpose | dest_purpose_bin4 / dest_purpose_bin9 |
Home • Work • Social/Rec/Eat • Errand |
hts_bin_mode() |
mode_characterization | mode_basic | Walk/Bike/Micromobility • Transit • Carpool • Drive alone |
hts_bin_transit_acc_mode() |
mode_acc, mode_characterization | transit_mode_acc | Walked or jogged • Bike/Micromobility • Vehicular |