Set Operations
GeoIDs.jl supports standard set operations on GEOID sets, allowing you to combine and manipulate geographic areas in powerful ways. These operations follow the mathematical set theory concepts and create new GEOID sets that can be saved and versioned.
Available Operations
The package provides four primary set operations:
- Union: Combines all GEOIDs from multiple sets
- Intersection: Keeps only GEOIDs that appear in all sets
- Difference: Keeps GEOIDs from the first set that don't appear in the second set
- Symmetric Difference: Keeps GEOIDs that appear in exactly one of the two sets
Union Operation
The union operation combines all GEOIDs from multiple sets, removing duplicates.
union_geoid_sets(set_names::Vector{String}, output_name::String, description::String="") -> Vector{String}Example
# Combine western and mountain counties into a single set
combined = union_geoid_sets(
["western_counties", "mountain_counties"],
"western_mountain_counties",
"Combined western and mountain regions"
)
# Combine multiple regions
combined_regions = union_geoid_sets(
["south_florida", "central_florida", "east_coast"],
"analysis_region",
"Combined analysis region"
)Parameters
set_names: Vector of GEOID set names to combineoutput_name: Name of the new GEOID set to createdescription: Optional description for the new set- Returns: Vector of GEOIDs in the resulting union set
Intersection Operation
The intersection operation keeps only GEOIDs that appear in all of the input sets.
intersect_geoid_sets(set_names::Vector{String}, output_name::String, description::String="") -> Vector{String}Example
# Find counties that are both in Florida and coastal
common = intersect_geoid_sets(
["florida_counties", "coastal_counties"],
"florida_coastal_counties",
"Florida counties on the coast"
)
# Find counties that are in all three sets
three_way_intersection = intersect_geoid_sets(
["high_population", "high_income", "urban_counties"],
"affluent_urban_counties",
"Wealthy, populated urban counties"
)Parameters
set_names: Vector of GEOID set names to intersectoutput_name: Name of the new GEOID set to createdescription: Optional description for the new set- Returns: Vector of GEOIDs in the resulting intersection set
Difference Operation
The difference operation (sometimes called "set subtraction") keeps GEOIDs that are in the first set but not in the second set.
difference_geoid_sets(base_set::String, subtract_set::String, output_name::String, description::String="") -> Vector{String}Example
# Get all Florida counties that are not coastal
non_coastal_florida = difference_geoid_sets(
"florida_counties",
"coastal_counties",
"florida_inland_counties",
"Florida counties that are not on the coast"
)
# Get all high population counties that are not in the west
non_western_populous = difference_geoid_sets(
"high_population_counties",
"western_counties",
"eastern_populous_counties"
)Parameters
base_set: Name of the GEOID set to start withsubtract_set: Name of the GEOID set to subtractoutput_name: Name of the new GEOID set to createdescription: Optional description for the new set- Returns: Vector of GEOIDs in the resulting difference set
Symmetric Difference Operation
The symmetric difference operation keeps GEOIDs that appear in exactly one of the two input sets (not in both).
symmetric_difference_geoid_sets(set1::String, set2::String, output_name::String, description::String="") -> Vector{String}Example
# Find counties that are either eastern or coastal, but not both
exclusive_regions = symmetric_difference_geoid_sets(
"eastern_counties",
"coastal_counties",
"exclusive_regions",
"Counties that are either eastern or coastal, but not both"
)
# Find counties that are in exactly one of these two states
fl_ga_exclusive = symmetric_difference_geoid_sets(
"florida_counties",
"georgia_counties",
"fl_ga_exclusive"
)Parameters
set1: Name of the first GEOID setset2: Name of the second GEOID setoutput_name: Name of the new GEOID set to createdescription: Optional description for the new set- Returns: Vector of GEOIDs in the resulting symmetric difference set
Combining Operations
You can chain set operations together to create complex geographic definitions:
# Get Florida coastal counties first
fl_coastal = intersect_geoid_sets(
["florida_counties", "coastal_counties"],
"florida_coastal_counties"
)
# Get Florida coastal counties that are not in the southern region
central_coastal = difference_geoid_sets(
"florida_coastal_counties",
"south_florida",
"central_florida_coastal"
)
# Combine with a specific county that we want to include
final_region = union_geoid_sets(
["central_florida_coastal", "individual_counties"],
"analysis_region"
)Persistence and Versioning
All set operations create new GEOID sets that are:
- Stored in the database
- Versioned like any other GEOID set
- Available for further operations or analysis
This means you can:
- Track how your geographic definitions were created
- Modify the resulting sets with normal GEOID set operations
- Roll back changes if needed
- Export and import the sets with their full history