opendatatoronto
is an R interface to the City of Toronto Open Data Portal. The goal of the package is to help read data directly into R without needing to manually download it via the portal.
In the portal, datasets are called packages. You can see a list of available packages by using list_packages()
. This will show metadata about the package, including what topics (i.e. tags) the package covers, a description of it, any civic issues it addresses, how many resources there are (and their formats), how often it is is refreshed and when it was last refreshed.
library(opendatatoronto)
packages <- list_packages(limit = 10)
packages
#> # A tibble: 10 x 10
#> title id topics civic_issues excerpt dataset_category num_resources
#> <chr> <chr> <chr> <chr> <chr> <chr> <int>
#> 1 Dail… 8a6e… City … Affordable … "Daily… Table 8
#> 2 Body… c405… City … <NA> "This … Table 2
#> 3 Stre… 1db3… City … Mobility "Trans… Map 1
#> 4 Stre… 74f6… City … <NA> "Publi… Map 1
#> 5 Stre… 821f… City … <NA> "Publi… Map 1
#> 6 Stre… ccfd… City … <NA> "Poste… Map 1
#> 7 Stre… cf70… City … <NA> "Poste… Map 1
#> 8 Stre… 99b1… City … <NA> "Infor… Map 1
#> 9 Stre… 71e6… Trans… <NA> "Bike … Map 1
#> 10 Stre… 0c4e… City … <NA> "Bench… Map 1
#> # … with 3 more variables: formats <chr>, refresh_rate <chr>,
#> # last_refreshed <date>
Or, you can search packages by title using search_packages()
:
apartment_packages <- search_packages("Apartment")
apartment_packages
#> # A tibble: 2 x 10
#> title id topics civic_issues excerpt dataset_category num_resources formats
#> <chr> <chr> <chr> <chr> <chr> <chr> <int> <chr>
#> 1 Apar… 2b98… Busin… Affordable … "This … Table 1 XML,JS…
#> 2 Apar… 4ef8… Locat… Affordable … "This … Table 1 CSV,JS…
#> # … with 2 more variables: refresh_rate <chr>, last_refreshed <date>
You can also see metadata for one specific package using show_package()
:
show_package("996cfe8d-fb35-40ce-b569-698d51fc683b")
#> # A tibble: 1 x 10
#> title id topics civic_issues excerpt dataset_category num_resources formats
#> <chr> <chr> <chr> <chr> <chr> <chr> <int> <chr>
#> 1 TTC … 996c… <NA> <NA> <NA> <NA> 35 <NA>
#> # … with 2 more variables: refresh_rate <chr>, last_refreshed <date>
Within a package, there are a number of resources - e.g. CSV, XSLX, JSON, SHP files, and more. Resources are the actual “data”.
For a given package, you can get a list of resources using list_package_resources()
, either by using a package found via search_packages()
or list_packages()
:
apartment_building_registration_package <- search_packages("Apartment Building Registration")
apartment_building_registration_resources <- apartment_building_registration_package %>%
list_package_resources()
apartment_building_registration_resources
#> # A tibble: 1 x 4
#> name id format last_modified
#> <chr> <chr> <chr> <date>
#> 1 Apartment Building Registra… 3ad76a8c-0518-4df2-b94e-8c7… CSV 2020-03-06
or by passing the package’s portal URL directly:
list_package_resources("https://open.toronto.ca/dataset/apartment-building-registration/")
#> # A tibble: 1 x 4
#> name id format last_modified
#> <chr> <chr> <chr> <date>
#> 1 Apartment Building Registra… 3ad76a8c-0518-4df2-b94e-8c7… CSV 2020-03-06
Finally (and most usefully!), you can download the resource (i.e., the actual data) directly into R using get_resource()
:
apartment_building_registration_data <- apartment_building_registration_resources %>%
get_resource()
apartment_building_registration_data
#> # A tibble: 3,454 x 70
#> `_id` AIR_CONDITIONIN… AMENITIES_AVAIL… ANNUAL_FIRE_ALA… ANNUAL_FIRE_PUM…
#> <int> <chr> <chr> <lgl> <lgl>
#> 1 6905 NONE <NA> NA NA
#> 2 6906 NONE <NA> NA NA
#> 3 6907 NONE <NA> NA NA
#> 4 6908 INDIVIDUAL UNITS <NA> NA NA
#> 5 6909 CENTRAL AIR <NA> NA NA
#> 6 6910 CENTRAL AIR Indoor recreati… NA NA
#> 7 6911 NONE <NA> NA NA
#> 8 6912 NONE <NA> NA NA
#> 9 6913 NONE Child play area NA NA
#> 10 6914 NONE <NA> NA NA
#> # … with 3,444 more rows, and 65 more variables:
#> # APPROVED_FIRE_SAFETY_PLAN <lgl>, BALCONIES <chr>,
#> # BARRIER_FREE_ACCESSIBILTY_ENTR <chr>, BIKE_PARKING <chr>,
#> # CONFIRMED_STOREYS <lgl>, CONFIRMED_UNITS <lgl>,
#> # DATE_OF_LAST_INSPECTION_BY_TSSA <lgl>,
#> # DESCRIPTION_OF_CHILD_PLAY_AREA <lgl>,
#> # DESCRIPTION_OF_INDOOR_EXERCISE_ROOM <lgl>,
#> # DESCRIPTION_OF_OUTDOOR_REC_FACILITIES <lgl>, ELEVATOR_PARTS_REPLACED <lgl>,
#> # ELEVATOR_STATUS <lgl>, EMERG_POWER_SUPPLY_TEST_RECORDS <lgl>,
#> # EXTERIOR_FIRE_ESCAPE <chr>, FACILITIES_AVAILABLE <lgl>, FIRE_ALARM <chr>,
#> # GARBAGE_CHUTES <chr>, GREEN_BIN_LOCATION <lgl>,
#> # HEATING_EQUIPMENT_STATUS <lgl>, HEATING_EQUIPMENT_YEAR_INSTALLED <lgl>,
#> # HEATING_TYPE <chr>, INDOOR_GARBAGE_STORAGE_AREA <lgl>, INTERCOM <chr>,
#> # IS_THERE_A_COOLING_ROOM <lgl>, IS_THERE_EMERGENCY_POWER <lgl>,
#> # LAUNDRY_ROOM <chr>, LAUNDRY_ROOM_HOURS_OF_OPERATION <lgl>,
#> # LAUNDRY_ROOM_LOCATION <lgl>, LOCKER_OR_STORAGE_ROOM <chr>,
#> # NO_BARRIER_FREE_ACCESSBLE_UNITS <lgl>,
#> # NO_OF_ACCESSIBLE_PARKING_SPACES <lgl>, NO_OF_ELEVATORS <chr>,
#> # NO_OF_LAUNDRY_ROOM_MACHINES <lgl>, NON_SMOKING_BUILDING <lgl>,
#> # OUTDOOR_GARBAGE_STORAGE_AREA <lgl>, PARKING_TYPE <chr>, PCODE <lgl>,
#> # PET_RESTRICTIONS <lgl>, PETS_ALLOWED <chr>,
#> # PROP_MANAGEMENT_COMPANY_NAME <chr>, PROPERTY_TYPE <chr>,
#> # RECYCLING_BINS_LOCATION <lgl>, RSN <int>,
#> # SEPARATE_GAS_METERS_EACH_UNIT <chr>, SEPARATE_HYDRO_METER_EACH_UNIT <chr>,
#> # SEPARATE_WATER_METERS_EA_UNIT <chr>, SITE_ADDRESS <chr>,
#> # SPRINKLER_SYSTEM <chr>, SPRINKLER_SYSTEM_TEST_RECORD <lgl>,
#> # SPRINKLER_SYSTEM_YEAR_INSTALLED <lgl>, TSSA_TEST_RECORDS <lgl>,
#> # VISITOR_PARKING <chr>, WARD <chr>, WINDOW_TYPE <chr>, YEAR_BUILT <chr>,
#> # YEAR_OF_REPLACEMENT <lgl>, YEAR_REGISTERED <chr>, NO_OF_STOREYS <int>,
#> # `IS_THERE_EMERGENCY_POWER?` <chr>, `NON-SMOKING_BUILDING` <chr>,
#> # NO_OF_UNITS <int>, NO_OF_ACCESSIBLEPARKING_SPACES <int>,
#> # `FACILITIES_AVAILABLE?` <chr>, `IS_THERE_A_COOLING_ROOM?` <chr>,
#> # NO_BARRIERFREE_ACCESSBLE_UNITS <int>
The opendatatoronto
package can currently handle the download of CSV, XLS/XLSX, XML, JSON, SHP, and GeoJSON resources, as well as ZIP resources that contain multiple files. For more information, see the following vignettes: