Major Crime Indicators (MCI)
Toronto Police Service Public Safety Data Portal
“MCI_2014_to_2018.csv”
http://data.torontopolice.on.ca/pages/glossary:
For the most part, the statistics on the following pages use an incident-based counting method. Generally, each type of major crime that occurred during an incident will be counted. For example, if an assault and a break and enter took place in the same incident, they would be counted once in each category. Statistics Canada also presents incident-based crime statistics, but generally counts only the most serious offence per incident. Some other police services present their crime statistics using the offence-based method, which counts all offences in each incident. It is important to note these differences when comparing our crime statistics to those provided by Statistics Canada or by other police agencies.
Assault. The direct or indirect application of force to another person, or the attempt or threat to apply force to another person, without that person’s consent.
Robbery. The act of taking property from another person or business by the use of force or intimidation in the presence of the victim.
Break and Enter. The act of entering a place with the intent to commit an indictable offence therein.
Auto Theft. The act of taking or another person’s vehicle (not including attempts). Auto Theft figures represent the number of vehicles stolen.
Theft Over. The act of stealing property in excess of $5,000 (excluding auto theft).
Statistics Canada does not release data at the level of Toronto’s social planning neighbourhoods. Neighbourhood level data for 2016 are initially calculated by summing data for the Census Tracts which comprise each neighbourhood.
“neighbourhood-profiles-2016-csv.csv”
library(data.table) #fread, setcolorder, rbindlist
library(sp) #used by rgdal
library(rgdal) #readOGR
library(ggplot2) #fortify
library(plyr) #join
library(scales) #scale_fill_distiller
library(ggmap) #theme_nothing
library(rgeos) #gCentroid
library(forecast) #autoplot ts, auto.arima
MCI_dt <- fread("MCI_2014_to_2018.csv")
str(MCI_dt)
Classes ‘data.table’ and 'data.frame': 167525 obs. of 29 variables:
$ X : num -79.3 -79.5 -79.5 -79.6 -79.5 ...
$ Y : num 43.7 43.8 43.7 43.7 43.7 ...
$ Index_ : int 214 215 216 217 218 219 220 221 222 223 ...
$ event_unique_id : chr "GO-20141948968" "GO-20141950728" "GO-20141956416" "GO-20141956867" ...
$ occurrencedate : chr "2014-04-24T11:29:00.000Z" "2014-04-24T13:00:00.000Z" "2014-04-25T13:20:00.000Z" "2014-04-24T17:00:00.000Z" ...
$ reporteddate : chr "2014-04-24T12:46:00.000Z" "2014-04-24T15:58:00.000Z" "2014-04-25T13:52:00.000Z" "2014-04-25T10:30:00.000Z" ...
$ premisetype : chr "Commercial" "House" "Apartment" "Outside" ...
$ ucr_code : int 1610 2120 1430 1430 1430 1430 1430 1420 1420 1420 ...
$ ucr_ext : int 200 200 100 100 100 100 100 100 100 100 ...
$ offence : chr "Robbery - Mugging" "B&E" "Assault" "Assault" ...
$ reportedyear : int 2014 2014 2014 2014 2014 2014 2014 2014 2014 2014 ...
$ reportedmonth : chr "April" "April" "April" "April" ...
$ reportedday : int 24 24 25 25 25 25 3 3 3 3 ...
$ reporteddayofyear : int 114 114 115 115 115 115 123 123 123 123 ...
$ reporteddayofweek : chr "Thursday" "Thursday" "Friday" "Friday" ...
$ reportedhour : int 12 15 13 10 16 22 3 4 4 4 ...
$ occurrenceyear : int 2014 2014 2014 2014 2014 2014 2014 2014 2014 2014 ...
$ occurrencemonth : chr "April" "April" "April" "April" ...
$ occurrenceday : int 24 24 25 24 25 25 3 3 3 3 ...
$ occurrencedayofyear: int 114 114 115 114 115 115 123 123 123 123 ...
$ occurrencedayofweek: chr "Thursday" "Thursday" "Friday" "Thursday" ...
$ occurrencehour : int 11 13 13 17 16 22 1 4 4 4 ...
$ MCI : chr "Robbery" "Break and Enter" "Assault" "Assault" ...
$ Division : chr "D55" "D31" "D12" "D23" ...
$ Hood_ID : int 68 24 30 4 114 73 64 79 79 79 ...
$ Neighbourhood : chr "North Riverdale (68)" "Black Creek (24)" "Brookhaven-Amesbury (30)" "Rexdale-Kipling (4)" ...
$ Lat : num 43.7 43.8 43.7 43.7 43.7 ...
$ Long : num -79.3 -79.5 -79.5 -79.6 -79.5 ...
$ ObjectId : int 1 2 3 4 5 6 7 8 9 10 ...
- attr(*, ".internal.selfref")=<externalptr>
unique(MCI_dt$premisetype)
[1] "Commercial" "House" "Apartment" "Outside" "Other"
sort(unique(MCI_dt$ucr_code))
[1] 1410 1420 1430 1440 1450 1455 1457 1460 1461 1462 1470 1475 1480 1610 2120 2121 2125 2130 2132
[20] 2133 2135
sort(unique(MCI_dt$Hood_ID))
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
[25] 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
[49] 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72
[73] 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96
[97] 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120
[121] 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140
unique(MCI_dt$MCI)
[1] "Robbery" "Break and Enter" "Assault" "Theft Over" "Auto Theft"
Uniform Crime Reporting Survey (UCR). UCR Incident-Based Survey: RDC User Manual:
- 1410 - Aggravated Assault – Level 3
- 1420 - Assault with Weapon or Causing Bodily Harm – Level 2
- 1430 - Assault – Level 1
- 1440 - Unlawfully Causing Bodily Harm
- 1450 - Discharge Firearm with Intent
- 1455 - Using Firearm/Imitation of Firearm in commission of offence
- 1457 - Pointing a Firearm
- 1460 - Assault Against Peace-Public Officer
- 1461 – Assault against Peace Officer with a Weapon or Causing Bodily Harm
- 1462 – Aggravated Assault against Peace Officer
- 1470 - Criminal Negligence Causing Bodily Harm
- 1475 – Trap Likely To or Causing Bodily Harm
- 1480 - Other Assaults
- 1610 - Robbery
- 2120 - Break and Enter
- 2121 – Break and Enter to Steal Firearm
- 2125 – Break and Enter of a Motor Vehicle to obtain a Firearm
- 2130 – Theft over $5,000
- 2132 – Theft over $5,000 from a Motor Vehicle
- 2133 – Shoplifting over $5,000
- 2135 – Theft of a Motor Vehicle
Assault refers to three levels of physical assaults which include the following categories:
Common assault, (section 265). This includes the Criminal Code category assault (level 1). This is the least serious form of assault and includes pushing, slapping, punching, and face-to-face verbal threats.
Major assault levels 2 and 3, (sections 267, 268). This includes more serious forms of assault, i.e. assault with a weapon or causing bodily harm (level 2) and aggravated assault (level 3). Assault level 2 involves carrying, using or threatening to use a weapon against someone or causing someone bodily harm. Assault level 3 involves wounding, maiming, disfiguring or endangering the life of someone.
Criminal Code (R.S.C., 1985, c. C-46)
bodily harm means any hurt or injury to a person that interferes with the health or comfort of the person and that is more than merely transient or trifling in nature; (lésions corporelles)
Criminal negligence
219 (1) Every one is criminally negligent who
(a) in doing anything, or
(b) in omitting to do anything that it is his duty to do,
shows wanton or reckless disregard for the lives or safety
of other persons.
Assault
265 (1) A person commits an assault when
(a) without the consent of another person, he applies force intentionally to that other person, directly or indirectly;
(b) he attempts or threatens, by an act or a gesture, to apply force to another person, if he has, or causes that other person to believe on reasonable grounds that he has, present ability to effect his purpose; or
(c) while openly wearing or carrying a weapon or an imitation thereof, he accosts or impedes another person or begs.
Breaking and entering with intent, committing offence or breaking out
348 (1) Every one who
(a) breaks and enters a place with intent to commit an indictable offence therein,
(b) breaks and enters a place and commits an indictable offence therein, or
(c) breaks out of a place after
(i) committing an indictable offence therein, or
(ii) entering the place with intent to commit an indictable offence therein, is guilty (d) if the offence is committed in relation to a dwelling-house, of an indictable offence and liable to imprisonment for life, and
(e) if the offence is committed in relation to a place other than a dwelling-house, of an indictable offence and liable to imprisonment for a term not exceeding ten years or of an offence punishable on summary conviction.
Robbery
343 Every one commits robbery who
(a) steals, and for the purpose of extorting whatever is stolen or to prevent or overcome resistance to the stealing, uses violence or threats of violence to a person or property;
(b) steals from any person and, at the time he steals or immediately before or immediately thereafter, wounds, beats, strikes or uses any personal violence to that person;
(c) assaults any person with intent to steal from him; or
(d) steals from any person while armed with an offensive weapon or imitation thereof.
Breaking and Entering in Canada - 2002, Juristat, Statistics Canada – Catalogue no. 85-002-XPE, Vol. 24, no. 5, page 1: > In 2002, over 31,000 persons were charged with B&E, the vast majority of whom were male (91%). Four in ten persons charged with B&E were youths. For property and violent crimes overall, youths represented 26% and 16% of persons charged, respectively.
Mathieu Charron, Neighbourhood Characteristics and the Distribution of Police-reported Crime in the City of Toronto, Canadian Centre For Justice Statistics, Statistics Canada, Catalogue no. 85-561-M, no. 18.
p. 11:
Crimes reported to the police are not randomly distributed throughout Toronto, but are concentrated in certain areas. An examination of local crime rates (the relationship between the number of crimes and the population at a local level) shows that the rates of violent crime are higher near the downtown core and in the east and northwest areas of the city (Map 5; See ‘Mapping techniques’ in the Methodology section for technical details.), which correspond roughly to the neighbourhoods along the Canadian National railway and to the areas where residents earn the lowest individual incomes (Map 3). There are some hot spots within these areas that have higher rates.Some of these are Danforth, downtown east side and the intersections of Lawrence and Morningside, Jane and Finch, and Jane and Eglinton.
p. 12:
In contrast, in the north area along Yonge Street, where residents earn a higher income, the violent crime rate is much lower than average. The business district—the Bay Street area where most of the workers in the finance and insurance industry are employed—has a violent crime rate well below the average for the city of Toronto. This differs from most of the other Canadian cities that have been the focus of studies, where the violent crime rate in the centre was high (Fitzgerald et al. 2004; Wallace et al. 2006; Kitchen 2006; Charron 2008). A similar situation was noted in Montréal, where the crime hot spots were spread out in many areas of the city (Savoie et al. 2006). The results suggest that the complex social geography of large cities like Toronto and Montréal is related to the spatial organization of crime.
pp. 12-13:
Several neighbourhood characteristics vary according to the local police-reported crime rate. Neighbourhoods with a high rate of violent crime are more densely populated and have a higher percentage of residents living in multi-unit dwellings.They also have the highest percentages of children (under the age of 15), renters, single-parent families and visible minorities. The residents of these neighbourhoods are also less likely to have a university degree, more likely to earn a lower wage, and more likely to live in low-income households.
p. 23:
As for demographic characteristics, rates of harassment and common assault increase with the proportion of children (under 15) and of young men (aged 20 to 29). Rates of sexual assaults, threats, major assaults and robberies decrease as the proportion of people aged 65 and older increases.
p. 24:
Motor vehicle theft rates are higher in neighbourhoods with higher proportions of children (under 15) and young men aged 20 to 29. They are also higher in neighbourhoods where access to socio-economic resources is limited or where there is a subway or train station, as well as in clusters of commercial and manufacturing activity.
p. 25:
The spatial structure of breaking and entering varies essentially with urban and economic activity characteristics. More specifically, results show that breaking and entering is relatively more frequent in central neighbourhoods, with high commercial activity, but less so in areas with high numbers of office jobs (Table 9).
p. 26:
Uttering threats, major assault and drug offences showed the closest association with access to socio-economic resources. Other strong links were noted for mischief, motor vehicle theft, robbery, sexual assault and common assault. Only other thefts (which exclude shoplifting, theft from a motor vehicle and motor vehicle theft) and breaking and entering were not significantly associated with access to socio-economic resources. Economic vulnerability was associated with generally serious violent crimes: robbery, major assault, sexual assault and uttering threats. It was not related to common assault, harassment or any type of property crime.
Neighbourhoods are aggregated from census tracts.
neighbourhoods_dt <- fread("neighbourhood-profiles-2016-csv.csv")
neighbourhoods_dt <- neighbourhoods_dt[, -c("_id", "Category", "Topic", "Data Source", "City of Toronto")]
Select census variables.
v <- c("Neighbourhood Number",
"Population, 2016",
"Land area in square kilometres",
"Children (0-14 years)",
"Seniors (65+ years)",
"Private households by household size",
"Average household size",
"In low income based on the Low-income cut-offs, after tax (LICO-AT)",
"Prevalence of low income based on the Low-income cut-offs, after tax (LICO-AT) (%)",
"Renter",
"Spending 30% or more of income on shelter costs",
"University certificate, diploma or degree at bachelor level or above",
"Unemployed (Males)",
"Unemployment rate (Males)",
"Public transit",
"Walked",
"Employment income: Average amount ($)",
"Social assistance benefits: Population with an amount")
Assuring myself that “Renter” only occurs once, and therefore is not counted both for households and persons but only for households.
length(grep("Renter", neighbourhoods_dt$Characteristic))
[1] 1
neighbourhoods_v <-
neighbourhoods_dt[Characteristic %in% v,]
neighbourhoods_v <- transpose(neighbourhoods_v)
head(neighbourhoods_v)
Land area is in square kilometers. Children are children 0 to 14. Households_unaffordable is the number of households spending 30% or more of income on shelter costs: see Canada Mortgage and Housing Corporation, About Affordable Housing in Canada.
neighbourhoods_census <- neighbourhoods_v[!1,.(id=V1, Population=V2, Land_area=V3, Children=V4,
Seniors=V5, Households=V6, Average_household_size=V7,
LICO=V8, LICO_prevalence=V9, Renters=V10,
Households_unaffordable=V11,
Unemployed_males=V12, Unemployment_rate_males=V13,
Public_transit_to_work=V14, Walk_to_work=V15,
Average_employment_income=V16,
Social_assistance_recipients=V17)]
head(neighbourhoods_census)
neighbourhoods_census$id <- as.integer(neighbourhoods_census$id)
neighbourhoods_census$Population <- as.integer(gsub(",", "", neighbourhoods_census$Population))
neighbourhoods_census$Land_area <- as.numeric(neighbourhoods_census$Land_area)
neighbourhoods_census$Children <- as.integer(gsub(",", "", neighbourhoods_census$Children))
neighbourhoods_census$Seniors <- as.integer(gsub(",", "", neighbourhoods_census$Seniors))
neighbourhoods_census$Households <- as.integer(neighbourhoods_census$Households)
neighbourhoods_census$Average_household_size <- as.numeric(neighbourhoods_census$Average_household_size)
neighbourhoods_census$LICO <- as.integer(gsub(",", "", neighbourhoods_census$LICO))
neighbourhoods_census$LICO_prevalence <- as.numeric(neighbourhoods_census$LICO_prevalence)
neighbourhoods_census$Renters <- as.integer(gsub(",", "", neighbourhoods_census$Renters))
neighbourhoods_census$Households_unaffordable <- as.integer(gsub(",", "", neighbourhoods_census$Households_unaffordable))
neighbourhoods_census$Unemployed_males <- as.integer(gsub(",", "", neighbourhoods_census$Unemployed_males))
neighbourhoods_census$Unemployment_rate_males <- as.numeric(neighbourhoods_census$Unemployment_rate_males)
neighbourhoods_census$Public_transit_to_work <- as.integer(gsub(",", "", neighbourhoods_census$Public_transit_to_work))
neighbourhoods_census$Walk_to_work <- as.integer(gsub(",", "", neighbourhoods_census$Walk_to_work))
neighbourhoods_census$Average_employment_income <- as.numeric(gsub(",", "", neighbourhoods_census$Average_employment_income))
neighbourhoods_census$Social_assistance_recipients <- as.integer(gsub(",", "", neighbourhoods_census$Social_assistance_recipients))
neighbourhoods_census <- neighbourhoods_census[order(id)]
str(neighbourhoods_census)
Classes ‘data.table’ and 'data.frame': 140 obs. of 17 variables:
$ id : int 1 2 3 4 5 6 7 8 9 10 ...
$ Population : int 33312 32954 10360 10529 9456 22000 22156 10948 15535 11051 ...
$ Land_area : num 29.81 4.52 3.31 2.49 2.86 ...
$ Children : int 5060 7090 1730 1640 1805 4240 3555 1450 2120 1770 ...
$ Seniors : int 4980 3560 1880 1730 1275 3585 4905 3045 3290 2025 ...
$ Households : int 10280 9880 3280 3845 3220 7785 8510 4135 6260 3865 ...
$ Average_household_size : num 3.2 3.32 3.09 2.69 2.93 2.82 2.6 2.45 2.43 2.86 ...
$ LICO : int 4550 7140 1485 1640 1695 4340 2470 1090 1250 660 ...
$ LICO_prevalence : num 13.8 21.8 14.7 15.8 17.9 19.7 11.2 10.8 8.2 6 ...
$ Renters : int 3275 5455 1245 1685 1470 3735 3925 1620 2745 595 ...
$ Households_unaffordable : int 3270 3715 1065 1185 1080 2730 2645 1325 1900 750 ...
$ Unemployed_males : int 870 890 260 290 245 475 440 215 250 205 ...
$ Unemployment_rate_males : num 9.2 11.4 9.8 10.4 10.5 8.8 7.8 8.4 5.7 6.7 ...
$ Public_transit_to_work : int 4380 4110 1030 1345 1330 2665 2380 1200 2010 950 ...
$ Walk_to_work : int 425 385 110 150 70 270 140 65 175 75 ...
$ Average_employment_income : num 33340 28126 34385 35988 33188 ...
$ Social_assistance_recipients: int 1290 2915 650 720 705 1710 840 410 370 145 ...
- attr(*, ".internal.selfref")=<externalptr>
MCI_2018 <- MCI_dt[reportedyear==2018]
MCI_2018_nbd <- MCI_2018[, c("MCI", "Hood_ID")]
str(MCI_2018_nbd)
Classes ‘data.table’ and 'data.frame': 36303 obs. of 2 variables:
$ MCI : chr "Assault" "Robbery" "Break and Enter" "Break and Enter" ...
$ Hood_ID: int 75 86 132 121 121 1 122 77 86 31 ...
- attr(*, ".internal.selfref")=<externalptr>
The MCI dataset classifies reports as Assault, Auto Theft, Break and Enter, Robbery, and Theft Over.
MCI_2018_grouped <- MCI_2018_nbd[,.(Number_of_reports=.N),by=.(id=Hood_ID, category=MCI)]
MCI_2018_grouped <- MCI_2018_grouped[order(id)]
Assault_MCI <- MCI_2018_grouped[category=="Assault", .(Assault_reports=sum(Number_of_reports)), by=.(id)]
Auto_theft_MCI <- MCI_2018_grouped[category=="Auto Theft", .(Auto_theft_reports=sum(Number_of_reports)), by=.(id)]
BE_MCI <- MCI_2018_grouped[category=="Break and Enter", .(BE_reports=sum(Number_of_reports)), by=.(id)]
Robbery_MCI <- MCI_2018_grouped[category=="Robbery", .(Robbery_reports=sum(Number_of_reports)), by=.(id)]
Theft_over_MCI <- MCI_2018_grouped[category=="Theft Over", .(Theft_over_reports=sum(Number_of_reports)), by=.(id)]
neighbourhoods_merged <- neighbourhoods_census
neighbourhoods_merged <- merge(neighbourhoods_merged, Assault_MCI, by="id", all=TRUE)
neighbourhoods_merged <- merge(neighbourhoods_merged, Robbery_MCI, by="id", all=TRUE)
neighbourhoods_merged <- merge(neighbourhoods_merged, BE_MCI, by="id", all=TRUE)
neighbourhoods_merged <- merge(neighbourhoods_merged, Theft_over_MCI, by="id", all=TRUE)
neighbourhoods_merged <- merge(neighbourhoods_merged, Auto_theft_MCI, by="id", all=TRUE)
#Robbery and Theft Over have missing values
neighbourhoods_merged[is.na(neighbourhoods_merged)] <- 0
str(neighbourhoods_merged)
Classes ‘data.table’ and 'data.frame': 140 obs. of 22 variables:
$ id : int 1 2 3 4 5 6 7 8 9 10 ...
$ Population : int 33312 32954 10360 10529 9456 22000 22156 10948 15535 11051 ...
$ Land_area : num 29.81 4.52 3.31 2.49 2.86 ...
$ Children : int 5060 7090 1730 1640 1805 4240 3555 1450 2120 1770 ...
$ Seniors : int 4980 3560 1880 1730 1275 3585 4905 3045 3290 2025 ...
$ Households : int 10280 9880 3280 3845 3220 7785 8510 4135 6260 3865 ...
$ Average_household_size : num 3.2 3.32 3.09 2.69 2.93 2.82 2.6 2.45 2.43 2.86 ...
$ LICO : int 4550 7140 1485 1640 1695 4340 2470 1090 1250 660 ...
$ LICO_prevalence : num 13.8 21.8 14.7 15.8 17.9 19.7 11.2 10.8 8.2 6 ...
$ Renters : int 3275 5455 1245 1685 1470 3735 3925 1620 2745 595 ...
$ Households_unaffordable : int 3270 3715 1065 1185 1080 2730 2645 1325 1900 750 ...
$ Unemployed_males : int 870 890 260 290 245 475 440 215 250 205 ...
$ Unemployment_rate_males : num 9.2 11.4 9.8 10.4 10.5 8.8 7.8 8.4 5.7 6.7 ...
$ Public_transit_to_work : int 4380 4110 1030 1345 1330 2665 2380 1200 2010 950 ...
$ Walk_to_work : int 425 385 110 150 70 270 140 65 175 75 ...
$ Average_employment_income : num 33340 28126 34385 35988 33188 ...
$ Social_assistance_recipients: int 1290 2915 650 720 705 1710 840 410 370 145 ...
$ Assault_reports : int 284 259 56 72 75 101 75 46 18 17 ...
$ Robbery_reports : num 69 73 11 25 15 18 16 6 7 2 ...
$ BE_reports : int 154 28 18 28 7 40 65 31 44 22 ...
$ Theft_over_reports : num 50 3 2 4 1 3 4 3 5 5 ...
$ Auto_theft_reports : int 495 73 46 54 37 57 51 16 18 20 ...
- attr(*, ".internal.selfref")=<externalptr>
- attr(*, "sorted")= chr "id"
Calculate ratios of MCI to population
neighbourhoods_merged$Assault_ratio <- neighbourhoods_merged$Assault_reports/neighbourhoods_merged$Population
neighbourhoods_merged$Auto_theft_ratio <- neighbourhoods_merged$Auto_theft_reports/neighbourhoods_merged$Population
neighbourhoods_merged$BE_ratio <- neighbourhoods_merged$BE_reports/neighbourhoods_merged$Population
neighbourhoods_merged$Robbery_ratio <- neighbourhoods_merged$Robbery_reports/neighbourhoods_merged$Population
neighbourhoods_merged$Theft_over_ratio <- neighbourhoods_merged$Theft_over_reports/neighbourhoods_merged$Population
Calculate ratios of census variables to population
toronto_avg_household_size <- neighbourhoods_merged[, mean(Average_household_size)]
toronto_avg_employment_income <- neighbourhoods_merged[, mean(Average_employment_income)]
toronto_avg_unemployment_rate_males <- neighbourhoods_merged[, mean(Unemployment_rate_males)]
toronto_avg_household_size
[1] 2.491643
toronto_avg_employment_income
[1] 55698.18
toronto_avg_unemployment_rate_males
[1] 8.108571
neighbourhoods_merged$Children_ratio <- neighbourhoods_merged$Children/neighbourhoods_merged$Population
neighbourhoods_merged$Seniors_ratio <- neighbourhoods_merged$Seniors/neighbourhoods_merged$Population
neighbourhoods_merged$Renters_ratio <- neighbourhoods_merged$Renters/neighbourhoods_merged$Households
neighbourhoods_merged$Households_unaffordable_ratio <- neighbourhoods_merged$Households_unaffordable/neighbourhoods_merged$Households
neighbourhoods_merged$Public_transit_to_work_ratio <- neighbourhoods_merged$Public_transit_to_work/neighbourhoods_merged$Population
neighbourhoods_merged$Social_assistance_recipients_ratio <- neighbourhoods_merged$Social_assistance_recipients/neighbourhoods_merged$Population
neighbourhoods_merged$Average_household_size_ratio <- neighbourhoods_merged$Average_household_size/toronto_avg_household_size
neighbourhoods_merged$Average_employment_income_ratio <- neighbourhoods_merged$Average_employment_income/toronto_avg_employment_income
neighbourhoods_merged$Unemployment_rate_males_ratio <- neighbourhoods_merged$Unemployment_rate_males/toronto_avg_unemployment_rate_males
Neighbourhoods (WGS84). City of Toronto, Social Development, Finance & Administration
“neighbourhoods_planning_areas_wgs84.zip”
nbds <- readOGR("C:/Users/14165/Desktop/Shapefiles/neighbourhoods_planning_areas_wgs84", "NEIGHBORHOODS_WGS84")
OGR data source with driver: ESRI Shapefile
Source: "C:\Users\14165\Desktop\Shapefiles\neighbourhoods_planning_areas_wgs84", layer: "NEIGHBORHOODS_WGS84"
with 140 features
It has 2 fields
Add “id” column
nbds@data$id <- as.integer(nbds@data$AREA_S_CD)
Make centroids of each neighbourhood, for placing labels when plotting
nbds.centroids <- as.data.frame(gCentroid(nbds, byid = TRUE))
Add “id” column
nbds.centroids$id <- nbds@data$id
Shapefile processing
nbds.points = fortify(nbds, region = "id")
nbds.df = join(nbds.points, nbds@data, by = "id")
nbds_MCI <- merge(nbds.df, neighbourhoods_merged, by = "id")
Make and plot choropleth
p.LICO_percent <- ggplot() +
geom_polygon(data = nbds_MCI,
aes(x = long, y = lat, group = group, fill = LICO_prevalence/100),
color = "black", size = 0.2) +
coord_map() +
scale_fill_distiller(name="LICO prevalence", labels=percent_format(accuracy=1), palette = "RdPu", trans = "reverse", breaks = pretty_breaks(n = 10)) +
theme_nothing(legend = TRUE) +
labs(title="LICO households/total households, 2018") +
geom_text(aes(x=x,y=y, group=NULL, label=id), data = nbds.centroids, size = 2)
p.LICO_percent + guides(fill = guide_legend(reverse = TRUE))
Make and plot choropleth
p.assaults <- ggplot() +
geom_polygon(data = nbds_MCI,
aes(x = long, y = lat, group = group, fill = Assault_reports),
color = "black", size = 0.2) +
coord_map() +
scale_fill_distiller(name="Assaults", palette = "YlOrRd", trans = "reverse", breaks = pretty_breaks(n = 8)) +
theme_nothing(legend = TRUE) +
labs(title="Number of assault reports in Toronto by neighbourhood, 2018") +
geom_text(aes(x=x,y=y, group=NULL, label=id), data = nbds.centroids, size = 2)
p.assaults + guides(fill = guide_legend(reverse = TRUE))
p.assaults_ratio <- ggplot() +
geom_polygon(data = nbds_MCI,
aes(x = long, y = lat, group = group, fill = Assault_ratio),
color = "black", size = 0.2) +
coord_map() +
scale_fill_distiller(name="Assaults/Population", palette = "Reds", trans = "reverse", breaks = pretty_breaks(n = 8)) +
theme_nothing(legend = TRUE) +
labs(title="Number of assault reports/population count in Toronto by neighbourhood, 2018") +
geom_text(aes(x=x,y=y, group=NULL, label=id), data = nbds.centroids, size = 2)
p.assaults_ratio + guides(fill = guide_legend(reverse = TRUE))
https://stats.stackexchange.com/questions/31726/scatterplot-with-contour-heat-overlay
https://gist.github.com/lmullen/8375785
https://gis.stackexchange.com/questions/165974/r-fortify-causing-polygons-to-tear
MCI_Assault_XY <- MCI_dt[MCI=="Assault",.(X,Y)]
MCI_Auto_Theft_XY <- MCI_dt[MCI=="Auto Theft",.(X,Y)]
MCI_BE_XY <- MCI_dt[MCI=="Break and Enter",.(X,Y)]
MCI_Robbery_XY <- MCI_dt[MCI=="Robbery", .(X,Y)]
MCI_Theft_Over_XY <- MCI_dt[MCI=="Theft Over", .(X,Y)]
Read shapefiles
torontoBoundary_wgs84 <- readOGR("C:/Users/14165/Desktop/Shapefiles/torontoBoundary_wgs84", "citygcs_regional_mun_wgs84")
OGR data source with driver: ESRI Shapefile
Source: "C:\Users\14165\Desktop\Shapefiles\torontoBoundary_wgs84", layer: "citygcs_regional_mun_wgs84"
with 1 features
It has 3 fields
Integer64 fields read as strings: AREA_ID OBJECTID
TTC_subway_lines_wgs84 <- readOGR("C:/Users/14165/Desktop/Shapefiles/TTC_subway_lines_wgs84", "TTC_SUBWAY_LINES_WGS84")
OGR data source with driver: ESRI Shapefile
Source: "C:\Users\14165\Desktop\Shapefiles\TTC_subway_lines_wgs84", layer: "TTC_SUBWAY_LINES_WGS84"
with 4 features
It has 3 fields
Integer64 fields read as strings: RID
centreline_wgs84 <- readOGR("C:/Users/14165/Desktop/Shapefiles/centreline_wgs84", "CENTRELINE_WGS84")
OGR data source with driver: ESRI Shapefile
Source: "C:\Users\14165\Desktop\Shapefiles\centreline_wgs84", layer: "CENTRELINE_WGS84"
with 69378 features
It has 17 fields
Integer64 fields read as strings: GEO_ID LFN_ID FNODE TNODE
Every linear feature has feature code (FCODE) defined as follow:
201100 Highway
201101 Highway Ramp
201200 Major Arterial Road
201201 Major Arterial Road Ramp
201300 Minor Arterial Road
201301 Minor Arterial Road Ramp
201400 Collector Road
201401 Collector Road Ramp
201500 Local Road
201600 Other Road
201601 Other Ramp
201700 Laneways
201800 Pending
201803 Access Road
201801 Busway
202001 Major Railway
202002 Minor Railway
202003 Railway under construction/proposed
203001 River
203002 Creek/Tributary
204001 Trail
204002 Walkway
205001 Hydro Line
206001 Major Shoreline
206002 Minor Shoreline (Land locked)
centreline_wgs84_major <- centreline_wgs84[centreline_wgs84@data$FCODE %in% c(201100, 201200, 201300, 201400),]
torontoBoundary_wgs84.df <- fortify(torontoBoundary_wgs84)
Regions defined for each Polygons
TTC_subway_lines_wgs84.df <- fortify(TTC_subway_lines_wgs84)
centreline_wgs84_major.df <- fortify(centreline_wgs84_major)
ggplot()+geom_path(data = centreline_wgs84_major.df, aes(x = long, y = lat, group = group),
color = 'black', size = .2)
ggplot() +
geom_polygon(data = torontoBoundary_wgs84.df, aes(x = long, y = lat, group = group),
color = 'black', size = 1, fill=NA) +
geom_path(data = TTC_subway_lines_wgs84.df, aes(x = long, y = lat, group = group),
color = 'red', size = 1) +
geom_path(data = centreline_wgs84_major.df, aes(x = long, y = lat, group = group),
color = 'black', size = .1) +
stat_density2d(data=MCI_Assault_XY, aes(x=X, y=Y, fill=..level..), alpha=0.2, geom = 'polygon', colour = 'black', contour=TRUE) +
scale_fill_continuous(low="yellow",high="red")+
theme_nothing(legend = TRUE) +
labs(title="Assault Reports in Toronto, 2014-2018")
ggplot() +
geom_polygon(data = torontoBoundary_wgs84.df, aes(x = long, y = lat, group = group),
color = 'black', size = 1, fill=NA) +
geom_path(data = TTC_subway_lines_wgs84.df, aes(x = long, y = lat, group = group),
color = 'red', size = 1) +
geom_path(data = centreline_wgs84_major.df, aes(x = long, y = lat, group = group),
color = 'black', size = .1) +
stat_density2d(data=MCI_Auto_Theft_XY, aes(x=X, y=Y, fill=..level..), alpha=0.2, geom = 'polygon', colour = 'black', contour=TRUE) +
scale_fill_continuous(low="yellow",high="red") +
theme_nothing(legend = TRUE) +
labs(title="Auto Theft Reports in Toronto, 2014-2018")
Change bandwidth parameter h
ggplot() +
geom_polygon(data = torontoBoundary_wgs84.df, aes(x = long, y = lat, group = group),
color = 'black', size = 1, fill=NA) +
geom_path(data = TTC_subway_lines_wgs84.df, aes(x = long, y = lat, group = group),
color = 'red', size = 1) +
geom_path(data = centreline_wgs84_major.df, aes(x = long, y = lat, group = group),
color = 'black', size = .1) +
stat_density2d(data=MCI_Auto_Theft_XY, aes(x=X, y=Y, fill=..level..), alpha=0.2, h=0.05, n=300, geom = 'polygon', colour = 'black', contour=TRUE) +
scale_fill_continuous(low="yellow",high="red") +
theme_nothing(legend = TRUE) +
labs(title="Auto Theft Reports in Toronto, 2014-2018. h=0.05")
ggplot() +
geom_polygon(data = torontoBoundary_wgs84.df, aes(x = long, y = lat, group = group),
color = 'black', size = 1, fill=NA) +
geom_path(data = TTC_subway_lines_wgs84.df, aes(x = long, y = lat, group = group),
color = 'red', size = 1) +
geom_path(data = centreline_wgs84_major.df, aes(x = long, y = lat, group = group),
color = 'black', size = .1) +
stat_density2d(data=MCI_BE_XY, aes(x=X, y=Y, fill=..level..), alpha=0.2, geom = 'polygon', colour = 'black', contour=TRUE) +
scale_fill_continuous(low="yellow",high="red") +
theme_nothing(legend = TRUE) +
labs(title="Break and Enter Reports in Toronto, 2014-2018")
ggplot() +
geom_polygon(data = torontoBoundary_wgs84.df, aes(x = long, y = lat, group = group),
color = 'black', size = 1, fill=NA) +
geom_path(data = TTC_subway_lines_wgs84.df, aes(x = long, y = lat, group = group),
color = 'red', size = 1) +
geom_path(data = centreline_wgs84_major.df, aes(x = long, y = lat, group = group),
color = 'black', size = .1) +
stat_density2d(data=MCI_Robbery_XY, aes(x=X, y=Y, fill=..level..), alpha=0.2, geom = 'polygon', colour = 'black', contour=TRUE) +
scale_fill_continuous(low="yellow",high="red") +
theme_nothing(legend = TRUE) +
labs(title="Robbery Reports in Toronto, 2014-2018")
ggplot() +
geom_polygon(data = torontoBoundary_wgs84.df, aes(x = long, y = lat, group = group),
color = 'black', size = 1, fill=NA) +
geom_path(data = TTC_subway_lines_wgs84.df, aes(x = long, y = lat, group = group),
color = 'red', size = 1) +
geom_path(data = centreline_wgs84_major.df, aes(x = long, y = lat, group = group),
color = 'black', size = .1) +
stat_density2d(data=MCI_Theft_Over_XY, aes(x=X, y=Y, fill=..level..), alpha=0.2, geom = 'polygon', colour = 'black', contour=TRUE) +
scale_fill_continuous(low="yellow",high="red") +
theme_nothing(legend = TRUE) +
labs(title="Theft Over $5000 Reports in Toronto, 2014-2018")
str(neighbourhoods_merged)
Classes ‘data.table’ and 'data.frame': 140 obs. of 36 variables:
$ id : int 1 2 3 4 5 6 7 8 9 10 ...
$ Population : int 33312 32954 10360 10529 9456 22000 22156 10948 15535 11051 ...
$ Land_area : num 29.81 4.52 3.31 2.49 2.86 ...
$ Children : int 5060 7090 1730 1640 1805 4240 3555 1450 2120 1770 ...
$ Seniors : int 4980 3560 1880 1730 1275 3585 4905 3045 3290 2025 ...
$ Households : int 10280 9880 3280 3845 3220 7785 8510 4135 6260 3865 ...
$ Average_household_size : num 3.2 3.32 3.09 2.69 2.93 2.82 2.6 2.45 2.43 2.86 ...
$ LICO : int 4550 7140 1485 1640 1695 4340 2470 1090 1250 660 ...
$ LICO_prevalence : num 13.8 21.8 14.7 15.8 17.9 19.7 11.2 10.8 8.2 6 ...
$ Renters : int 3275 5455 1245 1685 1470 3735 3925 1620 2745 595 ...
$ Households_unaffordable : int 3270 3715 1065 1185 1080 2730 2645 1325 1900 750 ...
$ Unemployed_males : int 870 890 260 290 245 475 440 215 250 205 ...
$ Unemployment_rate_males : num 9.2 11.4 9.8 10.4 10.5 8.8 7.8 8.4 5.7 6.7 ...
$ Public_transit_to_work : int 4380 4110 1030 1345 1330 2665 2380 1200 2010 950 ...
$ Walk_to_work : int 425 385 110 150 70 270 140 65 175 75 ...
$ Average_employment_income : num 33340 28126 34385 35988 33188 ...
$ Social_assistance_recipients : int 1290 2915 650 720 705 1710 840 410 370 145 ...
$ Assault_reports : int 284 259 56 72 75 101 75 46 18 17 ...
$ Robbery_reports : num 69 73 11 25 15 18 16 6 7 2 ...
$ BE_reports : int 154 28 18 28 7 40 65 31 44 22 ...
$ Theft_over_reports : num 50 3 2 4 1 3 4 3 5 5 ...
$ Auto_theft_reports : int 495 73 46 54 37 57 51 16 18 20 ...
$ Assault_ratio : num 0.00853 0.00786 0.00541 0.00684 0.00793 ...
$ Auto_theft_ratio : num 0.01486 0.00222 0.00444 0.00513 0.00391 ...
$ BE_ratio : num 0.00462 0.00085 0.00174 0.00266 0.00074 ...
$ Robbery_ratio : num 0.00207 0.00222 0.00106 0.00237 0.00159 ...
$ Theft_over_ratio : num 0.001501 0.000091 0.000193 0.00038 0.000106 ...
$ Children_ratio : num 0.152 0.215 0.167 0.156 0.191 ...
$ Seniors_ratio : num 0.149 0.108 0.181 0.164 0.135 ...
$ Renters_ratio : num 0.319 0.552 0.38 0.438 0.457 ...
$ Households_unaffordable_ratio : num 0.318 0.376 0.325 0.308 0.335 ...
$ Public_transit_to_work_ratio : num 0.1315 0.1247 0.0994 0.1277 0.1407 ...
$ Social_assistance_recipients_ratio: num 0.0387 0.0885 0.0627 0.0684 0.0746 ...
$ Average_household_size_ratio : num 1.28 1.33 1.24 1.08 1.18 ...
$ Average_employment_income_ratio : num 0.599 0.505 0.617 0.646 0.596 ...
$ Unemployment_rate_males_ratio : num 1.13 1.41 1.21 1.28 1.29 ...
- attr(*, ".internal.selfref")=<externalptr>
- attr(*, "sorted")= chr "id"
neighbourhoods_ratios <-
neighbourhoods_merged[, c("Assault_ratio",
"Auto_theft_ratio",
"BE_ratio",
"Robbery_ratio",
"Theft_over_ratio",
"Average_household_size",
"LICO_prevalence",
"Children_ratio",
"Seniors_ratio",
"Renters_ratio",
"Public_transit_to_work_ratio",
"Social_assistance_recipients_ratio",
"Average_household_size_ratio",
"Average_employment_income_ratio",
"Unemployment_rate_males_ratio")]
Compute correlations; there are only weak correlations between the MCI and demographic ratios I have selected.
cor(as.matrix(neighbourhoods_ratios))
Assault_ratio Auto_theft_ratio BE_ratio Robbery_ratio
Assault_ratio 1.0000000 0.123934572 0.644161811 0.8466125
Auto_theft_ratio 0.1239346 1.000000000 0.177476235 0.2621782
BE_ratio 0.6441618 0.177476235 1.000000000 0.5913394
Robbery_ratio 0.8466125 0.262178218 0.591339393 1.0000000
Theft_over_ratio 0.6465254 0.356012795 0.692406847 0.5872684
Average_household_size -0.3220854 0.337037837 -0.318775580 -0.2446828
LICO_prevalence 0.5444735 -0.115264927 0.273804880 0.4082138
Children_ratio -0.3295457 0.127253992 -0.454609155 -0.2649427
Seniors_ratio -0.3511057 0.028090974 -0.050108996 -0.2796248
Renters_ratio 0.3868284 -0.194433863 0.179501393 0.2721201
Public_transit_to_work_ratio 0.1356827 -0.260180807 -0.026893672 0.1262704
Social_assistance_recipients_ratio 0.4263891 0.003831698 0.002276327 0.3276343
Average_household_size_ratio -0.3220854 0.337037837 -0.318775580 -0.2446828
Average_employment_income_ratio -0.1899689 -0.095820295 0.158920915 -0.1389625
Unemployment_rate_males_ratio 0.1713490 0.118402893 -0.067685554 0.1195029
Theft_over_ratio Average_household_size LICO_prevalence
Assault_ratio 0.64652538 -0.32208540 0.5444735
Auto_theft_ratio 0.35601280 0.33703784 -0.1152649
BE_ratio 0.69240685 -0.31877558 0.2738049
Robbery_ratio 0.58726840 -0.24468277 0.4082138
Theft_over_ratio 1.00000000 -0.25953552 0.2641209
Average_household_size -0.25953552 1.00000000 -0.1531719
LICO_prevalence 0.26412090 -0.15317188 1.0000000
Children_ratio -0.42530132 0.62537285 -0.1074141
Seniors_ratio -0.12776436 0.20648236 -0.3889159
Renters_ratio 0.13890737 -0.55429267 0.6618480
Public_transit_to_work_ratio -0.01837898 -0.55228205 0.2120016
Social_assistance_recipients_ratio -0.01122991 0.05802879 0.6612614
Average_household_size_ratio -0.25953552 1.00000000 -0.1531719
Average_employment_income_ratio 0.03930987 -0.24239320 -0.4525759
Unemployment_rate_males_ratio -0.05914277 0.46818268 0.5901170
Children_ratio Seniors_ratio Renters_ratio
Assault_ratio -0.32954565 -0.35110573 0.3868284
Auto_theft_ratio 0.12725399 0.02809097 -0.1944339
BE_ratio -0.45460916 -0.05010900 0.1795014
Robbery_ratio -0.26494269 -0.27962482 0.2721201
Theft_over_ratio -0.42530132 -0.12776436 0.1389074
Average_household_size 0.62537285 0.20648236 -0.5542927
LICO_prevalence -0.10741409 -0.38891587 0.6618480
Children_ratio 1.00000000 -0.14210117 -0.1488113
Seniors_ratio -0.14210117 1.00000000 -0.4107354
Renters_ratio -0.14881132 -0.41073539 1.0000000
Public_transit_to_work_ratio -0.18119889 -0.45232803 0.5736128
Social_assistance_recipients_ratio 0.29844040 -0.42513857 0.5328017
Average_household_size_ratio 0.62537285 0.20648236 -0.5542927
Average_employment_income_ratio -0.06997932 0.18820202 -0.2084157
Unemployment_rate_males_ratio 0.38403635 -0.11106102 0.2099765
Public_transit_to_work_ratio Social_assistance_recipients_ratio
Assault_ratio 0.13568268 0.426389126
Auto_theft_ratio -0.26018081 0.003831698
BE_ratio -0.02689367 0.002276327
Robbery_ratio 0.12627040 0.327634301
Theft_over_ratio -0.01837898 -0.011229909
Average_household_size -0.55228205 0.058028794
LICO_prevalence 0.21200157 0.661261447
Children_ratio -0.18119889 0.298440398
Seniors_ratio -0.45232803 -0.425138569
Renters_ratio 0.57361284 0.532801690
Public_transit_to_work_ratio 1.00000000 0.189770279
Social_assistance_recipients_ratio 0.18977028 1.000000000
Average_household_size_ratio -0.55228205 0.058028794
Average_employment_income_ratio -0.11905638 -0.514634851
Unemployment_rate_males_ratio -0.16862450 0.604488407
Average_household_size_ratio Average_employment_income_ratio
Assault_ratio -0.32208540 -0.18996891
Auto_theft_ratio 0.33703784 -0.09582029
BE_ratio -0.31877558 0.15892091
Robbery_ratio -0.24468277 -0.13896248
Theft_over_ratio -0.25953552 0.03930987
Average_household_size 1.00000000 -0.24239320
LICO_prevalence -0.15317188 -0.45257594
Children_ratio 0.62537285 -0.06997932
Seniors_ratio 0.20648236 0.18820202
Renters_ratio -0.55429267 -0.20841573
Public_transit_to_work_ratio -0.55228205 -0.11905638
Social_assistance_recipients_ratio 0.05802879 -0.51463485
Average_household_size_ratio 1.00000000 -0.24239320
Average_employment_income_ratio -0.24239320 1.00000000
Unemployment_rate_males_ratio 0.46818268 -0.43807384
Unemployment_rate_males_ratio
Assault_ratio 0.17134895
Auto_theft_ratio 0.11840289
BE_ratio -0.06768555
Robbery_ratio 0.11950289
Theft_over_ratio -0.05914277
Average_household_size 0.46818268
LICO_prevalence 0.59011703
Children_ratio 0.38403635
Seniors_ratio -0.11106102
Renters_ratio 0.20997653
Public_transit_to_work_ratio -0.16862450
Social_assistance_recipients_ratio 0.60448841
Average_household_size_ratio 0.46818268
Average_employment_income_ratio -0.43807384
Unemployment_rate_males_ratio 1.00000000
str(MCI_dt)
Classes ‘data.table’ and 'data.frame': 167525 obs. of 29 variables:
$ X : num -79.3 -79.5 -79.5 -79.6 -79.5 ...
$ Y : num 43.7 43.8 43.7 43.7 43.7 ...
$ Index_ : int 214 215 216 217 218 219 220 221 222 223 ...
$ event_unique_id : chr "GO-20141948968" "GO-20141950728" "GO-20141956416" "GO-20141956867" ...
$ occurrencedate : chr "2014-04-24T11:29:00.000Z" "2014-04-24T13:00:00.000Z" "2014-04-25T13:20:00.000Z" "2014-04-24T17:00:00.000Z" ...
$ reporteddate : chr "2014-04-24T12:46:00.000Z" "2014-04-24T15:58:00.000Z" "2014-04-25T13:52:00.000Z" "2014-04-25T10:30:00.000Z" ...
$ premisetype : chr "Commercial" "House" "Apartment" "Outside" ...
$ ucr_code : int 1610 2120 1430 1430 1430 1430 1430 1420 1420 1420 ...
$ ucr_ext : int 200 200 100 100 100 100 100 100 100 100 ...
$ offence : chr "Robbery - Mugging" "B&E" "Assault" "Assault" ...
$ reportedyear : int 2014 2014 2014 2014 2014 2014 2014 2014 2014 2014 ...
$ reportedmonth : chr "April" "April" "April" "April" ...
$ reportedday : int 24 24 25 25 25 25 3 3 3 3 ...
$ reporteddayofyear : int 114 114 115 115 115 115 123 123 123 123 ...
$ reporteddayofweek : chr "Thursday" "Thursday" "Friday" "Friday" ...
$ reportedhour : int 12 15 13 10 16 22 3 4 4 4 ...
$ occurrenceyear : int 2014 2014 2014 2014 2014 2014 2014 2014 2014 2014 ...
$ occurrencemonth : chr "April" "April" "April" "April" ...
$ occurrenceday : int 24 24 25 24 25 25 3 3 3 3 ...
$ occurrencedayofyear: int 114 114 115 114 115 115 123 123 123 123 ...
$ occurrencedayofweek: chr "Thursday" "Thursday" "Friday" "Thursday" ...
$ occurrencehour : int 11 13 13 17 16 22 1 4 4 4 ...
$ MCI : chr "Robbery" "Break and Enter" "Assault" "Assault" ...
$ Division : chr "D55" "D31" "D12" "D23" ...
$ Hood_ID : int 68 24 30 4 114 73 64 79 79 79 ...
$ Neighbourhood : chr "North Riverdale (68)" "Black Creek (24)" "Brookhaven-Amesbury (30)" "Rexdale-Kipling (4)" ...
$ Lat : num 43.7 43.8 43.7 43.7 43.7 ...
$ Long : num -79.3 -79.5 -79.5 -79.6 -79.5 ...
$ ObjectId : int 1 2 3 4 5 6 7 8 9 10 ...
- attr(*, ".internal.selfref")=<externalptr>
- attr(*, "index")= int
..- attr(*, "__reportedyear")= int 1 2 3 4 5 6 7 8 9 10 ...
..- attr(*, "__MCI")= int 3 4 5 6 7 8 9 10 11 12 ...
MCI_dt_dates <- MCI_dt[,.(reportedyear,reportedmonth,MCI)]
MCI_dt_dates$reportedmonth <- match(MCI_dt_dates$reportedmonth, month.name)
str(MCI_dt_dates)
Classes ‘data.table’ and 'data.frame': 167525 obs. of 3 variables:
$ reportedyear : int 2014 2014 2014 2014 2014 2014 2014 2014 2014 2014 ...
$ reportedmonth: int 4 4 4 4 4 4 5 5 5 5 ...
$ MCI : chr "Robbery" "Break and Enter" "Assault" "Assault" ...
- attr(*, ".internal.selfref")=<externalptr>
Assault_dt <- MCI_dt_dates[MCI=="Assault",.N, by = .(reportedyear,reportedmonth)]
Assault_dt <- Assault_dt[order(reportedyear,reportedmonth)]
str(Assault_dt)
Classes ‘data.table’ and 'data.frame': 60 obs. of 3 variables:
$ reportedyear : int 2014 2014 2014 2014 2014 2014 2014 2014 2014 2014 ...
$ reportedmonth: int 1 2 3 4 5 6 7 8 9 10 ...
$ N : int 1188 1162 1228 1232 1502 1556 1377 1469 1490 1409 ...
- attr(*, ".internal.selfref")=<externalptr>
Assault.ts <- ts(Assault_dt$N, start = 2014, frequency = 12)
autoplot(Assault.ts)
Assault.ts.components <- decompose(Assault.ts)
autoplot(Assault.ts.components)
Assault.ts.stl <- stl(Assault.ts, s.window = "periodic")
autoplot(Assault.ts.stl)
Assault.ts.arima <- auto.arima(Assault.ts)
Assault.ts.arima
Series: Assault.ts
ARIMA(0,1,2)(1,1,0)[12]
Coefficients:
ma1 ma2 sar1
-1.1118 0.3465 -0.5257
s.e. 0.1713 0.1833 0.1308
sigma^2 estimated as 5311: log likelihood=-269.35
AIC=546.7 AICc=547.65 BIC=554.1
Assault.ts.arima.forecast <- forecast(Assault.ts.arima, level = c(95), h = 12)
autoplot(Assault.ts.arima.forecast)
Theft_Over_dt <- MCI_dt_dates[MCI=="Theft Over",.N, by = .(reportedyear,reportedmonth)]
Theft_Over_dt <- Theft_Over_dt[order(reportedyear,reportedmonth)]
Theft_Over.ts <- ts(Theft_Over_dt$N, start = 2014, frequency = 12)
Theft_Over.ts.components <- decompose(Theft_Over.ts)
Theft_Over.ts.stl <- stl(Theft_Over.ts, s.window = "periodic")
autoplot(Theft_Over.ts.components)
autoplot(Theft_Over.ts.stl)
Theft_Over.ts.arima <- auto.arima(Theft_Over.ts)
Theft_Over.ts.arima
Series: Theft_Over.ts
ARIMA(1,1,1)
Coefficients:
ar1 ma1
0.3119 -0.8859
s.e. 0.1429 0.0635
sigma^2 estimated as 176.4: log likelihood=-235.79
AIC=477.58 AICc=478.02 BIC=483.81