Author

Source and descriptions of datasets

Ontario Community Health Profiles Partnership (OCHPP)
http://www.ontariohealthprofiles.ca/

Data — LHIN 7 (Toronto Central and City of Toronto) Neighbourhoods, Ontario Sub-Regions and LHINs

We use three datasets:

We quote from the description of the data:

The Registered Persons Database (RPDB) as the source of population
http://www.ontariohealthprofiles.ca/o_documents/aboutTheDataON/RPDB_vs_Census_Oct_26_2018.pdf

Two main population sources

There are two main sources that provide the most reliable estimates of Ontario’s population. For purposes of OCHPP reporting, we use the Registered Persons Database (RPDB) as the source (denominator) for all of our health-related indicators. While another major source for identifying the population (denominator) is Statistics Canada’s Census counts (Census), the OCHPP has determined that the RPDB provides a more consistent measure for the calculation of rates for health indicators and conditions. We explain why and provide examples to illustrate.

RPDB

The RPDB (database) provides basic demographic information about anyone who has ever received an Ontario Health Insurance Plan (OHIP) card. OHIP cards include a unique Health Card Number (HCN) to identify a person’s age, sex and address, including postal code. The postal codes used at ICES come mainly from HCNs. Information from the health card is stored in the RPDB. Health cards are usually renewed every 5 years, a process that helps to ensure that information in the system is periodically refreshed, such as address, for example, should a person move and update the card within that time frame. The system may also be updated more frequently if an individual interacts with the healthcare system between renewal periods. This allows for a more current source of location-based data for health reporting.

Census

Another source of data on where people live can come from Statistics Canada (Stats Can). Stats Can collects data on individuals living in Canada through the Census, a survey conducted every 5 years. Stats Can reports who lives in a given area at one point in time – i.e. based on the information an individual provides when completing the Census

RPDB vs. Census

While at the provincial level, the number of people we identify in the RPDB as living in Ontario does not differ much from Census estimates, differences are more pronounced at smaller areas such as neighbourhoods or local areas. This particular issue is most evident in areas of high migration where we have observed large differences in rates using RPDB vs. the Census in part due to population mobility.
For example, in some areas of Ontario, particularly in larger urban centres, newcomers to the area often settle first “downtown” but over time, may move from downtown to outlying areas. The majority of people who move do not change or update their health card until renewal time so they stay in the RPDB with their original health card information including the address and postal code of their “downtown” address.
The same is true of Census population counts: when someone moves, they are still considered living at their original address and comprise the population of that area until the next Census is taken (every 5 years).

Shapefile for City of Toronto’s 140 neighbourhoods

We now import and process a shapefile for Toronto’s 140 neighbourhoods.

https://www.toronto.ca/city-government/data-research-maps/open-data/open-data-catalogue/#a45bd45a-ede8-730e-1abc-93105b2c439f

http://opendata.toronto.ca/gcc/neighbourhoods_planning_areas_wgs84.zip

Neighbourhoods

Owner: Social Development, Finance & Administration

Currency: June 2014

Neighbourhoods (WGS84)

library(data.table) #fread
library(plyr) #join
library(ggplot2) #ggplot, fortify
library(sp) #used by rgdal
library(rgdal) #readOGR
library(rgeos) #gCentroid
library(scales) #scale_fill_distiller, percent_format
library(ggmap) #theme_nothing
library(Hmisc) #rcorr

Input shapefile

nbds.sh <- readOGR("C:/Users/14165/Desktop/ArcGIS/SHAPEFILES/neighbourhoods_planning_areas_wgs84", "NEIGHBORHOODS_WGS84")
OGR data source with driver: ESRI Shapefile 
Source: "C:\Users\14165\Desktop\ArcGIS\SHAPEFILES\neighbourhoods_planning_areas_wgs84", layer: "NEIGHBORHOODS_WGS84"
with 140 features
It has 2 fields

Add “id” column

nbds.sh@data$id <- as.integer(nbds.sh@data$AREA_S_CD)

Make centroids of each neighbourhood, for placing labels when plotting

nbds.sh.centroids  <- as.data.frame(gCentroid(nbds.sh, byid = TRUE))

Add “id” column

nbds.sh.centroids$id <- nbds.sh@data$id

Shapefile processing

nbds.sh.points = fortify(nbds.sh, region = "id")

nbds.sh.df = join(nbds.sh.points, nbds.sh@data, by = "id")

We are now done with shapefile processing.

We copied selected columns from sheets of the .xls/.xlsx files to new sheets, renamed columns by changing spaces and hyphens to underscores and spelling symbols, and exported these to .csv files.

We shall visualize selected data from these tables on a map. Then we compute correlations.

Primary Care: Enrolment and Continuity of Care 2011/13

1_pc_Continuity_NonRostered_Rostered_Patients_neighb_2013_LHIN_7.xlsx!Continuity_Enrolled_NonEnrolled:

Primary Care: Enrolment and Continuity of Care (Both sexes, Ages 19+) for Toronto Neighbourhoods and Toronto Central LHIN, 2011/12 to 2012/13 (April 1, 2011-March 31, 2013)

Data sources: Population - Ontario Heath Insurance Plan (OHIP) physician claims, Client Agency Provider Enrolment (CAPE) tables, Registered Persons’ Database, Community Health Centre client encounter data.

Population Enrolled: rostered with a Patient Enrolment Model (PEM) or registered with a Community Health Centre (CHC).

Population Non-Enrolled: not rostered with a Patient Enrolment Model (PEM) or registered with a Community Health Centre (CHC).

“id”, “Population_19_plus”, “Enrolled_Population”, “Enrolled_Population_with_No_Visits”, “Enrolled_Population_with_1_or_2_Visits”, “Enrolled_Population_with_3_plus_Visits”, “Non_Enrolled_Population”, “Non_Enrolled_Population_with_No_Visits”, “Non_Enrolled_Population_with_1_or_2_Visits”, “Non_Enrolled_Population_with_3_plus_Visits”, “Total_Population_Enrolled_and_Non_Enrolled”, “Total_Population_with_No_Visits”, “Total_Population_with_1_or_2_Visits”, “Total_Population_with_3_plus_Visits”

1_pc_Continuity_NonRostered_Rostered_Patients_neighb_2013_LHIN_7.csv

patients <- fread("1_pc_Continuity_NonRostered_Rostered_Patients_neighb_2013_LHIN_7.csv")
str(patients)
Classes ‘data.table’ and 'data.frame':  140 obs. of  14 variables:
 $ id                                        : int  1 2 3 4 5 6 7 8 9 10 ...
 $ Population_19_plus                        : int  27552 25140 8548 8497 7834 17733 16996 9076 12298 8922 ...
 $ Enrolled_Population                       : int  20095 17636 5982 6593 5748 13110 13037 6728 8856 6695 ...
 $ Enrolled_Population_with_No_Visits        : int  1030 775 285 356 290 640 655 364 530 405 ...
 $ Enrolled_Population_with_1_or_2_Visits    : int  2145 1723 697 855 735 1492 1632 864 1346 1047 ...
 $ Enrolled_Population_with_3_plus_Visits    : int  16920 15138 5000 5382 4723 10978 10750 5500 6980 5243 ...
 $ Non_Enrolled_Population                   : int  7457 7504 2566 1904 2086 4623 3959 2348 3442 2227 ...
 $ Non_Enrolled_Population_with_No_Visits    : int  1258 1232 411 405 365 900 771 430 652 434 ...
 $ Non_Enrolled_Population_with_1_or_2_Visits: int  1509 1483 485 411 411 1005 862 515 772 552 ...
 $ Non_Enrolled_Population_with_3_plus_Visits: int  4690 4789 1670 1088 1310 2718 2326 1403 2018 1241 ...
 $ Total_Population_Enrolled_and_Non_Enrolled: int  27552 25140 8548 8497 7834 17733 16996 9076 12298 8922 ...
 $ Total_Population_with_No_Visits           : int  2288 2007 696 761 655 1540 1426 794 1182 839 ...
 $ Total_Population_with_1_or_2_Visits       : int  3654 3206 1182 1266 1146 2497 2494 1379 2118 1599 ...
 $ Total_Population_with_3_plus_Visits       : int  21610 19927 6670 6470 6033 13696 13076 6903 8998 6484 ...
 - attr(*, ".internal.selfref")=<externalptr> 

Create ratio columns for selected variables

patients$Ratio_Non_Enrolled_Population <- patients$Non_Enrolled_Population/patients$Population_19_plus

patients$Ratio_Total_Population_with_No_Visits <- patients$Total_Population_with_No_Visits/patients$Population_19_plus

patients$Ratio_Total_Population_with_3_plus_Visits <- patients$Total_Population_with_3_plus_Visits/patients$Population_19_plus

patients_ratios <- patients[,.(id,Ratio_Non_Enrolled_Population,Ratio_Total_Population_with_No_Visits,Ratio_Total_Population_with_3_plus_Visits)]

Ratio_Non_Enrolled_Population

Ratio_Non_Enrolled_Population <- patients_ratios[, .(id,Ratio_Non_Enrolled_Population)]

Ratio_Non_Enrolled_Population.sh <- merge(nbds.sh.df, Ratio_Non_Enrolled_Population, by = "id")

Make graphics object

p.Ratio_Non_Enrolled_Population <- ggplot() +
  geom_polygon(data = Ratio_Non_Enrolled_Population.sh, 
               aes(x = long, y = lat, group = group, fill = Ratio_Non_Enrolled_Population), 
               color = "black", size = 0.2) + 
  coord_map() + 
  scale_fill_distiller(name="Ratio", labels=percent_format(accuracy=1), palette = "PuRd", trans = "reverse", breaks = pretty_breaks(n = 8))+
  theme_nothing(legend = TRUE) + 
  labs(title="Ratio of total population non-enrolled") +
  geom_text(aes(x=x,y=y, group=NULL, label=id), data = nbds.sh.centroids, size = 2)

Plot graphics object

p.Ratio_Non_Enrolled_Population + guides(fill = guide_legend(reverse=TRUE))

Ratio_Total_Population_with_No_Visits

Ratio_Total_Population_with_No_Visits <- patients_ratios[, .(id,Ratio_Total_Population_with_No_Visits)]

Ratio_Total_Population_with_No_Visits.sh <- merge(nbds.sh.df, Ratio_Total_Population_with_No_Visits, by = "id")

Make graphics object

p.Ratio_Total_Population_with_No_Visits <- ggplot() +
  geom_polygon(data = Ratio_Total_Population_with_No_Visits.sh, 
               aes(x = long, y = lat, group = group, fill = Ratio_Total_Population_with_No_Visits), 
               color = "black", size = 0.2) + 
  coord_map() + 
  scale_fill_distiller(name="Ratio", labels=percent_format(accuracy=1), palette = "PuRd", trans = "reverse", breaks = pretty_breaks(n = 8)) + 
  theme_nothing(legend = TRUE) + 
  labs(title="Ratio of total population with no visits") +
  geom_text(aes(x=x,y=y, group=NULL, label=id), data = nbds.sh.centroids, size = 2)

Plot graphics object

p.Ratio_Total_Population_with_No_Visits + guides(fill = guide_legend(reverse=TRUE))

Ratio_Total_Population_with_3_plus_Visits

Ratio_Total_Population_with_3_plus_Visits <- patients_ratios[, .(id,Ratio_Total_Population_with_3_plus_Visits)]

Ratio_Total_Population_with_3_plus_Visits.sh <- merge(nbds.sh.df, Ratio_Total_Population_with_3_plus_Visits, by = "id")

Make graphics object

p.Ratio_Total_Population_with_3_plus_Visits <- ggplot() +
  geom_polygon(data = Ratio_Total_Population_with_3_plus_Visits.sh, 
               aes(x = long, y = lat, group = group, fill = Ratio_Total_Population_with_3_plus_Visits), 
               color = "black", size = 0.2) + 
  coord_map() + 
  scale_fill_distiller(name="Ratio", labels=percent_format(accuracy=1), palette = "Greens", trans = "reverse", breaks = pretty_breaks(n = 8))+
  theme_nothing(legend = TRUE) + 
  labs(title="Ratio of total population with 3+ visits") +
  geom_text(aes(x=x,y=y, group=NULL, label=id), data = nbds.sh.centroids, size = 2)

Plot graphics object

p.Ratio_Total_Population_with_3_plus_Visits + guides(fill = guide_legend(reverse=TRUE))

Adult Health and Disease 2016/17

1_AHD_2017_RPDB_Neighb_LHIN_7.xlsx!Diabetes_Neighb_Toronto:

Demographics - Denominator: Ontario Ministry of Health and Long-Term Care Registered Persons Database (RPDB), population aged 20+ who were alive and living in the Ontario on April 1st, 2016

Numerator: derived from validated, disease registries maintained by the Institute for Clinical Evaluative Sciences (ICES)

1_AHD_2017_RPDB_Neighb_LHIN_7.xlsx!COPD_Neighb_Toronto:

Demographics - Denominator: Ontario Ministry of Health and Long-Term Care Registered Persons Database (RPDB), population aged 35+ who were alive and living in the Ontario on April 1st, 2016

Numerator: derived from validated, disease registries maintained by the Institute for Clinical Evaluative Sciences (ICES)

“id”, “People_with_Diabetes_20_plus”, “Total_Population_2016_RPDB_20_plus”, “Prevalence_per_hundred_Diabetes_20_plus”, “People_with_COPD_35_plus”, “Total_Population_2016_RPDB_35_plus”, “Prevalence_per_hundred_COPD_35_plus”

1_AHD_2017_RPDB_Neighb_LHIN_7.csv

chronic <- fread("1_AHD_2017_RPDB_Neighb_LHIN_7.csv")

chronic$Ratio_People_with_Diabetes_20_plus <- chronic$People_with_Diabetes_20_plus/chronic$Total_Population_2016_RPDB_20_plus

chronic$Ratio_People_with_COPD_35_plus <- chronic$People_with_COPD_35_plus/chronic$Total_Population_2016_RPDB_35_plus

chronic_ratios <- chronic[,.(id,Ratio_People_with_Diabetes_20_plus,Ratio_People_with_COPD_35_plus)]
Ratio_People_with_Diabetes_20_plus <- chronic_ratios[, .(id,Ratio_People_with_Diabetes_20_plus)]

Ratio_People_with_Diabetes_20_plus.sh <- merge(nbds.sh.df, Ratio_People_with_Diabetes_20_plus, by = "id")

Make and plot graphics object

p.Ratio_People_with_Diabetes_20_plus <- ggplot() +
  geom_polygon(data = Ratio_People_with_Diabetes_20_plus.sh, 
               aes(x = long, y = lat, group = group, fill = Ratio_People_with_Diabetes_20_plus), 
               color = "black", size = 0.2) + 
  coord_map() + 
  scale_fill_distiller(name="Ratio", labels=percent_format(accuracy=1), palette = "PuRd", trans = "reverse",
                       breaks = pretty_breaks(n = 8)) + 
  theme_nothing(legend = TRUE)+
  labs(title="Ratio of people with diabetes, 20+") +
  geom_text(aes(x=x,y=y, group=NULL, label=id), data = nbds.sh.centroids, size = 2)

p.Ratio_People_with_Diabetes_20_plus + guides(fill = guide_legend(reverse=TRUE))

Ratio_People_with_COPD_35_plus <- chronic_ratios[, .(id,Ratio_People_with_COPD_35_plus)]

Ratio_People_with_COPD_35_plus.sh <- merge(nbds.sh.df, Ratio_People_with_COPD_35_plus, by = "id")

Make and plot graphics object

p.Ratio_People_with_COPD_35_plus <- ggplot() +
  geom_polygon(data = Ratio_People_with_COPD_35_plus.sh, 
               aes(x = long, y = lat, group = group, fill = Ratio_People_with_COPD_35_plus), 
               color = "black", size = 0.2) + 
  coord_map() + 
  scale_fill_distiller(name="Ratio", labels=percent_format(accuracy=1), palette = "PuRd", trans = "reverse",
                       breaks = pretty_breaks(n = 8))+
  theme_nothing(legend = TRUE) + 
  labs(title="Ratio of people with COPD, 35+") +
  geom_text(aes(x=x,y=y, group=NULL, label=id), data = nbds.sh.centroids, size = 2)

p.Ratio_People_with_COPD_35_plus + guides(fill = guide_legend(reverse=TRUE))

Sexual Health: Chlamydia cases, Gonorrhea cases 2013/16

2_sh_neighb_chlam_gonor_2013-2016_LHIN_7.xlsx!Chlamydia_Male_Female_Tor_neigh:

Number of Chlamydia Cases for Age 15 years and over by Gender (Male, Female), Toronto Neighbourhoods, 2013 to 2016 Calendar Years Combined

Demographics - Denominator: Based on 2011 Census population estimates. (Statistics Canada, 2011 Census of Population).

Numerator: Number of Chlamydia cases for 4 year (2013 to 2016) observation period. Data source: Data as of July 20, 2017, Toronto Public Health integrated Public Health Information System [iPHIS].

  • Average annual rate of Chlamydia cases (/100,000) Males (2013 to 2016), All Ages 15+

  • Average annual rate of Chlamydia cases (/100,000) Females (2013 to 2016), All Ages 15+

“id”, “Chlamydia_cases_male”, “Population_male”, “Chlamydia_cases_female”, “Population_female”

2_sh_neighb_chlam_gonor_2013-2016_LHIN_7.csv

chlamydia <- fread("2_sh_neighb_chlam_gonor_2013-2016_LHIN_7.csv")

chlamydia$Ratio_Chlamydia_cases_male <- chlamydia$Chlamydia_cases_male/chlamydia$Population_male

chlamydia$Ratio_Chlamydia_cases_female <- chlamydia$Chlamydia_cases_female/chlamydia$Population_female

chlamydia_ratios <- chlamydia[,.(id,Ratio_Chlamydia_cases_male,Ratio_Chlamydia_cases_female)]
Ratio_Chlamydia_cases_male <- chlamydia_ratios[, .(id,Ratio_Chlamydia_cases_male)]

Ratio_Chlamydia_cases_male.sh <- merge(nbds.sh.df, Ratio_Chlamydia_cases_male, by = "id")

Make and plot graphics object

p.Ratio_Chlamydia_cases_male <- ggplot() +
  geom_polygon(data = Ratio_Chlamydia_cases_male.sh, 
               aes(x = long, y = lat, group = group, fill = Ratio_Chlamydia_cases_male), 
               color = "black", size = 0.2) + 
  coord_map() + 
  scale_fill_distiller(name="Cases/population", labels=percent_format(accuracy=1), palette = "YlOrBr", trans = "reverse", breaks = pretty_breaks(n = 8)) + 
  theme_nothing(legend = TRUE) + 
  labs(title="Number of male chlamydia cases/male population") +
  geom_text(aes(x=x,y=y, group=NULL, label=id), data = nbds.sh.centroids, size = 2)

p.Ratio_Chlamydia_cases_male + guides(fill = guide_legend(reverse=TRUE))

Correlations

OCHPP_ratios <- merge(merge(patients_ratios, chronic_ratios), chlamydia_ratios)

Structure of data table:

str(OCHPP_ratios)
Classes ‘data.table’ and 'data.frame':  140 obs. of  8 variables:
 $ id                                       : int  1 2 3 4 5 6 7 8 9 10 ...
 $ Ratio_Non_Enrolled_Population            : num  0.271 0.298 0.3 0.224 0.266 ...
 $ Ratio_Total_Population_with_No_Visits    : num  0.083 0.0798 0.0814 0.0896 0.0836 ...
 $ Ratio_Total_Population_with_3_plus_Visits: num  0.784 0.793 0.78 0.761 0.77 ...
 $ Ratio_People_with_Diabetes_20_plus       : num  0.182 0.184 0.177 0.175 0.17 ...
 $ Ratio_People_with_COPD_35_plus           : num  0.0684 0.0658 0.0926 0.1047 0.097 ...
 $ Ratio_Chlamydia_cases_male               : num  0.0199 0.0199 0.0125 0.0114 0.0181 ...
 $ Ratio_Chlamydia_cases_female             : num  0.0232 0.0277 0.023 0.0263 0.0289 ...
 - attr(*, ".internal.selfref")=<externalptr> 
 - attr(*, "sorted")= chr "id"

Drop “id” column

OCHPP_ratios_noid <- OCHPP_ratios[,-c("id")]

str(OCHPP_ratios_noid)
Classes ‘data.table’ and 'data.frame':  140 obs. of  7 variables:
 $ Ratio_Non_Enrolled_Population            : num  0.271 0.298 0.3 0.224 0.266 ...
 $ Ratio_Total_Population_with_No_Visits    : num  0.083 0.0798 0.0814 0.0896 0.0836 ...
 $ Ratio_Total_Population_with_3_plus_Visits: num  0.784 0.793 0.78 0.761 0.77 ...
 $ Ratio_People_with_Diabetes_20_plus       : num  0.182 0.184 0.177 0.175 0.17 ...
 $ Ratio_People_with_COPD_35_plus           : num  0.0684 0.0658 0.0926 0.1047 0.097 ...
 $ Ratio_Chlamydia_cases_male               : num  0.0199 0.0199 0.0125 0.0114 0.0181 ...
 $ Ratio_Chlamydia_cases_female             : num  0.0232 0.0277 0.023 0.0263 0.0289 ...
 - attr(*, ".internal.selfref")=<externalptr> 

Compute correlations

rcorr(as.matrix(OCHPP_ratios_noid), type="pearson")
                                          Ratio_Non_Enrolled_Population
Ratio_Non_Enrolled_Population                                      1.00
Ratio_Total_Population_with_No_Visits                              0.45
Ratio_Total_Population_with_3_plus_Visits                         -0.32
Ratio_People_with_Diabetes_20_plus                                -0.12
Ratio_People_with_COPD_35_plus                                     0.14
Ratio_Chlamydia_cases_male                                         0.53
Ratio_Chlamydia_cases_female                                       0.51
                                          Ratio_Total_Population_with_No_Visits
Ratio_Non_Enrolled_Population                                              0.45
Ratio_Total_Population_with_No_Visits                                      1.00
Ratio_Total_Population_with_3_plus_Visits                                 -0.94
Ratio_People_with_Diabetes_20_plus                                        -0.77
Ratio_People_with_COPD_35_plus                                            -0.27
Ratio_Chlamydia_cases_male                                                 0.39
Ratio_Chlamydia_cases_female                                               0.08
                                          Ratio_Total_Population_with_3_plus_Visits
Ratio_Non_Enrolled_Population                                                 -0.32
Ratio_Total_Population_with_No_Visits                                         -0.94
Ratio_Total_Population_with_3_plus_Visits                                      1.00
Ratio_People_with_Diabetes_20_plus                                             0.91
Ratio_People_with_COPD_35_plus                                                 0.34
Ratio_Chlamydia_cases_male                                                    -0.23
Ratio_Chlamydia_cases_female                                                   0.10
                                          Ratio_People_with_Diabetes_20_plus
Ratio_Non_Enrolled_Population                                          -0.12
Ratio_Total_Population_with_No_Visits                                  -0.77
Ratio_Total_Population_with_3_plus_Visits                               0.91
Ratio_People_with_Diabetes_20_plus                                      1.00
Ratio_People_with_COPD_35_plus                                          0.38
Ratio_Chlamydia_cases_male                                             -0.05
Ratio_Chlamydia_cases_female                                            0.31
                                          Ratio_People_with_COPD_35_plus Ratio_Chlamydia_cases_male
Ratio_Non_Enrolled_Population                                       0.14                       0.53
Ratio_Total_Population_with_No_Visits                              -0.27                       0.39
Ratio_Total_Population_with_3_plus_Visits                           0.34                      -0.23
Ratio_People_with_Diabetes_20_plus                                  0.38                      -0.05
Ratio_People_with_COPD_35_plus                                      1.00                       0.13
Ratio_Chlamydia_cases_male                                          0.13                       1.00
Ratio_Chlamydia_cases_female                                        0.23                       0.75
                                          Ratio_Chlamydia_cases_female
Ratio_Non_Enrolled_Population                                     0.51
Ratio_Total_Population_with_No_Visits                             0.08
Ratio_Total_Population_with_3_plus_Visits                         0.10
Ratio_People_with_Diabetes_20_plus                                0.31
Ratio_People_with_COPD_35_plus                                    0.23
Ratio_Chlamydia_cases_male                                        0.75
Ratio_Chlamydia_cases_female                                      1.00

n= 140 


P
                                          Ratio_Non_Enrolled_Population
Ratio_Non_Enrolled_Population                                          
Ratio_Total_Population_with_No_Visits     0.0000                       
Ratio_Total_Population_with_3_plus_Visits 0.0001                       
Ratio_People_with_Diabetes_20_plus        0.1670                       
Ratio_People_with_COPD_35_plus            0.0889                       
Ratio_Chlamydia_cases_male                0.0000                       
Ratio_Chlamydia_cases_female              0.0000                       
                                          Ratio_Total_Population_with_No_Visits
Ratio_Non_Enrolled_Population             0.0000                               
Ratio_Total_Population_with_No_Visits                                          
Ratio_Total_Population_with_3_plus_Visits 0.0000                               
Ratio_People_with_Diabetes_20_plus        0.0000                               
Ratio_People_with_COPD_35_plus            0.0014                               
Ratio_Chlamydia_cases_male                0.0000                               
Ratio_Chlamydia_cases_female              0.3191                               
                                          Ratio_Total_Population_with_3_plus_Visits
Ratio_Non_Enrolled_Population             0.0001                                   
Ratio_Total_Population_with_No_Visits     0.0000                                   
Ratio_Total_Population_with_3_plus_Visits                                          
Ratio_People_with_Diabetes_20_plus        0.0000                                   
Ratio_People_with_COPD_35_plus            0.0000                                   
Ratio_Chlamydia_cases_male                0.0054                                   
Ratio_Chlamydia_cases_female              0.2544                                   
                                          Ratio_People_with_Diabetes_20_plus
Ratio_Non_Enrolled_Population             0.1670                            
Ratio_Total_Population_with_No_Visits     0.0000                            
Ratio_Total_Population_with_3_plus_Visits 0.0000                            
Ratio_People_with_Diabetes_20_plus                                          
Ratio_People_with_COPD_35_plus            0.0000                            
Ratio_Chlamydia_cases_male                0.5450                            
Ratio_Chlamydia_cases_female              0.0002                            
                                          Ratio_People_with_COPD_35_plus Ratio_Chlamydia_cases_male
Ratio_Non_Enrolled_Population             0.0889                         0.0000                    
Ratio_Total_Population_with_No_Visits     0.0014                         0.0000                    
Ratio_Total_Population_with_3_plus_Visits 0.0000                         0.0054                    
Ratio_People_with_Diabetes_20_plus        0.0000                         0.5450                    
Ratio_People_with_COPD_35_plus                                           0.1224                    
Ratio_Chlamydia_cases_male                0.1224                                                   
Ratio_Chlamydia_cases_female              0.0054                         0.0000                    
                                          Ratio_Chlamydia_cases_female
Ratio_Non_Enrolled_Population             0.0000                      
Ratio_Total_Population_with_No_Visits     0.3191                      
Ratio_Total_Population_with_3_plus_Visits 0.2544                      
Ratio_People_with_Diabetes_20_plus        0.0002                      
Ratio_People_with_COPD_35_plus            0.0054                      
Ratio_Chlamydia_cases_male                0.0000                      
Ratio_Chlamydia_cases_female                                          

Linear regression

lm_chlam <- lm(Ratio_Chlamydia_cases_male ~ Ratio_Chlamydia_cases_female, data = OCHPP_ratios_noid)
summary(lm_chlam)

Call:
lm(formula = Ratio_Chlamydia_cases_male ~ Ratio_Chlamydia_cases_female, 
    data = OCHPP_ratios_noid)

Residuals:
       Min         1Q     Median         3Q        Max 
-0.0095561 -0.0027092 -0.0006196  0.0014847  0.0314658 

Coefficients:
                               Estimate Std. Error t value Pr(>|t|)    
(Intercept)                  -0.0003589  0.0012921  -0.278    0.782    
Ratio_Chlamydia_cases_female  0.8114156  0.0611285  13.274   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.005647 on 138 degrees of freedom
Multiple R-squared:  0.5608,    Adjusted R-squared:  0.5576 
F-statistic: 176.2 on 1 and 138 DF,  p-value: < 2.2e-16
plot(lm_chlam)

Choropleth of residuals

str(lm_chlam$residuals)
 Named num [1:140] 0.00145 -0.00222 -0.00587 -0.00956 -0.005 ...
 - attr(*, "names")= chr [1:140] "1" "2" "3" "4" ...

Make dataframe of residuals with “id” column

lm_chlam_df <- data.frame(id = c(1:140), residual = lm_chlam$residuals)
head(lm_chlam_df)
lm_chlam_df.sh <- merge(nbds.sh.df, lm_chlam_df, by = "id")

Make and plot graphics object

p.lm_chlam_df <- ggplot() +
  geom_polygon(data = lm_chlam_df.sh, 
               aes(x = long, y = lat, group = group, fill = residual), 
               color = "black", size = 0.2) + 
  coord_map() + 
  scale_fill_distiller(name="Residuals", palette = "Blues", trans = "reverse", breaks = pretty_breaks(n = 8)) + 
  theme_nothing(legend = TRUE) + 
  labs(title="Residuals of male chlamydia cases ~ female chlamydia cases") +
  geom_text(aes(x=x,y=y, group=NULL, label=id), data = nbds.sh.centroids, size = 2)

p.lm_chlam_df + guides(fill = guide_legend(reverse=TRUE))

