amfAR Data: Distance to Treatment Facility amfAR is a non-profit which publishes data related to the HIV epidemic. The data set we utilized contains the average distance from each county to an opioid related treatment facility.
New York State Health Data: Opioid related Treatments and Deaths This dataset details opioid related treatments, including both inpatient and outpatient treatment, for opioid diagnoses from 2010 to 2015. The data is then categorized by county, year, whether the county is urban or rural, and the insurance of the patient. An additional data set in the New York health data which included the number of opioid related deaths from 2003 to 2017 was also utilized.
ARCOS Data: Pills Prescribed by County The ARCOS dataset is published by the Washington Post as part of a database which tracks opioid prescriptions from manufacturer to patient between 2006 and 2012. We utilized the data related to the number of pills prescribed to patients in each U.S. county, filtering for the counties in New York state.
Government Data: Medication Take-Back Facilities Finally, we used government data which provided the location of medication take-back facilities. This data was limited to counties in New York state, and the distance was mapped against opioid related deaths and treatments. The medication take-back facility data was collected in 2017.
In our analysis, we created a series of graphs which detail how the opioid crisis is affecting New Yorkers in different areas of the state over time. We broke down the death count by age, race, gender, and county.
Below, we see that there are significantly more men than women with opioid related deaths. In recent years that differential has only increased.
# read in data #
opioid_death_data = janitor::clean_names(read_csv('./data/opioid_related_deaths_2.csv'))
opioid_treatment_distance = janitor::clean_names(read_csv('./data/distance_to_treatment.csv'))
opioid_demographics = janitor::clean_names(read_csv("./data/opioid_demographics.csv"))
opioid_er_data = read_csv('./data/opioid_related_visits.csv') %>%
janitor::clean_names()
prod_county = arcos::summarized_county_annual(state = "NY", key = "WaPo") %>%
janitor::clean_names()
county_pop = arcos::county_population(state = "NY", key = "WaPo") %>%
janitor::clean_names()
# clean opioid death data #
opioid_death_data = opioid_death_data %>%
filter(year >= 2010) %>%
mutate(county = recode(county, "Kings (Brooklyn)" = "Kings",
"New York (Manhattan)" = "New York",
"St Lawrence" = "St. Lawrence"))
# clean opioid er data #
opioid_er_data = opioid_er_data %>%
select(year, patient_county_name, rural_urban, payer, er_opioid, inpatient_total_opioid, er_inpatient_total_opioid, outpatient_opioid, overall_opioid) %>%
rename(county = patient_county_name)
# Combine Data Sets #
opioid_total_data = left_join(opioid_er_data, opioid_death_data, by = c('county', 'year')) %>%
arrange(county, year)
opioid_demographics %>%
filter(region == "NYC") %>%
group_by(year, sex) %>%
summarize(
count = sum(opioid_poisoning_deaths)
) %>%
plot_ly(
x = ~year,
y = ~count,
type = "bar",
color = ~sex,
alpha = 0.5) %>%
layout(
title = "Opioid Related Deaths by Year and Gender",
xaxis = list(title = "Year"),
yaxis = list(title = "Count")
)
Non-Hispanic whites are more likely to die of opioid poisoning compared to people of other races/ethnicities, on average. Though there has been an increase in opioid related deaths for all ethnicities, white non-Hispanic people continue to be at higher risk for opioid poisoning.
deaths_age_grp = read_csv("./data/deaths_age_group.csv") %>%
janitor::clean_names() %>%
pivot_wider(
names_from = age_group,
values_from = opioid_poisoning_deaths
) %>%
janitor::clean_names()
deaths_by_year = deaths_age_grp %>%
select(year, region, race_or_ethnicity, sex, total) %>%
group_by(year, race_or_ethnicity) %>%
summarize(
count = sum(total)
) %>%
mutate(
race_or_ethnicity = factor(race_or_ethnicity, levels = c("Not Stated", "Other Non Hispanic", "Black Non Hispanic", "Hispanic", "White Non Hispanic"))
)
deaths_by_year %>%
plot_ly(x = ~year, y = ~count, type = "bar",
color = ~race_or_ethnicity, alpha = 0.5) %>%
layout(
title = "Opioid Related Deaths by Year and Race",
xaxis = list(title = "Year"),
yaxis = list(title = "Count")
)
We see an exponential increase of opioid poisoning deaths between the ages of 20 and 64 from 2004 and 2016. All other ages remained relatively stable at low levels of opioid related deaths between those years.
deaths_by_age = read_csv("./data/deaths_age_group.csv") %>%
janitor::clean_names() %>%
select(-region, -race_or_ethnicity, -sex) %>%
filter(age_group != "Total") %>%
group_by(year, age_group) %>%
summarize(
count = sum(opioid_poisoning_deaths)
) %>%
ungroup()
deaths_by_age %>%
plot_ly(x = ~year, y = ~count, type = "scatter", mode = "lines+markers",
color = ~factor(age_group), alpha = 0.5) %>%
layout(
title = "Opioid Related Deaths by Year and Age Group",
xaxis = list(title = "Year"),
yaxis = list(title = "Count")
)
Below is a graph of the opioid related death rate by county. We see that Schuyler county has the lowest death rate, while Sullivan county has the highest.
total_death = opioid_death_data %>%
group_by(county) %>%
summarize(
death_sum = sum(opioid_poisoning_deaths)
) %>%
mutate(
county = toupper(county)
)
total_pop = county_pop %>%
select (buyer_county, year, population) %>%
rename(county = buyer_county) %>%
group_by(county) %>%
summarize(
total_pop = sum(population)
)
total_death_pop = left_join(total_death, total_pop) %>%
mutate(
county = factor(county),
death_pop = death_sum/total_pop* 1000000,
county = fct_reorder(county, death_pop)
)
total_death_pop %>%
plot_ly(
x = ~county, y = ~death_pop, split = ~county,
type = 'bar', alpha = 0.5
) %>%
layout(
title = "Opioid Related Deaths by County",
xaxis = list(
title = "County"),
yaxis = list(
title = "Death per 1 Million People"
)
)
This project was done in collaboration with other students.