Tucson Police Reported Crime Data Analysis

Proposal

Tucson Police Reported Crime Data Analysis
Author
Affiliation

VIZards

School of Information, University of Arizona

if(!require(pacman))
  install.packages("pacman")

pacman::p_load(countdown,
               tidyverse,
               scales,
               ggthemes,
               gt,
               tidytuesdayR)

Tucson Police Reported Crime Analysis

Introduction and Data

Data Source

The Tucson Police Reported Crime dataset, which is sourced from the Tucson Police Department’s official crime analysis portal (https://policeanalysis.tucsonaz.gov/pages/reported-crimes).

Reason for Dataset Selection

  1. Community Impact: Analyzing crime data can have a direct impact on the safety of Tucson residents. By understanding patterns and hotspots over the year, this project could help inform decisions around community safety and resource allocation, making it a meaningful effort with real world value.
  1. Tackling Real-World Questions: With this data, We can explore questions like, “Are certain crimes more common in specific areas or at certain times?” This analysis could lead to practical insights for crime prevention and public awareness, making the project both useful and engaging.

Data Collection

The data was originally collected by the Tucson Police Department through police reports and crime records. It includes various reported crime incidents within the city of Tucson, categorized by type of crime, time of occurrence, and geographic location. The data is continually updated as new reports are filed, ensuring that it reflects recent crime activities.

Description of Observations

The dataset comprises observations related to different types of crimes such as theft, assault, burglary, and more, with attributes including:

- Crime Type: The nature of the crime (e.g., theft, assault).

- Date and Time: The timestamp of when the incident occurred.

- Location: Latitude and longitude coordinates, as well as divisions/wards within Tucson.

- Incident ID: Unique identifier for each crime report.

This dataset allows for analysis of crime trends over time and across various regions in Tucson, providing insights into patterns and frequencies of different types of criminal activities.

Ethical Concerns

This data contains sensitive information related to crime incidents and could potentially be used to stigmatize certain areas or populations.

As such, care must be taken to:

- Ensure anonymity and privacy of the individuals involved in reported incidents.

- Avoid misinterpretation or misuse of the data that could reinforce negative stereotypes about specific areas.

- Highlight the context of the data analysis to focus on insights that can aid in improving public safety rather than assigning blame.

Research Questions

1. What are the patterns of crime incidents across different types of crimes (e.g., theft, assault) over time in Tucson?

Importance

This question aims to identify trends in crime activities, such as peak times or specific types of crimes that occur more frequently. Understanding these trends can help towards community safety initiatives.

Types of Variables
  • Crime Type : Categorical
  • Date/Time : Quantitative - Time Series
  • Frequency of Occurrences : Quantitative

2. How is the distribution of different types of crimes spatially distributed across various wards and divisions in Tucson over the years?

Importance

Understanding the geographic distribution of crimes can help law enforcement and community identify hotspots and allocate resources to areas with higher crime rates. It also provides insights into the social dynamics of different regions.

Types of Variables
  • Location (Ward/Division): Categorical
  • Crime Type : Categorical
  • Number of Incidents : Quantitative

Glimpse of Data

# Load the data
crime_data <- read.csv("data/Tucson_Police_Reported_Crimes.csv")

# Glimpse of the data
glimpse(crime_data)
Rows: 181,688
Columns: 14
$ IncidentID         <dbl> 1800330010, 807190154, 1801010138, 1801010111, 1801…
$ DateOccurred       <chr> "2018/03/30 00:00:00+00", "2018/06/15 00:00:00+00",…
$ Year               <int> 2018, 2018, 2018, 2018, 2018, 2020, 2018, 2018, 201…
$ Month              <chr> "March", "June", "January", "January", "January", "…
$ Day                <chr> "Fri", "Fri", "Mon", "Mon", "Mon", "Thu", "Mon", "M…
$ TimeOccur          <chr> "2054", "0243", "0412", "0324", "0950", "0136", "07…
$ Division           <chr> "Midtown", "South", "Midtown", "South", "South", "S…
$ Ward               <int> 6, 1, 3, 1, 5, 5, 2, 3, 5, 5, 6, 1, 5, 6, 6, 4, 5, …
$ UCR                <int> 5, 1, 3, 4, 6, 1, 6, 6, 6, 1, 6, 6, 1, 1, 3, 3, 8, …
$ UCRDescription     <chr> "05 - BURGLARY", "01 - HOMICIDE", "03 - ROBBERY", "…
$ Offense            <int> 501, 101, 304, 413, 610, 101, 607, 603, 607, 101, 6…
$ OffenseDescription <chr> "Burglary - Force", "Criminal Homicide - Murder", "…
$ CallSource         <chr> "", "", "Call For Service", "Call For Service", "Ca…
$ ESRI_OID           <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, …
crime_summary <- crime_data %>%
  group_by(UCRDescription) %>%
  summarize(
    count = n(),
    most_recent = max(as.Date(DateOccurred, format="%Y/%m/%d %H:%M:%S"))
  ) %>%
  arrange(desc(count))


crime_summary_table <- crime_summary %>%
  gt() %>%
  tab_header(
    title = "Summary of Reported Crimes in Tucson",
    subtitle = "Count and Most Recent Occurrence by Crime Type"
  ) %>%
  cols_label(
    UCRDescription = "Crime Type",
    count = "Number of Reports",
    most_recent = "Most Recent Reported Date"
  ) %>%
  fmt_number(
    columns = vars(count),
    decimals = 0
  )

crime_summary_table
Summary of Reported Crimes in Tucson
Count and Most Recent Occurrence by Crime Type
Crime Type Number of Reports Most Recent Reported Date
06 - LARCENY 125,045 2024-05-01
05 - BURGLARY 16,797 2024-05-01
07 - GTA 14,859 2024-05-01
04 - ASSAULT, AGGRAVATED 13,923 2024-05-01
03 - ROBBERY 6,586 2024-04-30
02 - SEXUAL ASSAULT 2,892 2024-05-01
08 - ARSON 1,182 2024-04-29
01 - HOMICIDE 404 2024-04-28

Analysis Plan

Data Preparation and Cleaning

Objective: Ensure data quality by handling missing values, ensuring data consistency, and checking for duplicates.

Steps:

  1. Import the dataset and inspect for any missing or anomalous values in key fields like Crime Type, Date and Time, Location, and Incident ID.

  2. Correct data types, especially for time-related fields, converting timestamps to a consistent format.

  3. Remove any duplicates or records with incomplete critical fields (e.g., missing Crime Type or Location).

Exploratory Data Analysis (EDA)

Objective: Gain an initial understanding of the distribution and patterns within the dataset.

Steps:

  1. Frequency Analysis: Calculate the frequency of each Crime Type and examine the yearly/monthly trends to identify any seasonal patterns.

  2. Temporal Analysis: Plot the occurrence of crime incidents over time to detect trends. Aggregate the data at different time intervals (e.g., daily, weekly, monthly) to observe fluctuations in crime rates.

  3. Spatial Analysis: Map the distribution of crime incidents across Tucson to identify crime hotspots. This includes visualizing crime locations on a map and examining the concentration in different wards or divisions.

  4. Incident Analysis by Time of Day: Segment crimes by time of day (e.g., morning, afternoon, night) to see if certain types of crimes are more common at specific times.

Spatial Distribution and Hotspot Analysis

Objective: Understand the geographic distribution of crime and pinpoint high-risk areas.

Steps:

  1. Use choropleth maps to show the density of incidents across Tucson’s wards or divisions.

  2. Analyze the spatial distribution of specific crime types across wards/divisions to see if certain crimes are more prevalent in particular areas.

Data Interpretation and Visualization

Objective: Present findings in a way that highlights key insights and supports evidence-based decision-making.

Steps:

  1. Develop dashboards to dynamically display crime trends, with filters for Crime Type, Time Period, and Location.

  2. Create clear visualizations, such as heat maps for spatial distribution and bar charts for frequency analysis by crime type.

  3. Include annotated charts and narrative text to explain findings, especially for non-technical stakeholders.

Action Plan & Deliverables

  1. Comprehensive Report: Includes detailed explanations, visualizations, and findings on crime patterns and hotspots.

  2. Interactive Dashboard: Provides real-time insights into crime distribution, allowing users to filter by type, time, and location.

  3. Presentation of Insights: Summarizes key findings and actionable insights for stakeholders, with recommendations for improving public safety in Tucson.