Package 'rcrimeanalysis'

Title: An Implementation of Crime Analysis Methods
Description: An implementation of functions for the analysis of crime incident or records management system data. The package implements analysis algorithms scaled for city or regional crime analysis units. The package provides functions for kernel density estimation for crime heat maps, geocoding using the 'Google Maps' API, identification of repeat crime incidents, spatio-temporal map comparison across time intervals, time series analysis (forecasting and decomposition), detection of optimal parameters for the identification of near repeat incidents, and near repeat analysis with crime network linkage.
Authors: Jamie Spaulding and Keith Morris
Maintainer: Jamie Spaulding <[email protected]>
License: GPL-3
Version: 0.5.0
Built: 2025-03-10 05:51:40 UTC
Source: https://github.com/jsspaulding/rcrimeanalysis

Help Index


Example data from the Chicago Data Portal

Description

A sample dataset of crime incidents in Chicago, IL from 2017-2019.

Usage

crimes

Format

A data frame with 25000 rows and 22 variables.

id

Unique identifier for the record.

case_number

The Chicago Police Department Records Division Number, which is unique to the incident.

date

Date when the incident occurred.

block

Partially redacted address where the incident occurred.

iucr

Illinois Unifrom Crime Reporting code (directly linked to primary_type and description)

primary_type

The primary description of the IUCR code.

description

The secondary description of the IUCR code, a subcategory of the primary description.

location_description

Description of the location where the incident occurred.

arrest

Indicates whether an arrest was made.

domestic

Indicates whether the incident was domestic-related as defined by the Illinois Domestic Violence Act.

beat

Indicates the police beat where the incident occurred.

district

Indicates the police district where the incident occurred.

ward

The ward (City Council district) where the incident occurred.

community_area

Indicates the community area where the incident occurred.

fbi_code

Indicates the National Incident-Based Reporting System (NIBRS) crime classification.

x_coordinate

X coordinate of the incident location (State Plane Illinois East NAD 1983 projection).

y_coordinate

Y coordinate of the incident location (State Plane Illinois East NAD 1983 projection).

year

Year the incident occurred.

updated_on

Date and time the record was last updated.

latitude

The latitude of the location where the incident occurred.

longitude

The longitude of the location where the incident occurred.

location

Concatenation of latitude and longitude.

Source

https://data.cityofchicago.org/Public-Safety/Crimes-2001-to-present/ijzp-q8t2/data


Batch Geocoding of Physical Addresses using the Google Maps API

Description

Geocodes a location (determines latitude and longitude from physical address) using the Google Maps API. Note that the Google Maps API requires registered credentials (Google Cloud Platform), see the ggmap package for more details at https://github.com/dkahle/ggmap. Note that when using this function you are agreeing to the Google Maps API Terms of Service at https://cloud.google.com/maps-platform/terms/.

Usage

geocode_address(location)

Arguments

location

a character vector of physical addresses (e.g. 1600 University Ave., Morgantown, WV)

Value

Returns a two column matrix with the latitude and longitude of each location queried.

Author(s)

Jamie Spaulding, Keith Morris

Examples

library(ggmap) #needed to register Google Cloud Credentials
register_google("**Google Cloud Credentials Here**")
addresses <- c("Milan Puskar Stadium, Morgantown, WV","Woodburn Hall, Morgantown, WV")
geocode_address(addresses)

Identify Repeat Crime Incidents

Description

This function identifies crime incidents which occur at the same location and returns a list of such incidents where each data frame in the list contains the RMS data for the repeat crime incidents. The data is based on the Chicago Police Department RMS structure.

Usage

id_repeat(data)

Arguments

data

Data frame of crime or RMS data. See provided Chicago Data Portal example for reference

Value

A list where each data frame contains repeat crime incidents for a given location.

Author(s)

Jamie Spaulding, Keith Morris

Examples

#Using provided dataset from Chicago Data Portal:
data(crimes)
crimes <- head(crimes, n = 1000)
out <- id_repeat(crimes)

Comparison of KDE Maps Across Specified Time Intervals

Description

This function calculates and compares the kernel density estimate (heat maps) of crime incident locations from two given intervals. The function returns a net difference raster which illustrates net changes between the spatial crime distributions across the specified intervals.

Usage

kde_int_comp(data, start1, end1, start2, end2)

Arguments

data

Data frame of crime or RMS data. See provided Chicago Data Portal example for reference

start1

Beginning date for the first interval of comparison

end1

Final date for the first interval of comparison

start2

Beginning date for the second interval of comparison

end2

Final date for the second interval of comparison

Value

Returns a shiny.tag.list object which contains three leaflet widgets: a widget with the calculated KDE from interval 1, a widget with the calculated KDE from interval 2, and a widget with a raster of the net differences between the KDE (heat maps) of each specified interval.

Author(s)

Jamie Spaulding, Keith Morris

Examples

#Using provided dataset from Chicago Data Portal:
data(crimes)
int_out <- kde_int_comp(crimes, start1="1/1/2017", end1="3/1/2017",
                                start2="1/1/2018", end2="3/1/2018")

Kernel Density Estimation and Heat Map Generation for Crime Incidents

Description

This function computes a kernel density estimate of crime incident locations and returns a 'Leaflet' map of the incidents. The data is based on the Chicago Police Department RMS structure and populates pop-up windows with the incident location for each incident.

Usage

kde_map(data, pts = NULL)

Arguments

data

Data frame of crime or RMS data. See provided Chicago Data Portal example for reference

pts

Either true or false. Dictates whether the incident points will be plotted on the map widget. If NULL, the default value is TRUE.

Value

A Leaflet map with three layers: an 'ESRI' base-map, all crime incidents plotted (with incident info pop-up windows), and a kernel density estimate of those points.

Author(s)

Jamie Spaulding, Keith Morris

Examples

#Using provided dataset from Chicago Data Portal:
data(crimes)
crimes <- head(crimes, 1000)
library('leaflet') # needed to install basemap providers
kde_map(crimes)

Near Repeat Analysis of Crime Incidents with Crime Linkage Output

Description

This function performs near repeat analysis for a set of incident locations. The user specifies distance and time thresholds which are utilized to search all other incidents and find other near repeat incidents. From this an adjacency matrix is created for incidents which are related under the thresholds. The adjacency matrix is then used to create an igraph graph which illustrates potentially related or linked incidents (under the near repeat thresholds).

Usage

near_repeat_analysis(
  data,
  epsg,
  dist_thresh = NULL,
  time_thresh = NULL,
  tz = NULL
)

Arguments

data

Data frame of crime or RMS data. See provided Chicago Data Portal example for reference

epsg

The EPSG Geodetic Parameter code for the area being considered. The EPSG code is used for identifying projections and performing coordinate transformations. If needed, the EPSG for an area can be found at https://spatialreference.org.

dist_thresh

The spatial distance (in meters) which defines a near repeat incident. By default this value is set to 1000 meters.

time_thresh

The temporal distance (in days) which defines a near repeat incident. By default this value is set to 7 days.

tz

Time zone for which the area being examined. By default this value is assigned as the same time zone of the system. For more information about time zones within R, see https://www.rdocumentation.org/packages/base/versions/3.6.1/topics/timezones.

Value

Returns a list of all near repeat series identified within the input data as igraph graph objects. This list can be used to generate plots of each series and to discern the near repeat linkages between the crime incidents.

Author(s)

Jamie Spaulding, Keith Morris

Examples

data(crimes)
nr_data <- head(crimes, n = 1000) #truncate dataset for near repeat analysis
out <- near_repeat_analysis(data=nr_data,tz="America/Chicago",epsg="32616")

Identification of Optimal Time and Distance Parameters for Near Repeat Analysis

Description

This function performs an evaluation of given crime incidents to reccomend parameters for near repeat analysis. A series of time and distance parameters are tested using a full factorial design using the set of incident locations to determine the frequency of occurrence given each set of parameters. The results of the full factorial assessment are then modeled through interpolation and the second derivative is calculated to determine the inflection point. The inflection point represents the change in frequency of detected incidents which near repeat. Determination of the inflection point is completed for both the time and distance domains.

Usage

near_repeat_eval(data, epsg, tz = NULL)

Arguments

data

Data frame of crime or RMS data. See provided Chicago Data Portal example for reference

epsg

The EPSG Geodetic Parameter code for the area being considered. The EPSG code is used for identifying projections and performing coordinate transformations. If needed, the EPSG for an area can be found at https://spatialreference.org.

tz

Time zone for which the area being examined. By default this value is assigned as the same time zone of the system. For more information about time zones within R, see https://www.rdocumentation.org/packages/base/versions/3.6.1/topics/timezones.

Value

Returns a data frame with one instance (row) of two fields (columns). The fields are: distance and time. The instance indicates the optimal near repeat parameters for each. Note that distance is given in meters and time is given as days.

Author(s)

Jamie Spaulding, Keith Morris

Examples

data(crimes)
nr_dat <- subset(crimes, crimes$primary_type == "BURGLARY")
pars <- near_repeat_eval(data=nr_dat, tz="America/Chicago", epsg="32616")
pars

Time Series Forecast and Decomposition for Daily Crime Data

Description

This function transforms daily crime count data and plots the resultant components of a time series which has been decomposed into seasonal, trend, and irregular components using Loess smoothing. Holt Winters exponential smoothing is also performed for inproved trend resolution since data is in a daily format.

Usage

ts_daily_decomp(data, start)

Arguments

data

Data frame of crime or RMS data. See provided Chicago Data Portal example for reference

start

Start date for the time series being analyzed. The format is as follows: c('year', 'month', 'day'). See example below for reference.

Value

Returns an object of class "stl" with the following components:

time.series: a multiple time series with columns seasonal, trend and remainder.

weights: the final robust weights (all one if fitting is not done robustly).

call: the matched call.

win: integer (length 3 vector) with the spans used for the "s", "t", and "l" smoothers.

deg: integer (length 3) vector with the polynomial degrees for these smoothers.

jump: integer (length 3) vector with the 'jumps' (skips) used for these smoothers.

inner: number of inner iterations

Author(s)

Jamie Spaulding, Keith Morris

Examples

#Using provided dataset from Chicago Data Portal:
data(crimes)
test <- ts_daily_decomp(data = crimes, start = c(2017, 1, 1))
plot(test)

Time Series Forecast for Daily Crime Data

Description

This function transforms traditional crime data into a time series and forecasts future incident counts based on the input data over a specified duration. The forecast is computed using simple exponential smoothing with additive errors. Returned is a plot of the time series, trend, and the upper and lower prediction limits for the forecast.

Usage

ts_forecast(data, start, duration = NULL)

Arguments

data

Data frame of crime or RMS data. See provided Chicago Data Portal example for reference

start

Start date for the time series being analyzed. The format is as follows: c('year', 'month', 'day'). See example below for reference.

duration

Number of days for the forecast. If NULL, the default duration for the forecast is 365 days.

Value

Returns a plot of the time series entered (black), a forecast over the specified duration (blue), the exponentially smoothed trend for both the input data (red) and forecast (orange), and the upper and lower bounds for the prediction interval (grey).

Author(s)

Jamie Spaulding, Keith Morris

Examples

#Using provided dataset from Chicago Data Portal:
data(crimes)
ts_forecast(crimes, start = c(2017, 1, 1))

Time Series Decomposition for Monthly Crime Data

Description

This function transforms traditional crime data and plots the resultant components of a time series which has been decomposed into seasonal, trend and irregular components using Loess smoothing.

Usage

ts_month_decomp(data, start)

Arguments

data

Data frame of crime or RMS data. See provided Chicago Data Portal example for reference

start

The year in which the time series data starts. The time series is assumed to be composed of solely monthly count data

Value

Returns an object of class "stl" with the following components:

time.series: a multiple time series with columns seasonal, trend and remainder.

weights: the final robust weights (all one if fitting is not done robustly).

call: the matched call.

win: integer (length 3 vector) with the spans used for the "s", "t", and "l" smoothers.

deg: integer (length 3) vector with the polynomial degrees for these smoothers.

jump: integer (length 3) vector with the 'jumps' (skips) used for these smoothers.

inner: number of inner iterations

Author(s)

Jamie Spaulding, Keith Morris

Examples

#Using provided dataset from Chicago Data Portal:
data(crimes)
test <- ts_month_decomp(crimes, 2017)
plot(test)