8 Little Rock Crime Dataset
The dataset was collected from the Open Data source called little rock’s data hub. This dataset is 13 MB with 80,926 rows/observation and 14 different columns. The principal motivation behind using this dataset is to create a precise model to anticipate crimes. It can be downloaded from here.
The dataset could be cleaned as follows.
Data Cleaning and wraggling steps:
Remove all the duplicates in the column
INCIDENT_NUMBER
.Set
INCIDENT_NUMBER
to be the index to uniquely identify offenses.Fill the missing values on
WEAPON_TYPE
to beNO WEAPON
. Fill all other missing values by0
.Change
1
inWEAPON_TYPE
to beUNKNOWN
.Reassigned the columns to their respective types, e.g. numbers are numeric types and characters are strings.
Parse the
INCIDENT_DATE
to find out the year, month, day, hour and min and create columns accordingly. Note thatAM/PM
is taken into consideration when computating hours.Create a column
CRIME_TYPE
based on the following rules:- If the
OFFENSE_CODE
contains23
, the crime isNon-Violent
. - If the
OFFENSE_DESCRIPTION
containsTHEFT
, the crime isNon-Violent
. - All other crimes are
Violent Crime
.
- If the
Create a column
RISK_TYPE
based on the following rules:- If the crime is
Non-Violent
andNO WEAPON
, it isLow Risk
. - All other crimes are
High Risk
.
- If the crime is