Howard County Police Department Calls for Service, 2014-2016
This visualization, created with Tableau Public, incorporates data on Howard County Police Department Calls for Service (2014-2016) and started as part of a final exam for my Business Analytics and Strategic Decision Making class, which asked to create a data visualization in Tableau with a dataset of my choosing.
The initial dashboard created - now titled ‘Top 10 Events and Trends by Selected Date Level’ - presented high level summaries on which events generated the most calls for service, monthly trends for top 10 events, and a breakdown of event types for locations with the most events. It has since been updated with filters for event type, beat, and month and year, as well as the option to select granularity for the time series (e.g. year, month, week, day) and adjust date range displayed.
At the time the dashboard was created, the initial dashboard showed a couple disparities such as:
- Traffic stops occurring 3 to 4 times then the remaining top 10 events
- Locations such as Washington Blvd (AKA Route 1) generating the most calls for service - heavily traffic stops - followed by obvious locations such as the county courthouse and police department, primarily comprised of administrative and special assignments.
Although obvious from the visualization, these trends are ultimately not very informative or interesting and with so many event types and locations it is difficult to tease out additional trends.
Issues with Data Quality and Coding
Several issues identified originally when creating this visualization were in data quality or coding, including:
-
Inconsistent coding of the ‘Date Reported’ column (e.g., 01/01/2014 versus 01-Jan-2015), causing the majority of data to be excluded when charting time series data. (UPDATE: This issue appears to have been addressed in the most recent update to the dataset on the open data portal)
-
Broad versus specialized categories for ‘Computer Aided Dispatch Event Type’. TRAFFIC STOP event type as a potential catch-all category, as we see more specific events such as DWI, LICENSE PLATE READER, M/V ACCIDENT, MOTOR VEHICLE VIOLATION, PROP DAMAGE, TRAFFIC HAZARD, TRAFFIC PURSUIT, VEH CRASH that could be interpreted as special sub-category of TRAFFIC STOP. For example, when viewing monthly trends there are drops in TRAFFIC STOP events where there are spikes in TRAFFIC HAZARD or MOTOR VEHICLE VIOLATION events. This leads one to wonder whether there was actually a decrease or just that events were inconsistenly coded in each time period. With so many categories, it could be useful to develop a heirarchy so that major event categories could be summarized at a high level, allowing the user to further drill down into more specific sub-categories. Documentation on the exact meaning and methodology for each event type would also be useful for developers and data entry alike.
-
Location field is inconsistent in it’s use of primary roads, intersections, or XY coordinates. WASHINGTON BLVD (also known as Route 1) is particularly problematic as it stretches across the entire county and state and therefore dominates other entries when visualized. The dataset is filled with numerous entries including WASHINGTON BLVD, RT 1, or some intersection of WASHINGTON BLVD/RT 1. For greater flexibility in reporting and visualization, the dataset would benefit from a dedicated XY coordinates column, alleviating in part the issue of consistency for the Location field.
Also recorded for each event is the assigned Beat and Statistical Reporting Area, which provide information on the general areas or neighborhoods in which each event occurred. At this time there is no provided documentation or spatial data files (i.e., shape files, KML, geojson) indicating the boundaries for these geographic regions. To provide more high quality reporting capability, I set out to create a custom geospatial data file.
Creating Custom Spatial Data File
Although the County was unable to provide a spatial data file (e.g. shp, kml, geojson), they did provide the below image with Beat assignments in Howard County, with areas and boundaries clearly labelled and colored. Since I am working in a Mac environment and Arc GIS is primarily designed for Windows, I made the decision to work with Google Earth, free and open source, to create a KML file.
The process begins by importing the image into Google Earth as an image overlay and dragging the corners of the image to stretch and skew it until it matches the county boundaries as closely as possible.
We then draw polygons in Google Earth for each of the Beat assignments, creating points along the boundaries to draw complex polygons. Name and description are provided for each of the polygons, which will be included in the KML export. The below screenshot shows ‘New Polygon’ dialog which remains open as the polygon is drawn.
Once all polygons are created, I exported ‘My Places’ from Google Earth as a KML file, which can then be imported into Tableau as a spatial data file. The initial import was problematic as Tableau noted that KML should provide all placemarks/polygons on a single layer. Opening the KML file within a text editor, I simply move the series of Placemark XML tags outside of their Folder parent tag and reattempted import, successfully.
Mapping Events by Beat
Tableau allows imports of spatial data files as a dataset and once imported can be joined or blended with the primary data source. The imported dataset includes fields for name and description as specified in Google Earth, as well as geometries for the polygons that were drawn. In this case, we link ‘Beat’ from the primary data source with ‘Name’ from the imported KML file. A custom fill map (AKA choropleth map) may then be created by double clicking the Geometry field from the spatial data file to import the appropriate polygons. From there the visualization is created as you would with a typical fill map in Tableau. Coloring is done based on record counts for a given month and year. Filters provided allow the user to examine number of events for a selected month and year, beat, or computer aided dispatch event type on both the fill map as well as an accompanying cross tab with all the individual records.
Future Research and Improvements
This mockup was particularly instructive for me in creating custom spatial data files from scratch. I found Google Earth relatively easy to work with although the drawing of the custom polygons was quite cumbersome and Google Earth seemed to lack some basic tools such as the ability to erase points from a polygon once they were drawn, meaning if mistakes were made you had to delete the shape and begin again. I’ll be interested in the future to test additional tools that generate, for example, geoJSON files which can be more easily utilized in JavaScript mapping libraries such as Leaflet.
Although it was not the primary focus of this exercise, more effort could be spent to improve the high level summary of the second storyboard showing top events, trends over time, and events by location. Some ideas include:
-
Filters to show top events types and monthly trends for a particular beat (UPDATED: 12/21/2017)
-
Dropdown to select level of granularity for time series. The final design settled on monthly numbers, but additional charts by week or day could be developed and swapped in and out through a parameter dropdown (UPDATED 12/21/2017)
-
With better understanding of the event types, create a heirarchy that allows the visualization to drill from broad, overarching categories to more specialized event types. Visualizations such as tree maps or bar charts could then be used to select and drill down into a particular category, which would reduce the amount of event types under consideration and better facilitate identification of trends.