#IronViz Geospatial – Philadelphia Crime Scene

#IronViz Geospatial – Philadelphia Crime Scene

Wasn’t it just yesterday when we were all having a great time at Austin during the Tableau Conference 2016? Time has flown by and it’s time for the first feeder competition. The topic: Use the spatial file connector in Tableau 10.2. One of the newest features! Read more about the contest here.

How do you use it?

Find a shapefile that interests you, launch Tableau, connect to the shapefile using the spatial file connector and double click on geometry to plot the shapes on the canvas. Pretty straightforward!

As soon as the competition was announced I started my search for spatial files. I went through a number of downloads only to find that some multiline operation isn’t yet supported in Tableau. What does that even mean? My search led me to this:

So any shapefiles that were of transit, railway, tracks or bike paths were ditched right away. They all seemed to be multistring in nature. The first shapefile that worked for me was this Minnesota precipitation shapefile. Simple double clicking on geometry in Tableau led to this view:

This had to be ditched pretty easily too because the dataset didn’t have anything else apart from the geometry itself and grid codes indicating precipitation average groupings. I tried looking for other data to join with but I had no luck.

I went through more than 30 downloads ranging from precipitation to bike paths to state parks to earthquakes to water polygons to solar installations. But nothing really got me sold.

Finally found my shapefile(s)!

I started searching for Philadelphia region spatial files simply because I am loyal like that. I finally came across PASDA’s website which had a variety of spatial files for the Philadelphia region.

They have a great deal of shapefiles. Check it out!

Crime incidents caught my eye and I downloaded 2006 as a sample to see if it will work for my needs. And it sure did! The great thing about this dataset is that there are other attributes included so I didn’t have to worry about joining it with other datasets to come up with an effective story. I quickly downloaded 2007 to see if there are matches in data and whether crimes are similar or different in nature. It did seem to have consistency with attributes and measures.

Earlier last week I was traveling to Denver and it was going to be a long flight so I decided to work on my #IronViz on the plane. Awkward enough, I was asked by a fellow passenger whether I work for the FBI because I had a huge title on display “Philadelphia Crime Scene”.

On the plane, I thought about ways to make the story both explanatory and exploratory at the same time. I wanted the top focus to be on the overall crime incidents over time and by how much it has changed. But also give the user an ability to explore the data to find their own story. So here is how the top section looks, everything is on one single sheet with a Philadelphia skyline set as the background image, axis is fixed at 2020 to get some extra white space for annotations.

I went through a number of iterations to effectively show by how much have specific crime incidents changed over time. I settled with a highlight table that changes color and moves in location based on parameter selection to draw attention and fade out other crimes that are not in question. However, I still offered tooltips to see other crimes that are not highlighted. The highlighted bar also shows by how much percent has the selected crime gone up or down which I thought worked effectively. I used a bunch of calculations to increase the size of font of the selected crime so it pops.

At the time, I only had 2 shapefiles, 2006 and 2007. While working on it I quickly realized this wasn’t going to work because I would have had to create separate sheets for every year (2006 to 2013) since they were all different files. What do I do now? I didn’t want to deal with blending. Of course, I reach out to the better half of the data duo when in doubt. Adam came to my rescue, such a great friend.

Adam suggested I use QGIS to merge shapefiles into one large shapefile. Ok how do I do it? I had used QGIS in the past but not for merging shapefiles to one single file. I googled and it showed me how to do it.

QGIS Steps

If you don’t already have QGIS on your computer, Adam talks about installation steps here.

I brought in all the shapefiles (2006 to 2013) to QGIS using layer > Add Layer > Add Vector Layer like shown below:

Once the layers are all added for each year, Adam said to merge the shapefiles using the option in vector > data management tools

But I couldn’t find that option so I had to instead use this option on the MMQGIS tab:

Another issue with my dataset was also that if I do merge the files, there wasn’t a unique qualifier explaining which shapefile was for which year which could have led to inaccurate number of crimes. So, I had to add a new field using the attribute table in QGIS so each shapefile remains unique while still being merged, which again Adam explained pretty quickly how to do.

You basically right click on the vector layer that you created > open attribute table > click on the pencil on the top left and click add new field.

In the dialogue, I entered year as the name of the field and chose string and 4 in length. Click OK.

Once the field is added, I simply clicked on the attribute dropdown menu and chose the new ‘year’ field I created. Entered the value of the year and clicked update all.

Clicked on the pencil again and selected save in the dialogue and done. I repeated this process for all the shapefiles before merging them into one file with the process I mentioned above. This process created a nice merged shapefile for me to work with so I didn’t have to worry about blending or cross database joining. Both of which couldn’t have been an easy solution.

Ok back to Tableau.

The highlight table was great to see change over time of different crimes. But the competition is about using maps to effectively tell a story. So I decided to use small multiples to show intensity of crimes by year. I double clicked on geometry, changed mark type to point, sized down the marks all the way, added number of records on size, a calculation to color the points/crimes that is parameter driven on color and added location to level of detail and voila!

The calculation for color uses this:

if [Text Gener] = [Select a Parameter] then [Text Gener] else ‘z’end

I used else ‘z’ so the marks that are selected always appear up front and are not hidden at the back. I initially duplicated the sheet to use other 4 years because I wanted to use callouts for location with max selected crimes in between the 2 maps like so:

But smartie pants Adam suggested using trellis on the same sheet, that way the circles can be dynamically sized based on selected crime. So the final view looks like this:

This is the calculation that is driving the size of the bubbles on the map above:

IF attr([Color])=’z’ THEN 1
ELSEIF attr([Color]) != ‘z’ then sum([Number of Records]) END

I initially had hover labels, so hovering over one point/location on the map highlights the same point/locations for other years. But it works extremely slow on Tableau Public, I reverted it back to tooltips which seems more responsive.

So far so good. I however still thought I could add more about which locations are crime heavy and let the user identify which zones to avoid by giving that option to them. Before this I had never used the slide out containers explained very nicely by Robert Rouse here. Adam also had a template created using Robert’s technique which he kindly passed along. I had to resize the containers because Adam’s template left white space at the right of the dashboard when the container is collapsed which didn’t work for my needs. I wanted no space, so I adjusted the negative co-ordinates in a way that when the slide out is open the container on the right shifts outside of the dashboard space and when it is closed it fills the entire space on the dashboard. Download the workbook if you’re interested in seeing how the pixels/height and the width of the containers are assigned.

In the slide out container, I again went through a number of ways to show which are high crime versus low crime zones. The average incidents per location is different so it didn’t make sense to show average incidents and segregate them in groupings of high and low crime zones. I decided to simply show top ‘n’ locations to avoid in Philadelphia. I also gave an option to search a street keyword and added that to context so the filtered list shows top n locations to avoid for specific streets selected by the user.

Because the space is only 300 wide in the tiny container, I wanted to maximize the use of tooltips. So I went ahead and showed yearly crime incidents by way of using ASCII characters to indicate percent of total crimes per location. This is what the view looks like when the menu is opened and when you hover over the location the tooltips show yearly detail:

Learn more about creating bar charts in tooltips here.

Finally I decided to use spatial connector with Tableau’s device designer feature and created a mobile view

Just with the elements in my tiny container that collapses. I added an overall map to show crime incidents in Philadelphia and designed for iPhone 5 so that it scales fine on larger devices. Curtis Harris, the #IronViz champion himself helped out a lot with feedback and suggestions along the way. He suggested having population info to normalize the data. Here is the mobile view:

NOTE: The crime numbers are simply raw numbers and have not been normalized with population data, simply because my data was location specific and population is by neighborhood. I did not have the bandwidth to find population data and do all the mapping by location so it has been disregarded for the sake of this #Ironviz entry focusing on spatial mapping. I might go back and add in neighborhood data at a later point.

This entry is very unlike my usual public work (it is all tiling, no floating at all) but I think I am happy with the overall effect with the use of parameters, level of detail calcs, slide out containers, ASCII tooltips, spatial file connector and device designer all in one workbook.

I learnt a great deal with this entry,winning is secondary, learning is primary. Huge thanks to Adam Crahen, Josh Tapley and Curtis Harris for their help, suggestions and encouragement along the way. I am glad I participated and I look forward to seeing other entries. Its going to be a great round of outstanding entries!

I went through a number of iterations with my entry with tremendous support from Curtis and Adam. Here is what it looked like before I seeked feedback:

And here is what it final view looks like now after all the valuable feedback I received.

Thanks for reading. Reach out if you have questions and good luck to all those participating in this round of #IronViz.

Check out the interactive version of this entry here.

Leave a Reply