Another ‘BAN’ of Records for #MakeoverMonday

Another ‘BAN’ of Records for #MakeoverMonday


‘BAN’ = Big-Ass Number.

Thanks for inventing the term Steve Wexler

Just how big is big-ass?

That’s right 784 Million Records!  How often do you get to look at a data set of that size?

How often do you do that on a live connection to a data source? On WiFi?

How often is that data source based 4,009 miles away?

If you didn’t answer ‘never’ to at least one of those questions, I don’t believe you.


Getting Started

Last time we had an EXASOL data source, I was so impressed with the speed.  This time, I had an even better experience on a data set roughly four times larger.

Why?  Last time I was a kid in a candy shop, trying to push the data source to it’s limits as much as possible.  I wanted to see just how far I could push it and still get great performance.  I don’t think I got to it’s limits by the way.  I tried custom SQL connections to Union the 200M+ row data source to itself and a bunch of other crazy stuff.

This time, I wanted to just fly.  I already knew EXASOL was fast.  I already knew it could handle pretty much anything I was going to throw at it.  I just wanted to rip through this data like I would any other.  I built tons of exploratory views looking at the different dimensions and their distinct counts and bins of the measures to get an idea of what is there.  Most of this is throw away exploratory work.  Rarely does anything I build within this 20-30min period ever make it into my final visualization.

Of course, some of this is just knowing not to try and render millions of points on a map, but why would you want to do that anyway?  If you have never worked with a data source of this size, check out Eva’s helpful tips here.

Speaking of maps, I will be surprised if anyone has a useful map this week.  I don’t think it will be useful analysis unless it is normalized with population data or takes into account the number, type and distance of facilities in a postcode. In other words, more work than I wanted to take on this week.  So this map lover did not build one.  Well I lied, I built this map to tell others not build any.  Oh, and the great arc map up above.  Whatever, I like maps.

However, there is some geo data so I expect to see a lot of these on the #MakeoverMonday hashtag.  Whoo!


So what did you build Adam?

This past week, Pooja and I collaborated on a #VizForSocialGood for May Project Gardens.  You can check it out here.  I started the viz and sent it to Pooja.  When I sent it, I was pretty excited about what I had going on.  I honestly didn’t think Pooja would have many changes, but we are trying to do more work together so I sent it and said tear it up.  She completely changed the top half and blew me away.  The combination of our styles just worked so much better than my original!

Pooja has a very unique design style.  What is amazing is that she rarely/never does anything crazy.  If you download her workbooks, everything is very simple to execute.  She just does a lot of simple things that add up to something amazing.  She also has a large sphere of influence.  Whatever design she chooses for #MakeoverMonday, you see about 20 of it the following week (Waffles/dotted text boxes anyone?)

Michael Mixon coined the #Poojatastic hashtag after we published this viz.  So this week I was going to build something #Poojatastic.


 1. Clean, Crisp, and Balanced – #Poojatastic

I decided to work with Pooja’s floated/dotted text borders to connect the title and charts.  I think it is very simple yet elegant at the same time. I wanted this visualization to be very balanced.  I have always been pretty picky with floating designs, but everything lines up perfectly.  The borders are consistent, a dual axis up on top to balance the text and spacing, the titles and footers are the same size and weight, and the headers are on the top and bottom.  Clean, Crisp and Balanced.

I also knew this probably wouldn’t be able to be published on Tableau Public so I downloaded a Google Font called Share and focused on a static visual.  Don’t worry though the tooltips are sweet.  Just ask Mike Cisneros if I would leave those on default…

Is it #Poojatastic?  Probably not.  She would never have done a radar chart.  However, I think I made it simple to understand by labeling the start and end points with the month, labeling each pane with a footer below, the use of a good title floating over a custom color legend, and Pooja lines to link it all together.  I am happy with it, but will let you be the judge if I succeeded with this visualization.

Also, Sarah Bartlett commented that these costs are not the costs felt by patients.  I knew this to be true from the glossary of terms, but it is an important distinction that perhaps I did not fully appreciate or make clear.  I am sure these costs come back to the residents as nothing is free, but from a day-to-day patient perspective, they only pay £8.60 per prescribed item regardless.

We continued our chat on DM where I complained about the cost of my kid’s glasses and my desire to be a UK health tourist (just kidding).  Full disclosure, the chat included GIFs.


 2. This is just for the fun of it. Really.

I had already done the work to create the radar chart calculations for the first visualization.  You know I built about 20 different versions with them.  But it the end, it was going to be static and small so I simplified it to a single ring for each month per year.  But that doesn’t mean I couldn’t publish another view.

This chart includes every chemical name and an average cost per month from 2011-2016.  The further a point is from the center, the higher the average cost.  January is at 12 o’clock and the months are in order clock-wise.  Now, is this a great chart?  No, it isn’t.  It is busy, but I challenge you to find a more clear way to display outliers.  Most people don’t understand box-and-whisker plots and most times you can’t see all the points so we introduce an arbitrary jitter.  Is that better?  No.  I acknowledge the downsides of this chart, but whatever, this one was just for fun.

I think it turned out pretty badass and really does do a good job of highlighting the fluctuating costs when using the interactive workbook.  Check out my Huge Ass Radar GIF.


 3. Finding a story

I already mentioned I had built a ton of exploratory views.  One of them was comparing the growth in the number of prescriptions from January 2011.  I noticed one clear drug standing out with a ridiculously high growth rate.  I researched this drug and found it was approved for use in April 2012.  I filtered the data to start at that point and created the table calc on every chemical name.  This showed prescriptions of Apixaban grew over 9.9M% from April 2012 to December 2016.

I tried to dress this story up with some information about the drug like when it was approved, what it is used for, and the chemical structure.  I thought this was pretty cool.

Then, my conscience (Mike Cisneros) DM’d me and said this chart just shows that a new drug will be prescribed.  He’s not wrong.  I immediately agreed that not every line is on even footing because most of the other drugs were established and their high growth periods in the number of prescriptions probably occurred prior to 2012.  So Mike, ya got a point.  However, it is still an amazing growth rate that none of the other 1,916 drugs in the data set could boast.  However, he told me he is married to a biostatistician in pharma, so he (more like his wife) is probably right.


4. One we can all relate to or maybe not…

Recently, I came down with the stomach flu.  If you didn’t already know this is not the real flu. I was complaining to Pooja and Sarah on DM one day and let’s say they were less than comforting that I was sick. They diagnosed me with #ManFlu. So that is the backstory with this visualization. Some people just don’t believe in the flu I guess…

When I was playing with all the chemicals and prescription counts over time, I turned off the stacked marks.  #Boom – viz number 4 was born.  What the hell were all those spikes?
What was that drug?  Well, it turned out to be a vaccine as the chemical was influenza.  I knew this would be a good chart to publish.  It was 3am so I decided to have a little fun with it.

Andy Kriebel has just talked about using questions in titles in his weekly wrap-up from the previous week.  It is something I have done in the past, but was something I thought work well here.  What was the point of this chart?  See another question.  The point was that one drug has a clear seasonal pattern that stands above all in the data set of 1,916 drugs.  So that became the title.  This is a simple area chart with stacked marks off and some formatting.  It might just be my favorite from the week.

My conscience (Mike Cisneros) also fact checked and approved (I think) of this visualization.


My takeaways this week

I love the Tableau community.  We are a very approachable group.  I really appreciate and value Sarah and Mike (and his wife) giving me their feedback on my visualizations.  This is really the only way to improve our skills.  If you are working in a bubble or don’t accept feedback, you’ll never get better.  So I encourage you to solicit and accept feedback whenever possible.  Also, give some every now and again.  It probably helps that I’ve shared beers with Sarah and Mike (and his wife during a recent trip to DC) on more than one occasion, but look for people to give you their honest opinion.  Stay humble!

I love that EXASOL gave us this opportunity to work with this large data set.  It was a great learning opportunity.  Also, to be honest, it is freaking fast.  It was like I was working with Superstore the whole time.  I think I waited a few seconds once or twice.  There really is nothing else to say.  I wait way longer for my work computer to boot up than EXASOL took to process any of my queries.

Also, I should not have drank coffee at 6pm yesterday.  I was up until 4am vizzing and replying to folks in the UK who just woke up to start their Sunday!

3 thoughts on “Another ‘BAN’ of Records for #MakeoverMonday

  1. Great post Adam! Really enjoyable last night watching you produce viz after viz.

    1. Adam Crahen

      Thanks Andy! So many stories in these big data sets!

  2. Enjoyed reading your post and visualizations, thank you! I’m still in the process of learning myself and finding the MakeOverMondays a great source of inspiration (and challenge!) to explore and learn the various ways in which Tableau works.
    I was curious how you do some of the visual design elements like the lines that bracket your text? Anything you can refer me to that shows how you guys do that?

Leave a Reply