3/10/2020

A Journey to Custom Subtotals with Table Calculations

This past weekend Tableau Zen Master Klaus Schulte published a great blog post showing four different ways of Adding Custom Sub-Totals in Tableau. In this blog post, he demonstrates how to do this with data densification and table calculations. I posted my own solution later that evening, which I will describe below, but I wanted to discuss the journey to get to the solution.

We often see finished visualizations or technical blog posts relating to Tableau, but when we see the end result, we don't see the hours of iteration that it can take to get there. Those iterations can include dead ends, mistakes and sometimes overlooking the simplest of solutions. So in an effort to show this process more, I thought I would walk through the process that I took. Keep in mind that I didn't record every step along the way, but I will highlight my process as best I can.

In addition, I would point out that we often find these solutions through community work and collaboration. Had Klaus not posted this blog post, which references work by Rody Zakovich, then I surely wouldn't be writing this post. As with many Tableau techniques, we build on the work of many others that come before us.

When I read Klaus's blog post, I suspected there was an easier way to do this with tableau calculations. I can't pinpoint exactly why I thought this, but there were a good many steps in his process and some long formulas. So I download Klaus's workbook to see exactly what he did with the Tableau Table calculations.

Let's start with the goal. This is what Klaus created. A table that shows the individual values for a subcategory, but instead of the typical subtotals, he has a running sum for the subtotals for each category. This is one of those things that would take 30 seconds in Excel, but as we'll see, there's a bit more involved in Tableau.

Using the Superstore data, the first thing I did was create a basic table with

My next series of steps were based on this logic; "if the number is in the Subcategory sequence then use the Sum([Sales]), otherwise use a Running_Sum(Sum([Sales]))". After a few quick attempts at a calculation around that logic, I decided to add a

This was my first breadcrumb because this isolates the pieces for each part, one sum and one running sum.

My next step involved a few calculations. First, I tried a single calculation using the logic I outlined above.

Calculated Field:

Formula:

else sum([Sales])

end

While the logic might appear to make sense, this was a failed attempt, because no matter how I set Compute Using or adjusted the table calculation, it didn't solve the problem. This formula doesn't calculate the running sum as I need it. We'll come back to this, because this failure is actually a step in the solution that comes back later.

My next step was to break the calculation into two parts so that I could see what was going on for each part of the calculation. I figured that if I could break them up them I could control them easier for sum and running sum.

Calculated Field:

Formula:

else sum([Sales])

end

Calculated Field:

Formula:

else null

end

This

Calculated Field:

Formula:

Here's what all of these calculations look look like at this point.

This was my first glimmer of hope for the correct result. In fact, it's a variation of the solution where the values are in two separate columns. I'd call that a win. When I remove all of the other columns that I created it looks like this.

I went back to add

Calculated Field:

Formula:

else Sum([Sales])

end

At this point it looks like a step backwards. It's now the running sum down the entire table. Not exactly what we are looking for.

However, when I changed the "Compute Using" to

This is the moment where I explained, "Yes!" while making a fist, and my family watching TV looked at me funny. Don't judge. This is exciting stuff. These ARE the droids we've been looking for.

My next thought was to clean up some of these calculations. Remember back at the beginning when the Running_Sum of Sales calculation wasn't getting the result I wanted in

Calculated Field:

Formula:

else Sum([Sales])

end

After making that change I verified that calculation was still working as expected. Yep, still good. What was a road block early on in the process and incorrect now helped me simplify the formula.

If both conditions of the IF statement produce the same result, Sum([Sales]), then I can just use Sum([Sales]) into the Running Sum Calculation and do this all with a single calculation. I didn't need this IF calculation at all. It doesn't get much simpler than this.

Calculated Field:

Formula:

I now have the result I was looking for in a single calculation. And it makes complete sense. It's very simple. It computes the running sum on

The next step is dealing with the label. Klaus's workbook used this solution from Rody Zakovich.

Calculated Field:

Formula:

THEN MIN("Running Total after "+[Category])

ELSE MIN([Sub-Category])

END

I knew I could do this same calculation with the size() function that I used in my earlier iterations. Those earlier steps paid off again.

Calculated Field:

Formula:

THEN MIN("Running Total after "+[Category])

ELSE MIN([Sub-Category])

END

The key to this one is setting the Compute Using to Subcategory for the Label, instead of at Category like we did for the Running Sum Calc. This solution allows me to remove First() and Last() and therefore Measure Names and Measure Values. I'm down to two calculations, one for the numbers and one for the label. I adjusted the "spacer" calculation, and ended up doing it in a different way than Klaus did.

I also experimented with making the table using a dual-axis circle so that I could leverage the labels of the circles and control alignment. This approach eliminates the need for the spacer calculations.

I hope you find this information useful. If you have any questions feel free to email me at Jeff@DataPlusScience.com

Jeffrey A. Shaffer

Follow on Twitter @HighVizAbility