T-SQL

Calculating average value of multiple columns

It’s an easy task to calculate the average AVG() value for all rows in one or more columns. But it starts to be a little bit more interesting when we should calculate the average value for more columns on the same row. There are basically two options on how to do that. One of them is really cool using the row constructor and may be suitable for dynamic SQL because we don’t need to calculate the number of columns involved in the calculation.

Let’s start with the creation of some sample data:

First, we will calculate the average traditional way over all values in a column:

There is no surprise at all.

But how should we calculate the average value for all columns in a row (except the RowId column)? Sure, we can do it mathematically and explicitly:

What I don’t like on this solution is the constant (4) as the number of columns involved in the calculation. It’s the value we have calculated outside the query itself and then used it inside it.

There is a better solution: We can use the row constructor VALUES() together with the AVG() function to get the same and correct result without any “artificial” constant to be used:

What do you think about it?

It can be used in case of dynamic SQL effectively without the need to calculate the number of columns involved and handling the constant dynamically:

The VALUES() limitation of maximum 1000 rows used at once is overridden there because VALUES() is used in a subquery. You can read more about this problem in this post.

As a next step, it will be interesting to compare both solutions performance-wise on a wide table with let’s say > 100 columns.

Leave a Reply

Your email address will not be published. Required fields are marked *