Saturday 5 May 2012

MS SQL Correlated Subqueries


Well you can learn something new every day so I am finished up for the day and thought I would write about it.

I was writing a report for a customer and they wanted a count of values in a column. Simple stuff right, you just do a quick:

SELECT column1, COUNT(column3) FROM table1 GROUP BY column1

You end up with something like:
Value1  100
Value2  50
Value3  75

The problem here is what to do when you want to count the occurrences of the unique values in the column and have these counts output as individual columns in your report. Let’s say you have a data import job and the status of the each row you import is either going to be Pending, Completed or Failed.


So for each unique job you want a count of the rows that have yet to be inserted (Pending), have been inserted (Completed) or could not be inserted (Failed).

So what you want is something like:

Job ID
Rows Pending Insert
Rows Completed Insert
Rows Failed Insert
1
100
500
0
2
50
1000
55
3
0
2000
5

So this is where you can use Correlated Subqueries.
In the middle between your source data table and your target data table you have a tracking table that has a structure:
Dbo.TrackingTable
  • JobID
  • RowChangeSet
  • Status
The values in the Status column are going to be 1 = Pending, 2 = Completed, 3 = Failed
In order to get the report you would have a query like the following:

SELECT
GETDATE() AS 'Report As Of', 
tt.JobID AS 'Job ID',
       (SELECT COUNT(tt2.Status) FROM TrackingTable tt2 WHERE tt2.JobID = tt.JobID and tt2.Status = 1) AS 'Rows Pending Insert',
       (SELECT COUNT(tt2.Status) FROM TrackingTable tt2 WHERE tt2.JobID = tt.JobID and tt2.Status = 2) AS 'Rows Completed Insert',
       (SELECT COUNT(tt2.Status) FROM TrackingTable tt2 WHERE tt2.JobID = tt.JobID and tt2.Status = 3) AS 'Rows Failed Insert'
FROM
       TrackingTable tt
GROUP BY
       tt.JobID

You will notice that I am specifying in the WHERE clause that the sub query is dependent on the value in the main query. This is to say that the sub query is going to match where tt2.JobID = tt.JobID. You can look at where I am setting the alias tt and tt2 to see how this is working.

I hope you find this useful and if you do, show us some f and G+1 love, if you don’t then tell us why and we’ll work on that. If you have any questions/comments then leave them below.

No comments:

Post a Comment