Home Optimise T-sql query
Reply: 3

Optimise T-sql query

300baud
1#
300baud Published in 2017-12-06 21:49:32Z

The query I need to develop is to list out sites which have no records (retail transactions) for a given date. I've been able to accomplish this for a specific site, but I know the query can be written more efficiently using joins but my foray into inners and outers hasn't been much fun.

Here's what I have so far:

DECLARE @StartDate DATE = '2017-11-01',
    @EndDate DATE = '2017-11-30';

SELECT tx.Txndate, count(*) as txCount
FROM [Report].[dbo].[FactTransactions] tx 
where tx.Site = 2
and tx.TxnDate between @StartDate and @EndDate
group by tx.TxnDate
union all
select db.daybookdate, 0 as txCount
from DimDaybook db
where db.daybookdate between @StartDate and @EndDate
and NOT EXISTS (SELECT 1 FROM [Report].[dbo].[FactTransactions] AS t WHERE t.TxnDate = db.daybookdate and t.txndate between @StartDate and @EndDate and t.site = 2) 
order by tx.Txndate

This returns me the result set for site 2

Txndate txCount
2017-11-01  1691
2017-11-02  1657
2017-11-03  1835
2017-11-04  1587
2017-11-05  1489
2017-11-06  1544
2017-11-07  1525
2017-11-08  1782
2017-11-09  1848
2017-11-10  1990
2017-11-11  0
2017-11-12  0

What I'd really like to have is a result set that looks like this and runs much much quicker than my hack (currently around 2mins - for context the transactions table has 83,486,412 records)

Site    Date    Transactions
2   11/11/17    0
2   12/11/27    0
3   12/11/17    0
22  1/11/17     0
Gordon Linoff
2#
Gordon Linoff Reply to 2017-12-06 21:53:36Z

Generate all the rows using a cross join. Then filter out the ones with data:

select s.site, db.daybookdate
from DimDaybook db cross join
     (select distinct site
      from [Report].[dbo].[FactTransactions] ft
     ) s
where db.daybookdate between @StartDate and @EndDate and
      not exists (select 1 
                  from [Report].[dbo].[FactTransactions] ft 
                  where ft.TxnDate = db.daybookdate and
                        ft.site = s.site
                 ) 
order by tx.Txndate
Emilio Ceroleni
3#
Emilio Ceroleni Reply to 2017-12-06 22:05:37Z

One way could be the following:

DECLARE @StartDate DATE = '2017-11-01', @EndDate DATE = '2017-11-30'; 
With cte as (
 SELECT tx.Txndate, count(*) as txCount FROM 
[Report].[dbo].[FactTransactions] tx where tx.Site = 2 and tx.TxnDate between @StartDate and @EndDate group by tx.TxnDate)
Select db.daybookdate, isnull(txCount,0) from DimDaybook db left join cte on db.daybookdate = cte.Txndate

I'm missing the site id but there's nowhere I can get it from your sample code.

Used_By_Already
4#
Used_By_Already Reply to 2017-12-06 22:18:50Z

As you already have a list of all days available (DimDaybook) all you need to do is "left join" your counts to those days and then replace any NULLs with zero.

Do be careful with date ranges, using "between" might sound like a good way to do it but it isn't the best way. Instead of trying to specify the last day of November, just specify the first of December instead, then use a combination of >= and < with your date parameters and then your date range will work for every date/time data type (datetime2, datetime, smalldatime, date)

DECLARE @StartDate date = '2017-11-01'
      , @EndDate date = '2017-12-01'; -- this has changed!

SELECT
      db.daybookdate
    , COALESCE(txCount, 0)
FROM DimDaybook db
LEFT JOIN (
                  SELECT
                        tx.Txndate
                      , COUNT(*) AS txcount
                  FROM [Report].[dbo].[FactTransactions] tx
                  WHERE tx.Site = 2
                  AND tx.TxnDate >= @StartDate AND tx.TxnDate < @EndDate
                  GROUP BY
                        tx.TxnDate
      ) c ON db.daybookdate = c.Txndate
WHERE db.daybookdate >= @StartDate AND db.daybookdate < @EndDate
AND tx.Txndate IS NULL

When you establish a date reference like '2017-11-30' the time of day is automatically set at 00:00:00+0000000. So, for any data you might encounter that has a date of '2017-11-30' but a time > 00:00:00 that data would be excluded. In short: the duration of that day is ignored. This is easily overcome by simply moving the end date "up" to the next day and using less than that date, and to do that you must avoid using "between".

You need to login account before you can post.

About| Privacy statement| Terms of Service| Advertising| Contact us| Help| Sitemap|
Processed in 0.310269 second(s) , Gzip On .

© 2016 Powered by mzan.com design MATCHINFO