Home Ignoring rows within daterange
Reply: 2

Ignoring rows within daterange

SparteyNL
1#
SparteyNL Published in 2017-11-13 20:10:31Z

I have the following data:

CREATE TABLE SampleData
(
    orderid int,
    [name] nvarchar(1),
    [date] date
);
INSERT INTO SampleData
VALUES
(1, 'a', '2017-01-01'),
(2, 'a', '2017-01-05'),
(3, 'a', '2017-02-01'),
(4, 'a', '2017-04-01'),
(5, 'a', '2017-10-01'),
(6, 'b', '2017-04-01');

I need to retrieve each new order according to the following rules:

  • The first date for a name is the 'current order' for that name
  • Orders with the same name, but less than 3 months difference with the 'current order' is considered the same order and needs to be ignored
  • 3 months or more difference with the 'current' order is considered a new order and is now the 'current order' (in the SampleData orderid 1 and 4 need to be compared instead of 3 and 4, because 3 is not the current order)
  • If the name and date are the same, then the row with the lowest orderid is the superior order

So with the sample data I need the following result:

id name, date
1  a     2017-01-01
4  a     2017-04-01
5  a     2017-10-01
6  b     2017-04-01

I tried several approaches, but without success. Any idea's on how I can achieve this?

SQLUser44
2#
SQLUser44 Reply to 2017-11-14 00:00:23Z

Below is a quick fix solution that can be built upon if your code scales beyond the sample data provided. I will state beforehand that this isn't the prettiest solution but it does return the result set you indicated you were after.

If anything, you may want to consider looking into T-SQL Window Functions as well as Analytic Functions. I will advice that that they don't play well with all datatypes.

My goal with the solution below was to rank the rows while partitioning by name and order by the date field. Thus you have something similar to your order id but the rank is specific to the customer who placed the order.

I'll do my best to answer any questions:

if object_id('tempdb..#tmp_SampleData','u') is not null
    drop table #tmp_SampleData

CREATE TABLE #tmp_SampleData
(
    orderid int,
    [name] nvarchar(1),
    [date] date
);
INSERT INTO #tmp_SampleData
VALUES
(1, 'a', '2017-01-01'),
(2, 'a', '2017-01-05'),
(3, 'a', '2017-02-01'),
(4, 'a', '2017-04-01'),
(5, 'a', '2017-10-01'),
(6, 'b', '2017-04-01');

if object_id('tempdb..#tmp_iter','u') is not null
    drop table #tmp_iter 

select 
 orderid
,name
,date
,rank() over (partition by name order by date) [Rank] 
,lag(orderid,1,0) over (partition by name order by date) [LagRank]
--,rank() over (partition by name order by date desc) [ReverseRank]
into #tmp_Iter 
from #tmp_SampleData

if object_id('tempdb..#tmp_final','u') is not null
    drop table #tmp_final 


select 
 i.orderid
,i.name
,i.date
,datediff(month,i.date,i2.date) [MonthsPassed]
into #tmp_final 
from #tmp_Iter i
    left join #tmp_Iter i2 
        on i.Rank = i2.LagRank

select * 
from #tmp_final 
where 1=1
and MonthsPassed > 3 
or MonthsPassed = 0 
or MonthsPassed < 0 
or MonthsPassed is null 
SparteyNL
3#
SparteyNL Reply to 2017-11-14 12:30:36Z

@SQLUser44, thanks for your input. Unfortunately your code is not working. The result for the table below should be orderid's 1,6,7,8 and 9. Yours results in 1,2,3,5,6,7,8 and 9.

INSERT INTO #tmp_SampleData
VALUES
(1,'a','2017-01-01'),
(2,'a','2017-01-08'),
(3,'a','2017-05-01'),
(4,'a','2017-01-05'),
(5,'a','2017-02-01'),
(6,'b','2017-01-01'),
(7,'b','2017-09-01'),
(8,'c','2017-10-01'),
(9,'a','2017-04-01');

I came up with the following that works, but I think it will lack performance...

if object_id('tempdb..#tmp_SampleData','u') is not null
drop table #tmp_SampleData

CREATE TABLE #tmp_SampleData
(
    orderid int,
    [name] nvarchar(1),
    [date] date
);
INSERT INTO #tmp_SampleData
VALUES
(1,'a','2017-01-01'),
(2,'a','2017-01-08'),
(3,'a','2017-05-01'),
(4,'a','2017-01-05'),
(5,'a','2017-02-01'),
(6,'b','2017-01-01'),
(7,'b','2017-09-01'),
(8,'c','2017-10-01'),
(9,'a','2017-04-01');

DECLARE Test_Cursor CURSOR FOR  
SELECT * FROM #tmp_SampleData ORDER BY [name], [date];
OPEN Test_Cursor;

DECLARE @orderid int;
DECLARE @name nvarchar(255); 
DECLARE @date date; 

FETCH NEXT FROM Test_Cursor INTO @orderid, @name, @date;

DECLARE @current_date date = @date;
DECLARE @current_name nvarchar(255) = @name;
DECLARE @listOfIDs TABLE (orderid int);
INSERT @listOfIDs values(@orderid);

WHILE @@FETCH_STATUS = 0  
    BEGIN
       IF(@name = @current_name AND DATEDIFF(MONTH, @current_date, @date) >= 3)
          BEGIN
             SET @current_date = @date
             INSERT @listOfIDs values(@orderid)
          END
       IF(@name != @current_name)
          BEGIN
             SET @current_name = @name
             SET @current_date = @date
             INSERT @listOfIDs values(@orderid)
          END
    FETCH NEXT FROM Test_Cursor INTO @orderid, @name, @date; 
    END;  
CLOSE Test_Cursor;
DEALLOCATE Test_Cursor;
SELECT * FROM #tmp_SampleData WHERE orderid IN (SELECT orderid FROM @listOfIDs);

Better performing alternatives are very welcome!

You need to login account before you can post.

About| Privacy statement| Terms of Service| Advertising| Contact us| Help| Sitemap|
Processed in 0.298402 second(s) , Gzip On .

© 2016 Powered by mzan.com design MATCHINFO