Consolidating Contiguous Time Period Rows into one!!!

Consolidating Contiguous Time Period Rows into one!!!

Post by Adam Byrn » Wed, 11 Feb 2004 13:01:12



Hello.  This has been annoying me for a while now, and I'm not getting very
far (and a deadline looms!).

I have a table which contains time spans, so it has two columns of relevance
here: Start (datetime) and Finish (datetime),

I need a query that consolidates contiguous (adjacent) time spans into one,
so if the table has four rows:

2003-11-18 17:30:00, 2003-11-18 18:00:00
2003-11-18 18:00:00, 2003-11-18 19:00:00
2003-11-18 19:00:00, 2003-11-19 00:00:00
2003-11-19 07:30:00, 2003-11-18 08:30:00

the query should return the following:

2003-11-18 17:30:00, 2003-11-19 00:00:00
2003-11-19 07:30:00, 2003-11-18 08:30:00

Any ideas would be greatly appreciated!

Thanks,

-adam byrne

 
 
 

Consolidating Contiguous Time Period Rows into one!!!

Post by Jacco Schalkwij » Wed, 11 Feb 2004 14:05:57


Hi Adam,

There was a typo in the last row of your data, make sure that you have a
check constraint on the table to ensure that finish is later than start.

You can consolidate the timespans with the following query:

SELECT start, (SELECT MIN(finish) FROM #t t2
 WHERE t2.finish >= t1.start
 AND NOT EXISTS (SELECT NULL FROM #t t3
  WHERE t3.start = t2.finish)) AS finish
FROM #t t1
WHERE NOT EXISTS(SELECT NULL FROM #t t4
 WHERE t1.start = t4.finish)

--
Jacco Schalkwijk
SQL Server MVP


Quote:> Hello.  This has been annoying me for a while now, and I'm not getting
very
> far (and a deadline looms!).

> I have a table which contains time spans, so it has two columns of
relevance
> here: Start (datetime) and Finish (datetime),

> I need a query that consolidates contiguous (adjacent) time spans into
one,
> so if the table has four rows:

> 2003-11-18 17:30:00, 2003-11-18 18:00:00
> 2003-11-18 18:00:00, 2003-11-18 19:00:00
> 2003-11-18 19:00:00, 2003-11-19 00:00:00
> 2003-11-19 07:30:00, 2003-11-18 08:30:00

> the query should return the following:

> 2003-11-18 17:30:00, 2003-11-19 00:00:00
> 2003-11-19 07:30:00, 2003-11-18 08:30:00

> Any ideas would be greatly appreciated!

> Thanks,

> -adam byrne


 
 
 

Consolidating Contiguous Time Period Rows into one!!!

Post by Adam Byrn » Wed, 11 Feb 2004 18:52:03


Sorry about the typo!  And thanks a million for the speedy reply.

-adam


> Hi Adam,

> There was a typo in the last row of your data, make sure that you have a
> check constraint on the table to ensure that finish is later than start.

> You can consolidate the timespans with the following query:

> SELECT start, (SELECT MIN(finish) FROM #t t2
>  WHERE t2.finish >= t1.start
>  AND NOT EXISTS (SELECT NULL FROM #t t3
>   WHERE t3.start = t2.finish)) AS finish
> FROM #t t1
> WHERE NOT EXISTS(SELECT NULL FROM #t t4
>  WHERE t1.start = t4.finish)

> --
> Jacco Schalkwijk
> SQL Server MVP



> > Hello.  This has been annoying me for a while now, and I'm not getting
> very
> > far (and a deadline looms!).

> > I have a table which contains time spans, so it has two columns of
> relevance
> > here: Start (datetime) and Finish (datetime),

> > I need a query that consolidates contiguous (adjacent) time spans into
> one,
> > so if the table has four rows:

> > 2003-11-18 17:30:00, 2003-11-18 18:00:00
> > 2003-11-18 18:00:00, 2003-11-18 19:00:00
> > 2003-11-18 19:00:00, 2003-11-19 00:00:00
> > 2003-11-19 07:30:00, 2003-11-18 08:30:00

> > the query should return the following:

> > 2003-11-18 17:30:00, 2003-11-19 00:00:00
> > 2003-11-19 07:30:00, 2003-11-18 08:30:00

> > Any ideas would be greatly appreciated!

> > Thanks,

> > -adam byrne

 
 
 

Consolidating Contiguous Time Period Rows into one!!!

Post by Joe Celk » Wed, 11 Feb 2004 19:32:06


Quote:>>  have a table which contains time spans, so it has two columns of

relevance here: Start (datetime) and Finish (datetime)  .. <<

Can any of the durations overlap?  Are there any uniqueness constraints
on start and finish times?  Did you remember to add CHECK (start <
finish) in the real DDL?  Thre are good reasons the standard Netiquette
is to post DDL here.  So let me do it for you:

"Cut & Paste" from SQL FOR SMARTIES, Chapter 4:

Given a series of jobs that can start and stop at any time, how can you
be sure that an employee doing all these jobs was really working without
any gaps?  Let's build a table of timesheets for one employee.  

 CREATE TABLE Timesheets
 (job CHAR(5) NOT NULL PRIMARY KEY,
  start_date DATE NOT NULL,
  end_date DATE NOT NULL,
  CHECK(start_date < end_date));

INSERT INTO Timesheets VALUES ('j1', '1998-01-01', '1998-01-03');
INSERT INTO Timesheets VALUES ('j2', '1998-01-06', '1998-01-10');
INSERT INTO Timesheets VALUES ('j3', '1998-01-05', '1998-01-08');
INSERT INTO Timesheets VALUES ('j4', '1998-01-20', '1998-01-25');
INSERT INTO Timesheets VALUES ('j5', '1998-01-18', '1998-01-23');
INSERT INTO Timesheets VALUES ('j6', '1998-02-01', '1998-02-05');
INSERT INTO Timesheets VALUES ('j7', '1998-02-03', '1998-02-08');
INSERT INTO Timesheets VALUES ('j8', '1998-02-07', '1998-02-11');
INSERT INTO Timesheets VALUES ('j9', '1998-02-09', '1998-02-10');
INSERT INTO Timesheets VALUES ('j10', '1998-02-01', '1998-02-11');
INSERT INTO Timesheets VALUES ('j11', '1998-03-01', '1998-03-05');
INSERT INTO Timesheets VALUES ('j12', '1998-03-04', '1998-03-09');
INSERT INTO Timesheets VALUES ('j13', '1998-03-08', '1998-03-14');
INSERT INTO Timesheets VALUES ('j14', '1998-03-13', '1998-03-20');

The most immediate answer is to build a search condition for all of the
characteristics of a continuous time period  

This algorithm is due to Mike Arney, a DBA at BORN Consulting.  It uses
derived tables to get the extreme start and ending dates of a contiguous
run of durations.

SELECT Early.start_date, MIN(Latest.end_date)
  FROM (SELECT DISTINCT start_date
          FROM Timesheets AS T1
         WHERE NOT EXISTS
               (SELECT *
                  FROM Timesheets AS T2
                 WHERE T2.start_date < T1.start_date
                   AND T2.end_date >= T1.start_date)
       ) AS Early (start_date)
       INNER JOIN
       (SELECT DISTINCT end_date
          FROM Timesheets AS T3
         WHERE NOT EXISTS
              (SELECT *
                 FROM Timesheets AS T4
                WHERE T4.end_date > T3.end_date
                  AND T4.start_date <= T3.end_date)
      ) AS Latest (end_date)
      ON Early.start_date <= Latest.end_date
GROUP BY Early.start_date;

  Result
  start_date     end_date
  ===========================
  '1998-01-01'   '1998-01-03'
  '1998-01-05'   '1998-01-10'
  '1998-01-18'   '1998-01-25'
  '1998-02-01'   '1998-02-11'
  '1998-03-01'   '1998-03-20'

See if ou can get a copy of the Snodgrass book on temp*queries in
SQL; it just went out of print, so hurry.

--CELKO--
 ===========================
 Please post DDL, so that people do not have to guess what the keys,
constraints, Declarative Referential Integrity, datatypes, etc. in your
schema are.

*** Sent via Developersdex http://www.veryComputer.com/ ***
Don't just participate in USENET...get rewarded for it!