## Table Design question

### Table Design question

I'm still trying to learn about table design and wonder if someone could
help me with what must be a common problem.

In our analytical laboratory, samples for testing arrive in batches. Once a
batch is processed it is marked as closed. At many stages in our database,
we would like to see lists of all open batches. So we have two options:
(These two examples assume we have six batches and that the first three are
closed).

1. Use a 'Closed' Column in the Batch table
============================================
CREATE TABLE Batches
(
Batch_ID int IDENTITY(1,1) PRIMARY KEY,
BatchID char(10) NOT NULL,
Closed bit DEFAULT 0
)

INSERT INTO Batches (BatchID, Closed) VALUES ('Batch001', 1)
INSERT INTO Batches (BatchID, Closed) VALUES ('Batch002', 1)
INSERT INTO Batches (BatchID, Closed) VALUES ('Batch003', 1)
INSERT INTO Batches (BatchID, Closed) VALUES ('Batch004', 0)
INSERT INTO Batches (BatchID, Closed) VALUES ('Batch005', 0)
INSERT INTO Batches (BatchID, Closed) VALUES ('Batch006', 0)

SELECT * FROM Batches WHERE Closed = 0
=============================================

2. Use a separate ClosedBatches  table
=============================================
DROP TABLE Batches
GO
CREATE TABLE Batches
(
Batch_ID int IDENTITY(1,1) PRIMARY KEY,
BatchID char(10) NOT NULL
)

INSERT INTO Batches (BatchID) VALUES ('Batch001')
INSERT INTO Batches (BatchID) VALUES ('Batch002')
INSERT INTO Batches (BatchID) VALUES ('Batch003')
INSERT INTO Batches (BatchID) VALUES ('Batch004')
INSERT INTO Batches (BatchID) VALUES ('Batch005')
INSERT INTO Batches (BatchID) VALUES ('Batch006')

CREATE TABLE ClosedBatches
(
ClosedBatch_ID int IDENTITY(1,1) PRIMARY KEY,
Batch_ID int REFERENCES Batches(Batch_ID)
)

INSERT INTO ClosedBatches (Batch_ID) VALUES (1)
INSERT INTO ClosedBatches (Batch_ID) VALUES (2)
INSERT INTO ClosedBatches (Batch_ID) VALUES (3)

SELECT * FROM Batches WHERE Batch_ID NOT IN (SELECT Batch_ID
FROM ClosedBatches)
======================================

I think that 2. must be a better method than 1. I'd like to know if, in
fact, it is, and if there may be a third method that I haven't thought of.

John WB

### Table Design question

John:

Quote:> Use separate ClosedBatches and OpenBatches  tables
> =============================================
> CREATE TABLE BatchesOpen
> (
>  Batch_ID int IDENTITY(1,1) PRIMARY KEY,
>  BatchID char(10) NOT NULL
> )

> INSERT INTO BatchesOpen (BatchID) VALUES ('Batch004')
> INSERT INTO BatchesOpen (BatchID) VALUES ('Batch005')
> INSERT INTO BatchesOpen (BatchID) VALUES ('Batch006')

> CREATE TABLE BatchesClosed
> (
>  Batch_ID int IDENTITY(1,1) PRIMARY KEY,
>  BatchID char(10) NOT NULL
> )

> INSERT INTO BatchesClosed (Batch_ID) VALUES ('Batch001')
> INSERT INTO BatchesClosed (Batch_ID) VALUES ('Batch002')
> INSERT INTO BatchesClosed (Batch_ID) VALUES ('Batch003')

> SELECT * FROM BatchesOpen
> ======================================

When an Open batch is processed, its record is appended to BatchesClosed and
deleted from Batches Open, via the following stored procedure:

AS
INSERT INTO BatchesClosed

1. Both tables as small or smaller than with any other option
2. Queries of closed or open batches are straightforward

Larry Calame

- Show quoted text -

Quote:> I'm still trying to learn about table design and wonder if someone could
> help me with what must be a common problem.

> In our analytical laboratory, samples for testing arrive in batches. Once
a
> batch is processed it is marked as closed. At many stages in our database,
> we would like to see lists of all open batches. So we have two options:
> (These two examples assume we have six batches and that the first three
are
> closed).

> 1. Use a 'Closed' Column in the Batch table
> ============================================
> CREATE TABLE Batches
> (
>  Batch_ID int IDENTITY(1,1) PRIMARY KEY,
>  BatchID char(10) NOT NULL,
>  Closed bit DEFAULT 0
> )

> INSERT INTO Batches (BatchID, Closed) VALUES ('Batch001', 1)
> INSERT INTO Batches (BatchID, Closed) VALUES ('Batch002', 1)
> INSERT INTO Batches (BatchID, Closed) VALUES ('Batch003', 1)
> INSERT INTO Batches (BatchID, Closed) VALUES ('Batch004', 0)
> INSERT INTO Batches (BatchID, Closed) VALUES ('Batch005', 0)
> INSERT INTO Batches (BatchID, Closed) VALUES ('Batch006', 0)

> SELECT * FROM Batches WHERE Closed = 0
> =============================================

> 2. Use a separate ClosedBatches  table
> =============================================
> DROP TABLE Batches
> GO
> CREATE TABLE Batches
> (
>  Batch_ID int IDENTITY(1,1) PRIMARY KEY,
>  BatchID char(10) NOT NULL
> )

> INSERT INTO Batches (BatchID) VALUES ('Batch001')
> INSERT INTO Batches (BatchID) VALUES ('Batch002')
> INSERT INTO Batches (BatchID) VALUES ('Batch003')
> INSERT INTO Batches (BatchID) VALUES ('Batch004')
> INSERT INTO Batches (BatchID) VALUES ('Batch005')
> INSERT INTO Batches (BatchID) VALUES ('Batch006')

> CREATE TABLE ClosedBatches
> (
>  ClosedBatch_ID int IDENTITY(1,1) PRIMARY KEY,
>  Batch_ID int REFERENCES Batches(Batch_ID)
> )

> INSERT INTO ClosedBatches (Batch_ID) VALUES (1)
> INSERT INTO ClosedBatches (Batch_ID) VALUES (2)
> INSERT INTO ClosedBatches (Batch_ID) VALUES (3)

> SELECT * FROM Batches WHERE Batch_ID NOT IN (SELECT Batch_ID
>  FROM ClosedBatches)
> ======================================

> I think that 2. must be a better method than 1. I'd like to know if, in
> fact, it is, and if there may be a third method that I haven't thought of.

> John WB

### Table Design question

John,

Basic rule of thumb ... ask 3 software developers for their opinions on almost
any topic, and expect at least 5 different answers ...

Personally, I like option #1.  Unless you expect to have huge numbers of rows,
I'd be surprised if you would have a significant performance issue by keeping
everything in one table. It just seems "cleaner" to me to have a single table
holding the data.

I would however suggest that you consider changing the Closed column from a bit
to either an int or a char (1). Somewhere, at sometime, someone is going to want
a third status ... Voided, Cancelled, Contaminated, something like that. An int
or char(1) gives you more flexibility.

Again, the above is my personal preference ... I have no intention of starting a
"holy" war over data modeling issues  :-)

---------------------------------------------------------------
BP Margolin
When posting, inclusion of SQL (CREATE TABLE ..., INSERT ..., etc) which can be
cut and pasted into Query Analyzer is appreciated.

Quote:> I'm still trying to learn about table design and wonder if someone could
> help me with what must be a common problem.

> In our analytical laboratory, samples for testing arrive in batches. Once a
> batch is processed it is marked as closed. At many stages in our database,
> we would like to see lists of all open batches. So we have two options:
> (These two examples assume we have six batches and that the first three are
> closed).

> 1. Use a 'Closed' Column in the Batch table
> ============================================
> CREATE TABLE Batches
> (
>  Batch_ID int IDENTITY(1,1) PRIMARY KEY,
>  BatchID char(10) NOT NULL,
>  Closed bit DEFAULT 0
> )

> INSERT INTO Batches (BatchID, Closed) VALUES ('Batch001', 1)
> INSERT INTO Batches (BatchID, Closed) VALUES ('Batch002', 1)
> INSERT INTO Batches (BatchID, Closed) VALUES ('Batch003', 1)
> INSERT INTO Batches (BatchID, Closed) VALUES ('Batch004', 0)
> INSERT INTO Batches (BatchID, Closed) VALUES ('Batch005', 0)
> INSERT INTO Batches (BatchID, Closed) VALUES ('Batch006', 0)

> SELECT * FROM Batches WHERE Closed = 0
> =============================================

> 2. Use a separate ClosedBatches  table
> =============================================
> DROP TABLE Batches
> GO
> CREATE TABLE Batches
> (
>  Batch_ID int IDENTITY(1,1) PRIMARY KEY,
>  BatchID char(10) NOT NULL
> )

> INSERT INTO Batches (BatchID) VALUES ('Batch001')
> INSERT INTO Batches (BatchID) VALUES ('Batch002')
> INSERT INTO Batches (BatchID) VALUES ('Batch003')
> INSERT INTO Batches (BatchID) VALUES ('Batch004')
> INSERT INTO Batches (BatchID) VALUES ('Batch005')
> INSERT INTO Batches (BatchID) VALUES ('Batch006')

> CREATE TABLE ClosedBatches
> (
>  ClosedBatch_ID int IDENTITY(1,1) PRIMARY KEY,
>  Batch_ID int REFERENCES Batches(Batch_ID)
> )

> INSERT INTO ClosedBatches (Batch_ID) VALUES (1)
> INSERT INTO ClosedBatches (Batch_ID) VALUES (2)
> INSERT INTO ClosedBatches (Batch_ID) VALUES (3)

> SELECT * FROM Batches WHERE Batch_ID NOT IN (SELECT Batch_ID
>  FROM ClosedBatches)
> ======================================

> I think that 2. must be a better method than 1. I'd like to know if, in
> fact, it is, and if there may be a third method that I haven't thought of.

> John WB

### Table Design question

John:

For what it's worth, I actually agree with BP.

I sugeested method #3 as addressing performance concerns better than method
#2, but at the expense of reducing the simplicity/ease of maintenance of
method #1.

Larry Calame

Quote:> I'm still trying to learn about table design and wonder if someone could
> help me with what must be a common problem.

> In our analytical laboratory, samples for testing arrive in batches. Once
a
> batch is processed it is marked as closed. At many stages in our database,
> we would like to see lists of all open batches. So we have two options:
> (These two examples assume we have six batches and that the first three
are
> closed).

> 1. Use a 'Closed' Column in the Batch table
> ============================================
> CREATE TABLE Batches
> (
>  Batch_ID int IDENTITY(1,1) PRIMARY KEY,
>  BatchID char(10) NOT NULL,
>  Closed bit DEFAULT 0
> )

> INSERT INTO Batches (BatchID, Closed) VALUES ('Batch001', 1)
> INSERT INTO Batches (BatchID, Closed) VALUES ('Batch002', 1)
> INSERT INTO Batches (BatchID, Closed) VALUES ('Batch003', 1)
> INSERT INTO Batches (BatchID, Closed) VALUES ('Batch004', 0)
> INSERT INTO Batches (BatchID, Closed) VALUES ('Batch005', 0)
> INSERT INTO Batches (BatchID, Closed) VALUES ('Batch006', 0)

> SELECT * FROM Batches WHERE Closed = 0
> =============================================

> 2. Use a separate ClosedBatches  table
> =============================================
> DROP TABLE Batches
> GO
> CREATE TABLE Batches
> (
>  Batch_ID int IDENTITY(1,1) PRIMARY KEY,
>  BatchID char(10) NOT NULL
> )

> INSERT INTO Batches (BatchID) VALUES ('Batch001')
> INSERT INTO Batches (BatchID) VALUES ('Batch002')
> INSERT INTO Batches (BatchID) VALUES ('Batch003')
> INSERT INTO Batches (BatchID) VALUES ('Batch004')
> INSERT INTO Batches (BatchID) VALUES ('Batch005')
> INSERT INTO Batches (BatchID) VALUES ('Batch006')

> CREATE TABLE ClosedBatches
> (
>  ClosedBatch_ID int IDENTITY(1,1) PRIMARY KEY,
>  Batch_ID int REFERENCES Batches(Batch_ID)
> )

> INSERT INTO ClosedBatches (Batch_ID) VALUES (1)
> INSERT INTO ClosedBatches (Batch_ID) VALUES (2)
> INSERT INTO ClosedBatches (Batch_ID) VALUES (3)

> SELECT * FROM Batches WHERE Batch_ID NOT IN (SELECT Batch_ID
>  FROM ClosedBatches)
> ======================================

> I think that 2. must be a better method than 1. I'd like to know if, in
> fact, it is, and if there may be a third method that I haven't thought of.

> John WB

### Table Design question

Thanks Larry, I'm going to stick with option #1 for now, as you and BP have
suggested. If performance ever becomes in issue, (which I doubt), I like
your elegant way of dealing with it. You could even modify it by turning it
into a trigger. Keep the 'Closed' column in the table, and when it changes
from 0 to 1 have the trigger update the Open and Closed tables.

> John:

> For what it's worth, I actually agree with BP.

> I sugeested method #3 as addressing performance concerns better than
method
> #2, but at the expense of reducing the simplicity/ease of maintenance of
> method #1.

> Larry Calame

> > I'm still trying to learn about table design and wonder if someone could
> > help me with what must be a common problem.

> > In our analytical laboratory, samples for testing arrive in batches.
Once
> a
> > batch is processed it is marked as closed. At many stages in our
database,
> > we would like to see lists of all open batches. So we have two options:
> > (These two examples assume we have six batches and that the first three
> are
> > closed).

> > 1. Use a 'Closed' Column in the Batch table
> > ============================================
> > CREATE TABLE Batches
> > (
> >  Batch_ID int IDENTITY(1,1) PRIMARY KEY,
> >  BatchID char(10) NOT NULL,
> >  Closed bit DEFAULT 0
> > )

> > INSERT INTO Batches (BatchID, Closed) VALUES ('Batch001', 1)
> > INSERT INTO Batches (BatchID, Closed) VALUES ('Batch002', 1)
> > INSERT INTO Batches (BatchID, Closed) VALUES ('Batch003', 1)
> > INSERT INTO Batches (BatchID, Closed) VALUES ('Batch004', 0)
> > INSERT INTO Batches (BatchID, Closed) VALUES ('Batch005', 0)
> > INSERT INTO Batches (BatchID, Closed) VALUES ('Batch006', 0)

> > SELECT * FROM Batches WHERE Closed = 0
> > =============================================

> > 2. Use a separate ClosedBatches  table
> > =============================================
> > DROP TABLE Batches
> > GO
> > CREATE TABLE Batches
> > (
> >  Batch_ID int IDENTITY(1,1) PRIMARY KEY,
> >  BatchID char(10) NOT NULL
> > )

> > INSERT INTO Batches (BatchID) VALUES ('Batch001')
> > INSERT INTO Batches (BatchID) VALUES ('Batch002')
> > INSERT INTO Batches (BatchID) VALUES ('Batch003')
> > INSERT INTO Batches (BatchID) VALUES ('Batch004')
> > INSERT INTO Batches (BatchID) VALUES ('Batch005')
> > INSERT INTO Batches (BatchID) VALUES ('Batch006')

> > CREATE TABLE ClosedBatches
> > (
> >  ClosedBatch_ID int IDENTITY(1,1) PRIMARY KEY,
> >  Batch_ID int REFERENCES Batches(Batch_ID)
> > )

> > INSERT INTO ClosedBatches (Batch_ID) VALUES (1)
> > INSERT INTO ClosedBatches (Batch_ID) VALUES (2)
> > INSERT INTO ClosedBatches (Batch_ID) VALUES (3)

> > SELECT * FROM Batches WHERE Batch_ID NOT IN (SELECT Batch_ID
> >  FROM ClosedBatches)
> > ======================================

> > I think that 2. must be a better method than 1. I'd like to know if, in
> > fact, it is, and if there may be a third method that I haven't thought
of.

> > John WB

### Table Design question

question, and I appreciate getting opinions on something like this. It's one
of those nagging questions that keep you awake at night, wondering if your
database will work just fine or if, in fact, you have built a monster that
would give Joe Celko apoplexy.

After I wrote the post I tested each scenario with 10,000 records and
couldn't detect a difference in speed, so I'm going to stick to option #1,
and change data types as you suggested.

Once again, thanks.

> John,

> Basic rule of thumb ... ask 3 software developers for their opinions on
almost
> any topic, and expect at least 5 different answers ...

> Personally, I like option #1.  Unless you expect to have huge numbers of
rows,
> I'd be surprised if you would have a significant performance issue by
keeping
> everything in one table. It just seems "cleaner" to me to have a single
table
> holding the data.

> I would however suggest that you consider changing the Closed column from
a bit
> to either an int or a char (1). Somewhere, at sometime, someone is going
to want
> a third status ... Voided, Cancelled, Contaminated, something like that.
An int
> or char(1) gives you more flexibility.

> Again, the above is my personal preference ... I have no intention of
starting a
> "holy" war over data modeling issues  :-)

> ---------------------------------------------------------------
> BP Margolin
> When posting, inclusion of SQL (CREATE TABLE ..., INSERT ..., etc) which
can be
> cut and pasted into Query Analyzer is appreciated.

> > I'm still trying to learn about table design and wonder if someone could
> > help me with what must be a common problem.

> > In our analytical laboratory, samples for testing arrive in batches.
Once a
> > batch is processed it is marked as closed. At many stages in our
database,
> > we would like to see lists of all open batches. So we have two options:
> > (These two examples assume we have six batches and that the first three
are
> > closed).

> > 1. Use a 'Closed' Column in the Batch table
> > ============================================
> > CREATE TABLE Batches
> > (
> >  Batch_ID int IDENTITY(1,1) PRIMARY KEY,
> >  BatchID char(10) NOT NULL,
> >  Closed bit DEFAULT 0
> > )

> > INSERT INTO Batches (BatchID, Closed) VALUES ('Batch001', 1)
> > INSERT INTO Batches (BatchID, Closed) VALUES ('Batch002', 1)
> > INSERT INTO Batches (BatchID, Closed) VALUES ('Batch003', 1)
> > INSERT INTO Batches (BatchID, Closed) VALUES ('Batch004', 0)
> > INSERT INTO Batches (BatchID, Closed) VALUES ('Batch005', 0)
> > INSERT INTO Batches (BatchID, Closed) VALUES ('Batch006', 0)

> > SELECT * FROM Batches WHERE Closed = 0
> > =============================================

> > 2. Use a separate ClosedBatches  table
> > =============================================
> > DROP TABLE Batches
> > GO
> > CREATE TABLE Batches
> > (
> >  Batch_ID int IDENTITY(1,1) PRIMARY KEY,
> >  BatchID char(10) NOT NULL
> > )

> > INSERT INTO Batches (BatchID) VALUES ('Batch001')
> > INSERT INTO Batches (BatchID) VALUES ('Batch002')
> > INSERT INTO Batches (BatchID) VALUES ('Batch003')
> > INSERT INTO Batches (BatchID) VALUES ('Batch004')
> > INSERT INTO Batches (BatchID) VALUES ('Batch005')
> > INSERT INTO Batches (BatchID) VALUES ('Batch006')

> > CREATE TABLE ClosedBatches
> > (
> >  ClosedBatch_ID int IDENTITY(1,1) PRIMARY KEY,
> >  Batch_ID int REFERENCES Batches(Batch_ID)
> > )

> > INSERT INTO ClosedBatches (Batch_ID) VALUES (1)
> > INSERT INTO ClosedBatches (Batch_ID) VALUES (2)
> > INSERT INTO ClosedBatches (Batch_ID) VALUES (3)

> > SELECT * FROM Batches WHERE Batch_ID NOT IN (SELECT Batch_ID
> >  FROM ClosedBatches)
> > ======================================

> > I think that 2. must be a better method than 1. I'd like to know if, in
> > fact, it is, and if there may be a third method that I haven't thought
of.

> > John WB

### Table Design question

On Sat, 24 Jun 2000 16:27:28 -0400, "BPMargolin"

[snip]

Quote:>Personally, I like option #1.

[snip]

Me too.

Quote:>I would however suggest that you consider changing the Closed column from a bit
>to either an int or a char (1).

[snip]

I agree.  And I'd consider making that column Status instead.  I'd
also build a related table for Status now rather than later,
especially if I were using an int.

--
Mike Sherrill
Information Management Systems

I have a number of code tables (20+) that all have the same structure
(code - (char(3)), description(char 50), etc.).  What is more advantageous:

1) mutiple code tables
2) a single code table that combines all code values.  I would have to add
another column "TYPE" to uniquely identify the code type.

I am trying to look at it from a database performance perspective, which I
think lends itself to the "Single Code Table" solution.  The number of
tables is obviously reduced as well as the number of indices and the space
required for each.   Where is my logic incorrect?

From an ease of use standpoint, I believe I would have to create individual
views based on the type of code (for application drop-down purposes)  but I
would only have to build one maintenance facility for the table.

Any help would be appreciated.
Response by e-mail also appreciated.

Steve Predenkoski

CONVISTA, Inc.
(888)275-4477