Design Strategy Question

Design Strategy Question

Post by Tim Klu » Thu, 23 Apr 1998 04:00:00



Our system has several tables, which when normalized, have a large number
of foreign keys. Given the limit of 16 joins in a single query, we have had
a difficult time creating queries and/or stored procedures to efficiently
and quickly retrieve records. When we are required to retrieve many
records, performance deteriorates rapidly.

Has anyone had experience dealing with query result set dependent upon a
large number of joins? If so, what design / implementation solutions have
worked best? I would prefer not to denormalize for a number of reason,
however some of our perfomance issues are fairly significant.

 
 
 

Design Strategy Question

Post by Greg Michelso » Thu, 23 Apr 1998 04:00:00


given your restrictions, there are only a couple things I can think of

1.  Use views
2.  Construct temporary tables (possibly every day with ## global option)
and pre-join the most important tables - onces with single column primary
keys e.g.

 
 
 

Design Strategy Question

Post by Roy Harv » Thu, 23 Apr 1998 04:00:00


Greg,

Quote:>1.  Use views

Views do NOT get around the SQL Server limit of 16 tables in one
query.

Roy

 
 
 

Design Strategy Question

Post by Greg Michelso » Thu, 23 Apr 1998 04:00:00


true, but I believe they are faster in some situations than ad hoc queries

 
 
 

Design Strategy Question

Post by Greg » Fri, 24 Apr 1998 04:00:00


You have run smack into a common decision support system issue.

Since you do not want to denormalize, it sounds like the data is your
original transaction source (OLTP).  You are in need of a data warehouse,
which prepares the data for decision support.  Optimizing for transaction
processing and analytical processing is different.

During the data preparation for the warehouse, you may denormalize some
fields that are used often in queries (this is a point of contention...many
argue for a normalized warehouse).  You can also be more agressive adding
indexes to the warehouse, since you are not concerned about slow response
time to end users when adding records (when you load the data warehouse,
drop all indexes first, add your data, the re-add the indexes).

You can schedule the data warehouse load to run each night (assuming your
hardware can process your data that quickly).

Hope this helps,
Greg
APS

+AD4-Our system has several tables, which when normalized, have a large number
+AD4-of foreign keys. Given the limit of 16 joins in a single query, we have had
+AD4-a difficult time creating queries and/or stored procedures to efficiently
+AD4-and quickly retrieve records. When we are required to retrieve many
+AD4-records, performance deteriorates rapidly.
+AD4-
+AD4-Has anyone had experience dealing with query result set dependent upon a
+AD4-large number of joins? If so, what design / implementation solutions have
+AD4-worked best? I would prefer not to denormalize for a number of reason,
+AD4-however some of our perfomance issues are fairly significant.
+AD4-

 
 
 

Design Strategy Question

Post by Roy Harv » Fri, 24 Apr 1998 04:00:00


Greg,

Quote:>true, but I believe they are faster in some situations than ad hoc queries

At least in SQL Server, views are performance neutral.  When the SQL
command that references a view is compiled, the SELECT in the view is
combined with the SQL command using the view into a single SQL
command.  This is optimized the same way it would have been if that
combined command had been entered in the first place.

Roy

 
 
 

Design Strategy Question

Post by dougde » Sat, 25 Apr 1998 04:00:00


My two cents: views are not easy to manage in a production environ if your
are making code/structure changes on a regular basis.  Who will ensure they
are re-compiled.  Who will check dependencies.  It is so easy to forget to
verify these things when you add a col, change a key, add or modify a
structure.  I'd only use views as a last resort.
doug

 
 
 

Design Strategy Question

Post by Diederik Dellae » Sat, 25 Apr 1998 04:00:00


On Wed, 22 Apr 1998 13:46:44 -0700, "Greg Michelson"

Is there a possibility to do pre-joins in SQL-Server or is this why
you create the temporary tables.

Quote:>given your restrictions, there are only a couple things I can think of

>1.  Use views
>2.  Construct temporary tables (possibly every day with ## global option)
>and pre-join the most important tables - onces with single column primary
>keys e.g.

 
 
 

Design Strategy Question

Post by Samuel Mese » Sun, 26 Apr 1998 04:00:00


My suggestion is:

1. If your database is not set to "truncate log on check point", create a
new database just to process this kind of queries and leave space enough to
hold "reports" for all users
The necessity to have this flag "truncate log on check point" is just to
avoid dumping transactions....in the production database

2. create one table with all columns that you need in the usual query. If
the purpose of this select is just to present data to the end user, create
all columns allowing NULL

3. Add another column to hold the user id

4. create one stored procedure for each user select, for each criteria that
the select makes use one parameter, the last one parameter will be the user
ID

in the stored procedure issue a insert - select ... from the table where
the more data will come from. FIll the column USERID with the stored
procedure parameter

issue updates in that table to fill the other columns that belong to other
tables. Dont forget to add columns that will be used as PK allowing you to
join to other tables. In the where clause of the UPDATE do not forget wo
consider the USERID

The topic in the item 1 is justified now because if the database does not
have that condition, every UPDATE wil fill the transaction LOG

5. Then the stored procedure issue a select * (or all columns that interest
the end user) and again do not forget the USERID

6. the last one step in the Stored Procedure is to DELETE all data for that
USERID

Ive been using this approach to prepare complex reports and its very fast

Good Luck

 
 
 

Design Strategy Question

Post by mashed potatoe » Thu, 30 Apr 1998 04:00:00


As I understand the documentation there is no limit on Joins. But there is
a practical
limit. Meaning that performance will degrade severally after 16 joins. At
least that is my understanding.

You might consider using a denormalized copy of the data. Run a batch at
night to denorm the
data into OLAP tables then run your query against the denormed tables.

Hope that helps
Michael

 
 
 

Design Strategy Question

Post by Luciano Spigue » Thu, 30 Apr 1998 04:00:00


No, it will not degrade. 16 joins IS the limit (See the BOL, SELECT
statement).

 
 
 

1. Backup and Recovery strategy and DB design

Hi all,

I need some help/ opinions/ suggestions for the following  matter:

We are expecting the size of our database to be about 200GB ,

The current design is :
4 tablespace each table space has one datafile

Tablespace no 1:
Cmacdata contains about 160 table
Size:5GB

Tablespace no 2:
CmacIndex contains indexes for the 160 table
Size: 2 GB.

Tablespace no 3:
CmacMeter contains 3 Big tables each one is partitioned by month
Size: 10GB.

Tablespace no 4:
CmacMeterIndex contains indexes for the 3 big tables and they are
partitioned too)
Size: 4GB

Is this a good design , considering the database performance and backup
and recovery performance

Isn't better to have as many datafiles as possible specially for the
partitioning?
Isn't better to have for each month( partition )  a separate datafile ?
and if so , how is it going to affect the database performance , backup
and recovery ?

Also for the backup and recovery strategy the plan is:
The database is on archive mode
Every day we will backup the transaction files ( logfiles )
Every week we will back up the whole database ( I don't know how long it
is going to be ) , we can do the backup for tables space no 1 on
Saturday and for table space no 2 on Sunday and so on ..
Now, is this a good plan? If yes or no  , why ?
And how about incremental backup ,incremental levels , non-cumulative
and cumulative

One last thing :

Our business requirements says that the data should be available  for  7
years , now because I don't want the database and datafiles to explode ,
I think ( I might be wrong ) I need to backup the data for each year and
save them to tapes then remove the data for that year from the  database
using delete from table where . , then if the people needed the data I
have to restore it again .

Firstly, I don't know if this logic is good or bad
Secondly. If it is OK ,  how can I implement it  ?

I really appreciate your help,
Thanks in advance
Regards,
Khaled

Sent via Deja.com http://www.deja.com/
Before you buy.

2. Oracle 7 used book

3. table design strategy

4. locked oracle.jdbc.driver.OraclePreparedStatement

5. Replication Strategy Question

6. OQY file format

7. More Questions on Encapsulation Strategy...

8. Retrieve records via a SP using ADO

9. MSSQL Backup Strategy Question?

10. Text file import strategy - DTS newbie question

11. Strategy question - DTS, .NET, OLAP and how to build our data warehouse

12. Question: Sybase vs Oracle replication strategy

13. Workday mapping strategy question.