If your data fit into the star schema, RedBrick is the answer.
I've been DBA'ing sybase DB's at 40+ GB for years and I am getting ready toQuote:>We are in the process of selecting a database for our data warehouse. It is
>expected to start at 50GB of data with 50+ users and grow over the next 2
>years to be 500+GB of data and 500+ users. Much of the data will be obtained
>from our IBM mainframe ADABAS operational databases on a daily basis. We
>are looking at Sybase, Informix and Oracle. Does anyone have experiences
>(good or bad) that might help with our decision. In particular, how do
>these products scale throught this type of growth. Thanks
Sybase scales well for a well constructed query and I'm a big fan of the
product. But anytime you're getting into stuff that big you have some huge
operational concerns. To put it bluntly: the scalability of your queries is
going to be the very least of your concerns.
I know that sybase can work and work well at those sizes, but you need to
architect it well. To me, at that size, that means a number of machines and
many, many small ( < 5 GB) db's. You're querries are also going to have to be
tolerent of phyical i/o.
E-mail me for more discussion.
Topher
Don't take this discussion out-of-band! This sort of thread is just what we
need in this newsgroup.
--
Bob Lunney | Building tomorrow's legacy
Imonics, Inc. | systems today. (tm)
Don't take this discussion out-of-band! This sort of thread is just what we
need in this newsgroup.
--
Bob Lunney | Building tomorrow's legacy
Imonics, Inc. | systems today. (tm)
>Don't take this discussion out-of-band! This sort of thread is just
what we
>need in this newsgroup.
Here's a couple of things I've noticed:
1. Monolithic tables can scale well on queries if the key is
sufficiently small, unique and well behaved. As in:
a table with 100,000 rows and a key of 12 bytes takes somewhere
around 4 reads. The same key on the same table can support 4.5 million
rows with the same number of i/o's. Adding another i/o can increase
that to about 3/4 of a billion.
2. However, one wrong move on a monolithic table (can you say "table
scan") can*your entire cache and set you back a bejillion cycles.
So, IMHO, it is better to divide stuff up into lots of small tables (on
some key portion which is easily divideable) and tabke an extra i/o or
two and do it in a two stage query. This is less efficient and a bit
more complex on the code end of things, but oh so much safer.
Besides, rebuilding a corrupt 5 GB table is more frustrating than
trying to suck a McDonald's shake through the strw they give you.
Topher
Attributes usually fall into 2 camps.
1. Often read/changed.
2. Rarely read/changed.
example
table subscriber.
class 1.
subsc_id
subsc_nme
.
.
.
.
class 2.
subsc_mothers_maiden_name
it is often benificial to place class 1 attributes in a seperate table and only acces the class 2 attributes
when required. The benefit here comes because of the greater number of rows per page in the class 1 table.
Gus Fraser.
1. DBMS Selection for 500+GB data and 500+ Users
We are in the process of selecting a database for our data warehouse. It is
expected to start at 50GB of data with 50+ users and grow over the next 2
years to be 500+GB of data and 500+ users. Much of the data will be obtained
from our IBM mainframe ADABAS operational databases on a daily basis. We
are looking at Sybase, Informix and Oracle. Does anyone have experiences
(good or bad) that might help with our decision. In particular, how do
these products scale throught this type of growth. Thanks
2. Distributed Transaction: Login failed for user 'sa'
3. OID/XID allocation (was Re: is PG able to handle a >500 GB Da
4. db_block_size = 16K inflates tables
5. 500 concurrent user SQL Server?
7. MS-SQL Server able to manage 500 GigaBytes with 400 concurrent users
9. Hardware requirements for SQL2000 server with 500 users
10. Sqlserver 2000 Hardware Requirements for 500 concurrent users.
11. MS-SQL Server able to manage 500 gigabytes with 400 concurrent users
12. Looking for large (500+ user) SQL Server SMP UNIX Sites