If your data fit into the star schema, RedBrick is the answer.
I've been DBA'ing sybase DB's at 40+ GB for years and I am getting ready toQuote:
>We are in the process of selecting a database for our data warehouse. It is
>expected to start at 50GB of data with 50+ users and grow over the next 2
>years to be 500+GB of data and 500+ users. Much of the data will be obtained
>from our IBM mainframe ADABAS operational databases on a daily basis. We
>are looking at Sybase, Informix and Oracle. Does anyone have experiences
>(good or bad) that might help with our decision. In particular, how do
>these products scale throught this type of growth. Thanks
Sybase scales well for a well constructed query and I'm a big fan of the
product. But anytime you're getting into stuff that big you have some huge
operational concerns. To put it bluntly: the scalability of your queries is
going to be the very least of your concerns.
I know that sybase can work and work well at those sizes, but you need to
architect it well. To me, at that size, that means a number of machines and
many, many small ( < 5 GB) db's. You're querries are also going to have to be
tolerent of phyical i/o.
E-mail me for more discussion.
>Don't take this discussion out-of-band! This sort of thread is just
>need in this newsgroup.
Here's a couple of things I've noticed:
1. Monolithic tables can scale well on queries if the key is
sufficiently small, unique and well behaved. As in:
a table with 100,000 rows and a key of 12 bytes takes somewhere
around 4 reads. The same key on the same table can support 4.5 million
rows with the same number of i/o's. Adding another i/o can increase
that to about 3/4 of a billion.
2. However, one wrong move on a monolithic table (can you say "table
scan") can*your entire cache and set you back a bejillion cycles.
So, IMHO, it is better to divide stuff up into lots of small tables (on
some key portion which is easily divideable) and tabke an extra i/o or
two and do it in a two stage query. This is less efficient and a bit
more complex on the code end of things, but oh so much safer.
Besides, rebuilding a corrupt 5 GB table is more frustrating than
trying to suck a McDonald's shake through the strw they give you.
Attributes usually fall into 2 camps.
1. Often read/changed.
2. Rarely read/changed.
it is often benificial to place class 1 attributes in a seperate table and only acces the class 2 attributes
when required. The benefit here comes because of the greater number of rows per page in the class 1 table.
We are in the process of selecting a database for our data warehouse. It is
expected to start at 50GB of data with 50+ users and grow over the next 2
years to be 500+GB of data and 500+ users. Much of the data will be obtained
from our IBM mainframe ADABAS operational databases on a daily basis. We
are looking at Sybase, Informix and Oracle. Does anyone have experiences
(good or bad) that might help with our decision. In particular, how do
these products scale throught this type of growth. Thanks