Unix machines for large databases

Unix machines for large databases

Post by Bill » Wed, 02 May 1990 04:44:00



Help! By Friday we need to know if there is a Unix-based box that
can work as a very high-performance data-base server.  Yes, there are a
million Unix boxes out there, but a data-base server has to be able to
cope with concurrent access by multiple (possibly several hundred)
users.  Simple file locking isn't good enough -- users should never
have to read "file locked" error messages.  Also, the disk performance
should be very good.

What is the nature of the data base?  Would you believe we're not
sure?  The amount of data will probably be very large, and it may or
may not be based on the relational model.  Why am I asking such a
vague question?  Because we are trying to develop corporate funding
for a project that is still in the jello stage of conception (you can
see it, and it glistens, but you still can't get a good grip on it).
What we want to do is to be able to talk about existing Unix-based
solutions to very large data sharing problems.

Names we know about:
 Gould -- fast disks, but is there data-sharing software?
 Most other Big Unix Boxes in the world -- ditto above comment.
 Tandem -- is this Unix based?
 Stratus -- is this Unix based,  is this a good database machine?

I'm nervous about posting this, because I expect every Unix box maker
to tell me about their great machines.  That's fine, but are there
proven very-large data-sharing/data-base applications on those
machines?  Are the disks very high performance?  SCSI ports probably
won't hack it.   We will consider both uniprocessor and multiprocessor
solutions.  Non-Unix solutions, while interesting, are not the topic
of this posting.  Please mail to me, don't post.  If others express an
interest, and if the response is informative, I'll post a summary.

Bill O'Farrell, Northeast Parallel Architectures Center at Syracuse University

 
 
 

Unix machines for large databases

Post by Philip K » Wed, 02 May 1990 22:21:00


Bill -

Please get in touch with me if you can.  The return mail path I got
out of news was completely worthless, as usual, and I don't want to
clutter the net with this stuff.  You should be able to get in touch
with me through the uucp paths in my signature - they're the only way,
as far as I know, to get here from anywhere else.

BTW, in case you couldn't guess, we're doing gigabyte-database OLTP
applications on UNIX boxes here at Hopkins...

                                                                 Phil Kos
...!decvax!decuac!\                                   Information Systems
  ...!uunet!mimsy!aplcen!osiris!phil           The Johns Hopkins Hospital
...!allegra!/                                               Baltimore, MD

 
 
 

Unix machines for large databases

Post by Mark P. Diamo » Wed, 02 May 1990 05:05:00



Quote:> Help! By Friday we need to know if there is a Unix-based box that
> can work as a very high-performance data-base server.  

Take a look at the Sequent Symmetry.  This tightly coupled
UNIX multiple processor is an optimum machine for running
Relational Databases.  In a recent project with Relational
Technology a six* processor Symmetry achieved 104 Transactions
per second running the Debit Credit Benchmark (TP1) on
a fully sized 1.1G Byte database, at a price performance ratio
at about 1/8 that of Tandem  (all of this was verfied by
the independent Codd & Date consulting group).  This is the  
fastest (by a factor of about three) any UNIX box has achieved
for this benchmark.  RTI, Oracle, Informix and Unify run Sequent
for their in-house applications.

Mark <>

PS  If anyone would like a full write up of this project send me
your postal address.
<>                                  <>                                  <>
Mark P. Diamond                  {sun, cbosgd, amdahl, mtxinu}!rtech!markd
from Sequent Computer Systems onsite at Relational Technology

 
 
 

Unix machines for large databases

Post by Stephen Samu » Wed, 02 May 1990 12:12:00



Quote:> Help! By Friday we need to know if there is a Unix-based box that
> can work as a very high-performance data-base server.  Yes, there are a

I think (from the propaganda I've heard) that something like oracle might sorta
fit your bill. One of the ways that they do this is by use of raw disk I/O
rather than putting the data base into the filesytem space.
  I assume that SMD drives are fast enough for you?

-------------
 Stephen Samuel
  {ihnp4,ubc-vision,vax135}!alberta!edm!steve

--
-------------
 Stephen Samuel                         Disclaimer: You betcha!
  {ihnp4,ubc-vision,seismo!mnetor,vax135}!alberta!edm!steve

 
 
 

Unix machines for large databases

Post by Lee Sail » Wed, 02 May 1990 21:16:00



Quote:

>Help! By Friday we need to know if there is a Unix-based box that
>can work as a very high-performance data-base server.  Yes, there are a

Sure there is.  You can run Unix on a Cray.
 
 
 

Unix machines for large databases

Post by Eric Berg » Thu, 03 May 1990 00:36:00




>> Help! By Friday we need to know if there is a Unix-based box that
>> can work as a very high-performance data-base server.  

>Take a look at the Sequent Symmetry.  This tightly coupled
>UNIX multiple processor is an optimum machine for running
>Relational Databases.

        Rather a sweeping statement...

Quote:>In a recent project with Relational
>Technology a six* processor Symmetry achieved 104 Transactions
>per second running the Debit Credit Benchmark (TP1) on
>a fully sized 1.1G Byte database, at a price performance ratio
>at about 1/8 that of Tandem  (all of this was verfied by
>the independent Codd & Date consulting group).  This is the  
>fastest (by a factor of about three) any UNIX box has achieved
>for this benchmark.

        Just to make sure we are comparing apples and apples here, I'm
a little surprised by the 1.1 Gbyte figure. How many tuples were you running
with in account? The "standard" 1,000,000 (with 1000 teller and 100 branch
tuples) or did you scale it up? If scaled up, it is probably not appropriate
to compare against any other tests that have been run, since the decreased
contention on the branch relation will improve performance. Did you run
with just one history relation, or did you split that up, and if so,
into how many pieces? I assume that this was with journaling turned on?

Quote:>RTI, Oracle, Informix and Unify run Sequent
>for their in-house applications.

        In general, claiming that a database vendor is using one particular
platform or another for in-house applications is similar to claiming
that the Bell Labs has bought your computer - it's not a very exclusive
club. Oracle has purchased several Pyramid 9840s for their world-wide
sales applications. Several of the database companies use Pyramid's
for their file servers.

        Obviously I am biased, but claiming that any computer is "optimum"
for so broad a range of possible uses as relational database applications
seems a little questionable.

 
 
 

Unix machines for large databases

Post by G.Pavl » Wed, 02 May 1990 01:55:00



Quote:> ...................... In a recent project with Relational
> Technology a six* processor Symmetry achieved 104 Transactions
> per second running the Debit Credit Benchmark (TP1) on
> a fully sized 1.1G Byte database, at a price performance ratio
> at about 1/8 that of Tandem  (all of this was verfied by
> the independent Codd & Date consulting group). ................

> PS  If anyone would like a full write up of this project send me
> your postal address.

  I do not doubt the benchmark and I would appreciate a copy of the write-up.
  But there has been a working relationship of one sort or another between
  the C&D group and RTI for a long time.  So "independent" is overstating
 things a bit......
 
 
 

Unix machines for large databases

Post by G.Pavl » Wed, 02 May 1990 14:06:00




> > Help! By Friday we need to know if there is a Unix-based box that
> > can work as a very high-performance data-base server.  Yes, there are a

> I think (from the propaganda I've heard) that something like oracle might sort
> a fit your bill. One of the ways that they do this is by use of raw disk I/O
> rather than putting the data base into the filesytem space.

  But using raw disk i/o per se doesn't guarantee anything, does it ?  I think
  that the most relevant part of your message was the phrase in the parens.

    greg pavlov, fstrf, amherst, ny

 
 
 

Unix machines for large databases

Post by Elliott S. Fra » Wed, 02 May 1990 18:52:00




>>Help! By Friday we need to know if there is a Unix-based box that
>>can work as a very high-performance data-base server.  Yes, there are a

>Sure there is.  You can run Unix on a Cray.

Or on an Amdahl 5890/5990 running UTS.  You may have a floor space problem
past several Tb.
--
Elliott Frank      ...!{hplabs,ames,sun}!amdahl!esf00     (408) 746-6384
               or ....!{bnrmtv,drivax,hoptoad}!amdahl!esf00

[the above opinions are strictly mine, if anyone's.]
[the above signature may or may not be repeated, depending upon some
inscrutable property of the mailer-of-the-week.]

 
 
 

Unix machines for large databases

Post by Dave Kello » Wed, 02 May 1990 09:09:00


I can understand Eric's skepticism because if someone told me 6 months
ago that INGRES would exceed 100 TPS I might have asked them if they
bumped their head on the way to the office.

However, I know what Mark Diamond says is true because I was in the room
with him, along with Tom Sawyer from Codd & Date consulting, when INGRES
hit 104 TPS.

To appease any cynics I'll list the one caveat of the benchmark first:

        * The INGRES system (running on a Sequent Symmetry machine) which
          hit 104 TPS was running a prototype version of RTI's next release.
          As part of normal prototyping activity we asked ourselves "Just
          how fast can this go?"  We convinced Sequent to let RTI use a
          large Symmetry machine, and we were off...

Eric was surprised about the 1+ Gigabyte database size.  In fact, the
benchmark was run with a DebitCredit defined 100 TPS sized database.  Before
continuing, a little background on the DebitCredit benchmark is in order.

DebitCredit is a well-defined standard benchmark and was written in the
late 1970's  by Jim Gray and about 20 other database professionals.  The
paper was eventually published in DATAMATION under the title "A Measure of
Transaction Processing" by the authors "Anon et al."  Rumour has it the
authors wished to remain secret due to flame-ups that occurred after Dave
DeWitt and Dina Bitton wrote their paper on DBMS benchmarking.

DebitCredit is one of three benchmarks described in the paper, and various
degenerate forms of DebitCredit  have become loosely known in the industry
as "TP1."  The problem with TP1, and the ensuing "TPS" (transactions/second)
measurements, is that most vendors size the databse irregularly (i.e. smaller
than DebitCredit defines).  Thus, as Eric points out, when comparing TPS
measurements one is often comparing apples and oranges.

For the Silver Bullet benchmarks, to which Mark refers, the database was
sized at 100 TPS, or 10 Million 100 byte account records, 10,000 100 byte
teller records, and 1,000 100 byte account records.  Thus, a real purist
would rob RTI of the 104 TPS (and grant only 100 TPS) because the database
was sized for 100 TPS.  (If you do the multiplication you'll see that
the account relation alone is 1 gigabyte of data.)

Overall, the benchmark conformed to DebitCredit standards quite well,
including the submission of tranasctions via a network.  However, there
were a few things we didn't do 100% to the DebitCredit spec.  But then
again, we did a couple to exceed the spec.  In any case, the auditor's
report is being published tommorrow so all DebitCredit whizzes can take
a look.

In conclusion, I saw one "pop-off" on the net (flame semi-on) which questioned
the integrity of the auditor since "Codd and Date and RTI have always had
a good working relationship..." I'll reply to that with

        * If we wanted to pay someone to lie we wouldn't have paid
          Codd and Date's rates!  ;-)

        * Mr. Sawyer was the auditor of Tandem's 208 TPS benchmark.

        * I personally hope that he is not on the net to see this random
          * on his character.

        * If you read his report you'll see that he is certainly impartial.

Finally, if you're interested in seeing the benchmark report you can reply
to this message with a postal address and I'll do my best to get you a copy.

Dave Kellogg
ucbvax!rtech!davek (might need a mtxinu before the rtech)

"Hmmm.  We hit 100 TPS, can I go to bed now??"

 
 
 

Unix machines for large databases

Post by Charles Simmo » Wed, 02 May 1990 15:16:00




>>Help! By Friday we need to know if there is a Unix-based box that
>>can work as a very high-performance data-base server.  Yes, there are a

>Sure there is.  You can run Unix on a Cray.

Unless you have a real big need for the vector processor of the Cray,
an Amdahl machine may well provide better performance at a lower cost.

-- Cs

 
 
 

Unix machines for large databases

Post by David Kepp » Thu, 03 May 1990 00:14:00





>>>Help! By Friday we need to know if there is a Unix-based box that
>>>can work as a very high-performance data-base server.  Yes, there are
>>Sure there is.  You can run Unix on a Cray.
>Or on an Amdahl 5890/5990 running UTS.  You may have a floor space
>problem past several Tb.

Check out optical disk drives.  I believe DEC is now selling them
for the VAX line; I'd immagine that most other vendors have similar
products in mind.  They can solve your floor space problems well
beyond "several Tb", and, being write-once-read-many (WORM) are well-
suited to an application requiring a permanent history.  Typically
they are large enough so that you don't fill them very fast even if
you don't care about a permanent record.

    ;-D on  ( Bliss is Bliss, Ignorance is Ignorance, I'm happy )  Pardo

 
 
 

Unix machines for large databases

Post by news softwa » Wed, 02 May 1990 05:47:00





#> I think (from the propaganda I've heard) that something like oracle
#> might sort a fit your bill. One of the ways that they do this is by use
#> of raw disk I/O rather than putting the data base into the filesytem space.
#>
#   But using raw disk i/o per se doesn't guarantee anything, does it ?  I think
It tends to promise that address locality implies spacial locality. This is
a nice assumption to be able to make when you want to improve your speed.
--
-------------
 Stephen Samuel
  {ihnp4,ubc-vision,vax135}!alberta!edm!steve

 
 
 

Unix machines for large databases

Post by Jim Milbe » Wed, 02 May 1990 16:28:00




>> Help! By Friday we need to know if there is a Unix-based box that
>> can work as a very high-performance data-base server.  

>Take a look at the Sequent Symmetry.  This tightly coupled
>UNIX multiple processor is an optimum machine for running
>Relational Databases.

Relational Technology has been working with several unix-based
multiprocessing machines with the INGRES product.

Significant performance gains can be had using cpu/io improvements
that these vendors offer.

Pyramid is also offering significant features in a balanced fashion
(cpu, disk i/o and terminal i/o) that INGRES can take advantage of.

No selling here, but INGRES is hot, and can take advantage of the
multiprocessing capabilities that Pyramid and Sequent and others offer.

** Opinions are my own and not necessarily RTI's.

jim Milbery, RTI Technical Support Burlington,MA

 
 
 

Unix machines for large databases

Post by Paul F » Wed, 02 May 1990 18:07:00



>Help! By Friday we need to know if there is a Unix-based box that
>can work as a very high-performance data-base server.  Yes, there are a
>million Unix boxes out there, but a data-base server has to be able to
>cope with concurrent access by multiple (possibly several hundred)
>users.  Simple file locking isn't good enough -- users should never
>have to read "file locked" error messages.  Also, the disk performance
>should be very good.

I have no answers for you but I do have some questions ...

Is your database mainly for reads or reads and writes. If its mainly for
reading information, then having a large disk cache, (eg 8MB) will far
outweigh the speed of the disk.

If you really are going to try to support hundreds of users, then one
major problem will be finding a network interface that can reliably
support this many virtual circuits. One of the biggest problems with
all machines seems to be the limit on the number of concurrent sessions.
Not only is a large amount of memory (ie several K) needed per
session, but also one has to consider things like the size of the
ARP tables.

=====================
     //        o      All opinions are my own.
   (O)        ( )     The powers that be ...
  /    \_____( )
 o  \         |
    /\____\__/      

 
 
 

1. moving large database to another machine

Hello,
        Perhaps some of you ingres wizards can provide some ideas
We have to move our largest database from 1 machine to new machine
but we do not have enough discspace on the original system to run unloaddb
is there any other way we can acheive this transfer

the original machine is A ICL DRS6000 running ingres 6.4.3
the target machine is a ICL DRS6000 running ingres 6.4.4

does anybody have any suggestions ??

thanks in advance
tim
--
-------------------------------------------------------------------------------
Tim Simpson City of Dundee District Council Dundee Scotland

All views expressed are my own not my employers
If you have a spare minute then why not sow the heads back onto the chickens

2. OPINION ASKED: future role of replication related to faster networks

3. Flat File Database for Unix machine

4. insane error?

5. Access Database on UNIX Machine

6. Red Squiggly Line on SQL Server Icon

7. Create database locks Unix machine

8. pgsql/src/include/utils (elog.h)

9. moving oracle from one unix to another unix machine

10. Exporting large (>8 gig) database in UNIX?

11. Best database for large Unix application

12. Cient Machine, Web Server or Database Machine Takes the Workload