Our company has produced a full Image Server for image registration, retrieval,
processing and
archiving images, pages and Proofs for the Print Industry.
We haven't tried MSQL Server - we use ORACLE database - but I doubt whether it
could handle large images.
In addition, because Browsers load ASYNCHRONICALLY, keeping images in a file
system is that they will load at their
own leisure and do not all have to be prepared in advance in asp, with db
retrieval. On a page of 500 thumbnails, or when
upploading large images, this is a very big boon, even if the overall access
time turn out to be longer.
We have Internet and Windows Clients for manipulating the images.
Our database can reach up to 100,000 images kept out on RAIDs.
In our industry each image is kept in 3 sizes (technically called 'OPI' in the
Print Industry):
hi-res up to 2000dpi (<=100Mb)
lo-res 72dpi(<= 1Mb)
thumbnails <10K
We went through all your considerations.
Although we used to keep thumbnails in the database, because we wanted these to
be directly HREFed for our
Internet Clients we stuck with the FS. Our database emulates and enhances the
File System folders, with
all our management software.
> Vinod -
> Rob's already given you a great (and more experienced) viewpoint, but I've
> got a bit of real-world experience to add. I've been building a mapping
> application, and I currently have 3,300,000 small images stored in SQL
> Server. The images are served, via ASP and a VB db access component, out to
> a browser client.
> I debated the filesystem/db issue for a long time, and I'm glad I put them
> in the database. I think I was bogged down by the "conventional wisdom"
> that "putting images in a database is slow" (usually given without any
> argument as to *why*).
> First, the maintenance issue Rob mentioned is important. I built a rather
> sophisticated image loading tool, as I had to perform a lot of operations on
> the images as they went it. Guess what - despite my best efforts and
> brilliant coding skills, it was buggy. It crashed. I was *VERY* glad that
> I had SQL Server's transactional integrity on the back end; whenever my
> application crashed, all its work disappeared and life was as if it had
> never run. That was worth a LOT. From an ongoing admin point of view, it's
> a big win.
> The "application" vs. "database" and "filesystem" vs. "database" arguments
> are, IMHO, pretty specious. If you retrieve a bag of bits from disk inside
> a SQL Server database, SQL Server has to find them, read them, and squirt
> them out the net to you. If you retrieve a bag of bits from disk inside an
> NT Server filesystem, NT Server has to find them, read them, and squirt them
> out the net to you. Code is code - unless someone did some path analysis or
> profiling, it's hard to see a big difference between the two.
> The obvious difference is that SQL Server is still using some of the NT
> Server filesystem (naturally) to read and write from the disk. At that
> point, I said - "Well, at the I/O level, SQL and NT are both using NT Server
> low-level filesystem code, so there's no difference. So how does each
> decide which disk sectors to read?" The key here is that by storing your
> images in the filesystem, you're using NT as your "database" to search for
> those files. NT isn't too bad as a filesystem, but it's not designed (the
> way SQL Server is) to be as user-tunable. How do you influence the way NT
> stores things? In SQL Server you can create indexes to find things. you can
> create clustered indexes to control where things are stored, and in fact you
> can index the same image as many different ways as you like. You can't do
> that in NT. SQL Server gives you query optimization and caching for free.
> HOWEVER, writing the code to get images in and out of SQL Server (especially
> if you've got tools and utility libraries that work with images in the
> filesystem) is painfully ill-documented. It's not really that hard at all,
> but it can be hard to figure out, and it depends a lot on the details of
> your image pipeline. But I would rather trade some code I have to figure
> out once for performance and maintenance gains I see every day.
> - Ed