Storing Images in Blobs versus Storing in Filesystem - Performance Issues

Storing Images in Blobs versus Storing in Filesystem - Performance Issues

Post by Vinod Subramania » Mon, 26 Jul 1999 04:00:00



Hi !

We are planning to launch a Real Estate Web Site in the UK which
will allow prospective sellers to upload images of their property to
a SQL Server 7.0 database through ASP.

We expect at least 300,000 images to be uploaded in the first 6 months

If any of you have already done this then I would like feedback on
1. Performance Issues as to the approaches we need to take
    in terms of speed of display of images in the client browser
2. Will storing images in the filesystem be slower than storing them
   in a database if we have 300,000 images ?
3. How do we design the Site Maintainence module if we store
    images in the database ?

All feedback is welcome

Thanks in advance
Vinod Subramaniam

         __________________________________      
        | V I N O D  S U B R A M A N I A M |
        | Associate Consultant (SEB)       |
        | NIIT Limited, 8, Balaji Estate,  |
        | Kalkaji, New Delhi-19, INDIA     |
        | http://www.niit.com              |    


     / )| PHONE: 91-11-6203357             |( \
  __(  (|__________________________________|)  )__
 ((( \  \   / _)                    ( \   /  / )))
 (\\\ \  \_/  /                     \  \_/  / ///)
  \          /                       \          /
    \      _/                          \_      /
     /    /                              \     \
    /    /                                \     \
----     -----------------------------------     --------
Personal: http://www.geocities.com/SiliconValley/Way/2219

Do NOT let them deceive you with the legitimization of their myth!

 
 
 

Storing Images in Blobs versus Storing in Filesystem - Performance Issues

Post by Jay McVinne » Mon, 26 Jul 1999 04:00:00


<personal opinion>
I recommend storing the location in the database and keeping the images in the filesystem.  remember, either way they have to pass through your web server to be delivered.  do you really want to create network traffic for this?  Even if the web server and the sql server are the same machine, you are still having to pass the images through two applications rather than one if you use the filesystem.
</personal opinion>

HTH
Jay McVinney


  Hi !

  We are planning to launch a Real Estate Web Site in the UK which
  will allow prospective sellers to upload images of their property to
  a SQL Server 7.0 database through ASP.

  We expect at least 300,000 images to be uploaded in the first 6 months

  If any of you have already done this then I would like feedback on
  1. Performance Issues as to the approaches we need to take
      in terms of speed of display of images in the client browser
  2. Will storing images in the filesystem be slower than storing them
     in a database if we have 300,000 images ?
  3. How do we design the Site Maintainence module if we store
      images in the database ?

  All feedback is welcome

  Thanks in advance
  Vinod Subramaniam

           __________________________________      
          | V I N O D  S U B R A M A N I A M |
          | Associate Consultant (SEB)       |
          | NIIT Limited, 8, Balaji Estate,  |
          | Kalkaji, New Delhi-19, INDIA     |
          | http://www.niit.com              |    


       / )| PHONE: 91-11-6203357             |( \
    __(  (|__________________________________|)  )__
   ((( \  \   / _)                    ( \   /  / )))
   (\\\ \  \_/  /                     \  \_/  / ///)
    \          /                       \          /
      \      _/                          \_      /
       /    /                              \     \
      /    /                                \     \
  ----     -----------------------------------     --------
  Personal: http://www.geocities.com/SiliconValley/Way/2219

  Do NOT let them deceive you with the legitimization of their myth!

 
 
 

Storing Images in Blobs versus Storing in Filesystem - Performance Issues

Post by Manohar Kamath [MVP » Tue, 27 Jul 1999 04:00:00


Plus of course, the additional code (and pain) to get the image in and out
of SQL server. With file system, it is so easy to add/delete and replace the
images. And with some good naming standards, the maintenance is easy.

--
Manohar Kamath
Microsoft MVP - ASP
www.kamath.com Active Server Corner


<personal opinion>
I recommend storing the location in the database and keeping the images in
the filesystem.  remember, either way they have to pass through your web
server to be delivered.  do you really want to create network traffic for
this?  Even if the web server and the sql server are the same machine, you
are still having to pass the images through two applications rather than one
if you use the filesystem.
</personal opinion>

HTH
Jay McVinney


Hi !

We are planning to launch a Real Estate Web Site in the UK which
will allow prospective sellers to upload images of their property to
a SQL Server 7.0 database through ASP.

We expect at least 300,000 images to be uploaded in the first 6 months

If any of you have already done this then I would like feedback on
1. Performance Issues as to the approaches we need to take
    in terms of speed of display of images in the client browser
2. Will storing images in the filesystem be slower than storing them
   in a database if we have 300,000 images ?
3. How do we design the Site Maintainence module if we store
    images in the database ?

All feedback is welcome

Thanks in advance
Vinod Subramaniam

         __________________________________
        | V I N O D  S U B R A M A N I A M |
        | Associate Consultant (SEB)       |
        | NIIT Limited, 8, Balaji Estate,  |
        | Kalkaji, New Delhi-19, INDIA     |
        | http://www.niit.com              |


     / )| PHONE: 91-11-6203357             |( \
  __(  (|__________________________________|)  )__
 ((( \  \   / _)                    ( \   /  / )))
 (\\\ \  \_/  /                     \  \_/  / ///)
  \          /                       \          /
    \      _/                          \_      /
     /    /                              \     \
    /    /                                \     \
----     -----------------------------------     --------
Personal: http://www.geocities.com/SiliconValley/Way/2219

Do NOT let them deceive you with the legitimization of their myth!

 
 
 

Storing Images in Blobs versus Storing in Filesystem - Performance Issues

Post by R.MacFadye » Tue, 27 Jul 1999 04:00:00


Hi,

I have to disagree with storing images in a filesystem. My background has
included a certain amount of OLTP which may be biasing me.

The problems faced by storage in OS files are not inconsiderable. Backup
(easy enough). Transaction consistance (what happens when a server crashes).
What happens when out of disk space. Do you want to rely on NTFS to "lookup"
a filename - if not how will you segregate the files on disk (how many
disks, directories, and subdirectories - how would you move this to another
machine). How do you coordinate backup's with restores (oops restored the DB
and not the images, or worse the images and not the DB). How can you trace
the origin of a particular file if need be - who last updated it? who
created it? Unless the NT permissions are tight who knows where a particular
file came from. For that matter what about production support - system
crashs and a developer has to go poking around - did they add extra files to
the directory, rename a file, delete one, move one? What would happen if you
later restored - would everything be "ok"?

Storing images in a DB solves some of these. And it's all within the scope
of one data manipulate tool - backups, restores and so forth don't need to
be coordinated with filesystem restores. Auditing can be implemented within
the db if required (triggers and so on). Plus you wouldn't necesarily need a
really wizkid NT administrator to write scripts - your wizkid SQL
developer's could do it.

On to performance issues:

In terms of net traffic between a web server and a local sql server wouldn't
you have to be very very busy to exceed the capacity of several 100mb net
cards? Heck if you're that busy you can afford some sort of fiber optic
connection between the systems.

From a strict performance point of view I would not bet anything on either
method being substaintially faster than the other. I would be really
supprised if there was. With 300,000 images you'll have to do some clever
disk/directory mapping. Having one directory with several thousand pics is
probably to many - NTFS still does some sort of "lookup by key" on the
filename. NTFS was not built to be a database - does any know if it uses a
binary tree or a straight linear lookup?

Though on another thought line.... Building the client response what would
be the most likly way to include the image? Probably an <IMG> tag... which
needs a URL (a file)... which means fishing the image out of the DB into a
temp directory(ies). Hum... already having the files available would
certainly make this faster. Maybe the right answer is to do both? Or perhaps
when ever a user queries an image (or property ala search engine) the image
isn't copied into a temp directroy but a "cache directory" - who's files
older than 30 days get deleted nightly. This is kinda elegant in that the
cache doesn't need backup/restore, but a bit wasteful if more images are
cached than returned to the user (search engine returns 10 hits, but they
only click through on one).

On to how to get the images in:

Depending on languages etc storing the images in the actual db can be pretty
easy. Using VB/ADO (and possibly ASP/ADO) you can easily tuck a StdPicture
into a PropertyBag and use ADO's AppendChunk on the PropertyBag's contents.
I've used this quite sucessfully on small images (512x512 monochrome - 32k
each). How a StdPicture handles a Jpeg/Gif when saved to a property bag I
don't know (it would be pretty ugly the save to propertybag just saved the
raw .bmp).

At any rate that's just my 2 cents worth. If it was me I'd cover my but such
that switching from one method to another would cause the least amount of
code changes (and be prepared to switch methods during testing/early
production if needed).

Hope it helps.

Rob MacFadyen

 
 
 

Storing Images in Blobs versus Storing in Filesystem - Performance Issues

Post by Ed McNierne » Mon, 16 Aug 1999 04:00:00


Vinod -

Rob's already given you a great (and more experienced) viewpoint, but I've
got a bit of real-world experience to add.  I've been building a mapping
application, and I currently have 3,300,000 small images stored in SQL
Server.  The images are served, via ASP and a VB db access component, out to
a browser client.

I debated the filesystem/db issue for a long time, and I'm glad I put them
in the database.  I think I was bogged down by the "conventional wisdom"
that "putting images in a database is slow" (usually given without any
argument as to *why*).

First, the maintenance issue Rob mentioned is important.  I built a rather
sophisticated image loading tool, as I had to perform a lot of operations on
the images as they went it.  Guess what - despite my best efforts and
brilliant coding skills, it was buggy.  It crashed.  I was *VERY* glad that
I had SQL Server's transactional integrity on the back end; whenever my
application crashed, all its work disappeared and life was as if it had
never run.  That was worth a LOT.  From an ongoing admin point of view, it's
a big win.

The "application" vs. "database" and "filesystem" vs. "database" arguments
are, IMHO, pretty specious.  If you retrieve a bag of bits from disk inside
a SQL Server database, SQL Server has to find them, read them, and squirt
them out the net to you.  If you retrieve a bag of bits from disk inside an
NT Server filesystem, NT Server has to find them, read them, and squirt them
out the net to you.  Code is code - unless someone did some path analysis or
profiling, it's hard to see a big difference between the two.

The obvious difference is that SQL Server is still using some of the NT
Server filesystem (naturally) to read and write from the disk.  At that
point, I said - "Well, at the I/O level, SQL and NT are both using NT Server
low-level filesystem code, so there's no difference.  So how does each
decide which disk sectors to read?"  The key here is that by storing your
images in the filesystem, you're using NT as your "database" to search for
those files.  NT isn't too bad as a filesystem, but it's not designed (the
way SQL Server is) to be as user-tunable.  How do you influence the way NT
stores things?  In SQL Server you can create indexes to find things. you can
create clustered indexes to control where things are stored, and in fact you
can index the same image as many different ways as you like.  You can't do
that in NT.  SQL Server gives you query optimization and caching for free.

HOWEVER, writing the code to get images in and out of SQL Server (especially
if you've got tools and utility libraries that work with images in the
filesystem) is painfully ill-documented.  It's not really that hard at all,
but it can be hard to figure out, and it depends a lot on the details of
your image pipeline.  But I would rather trade some code I have to figure
out once for performance and maintenance gains I see every day.

     - Ed

 
 
 

Storing Images in Blobs versus Storing in Filesystem - Performance Issues

Post by avron polako » Tue, 17 Aug 1999 04:00:00


Our company has produced a full Image Server for image registration, retrieval,
processing and
archiving images, pages and Proofs for the Print Industry.

We haven't tried MSQL Server - we use ORACLE database - but I doubt whether it
could handle large images.
In addition, because Browsers load ASYNCHRONICALLY, keeping images in a file
system is that they will load at their
own leisure and do not all have to be prepared in advance in asp, with db
retrieval. On a page of 500 thumbnails, or when
upploading large images, this is a very big boon, even if the overall access
time turn out to be longer.

We have Internet and Windows Clients for manipulating the images.
Our database can reach up to 100,000 images kept out on RAIDs.

In our industry each image is kept in 3 sizes (technically called 'OPI' in the
Print Industry):

hi-res up to 2000dpi (<=100Mb)
lo-res          72dpi(<= 1Mb)
thumbnails   <10K

We went through all your considerations.
Although we used to keep thumbnails in the database, because we wanted these to
be directly HREFed for our
Internet Clients we stuck with the FS. Our database emulates and enhances the
File System folders, with
all our management software.


> Vinod -

> Rob's already given you a great (and more experienced) viewpoint, but I've
> got a bit of real-world experience to add.  I've been building a mapping
> application, and I currently have 3,300,000 small images stored in SQL
> Server.  The images are served, via ASP and a VB db access component, out to
> a browser client.

> I debated the filesystem/db issue for a long time, and I'm glad I put them
> in the database.  I think I was bogged down by the "conventional wisdom"
> that "putting images in a database is slow" (usually given without any
> argument as to *why*).

> First, the maintenance issue Rob mentioned is important.  I built a rather
> sophisticated image loading tool, as I had to perform a lot of operations on
> the images as they went it.  Guess what - despite my best efforts and
> brilliant coding skills, it was buggy.  It crashed.  I was *VERY* glad that
> I had SQL Server's transactional integrity on the back end; whenever my
> application crashed, all its work disappeared and life was as if it had
> never run.  That was worth a LOT.  From an ongoing admin point of view, it's
> a big win.

> The "application" vs. "database" and "filesystem" vs. "database" arguments
> are, IMHO, pretty specious.  If you retrieve a bag of bits from disk inside
> a SQL Server database, SQL Server has to find them, read them, and squirt
> them out the net to you.  If you retrieve a bag of bits from disk inside an
> NT Server filesystem, NT Server has to find them, read them, and squirt them
> out the net to you.  Code is code - unless someone did some path analysis or
> profiling, it's hard to see a big difference between the two.

> The obvious difference is that SQL Server is still using some of the NT
> Server filesystem (naturally) to read and write from the disk.  At that
> point, I said - "Well, at the I/O level, SQL and NT are both using NT Server
> low-level filesystem code, so there's no difference.  So how does each
> decide which disk sectors to read?"  The key here is that by storing your
> images in the filesystem, you're using NT as your "database" to search for
> those files.  NT isn't too bad as a filesystem, but it's not designed (the
> way SQL Server is) to be as user-tunable.  How do you influence the way NT
> stores things?  In SQL Server you can create indexes to find things. you can
> create clustered indexes to control where things are stored, and in fact you
> can index the same image as many different ways as you like.  You can't do
> that in NT.  SQL Server gives you query optimization and caching for free.

> HOWEVER, writing the code to get images in and out of SQL Server (especially
> if you've got tools and utility libraries that work with images in the
> filesystem) is painfully ill-documented.  It's not really that hard at all,
> but it can be hard to figure out, and it depends a lot on the details of
> your image pipeline.  But I would rather trade some code I have to figure
> out once for performance and maintenance gains I see every day.

>      - Ed

 
 
 

Storing Images in Blobs versus Storing in Filesystem - Performance Issues

Post by Ed McNierne » Tue, 17 Aug 1999 04:00:00


I'm certainly not suggesting it's a solution for everyone, nor does it
"prove" much about storing images in SQL Server, but you ought to take a
look at http://terraserver.microsoft.com - it's about a terabyte worth of
high-resolution images, stored in SQL Server, delivered over the web to many
thousands of users a day.  It handles the "asynchronous" issue by breaking
each image into lots of tiles - each tile is generated by a separate
request, so the image loads tile-wise asynchronously into the client.

Now it's also supported by an awfully big chunk of hardware...

I've been thinking some more about my comments earlier, and I think it comes
down to indexing and flexibility.  If you only need to index the images one
way, and access them one way with straightforward queries, then you can
probably do that better and quicker using the file system.  If you've got a
more complicated index scheme, or if you've got a LOT of images, you can
probably do the same thing if you've got someone around who's good at
designing index/access methods.  But if you've got LOTS of images, with
different access methods, and you want flexibility in how you access the
images, at a certain point you'll end up building a general-purpose
index/query/storage mechanism.  In my case, I happen to know that the guys
on the SQL Server team are better at doing that than I am <g>, so that makes
SQL Server a good choice.  But your mileage may vary...

     - Ed

 
 
 

Storing Images in Blobs versus Storing in Filesystem - Performance Issues

Post by avron polako » Tue, 17 Aug 1999 04:00:00


Sounds interesting. Can a column really contain 10s of megabytes of data?

ORACLE 8i is now supposed to be a server that can handle multi-media.
Being a multi-server: database server, webserver, application server and file
server,
all bundled in one, it seems all the boundaries have broken down. I don't know
how
they do it. We have developed for the last three years and couldn't very easily
change
our database.


> I'm certainly not suggesting it's a solution for everyone, nor does it
> "prove" much about storing images in SQL Server, but you ought to take a
> look at http://terraserver.microsoft.com - it's about a terabyte worth of
> high-resolution images, stored in SQL Server, delivered over the web to many
> thousands of users a day.  It handles the "asynchronous" issue by breaking
> each image into lots of tiles - each tile is generated by a separate
> request, so the image loads tile-wise asynchronously into the client.

> Now it's also supported by an awfully big chunk of hardware...

> I've been thinking some more about my comments earlier, and I think it comes
> down to indexing and flexibility.  If you only need to index the images one
> way, and access them one way with straightforward queries, then you can
> probably do that better and quicker using the file system.  If you've got a
> more complicated index scheme, or if you've got a LOT of images, you can
> probably do the same thing if you've got someone around who's good at
> designing index/access methods.  But if you've got LOTS of images, with
> different access methods, and you want flexibility in how you access the
> images, at a certain point you'll end up building a general-purpose
> index/query/storage mechanism.  In my case, I happen to know that the guys
> on the SQL Server team are better at doing that than I am <g>, so that makes
> SQL Server a good choice.  But your mileage may vary...

>      - Ed

 
 
 

Storing Images in Blobs versus Storing in Filesystem - Performance Issues

Post by Tim » Mon, 23 Aug 1999 04:00:00


I also have been thinking of this issue as we are developing a portal type
app.. I will be using SA fileupload and hope it supports some method of
checking for the size of the image so that the user can only upload a file
of a certain size.... I will write code that only allows users to upload one
file or N files according to how much they pay us.. (Isn't the US great)..

I think with SQL server 7.0 I am going to store the images in a database
blob field.. Retrieval has to be faster, but more expensive I would think in
terms of Memory..etc.. Hardware is cheap!  (and my old boss hated me about
this with VB) .. (You can tell I am self taught non-degreed ASP / VB
dude..hehe) .. I am more concerned about retrieval than any other issue..
and as the original answer to the post noted.  Retrieval from the disk is
retrieval from the disk, but a database will be NT in the retrieval horse
race every time I would think..

Just my 2 cents worth..

Tim


> Sounds interesting. Can a column really contain 10s of megabytes of data?

> ORACLE 8i is now supposed to be a server that can handle multi-media.
> Being a multi-server: database server, webserver, application server and
file
> server,
> all bundled in one, it seems all the boundaries have broken down. I don't
know
> how
> they do it. We have developed for the last three years and couldn't very
easily
> change
> our database.


> > I'm certainly not suggesting it's a solution for everyone, nor does it
> > "prove" much about storing images in SQL Server, but you ought to take a
> > look at http://terraserver.microsoft.com - it's about a terabyte worth
of
> > high-resolution images, stored in SQL Server, delivered over the web to
many
> > thousands of users a day.  It handles the "asynchronous" issue by
breaking
> > each image into lots of tiles - each tile is generated by a separate
> > request, so the image loads tile-wise asynchronously into the client.

> > Now it's also supported by an awfully big chunk of hardware...

> > I've been thinking some more about my comments earlier, and I think it
comes
> > down to indexing and flexibility.  If you only need to index the images
one
> > way, and access them one way with straightforward queries, then you can
> > probably do that better and quicker using the file system.  If you've
got a
> > more complicated index scheme, or if you've got a LOT of images, you can
> > probably do the same thing if you've got someone around who's good at
> > designing index/access methods.  But if you've got LOTS of images, with
> > different access methods, and you want flexibility in how you access the
> > images, at a certain point you'll end up building a general-purpose
> > index/query/storage mechanism.  In my case, I happen to know that the
guys
> > on the SQL Server team are better at doing that than I am <g>, so that
makes
> > SQL Server a good choice.  But your mileage may vary...

> >      - Ed

 
 
 

Storing Images in Blobs versus Storing in Filesystem - Performance Issues

Post by Brad Kingsle » Tue, 24 Aug 1999 04:00:00



Quote:> Sounds interesting. Can a column really contain 10s of megabytes of data?

Supposedly SQL7 image data type can hold up to 1 GB of data.

-Brad Kingsley

ASP/SQL7/COM Hosting
http://www.ORCSWeb.com/

 
 
 

1. Storing image as BLOB or only storing URL to fileshare

It can be somewhat cumbersome, but not too difficult.

Other issues to think about:

Security of the data.
Database consistency issues (what if someone deletes the file in the URL
version?  What if you do a restore of your database and there are missing or
bad links now?)
Backup of the images?

Rick Sawtell

2. ADO Connection Pooling Question

3. storing images in SQL Server vs. filesystem

4. SYBASE 10 Memory usage compare with 4.0.1

5. Blobs versus Filesystem?

6. raw devices

7. Performance speed - Variables versus Parameters in Stored Procedures

8. Database link to a SQL Server possible?

9. Help! - Stored procedure speed performance versus queries

10. Store and Retrieve image file to and from MSSQL BLOB

11. Images: Store as BLOB or in the File system

12. Store Blobs to image/Text columns via Delphi 2.0 /BDE 3.5

13. Storing Image/BLOB