| I've got a quick question concerning the convolution & histogram
| extensions under the SGI implementation of OpenGL. I'm using RE3
| hardware under IRIX 6.2. ...
Do you mean InfiniteReality (IR)?
| ... The routines don't seem much faster (if at
| all) than the pure CPU counterparts that I coded up. ...
It wouldn't surprise me greatly to hear of a situation in which
host-based histogramming is faster than histogramming in the pipe, if
you're using very small images (64x64 or less) or monochrome images
with large histogram tables (greater than 4K entries). The cost of
moving the image into the pipe and the histogram table out of the pipe
might exceed the cost of performing the histogram on the host.
It does surprise me to hear of a situation in which host-based
convolution is faster than convolution in the pipe. (Unless you're
running into a data-transfer bottleneck of some kind, or you're
waiting for the convolution results to be read back to the host.)
| ... Does anyone know
| if these two extensions actually use the graphics hardware or are they
| implemented using the CPU instead.
On IR, there's essentially no fallback to software; nearly everything
is in either silicon or microcode. Both histogramming and convolution
are implemented by microcode and hardware assist in the GEs.
Without knowing more about your code, it's hard to say what's
happening. If you're processing small images and waiting to read back
the results, performance might be dominated by pipeline latency. If
you're processing large images, then some mode setting might be
Also, does your host-based convolution code use floating-point
arithmetic? If not, the pipe may be performing the convolution at a
higher precision than you need.