## Fast Persp. Texture mapping

### Fast Persp. Texture mapping

Gentlemen,

Recently, I implemented an algorithm to do perspective texturemapping at
a reasonable speed, using 8 or 16 pixel spans, and using the FPU to do
the neccessary divides while the integer processor draws the pixels. I
am rather content with this code, but I am wondering: Would it be
possible to extend the idea of 16 pixel spans to 16x16 pixel squares? It
would greatly reduce the number of divides! Has anyone tried to
implement such an algorithm?

Jacco Bikker.

### Fast Persp. Texture mapping

Quote:

>Recently, I implemented an algorithm to do perspective texturemapping at
>a reasonable speed, using 8 or 16 pixel spans, and using the FPU to do
>the neccessary divides while the integer processor draws the pixels. I
>am rather content with this code, but I am wondering: Would it be
>possible to extend the idea of 16 pixel spans to 16x16 pixel squares? It
>would greatly reduce the number of divides! Has anyone tried to
>implement such an algorithm?

>Jacco Bikker.

There is no need to restrict the shape to a square.  Let it be a trapezoid
as usual and you can do your perspective divides every 8 scanlines, say.
This will make a significant difference only if your scanline setup is
incredibly fast.

--
Daniel Phillips

### Fast Persp. Texture mapping

[.....]

> There is no need to restrict the shape to a square.  Let it be a trapezoid
> as usual and you can do your perspective divides every 8 scanlines, say.
> This will make a significant difference only if your scanline setup is
> incredibly fast.

> --
> Daniel Phillips

I think that there will be quite significant speedup if using MMX when
you can't do divides with FPU. There will be one divide every 256 pixels
instead of 16. Squares must be aligned on 16 pixel boundaries and
drawing must be done sequentaly square by square elminating cache
misses.
If you divide the polygon in rectangles there will result triangles on
the boundary. So you should consider dividing poly in rectangles and
triangles.
If someone knows a nice algorithm how to do this subdivision please mail
me.

Karlis.

### Fast Persp. Texture mapping

> [.....]
> > There is no need to restrict the shape to a square.  Let it be a
> trapezoid
> > as usual and you can do your perspective divides every 8
> scanlines, say.
> > This will make a significant difference only if your scanline
> setup is
> > incredibly fast.

> > --
> > Daniel Phillips

> I think that there will be quite significant speedup if using MMX
> when
> you can't do divides with FPU. There will be one divide every 256
> pixels
> instead of 16. Squares must be aligned on 16 pixel boundaries and
> drawing must be done sequentaly square by square elminating cache
> misses.
> If you divide the polygon in rectangles there will result triangles
> on
> the boundary. So you should consider dividing poly in rectangles and

> triangles.
> If someone knows a nice algorithm how to do this subdivision please
> mail
> me.

> Karlis.

To avoid the (hard-to-implement) triangles, you might consider to
pre-render a scaled version of your polygon, that is only 1/16 of the
original polygon. This scaled version of the polygon can than be
rendered using the neccessary 2 divides per pixel. The resulting data
(not texels, but texel-addresses) can than be used to interpolate the
spans. The good part of this approach is that you can also interpolate a
three-pixel span when you know only the address of the 1st and the 16th
pixel... That saves you a lot of * stuff.

Jacco.

### Fast Persp. Texture mapping

> Gentlemen,

> Recently, I implemented an algorithm to do perspective texturemapping at
> a reasonable speed, using 8 or 16 pixel spans, and using the FPU to do
> the neccessary divides while the integer processor draws the pixels. I
> am rather content with this code, but I am wondering: Would it be
> possible to extend the idea of 16 pixel spans to 16x16 pixel squares? It
> would greatly reduce the number of divides! Has anyone tried to
> implement such an algorithm?

> Jacco Bikker.

It probably wouldn't speed it up that much as you expect as on most
machines the divide is free, as long as it's implemented correctly.
While the 16 pixels are being drawn in the integer pipes of the
processor
the next divide is in the FPU pipe and is completed before it is
required, thats the point of doing it in the FPU. It's going to be the
integer stuff between the divides that is the bottle neck.
--
______________________________________________________________________
|\_____________________________________________________________________\
||  Robin Patenall                           Token Coder NetGamer Soc. |
||  NetGamer Soc   http://www.ee.surrey.ac.uk/Societies/netgamer/      |
||  Homepage       http://www.ee.surrey.ac.uk/Personal/ee41rp/         |
\|_____________________________________________________________________|

### Fast Persp. Texture mapping

> > Gentlemen,

> > Recently, I implemented an algorithm to do perspective
> texturemapping at
> > a reasonable speed, using 8 or 16 pixel spans, and using the FPU
> to do
> > the neccessary divides while the integer processor draws the
> pixels. I
> > am rather content with this code, but I am wondering: Would it be
> > possible to extend the idea of 16 pixel spans to 16x16 pixel
> squares? It
> > would greatly reduce the number of divides! Has anyone tried to
> > implement such an algorithm?

> > Jacco Bikker.

> It probably wouldn't speed it up that much as you expect as on most
> machines the divide is free, as long as it's implemented correctly.
> While the 16 pixels are being drawn in the integer pipes of the
> processor
> the next divide is in the FPU pipe and is completed before it is
> required, thats the point of doing it in the FPU. It's going to be
> the
> integer stuff between the divides that is the bottle neck.

1. You don't exactly get the DIV's for free, not even when you do
integer stuff in parallel. I'm not sure what the minimal
tick-consumption of an FDIV is if you have enough integer stuff to fill
the wait; does anyone have details on this?
2. It's not possible to do the FDIV's in parallel if you want to use the
MMX features of your processor to speed up 16bit color. So for those
processors, tiled perspective map rendering could matter a lot.

Greets,
Jacco

### Fast Persp. Texture mapping

Quote:> 1. You don't exactly get the DIV's for free, not even when you do
> integer stuff in parallel. I'm not sure what the minimal
> tick-consumption of an FDIV is if you have enough integer stuff to fill
> the wait; does anyone have details on this?

The throughput of FDIV is exactly one cycle. Hardly a hog.

Regards,

--
Jon Beltran de Heredia               http://bips.bi.ehu.es/~jon
-                                                             -
I'm using self discipline, see http://www.eiffel.com/discipline

Hi everyone,
Here is my problem. I'm writting a software rendering 3D engine.
I've implemented perspective mapping and I've encountered a strong
problem with triangles that have one vertex with Z<0 and one vertex with
Z>0. In persp map, we are to linearly interpolate the values of 1/Z. If
z1<0 and z2>0, then (1/z1) < 0 and (1/z2) > 0. So the linear
interpolation of those values make the (1/z) go throw 0. Now let's
invert again and find Z = (+/-) oo ! What a mistake for a Z that should
be in [Z1;Z2] !
I hope I've been clear ! And I hope you'll be able to help... thanx