|
3D Engine 5 - Z-buffer and texture mapping
Hallelujah ! Believe in god ? Praise him for next part of 3d-engine serial. I'm here again
with new topic (and lots of jams, but i'm not here to share these ...) Today i'd like to tell
you how does Z-buffer and texture mapping work.
Now it's time to thank to Mats Byggmastar a.k.a. MRI / Doomsday for his didactic (;-))
document called "Fatmap" (Fast affine texture mapping), you can find original on the internet.
It's much more optimalisated, but unfortunately it's not perspective correct (some of you don't
know what that is, but you soon will - so get to it !)
I think the first one who did realtime texturemapping was >doommaker< mr. Carmack,
who could optimalise it such way so we could play wolfenstein or doom even at these slow computers.
Maybe you're thinking about connection of textures and z-buffer - it's simple : interpolation.
Z-buffer is memory of same size as screen, but inside are z-values of pixels. When you're about
to draw pixel, you compare it's z-value with z-value of previously drawn pixel and when it's
nearer, you overdraw old pixel (and update z-value). It's quite simple, but quite slow and since
z-buffer requires floats also quite big in memory (but both problems was solved with 3d-cards)
Big advantage of z-buffer is perfect object intersection :
With invention of z-buffer also begins our everlasting fight with fillrate. Fillrate is amount
of pixels you're able to draw in a second. With z-buffer you have to compute z-value of each
pixel and compare it with the older. That's also quite big amount of data to be transfered.
Now, we need some function to tell what the z-value in our particular pixel is. We will use
the plane equation again. What plane ? Our plane is defined by three points of face and their
coordinates are perspective correct x and y and linear (just after transformation) z.
That's also connection with texturemapping. We will have two more planes for u and v coordinate,
that tells us position of pixel in texture ! Now we will look at the plane equation :
We can tell how much Z is :
Now we can see that some numbers can be precomputed :
Where dzdx is delta-z-per-delta-x, dzdy is delta-z-per-delta-y and begz is z-coordinate in
point x = 0, y = 0 (upper left screen corner). Now it's simple - for every next line
add dzdy, for every pixel on that line add dzdx and we have our z.
To find begz shouldn't be hard :
So we have our first c-function today :
struct Vertex {
float x, y, z;
float u, v;
float illum;
Vector normal;
};
struct Polygon {
int n_vertex_num;
Vertex vertex[7];
};
float didx, didy, begi;
int Compute_Interpolation_Consts(POLYGON *p)
{
float x[2], y[2], l[2];
float a, b, c;
x[0] = p->vtx[1].x - p->vtx[0].x;
x[1] = p->vtx[2].x - p->vtx[0].x;
y[0] = p->vtx[1].y - p->vtx[0].y;
y[1] = p->vtx[2].y - p->vtx[0].y;
l[0] = (float )p->vtx[1].illum - (float )p->vtx[0].illum;
l[1] = (float )p->vtx[2].illum - (float )p->vtx[0].illum;
a = y[0] * l[1] - y[1] * l[0];
b = l[0] * x[1] - l[1] * x[0];
c = x[0] * y[1] - x[1] * y[0];
if (c == 0)
return 0;
didx = - a / c;
didy = - b / c;
begi = (float )p->vtx[0].illum - p->vtx[0].x * didx - p->vtx[0].y * didy;
return 1;
}
int DrawPolygon(Polygon *p, unsigned __int32 *video, unsigned __int32 color)
{
int start_y, left_x, right_x;
int didx_fp, k, r_k;
float illum;
int l, c;
if (!Create_Edges(p))
return 0;
if (!Compute_Interpolation_Consts(p))
return 0;
start_y = p_right_edge->y;
left_x = p_left_edge->x;
right_x = p_right_edge->x;
video += n_Width * start_y;
illum = begi + didy * start_y;
didx_fp = (int )(didx * 65536);
while (1) {
l = left_x >> 16;
c = right_x >> 16;
k = (int )(illum * 65536) + c * didx_fp;
for (; c < l; c ++) {
r_k = k >> 16;
if (r_k > 255)
r_k = 255;
if (r_k < 0)
r_k = 0;
video[c] = ((((color & 0xff0000) * r_k) & 0xff000000) +
(((color & 0x00ff00) * r_k) & 0x00ff0000) +
(((color & 0x0000ff) * r_k) & 0x0000ff00)) >> 8;
k += didx_fp;
}
video += n_Width;
illum += didy;
if (-- p_left_edge->num == 0) {
if (p_left_edge -- == left_edge_buff)
return 1;
left_x = p_left_edge->x;
} else
left_x += p_left_edge->dxdy;
if (-- p_right_edge->num == 0) {
p_right_edge ++;
right_x = p_right_edge->x;
} else
right_x += p_right_edge->dxdy;
}
return 1;
}
|
It's not going to be this simple, because of perspective distortion. You can see that
texturemapping (ie. any surface-interpolated value) won't be linear :
You can see that nearer quads are bigger than the far-off ones ...
But i can't make you down. So i'll put here smooth shading example. It's practically the
same like before, but we simply average neighbour faces normals to get vertex pseudonormals,
that are used for lighting. So we have lighting value per vertex, not per face like before.
(note we must include illumination to edge-splitting function in clipping pyramid !)
You can see function ComputeVertexNormals() in load3ds.cpp module.
The example is running with directx 7, the only difference is that you must link ddraw.lib
and dxguid.lib. (and speed, of course) :
Smooth engine - Font engine + Antialiassing
Hope you like it :-), i had to do some work ... It's still using painter's
algorithm, simple font-engine is included and it's using something-like quincunx for
image antialiassing. But now we move on ...
Perspective correct mapping
I already told you mapping won't be linear. But how to make it linear ? We will use the fact
that z itself is not linear, but 1 / z is linear. When interpolating
more coordinates, we will have to multiply them by 1 / z. But that means we will have to
recompute them back when drawing ... well, we wouldn't have to do it with z, we would just
inverted comparation sign when drawing (far objects will have bigger z than near, but smaller 1 / z !)
But when texturemapping we would have to do two divisions per pixel ! (assume 2d texture) Allright,
if you wan't write high qulity rasterizer you will have to. But for realtime maniacs like me
(and some of you) we have a simple optimalization. (don't say Carmack, don't say Carmack ... well it was propably him ...) We will do division each 16-th pixel and
between them just linearly interpolate with certain error, that will be bigger on short and dip
surfaces :
We won't normally recognize that error, only when looking parallel with some surface when close to it.
(but collision detection can fix some of that case simply by don't letting camera to be too close)
Now look at some code :
int n_Width, n_Height;
__int32 *z_buffer;
float dzdx, dzdy, begz;
int Compute_Interpolation_Consts(Polygon *p_poly)
{
float x[2], y[2], z[2];
float a, b, c;
x[0] = p_poly->vertex[1].x - p_poly->vertex[0].x;
x[1] = p_poly->vertex[2].x - p_poly->vertex[0].x;
y[0] = p_poly->vertex[1].y - p_poly->vertex[0].y;
y[1] = p_poly->vertex[2].y - p_poly->vertex[0].y;
z[0] = (1 / p_poly->vertex[1].z) - (1 / p_poly->vertex[0].z);
z[1] = (1 / p_poly->vertex[2].z) - (1 / p_poly->vertex[0].z);
a = y[0] * z[1] - y[1] * z[0];
b = z[0] * x[1] - z[1] * x[0];
c = x[0] * y[1] - x[1] * y[0];
if (c == 0)
return 0;
dzdx = - a / c;
dzdy = - b / c;
begz = (1 / p_poly->vertex[0].z) -
p_poly->vertex[0].x * dzdx -
p_poly->vertex[0].y * dzdy;
return 1;
}
void Interpolate_Segment(int z1, int dz, unsigned __int32 *video,
unsigned __int32 *v2, __int32 *z_buff, unsigned __int32 color)
{
do {
if ((unsigned )(*z_buff) > (unsigned )z1){
*video = color;
*z_buff = z1;
}
video ++;
z_buff ++;
z1 += dz;
} while(video < v2);
}
void Draw_Segment(int c, int l, float z, __int32 *z_buff,
unsigned __int32 *video, unsigned __int32 color)
{
int dz, z1, z2;
z1 = (int )(0x10000 / z);
while(c >= 16) {
z += dzdx * 16;
z2 = (int )(0x10000 / z);
dz = (z2 - z1) / 16;
Interpolate_Segment(z1, dz, video, video + 16, z_buff, color);
c -= 16;
z_buff += 16;
video += 16;
z1 = z2;
}
if (c > 0) { z += dzdx * c;
z2 = (int )(0x10000 / z);
dz = (z2 - z1) / 16;
Interpolate_Segment(z1, dz, video, video + c, z_buff, color);
}
}
int DrawPolygon(Polygon *p, unsigned __int32 *video, unsigned __int32 color)
{
int start_y, left_x, right_x;
__int32 *z_buff;
int l, c;
float z;
if (!Create_Edges(p))
return 0;
if (!Compute_Interpolation_Consts(p))
return 0;
start_y = p_right_edge->y;
left_x = p_left_edge->x;
right_x = p_right_edge->x;
video += n_Width * start_y;
z_buff = &z_buffer[n_Width * start_y];
z = begz + dzdy * start_y;
while(1) {
c = left_x >> 16;
l = right_x >> 16;
c -= l ++;
if (c > 0)
Draw_Segment(c, l, z + dzdx * l, z_buff + l, video + l, color);
video += n_Width;
z_buff += n_Width;
z += dzdy;
if (-- p_left_edge->num == 0) {
if (p_left_edge -- == left_edge_buff)
return 1;
left_x = p_left_edge->x;
} else
left_x += p_left_edge->dxdy;
if (-- p_right_edge->num == 0) {
p_right_edge ++;
right_x = p_right_edge->x;
} else
right_x += p_right_edge->dxdy;
}
return 1;
}
void Clean_ZBuffer()
{
int i, m;
m = Width * Height;
for (i = 0; i > m; i ++)
z_buffer[i] = 0xffffffff;
}
|
Have a look at the picture :
Z_Buffer + Flat shading
And we have engine with z-buffering. Works pretty cool and accurate (and slow).
Major disadvantage of z-buffer is overdraw. Some pixels could (and will be) overdrawn
even arround 10 times ! So in the next part we will introduce the BSP tree, used also
to precompute visibility. Maybe you ask - well, i have 3D card, i don't need that. But
when you do some optimalisations, you will be able to afford for example more detail
actor models ...
But now the textures ... Maybe you noticed the font engine contains function for bitmap loading.
I will put it here just to be complete :
#include <windows.h>
#include <stdio.h>
struct BMP {
unsigned __int32 *bytes; int width, height; };
BMP *Load_TrueColorBMP(char *path)
{
BITMAPFILEHEADER bmfh;
BITMAPINFOHEADER bmih;
int Width, Height;
char *lpbBuf;
int BuffLen;
FILE *BMPfp;
BMP *tmp;
int i, j;
if ((BMPfp = fopen(path, "rb")) == NULL)
return NULL;
fread(&bmfh, sizeof (bmfh), 1, BMPfp);
fread(&bmih, sizeof (bmih), 1, BMPfp);
if (bmfh.bfType != ('B'|('M' << 8)))
return NULL;
if (bmih.biBitCount == 24 && bmih.biCompression == BI_RGB) {
if ((tmp = (BMP*)malloc(sizeof (BMP))) == NULL)
return NULL;
Width = bmih.biWidth;
Height = bmih.biHeight;
tmp->n_Width = Width;
tmp->n_Height = Height;
if ((tmp->bytes = (unsigned __int32 *)malloc(Width * Height * 4)) == NULL)
return NULL;
BuffLen = ((Width * 3 + 3) >> 2) << 2;
if ((lpbBuf = (char *)malloc(BuffLen)) == NULL)
return NULL;
for (i = Height - 1; i >= 0; i --) {
if ((j = fread(lpbBuf, 1, BuffLen, BMPfp)) != BuffLen)
return NULL;
for (j = 0; j < Width; j ++) {
tmp->bytes[j + Width * i] = RGB(lpbBuf[j * 3],
lpbBuf[j * 3 + 1], lpbBuf[j * 3 + 2]);
}
}
} else
return NULL;
free(lpbBuf);
return tmp;
}
|
I won't go deep ... first read headers, check if it's really bitmap, if it's 24bpp and
uncompressed and then alloc memory and read pixels.
So, you should know how to make perspective non-correct mapping. But how to make perspective
correct ? We will certeinly want to use linear interpolation, because it's fast. So we will
make from non-linear values linear values. How ? By multiplying 1 / z :
int Compute_Interpolation_Consts(Polygon *p_poly)
{
float x[2], y[2], z[2], u[2], v[2];
float a, b, c;
x[0] = p_poly->vertex[1].x - p_poly->vertex[0].x;
x[1] = p_poly->vertex[2].x - p_poly->vertex[0].x;
y[0] = p_poly->vertex[1].y - p_poly->vertex[0].y;
y[1] = p_poly->vertex[2].y - p_poly->vertex[0].y;
z[0] = (1 / p_poly->vertex[1].z) - (1 / p_poly->vertex[0].z);
z[1] = (1 / p_poly->vertex[2].z) - (1 / p_poly->vertex[0].z);
u[0] = 1 / p_poly->vertex[1].z * p_poly->vertex[1].u -
1 / p_poly->vertex[0].z * p_poly->vertex[0].u;
u[1] = 1 / p_poly->vertex[2].z * p_poly->vertex[2].u -
1 / p_poly->vertex[0].z * p_poly->vertex[0].u;
v[0] = 1 / p_poly->vertex[1].z * p_poly->vertex[1].v -
1 / p_poly->vertex[0].z * p_poly->vertex[0].v;
v[1] = 1 / p_poly->vertex[2].z * p_poly->vertex[2].v -
1 / p_poly->vertex[0].z * p_poly->vertex[0].v;
a = y[0] * z[1] - y[1] * z[0];
b = z[0] * x[1] - z[1] * x[0];
c = x[0] * y[1] - x[1] * y[0];
if (c == 0)
return 0;
dzdx = - a / c;
dzdy = - b / c;
begz = (1 / p_poly->vertex[0].z) -
p_poly->vertex[0].x * dzdx -
p_poly->vertex[0].y * dzdy;
a = y[0] * u[1] - y[1] * u[0];
b = u[0] * x[1] - u[1] * x[0];
dudx = - a / c;
dudy = - b / c;
begu = (1 / p_poly->vertex[0].z * p_poly->vertex[0].u) -
p_poly->vertex[0].x * dudx - p_poly->vertex[0].y * dudy;
a = y[0] * v[1] - y[1] * v[0];
b = v[0] * x[1] - v[1] * x[0];
dvdx = - a / c;
dvdy = - b / c;
begv = (1 / p_poly->vertex[0].z * p_poly->vertex[0].v) -
p_poly->vertex[0].x * dvdx - p_poly->vertex[0].y * dvdy;
return 1;
}
|
Next we use almost the same source for drawing polygons as for z-buffer
(but interpolate u-s and v-s moreover), but we will need to modify routine
for drawing segments :
int DrawPolygon(Polygon *p, unsigned __int32 *video)
{
int start_y, left_x, right_x;
__int32 *z_buff;
float z, u, v;
int l, c;
if (!Create_Edges(p))
return 0;
if (!Compute_Interpolation_Consts(p))
return 0;
start_y = p_right_edge->y;
left_x = p_left_edge->x;
right_x = p_right_edge->x;
video += n_Width * start_y;
z_buff = &z_buffer[n_Width * start_y];
z = begz + dzdy * start_y;
u = begu + dudy * start_y;
v = begv + dvdy * start_y;
while(1) {
c = left_x > 16;
l = right_x > 16;
c -= l ++;
if (c &ht; 0) {
Draw_Segment(c, l, u + dudx * l, v + dvdx * l,
z + dzdx * l, z_buff + l, video + l);
}
video += n_Width;
z_buff += n_Width;
z += dzdy;
u += dudy;
v += dvdy;
if (-- p_left_edge->num == 0) {
if (p_left_edge -- == left_edge_buff)
return 1;
left_x = p_left_edge->x;
} else
left_x += p_left_edge->dxdy;
if (-- p_right_edge->num == 0) {
p_right_edge ++;
right_x = p_right_edge->x;
} else
right_x += p_right_edge->dxdy;
}
return 1;
}
void Draw_Segment(int c, int l, float u, float v, float z,
__int32 *z_buff, unsigned __int32 *video)
{
float dz, z1, z2;
int du, u1, u2;
int dv, v1, v2;
z1 = (0x10000 / z);
u1 = (int )(u * z1);
v1 = (int )(v * z1);
while(c >= 16) {
z += dzdx * 16;
u += dudx * 16;
v += dvdx * 16;
z2 = (0x10000 / z);
u2 = (int )(u * z2);
v2 = (int )(v * z2);
dz = (z2 - z1) / 16;
du = (u2 - u1) / 16;
dv = (v2 - v1) / 16;
Interpolate_Segment(u1, v1, z1, du, dv, dz, video, video + 16, z_buff);
c -= 16;
z_buff += 16;
video += 16;
z1 = z2;
u1 = u2;
v1 = v2;
}
if (c > 0) { z += dzdx * c;
u += dudx * c;
v += dvdx * c; z2 = (0x10000 / z);
u2 = (int )(u * z2);
v2 = (int )(v * z2);
dz = (z2 - z1) / c;
du = (u2 - u1) / c;
dv = (v2 - v1) / c;
Interpolate_Segment(u1, v1, z1, du, dv, dz, video, video + c, z_buff);
}
}
|
So, Interpolate_Segment() gets coordinates of texture and z in 16:16 fixed point.
We could normally multiply them to get the index of pixel, but there is simple trick how to
make it faster. You propably know texture has to have size of power of two. So the multiplying
can be replaced with bit shift and modulo division (because the texture can repeat multiple
times on a single surface) by logical ands :
void Interpolate_Segment(int u1, int v1, int z1,
int du, int dv, int dz, unsigned __int32 *video,
unsigned __int32 *v2, __int32 *z_buff)
{
do {
if ((unsigned )(*z_buff) > (unsigned )z1) {
*video = _texture[((u1 >> 16) & 0xff) + ((v1 >> 8) & 0xff00)];
*z_buff = z1;
}
video ++;
z_buff ++;
z1 += dz;
u1 += du;
v1 += dv;
} while(video < v2);
}
|
So, that's our oldloop from z-buffer code, but instead of color the texture is drawn.
What the code means ? " u1 >> 16" is conversion of U (x-coordinate of texture) from
16:16fp to int and " & 0xff" means texture will be 256 pixels wide. " v1 >> 8" means
conversion of V (y-coord) from 16:16fp to int and at the same time multiply by 256 to get
the exact index in texture. The " & 0xff00" means texture will be 256 pixels high,
but V is already multiplied by 256 ...
It shouldn't be hard to replace these numbers by variables and support textures of variable size,
not only 256 per 256. So first " >> 16" will be always there, " & 0xff" will be replaced
with " & (tex_width - 1)". " >> 8" will be replaced with
" >> (16 - log2(tex_height))"
and final " & 0xff00" replace with " & ((tex_height - 1) * tex_width)".
Just one advice - don't use log(tex_height) / log(2) to obtain logarithm with base of 2,
it's sometimes inaccurate and texture will be displayed in half resolution. (the result is 7.9999, but
when rounding, the result will be 7 ... - and using round() seems not good to me) Write your
own log 2() based on some loop ...
So stop reading now, get your source and get outa here (it's gonna blow ! ;-)) :
Z_Buffer + Texturemapping
Valid HTML 4.01
|