Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Hubis
May 18, 2003

Boy, I wish we had one of those doomsday machines...

UraniumAnchor posted:

I'm trying to work out how to draw terrain as a crossover pattern of triangles, like so:

code:

+--+--+--+--+-- etc
| /|\ | /|\ |
|/ | \|/ | \|
+--+--+--+--+
|\ | /|\ | /|
| \|/ | \|/ |
+--+--+--+--+
|
etc

As one long triangle strip. I think I have a basic method using degenerate triangles, but I'm curious if the extra triangles get optimized out by modern GPUs.

I'm also wondering if there's a better method, I worked out that I'd have 6 extra triangles for every four tiles, but my vertices would go from 24 to 16. (Plus a bit more per row, but that's probably set.)

Are the extra degenerate triangles generally going to be less costly then the extra vertices from having the triangles be specified as TRIANGLES rather than TRIANGLE_STRIP? Let's assume the terrain mesh is fairly beefy, somewhere on the order of a couple thousand height points square.

Is there any reason you have to use that pattern, as opposed to something like

code:
+--+--+--+--+-- etc
| /| /| /| /|
|/ |/ |/ |/ |
+--+--+--+--+--
| /| /| /| /|
|/ |/ |/ |/ |
+--+--+--+--+--
|
etc
?

Adbot
ADBOT LOVES YOU

UraniumAnchor
May 21, 2006

Not a walrus.
I'm pretty sure somebody else in the thread mentioned that the crossover pattern was better, but I can't find the post now. (Maybe it was in the gamedev thread)

roomforthetuna
Mar 22, 2005

I don't need to know anything about virii! My CUSTOM PROGRAM keeps me protected! It's not like they'll try to come in through the Internet or something!

UraniumAnchor posted:

... As one long triangle strip.
I have a related question (that might also be partly an answer to your question) though - are you using indexed primitives? Since you're using strips you must be able to reuse the vertices, you can't be wanting to change normals or textures at the points, and since each vertex is used in four triangles you'd save a lot more points with indexing than you would with doing a strip (though I believe you could do both if you really wanted, using the strip would mean fewer indices).

I once did a terrain map using Hubis's pattern and it tended to look kind of stripey, I'm guessing the crossover pattern makes that happen less?

roomforthetuna fucked around with this message at 23:37 on May 4, 2010

UraniumAnchor
May 21, 2006

Not a walrus.
Yes I'm using indices. And yeah, the stripe thing is what I remember seeing in the example, if I could find it.

Edit: Discussion started here.

UraniumAnchor fucked around with this message at 00:05 on May 5, 2010

haveblue
Aug 15, 2005



Toilet Rascal
A tristrip saves just under 2 vertices per triangle over raw triangles, so one or two degenerates on a row of several thousand visible is practically no overhead at all.

OneEightHundred
Feb 28, 2008

Soon, we will be unstoppable!

Hubis posted:

Is there any reason you have to use that pattern, as opposed to something like
Using anything other than crossover gives you "diamond" or "stripe" artifacting caused by diagonals running parallel to their neighbors and verts having mixed 45/90 degree angles with other verts instead of being consistent.

roomforthetuna
Mar 22, 2005

I don't need to know anything about virii! My CUSTOM PROGRAM keeps me protected! It's not like they'll try to come in through the Internet or something!

haveblue posted:

A tristrip saves just under 2 vertices per triangle over raw triangles, so one or two degenerates on a row of several thousand visible is practically no overhead at all.
You're missing that he said there's 6 degenerates for every 4 tiles, because of the pattern (I think it's something like every couple of triangles one has to go anticlockwise or be a line so is wasted, to make that pattern into a working strip). So it's not "one or two degenerates over a row of several thousand", it's "a few hundred degenerates over a row of several thousand".

I don't know enough about hardware to say reliably, but my guess would be that despite the saving on indices, it would be a loss on speed to render that pattern as a strip. Checking an extra triangle for non-render conditions must surely be slower than just getting two already-transformed points from indices rather than using the most recent two already-transformed points. With indices is a triangle strip even faster at all, or is it just a saving on size?

OneEightHundred
Feb 28, 2008

Soon, we will be unstoppable!
I doubt it matters either way, memory bandwidth and usage from vertex arrays is microscopic compared to framebuffers and textures, and the vertex transformation cache removes any redundancy advantage.

UraniumAnchor
May 21, 2006

Not a walrus.

roomforthetuna posted:

You're missing that he said there's 6 degenerates for every 4 tiles, because of the pattern (I think it's something like every couple of triangles one has to go anticlockwise or be a line so is wasted, to make that pattern into a working strip). So it's not "one or two degenerates over a row of several thousand", it's "a few hundred degenerates over a row of several thousand".

code:
0  1  2  3  4
+--+--+--+--+
| /|\ | /|\ |
|/ | \|/ | \|
+--+--+--+--+
5  6  7  8  9
In order to get every triangle to render properly as a strip, you need something like this: (C is CCW, c is CW)

code:
  C0    c1    C2    c3    C4    c5    C6    c7 0,5,1|5,1,6|1,6,7|7,1,2|2,7,3|7,3,8|3,8,9|9,3,4
So this is what I figured out is an index buffer that works, maybe somebody can point out why it's not 'optimal' given these constraints: (d means degenerate)

code:
  C   C   C   C   C   C   C
 /0\ /2\ /d\ /d\ /4\ /6\ /d\
0,5,1,6,7,7,1,2,2,7,3,8,9,9,3,4
   \1/ \d/ \3/ \d/ \5/ \d/ \7/  
    c   c   c   c   c   c   c
Pattern seems to be RRRddRdd, with each pair of quads needing 2 degenerates and then 2 more to attach to the next pair if there is one, or 6(?) more to start a new row.

That seems remarkably high. In this particular simulation once the indices are created they never move even if the height of a particular vertex is fluid.

Maybe I'll just try it both ways and see which is better, the only real difference is in the index construction and the type flag.

roomforthetuna
Mar 22, 2005

I don't need to know anything about virii! My CUSTOM PROGRAM keeps me protected! It's not like they'll try to come in through the Internet or something!

UraniumAnchor posted:

Maybe I'll just try it both ways and see which is better, the only real difference is in the index construction and the type flag.
That's funny, I typed that as a suggestion for a way to get a good answer, then wondered whether you didn't want to do that because the results might differ significantly on different hardware so a theory answer might be better, or it might be a lot harder than I envisioned, and also thought suggesting it seemed really patronising, so in the end I closed the tab without posting.

Let us know how it benchmarks out, I'd certainly be interested. And if you make it benchmark with a standalone-ish binary I can try it on a DX10-capable NVidia laptop chip and/or a crappy Intel on-board chipset to see if those results differ from whatever you have.

OneEightHundred
Feb 28, 2008

Soon, we will be unstoppable!
If you're doing it on an NVIDIA card then use NV_primitive_restart.

If you're doing it on anything else you should just use GL_TRIANGLES.

UraniumAnchor
May 21, 2006

Not a walrus.
I'm doing it on an Nvidia card in OpenGL, but I'm curious how it specs out on a wider range of hardware. This is also doubling as a CUDA experiment eventually.

HauntedRobot
Jun 22, 2002

an excellent mod
a simple map to my heart
now give me tilt shift
You could do

code:
A--C--+--+--+
|*/|\ | /|\ |
|/*|*\|/ | \|
B--D--E--+--+
|\ |*/|\ | /|
| \|/*| \|/ |
+--F--G--I--+
| /|\*|*/|\ |
|/ | \|/*|*\|
+--+--H--J--K
|\ | /|\ |*/|
| \|/ | \|/*|
+--+--+--L--M
ABCDEFGHIJKLM to get the * triangles, and then you just have to deal with BDF, EGI HJL etc. Any reason you'd not use fans though?

HauntedRobot fucked around with this message at 10:11 on May 5, 2010

UraniumAnchor
May 21, 2006

Not a walrus.

HauntedRobot posted:


ABCDEFGHIJKLM to get the * triangles, and then you just have to deal with BDF, EGI HJL etc. Any reason you'd not use fans though?

Well supposedly D3D10 doesn't support them, for one thing. I haven't done enough D3D in general to know.

OneEightHundred
Feb 28, 2008

Soon, we will be unstoppable!
Trifans probably got removed because they wind up being really inflexible on high poly count models if not outright worthless. Without primitive restart, it's just asking for trouble.

Incidentally, D3D10 does natively support primitive restart, which it called "strip-cut indexes"

http://msdn.microsoft.com/en-us/library/bb205133%28VS.85%29.aspx

haveblue
Aug 15, 2005



Toilet Rascal
The only thing trifans are good for is drawing 2D circles, as far as I know.

Rav
Nov 5, 2000

haveblue posted:

The only thing trifans are good for is drawing 2D circles, as far as I know.

Trifans were mostly useful for software clipping (they were used in most PS2 engines behind the scenes). I cant think of any other place Ive seen them used to great effect in the past eight years.

roomforthetuna
Mar 22, 2005

I don't need to know anything about virii! My CUSTOM PROGRAM keeps me protected! It's not like they'll try to come in through the Internet or something!

Rav posted:

Trifans were mostly useful for software clipping (they were used in most PS2 engines behind the scenes). I cant think of any other place Ive seen them used to great effect in the past eight years.
A trifan is basically the sequence of points around the perimeter of any convex 2D shape - they're kind of useful in that respect for collision-detection, or as haveblue said, for circles (it's fun that intuitively you'd do a trifan from the center point, around the circumference, but it'd save you one point to just go clockwise around the circumference making long thin slivers). I've never seen a trifan used for anything that isn't flat, though they conceivably could be used to render a foldup fan I suppose.

OneEightHundred
Feb 28, 2008

Soon, we will be unstoppable!
The usefulness of trifans dates back to when the number of verts you passed was a major issue and trifans let you group together certain sets of triangles that tristrips couldn't.

The gradual shift towards index buffers, draw call minimization, and vertex transformation caches though has made them mostly obsolete though. Tristrips would be obsolete too but primitive restart SORT OF saves them.

Hubis
May 18, 2003

Boy, I wish we had one of those doomsday machines...

OneEightHundred posted:

Using anything other than crossover gives you "diamond" or "stripe" artifacting caused by diagonals running parallel to their neighbors and verts having mixed 45/90 degree angles with other verts instead of being consistent.

Oh! Right, that conversation.

Well, anyways, if you're using indexed triangles and issuing them in a fairly local order (such as adjacent triangles across a strip), you're essentially going to be getting no benefit from tri-strips.

Why? Because the hardware caches the vertex shader output for a specific vertex index, so that when you request another triangle, you don't have to transform its vertices if their index is already in the cache. The size of the post-transform cache varies with the number of attributes you're passing down, but in general it's big enough that it saves a lot of work. Thus, a tri-strip/fan will only benefit you if your index buffers are eating up a lot of memory/bandwidth. If you're really rendering hundreds of thousands of triangles per frame, you'd get better results dicing up your scene and frustum culling.

So TL:DR -- don't worry about strips/fans as a first-step optimization, use indexed primitives, and issue your triangles in a local pattern.

Rav
Nov 5, 2000

roomforthetuna posted:

A trifan is basically the sequence of points around the perimeter of any convex 2D shape - they're kind of useful in that respect for collision-detection, or as haveblue said, for circles (it's fun that intuitively you'd do a trifan from the center point, around the circumference, but it'd save you one point to just go clockwise around the circumference making long thin slivers). I've never seen a trifan used for anything that isn't flat, though they conceivably could be used to render a foldup fan I suppose.

Maybe for 2d, but I was answering the question with regards to 3D graphics engines (of which I have never seen an engine that used them for reasons other than software screen clipping (as I said, all PS2 engines), or for attribute caching reasons.

Im not talking about pre or post xform vert caches (as it has been mentioned, index buffers fix a lot of that). But some somewhat modern GPUs have a VERY tiny cache to pass attributes from the post-xform cache to the pixel shading processing unit. In that case, triangle fans can help for some types of objects in some situations.

OneEightHundred
Feb 28, 2008

Soon, we will be unstoppable!

Rav posted:

Maybe for 2d, but I was answering the question with regards to 3D graphics engines (of which I have never seen an engine that used them for reasons other than software screen clipping (as I said, all PS2 engines), or for attribute caching reasons.
Early hardware-accelerated engines used them a lot because (as mentioned) it reduced the number of verts you had to throw at the API. Index buffers and transform caches remove that advantage though.

quote:

But some somewhat modern GPUs have a VERY tiny cache to pass attributes from the post-xform cache to the pixel shading processing unit. In that case, triangle fans can help for some types of objects in some situations.
No, they won't, because if you're drawing a tri fan then (just like a tri strip) two of the verts you're drawing were on the last triangle. There is no GPU that won't have that still in cache.

hey mom its 420
May 12, 2007

I import a 1024 x 1024 heightmap to generate a terrain in OpenGL. Anyway, once I display it, my computer comes to a crawl. Is 1024 x 1024 just to much to display at once or are there some optimizations that I should be doing? For now I just go over the vertices and draw triangles. Just asking generally, but I can provide more info though.

haveblue
Aug 15, 2005



Toilet Rascal

Bonus posted:

I import a 1024 x 1024 heightmap to generate a terrain in OpenGL. Anyway, once I display it, my computer comes to a crawl. Is 1024 x 1024 just to much to display at once or are there some optimizations that I should be doing? For now I just go over the vertices and draw triangles. Just asking generally, but I can provide more info though.

What method of submitting geometry are you using? glBegin/glVertex/glEnd is much too slow for that workload, it should be in a vertex array or VBO. If you're already doing that, then what 3D hardware do you have? That might strain a low-end integrated chip like an Intel GMA but it shouldn't make a real recent GPU break a sweat.

hey mom its 420
May 12, 2007

I'm just using glBegin/glVertex/glEnd for each triangle. Does using a vertex array or VBO really help that much? I have an 8800 GT so the card itself shouldn't be a problem. I'll try using vertex arrays and see how it goes.

OneEightHundred
Feb 28, 2008

Soon, we will be unstoppable!
You should really get away from glBegin/glEnd, they suck horribly and should have been purged from the API in the D3D8 days.

Hubis
May 18, 2003

Boy, I wish we had one of those doomsday machines...

Bonus posted:

I'm just using glBegin/glVertex/glEnd for each triangle. Does using a vertex array or VBO really help that much? I have an 8800 GT so the card itself shouldn't be a problem. I'll try using vertex arrays and see how it goes.

yeah. It helps a lot, if you're at all CPU-bound (which you will be if you're rendering 2 million triangles that way).

What you'll want to do is dice it up into 256x256 tiles when you load it (so each tile has indices that will fit in a 16-bit int) and generate an indexed vertex array for each tile. The indexing will let you take advantage of the aforementioned caching, the vertex arrays will let you avoid the driver/CPU overhead of a ton of gl calls, and the VBOs will save you the PCI-E bandwidth of streaming your entire heightmap over the bus every frame. If you want, you can even chunk the tiles down smaller to 64x64, and do frustum culling on them as well.

Hubis fucked around with this message at 04:43 on May 6, 2010

UraniumAnchor
May 21, 2006

Not a walrus.
If my objects are drawing with a backwards Z-order what are some things I should be checking?

I've written poo poo like this before so I'm mystified why my terrain is drawing in front of the stuff that's supposed to be sitting on top of it. I've got backface culling on, CCW winding, an ortho projection that rotates according to mouse movements.

And the terrain always manages to draw in front of the water sitting on top of it. I know the water is drawing properly because if I make the water super deep it shows up properly way above the terrain.



You can see the hill on the left in the top shot showing through the hill in front of it after I rotate clockwise 90-ish degrees.

I'm sure I'm missing something really stupidly simple but I've never had this issue and not all of the code here is mine. It's for a class project and I thought I'd be clever and get some extra credit by modifying the draw code to use vertex arrays instead of display lists. Everything else about the assignment works so I'm confused what the hell I broke.

hey mom its 420
May 12, 2007

Ah, alright, thanks for all the advice guys! Sounds like I have a lot of stuff I can try, so I'm gonna go ahead and play with those optimizations, it looks like it should be fun.

heeen
May 14, 2005

CAT NEVER STOPS

Bonus posted:

I'm just using glBegin/glVertex/glEnd for each triangle. Does using a vertex array or VBO really help that much? I have an 8800 GT so the card itself shouldn't be a problem. I'll try using vertex arrays and see how it goes.

glVertex calls means sending each triple of floats over the bus to the gpu. 1024 * 1024 in triangles means 1024*1024*2*3 ~ 6 million calls per frame.

Next step would be glDrawArrays, which means telling opengl with a single call to render triangles from 6 million vertex positions.

Next you'd use a index buffer, so you have a buffer of 1024*1024 vertices (1024*1024*3*sizeof(float)), and an index buffer telling you which vertices a triangle is made up of which would be 1024*1024*3*sizeof(int). Of course you can optimize by using tri-strips which use the previous two vertices plus one new vertex to form the next triangle. You can also optimize by vertex transform cache which is the last 24? 36? vertices after they went through transformations (or vertex shader).

On top of that you'd put the vertex buffer and the index buffer entirely on gpu memory, which is what VBOs are. Basically you request a block of memory from the gpu, upload your data, and tell opengl to use the on-gpu-buffer for your subsequent render calls.

Rav
Nov 5, 2000

OneEightHundred posted:

Early hardware-accelerated engines used them a lot because (as mentioned) it reduced the number of verts you had to throw at the API. Index buffers and transform caches remove that advantage though.

No, they won't, because if you're drawing a tri fan then (just like a tri strip) two of the verts you're drawing were on the last triangle. There is no GPU that won't have that still in cache.

Just an fyi, pre and post xform caches are limited in size, and having to use a lot of degenerate triangles to get the same shape that can be done much more optimally with a fan, can cause some performance problems depending on the pipeline. And there are cards that do indeed optimize based on the last 2 or 3 verts when passing data to the pixel shader attribute unit. Though when talking about getting any benefit from that, your scene must be very attribute bound which should never be the case if you are doing intelligent LODing.

heeen
May 14, 2005

CAT NEVER STOPS
Whats the latest word in shadowing techniques? Does anyone have some performance numbers on shadow volumes vs. (omnidirectional) shadowmapping?

Is there a better technique for shadowmaps than 6 passes for the six sides for a depth cubemap? I've heard the term "unrolled cubemap" somewhere, what's that about?

OneEightHundred: any chance you know LordHavoc from irc?

OneEightHundred
Feb 28, 2008

Soon, we will be unstoppable!

heeen posted:

Whats the latest word in shadowing techniques?
It's up in the air, the big challenge right now is filtering. Doing shadow softening the "right way" is expensive as gently caress, but per-pixel PCF is expensive too. There are things like VSM which let you do it cheaply, but they come with drawbacks of there own. Bungie covered this sort of thing at SIGGRAPH:
http://www.bungie.net/images/Inside/publications/siggraph/Bungie/SIGGRAPH09_LightingResearch.pptx

quote:

Is there a better technique for shadowmaps than 6 passes for the six sides for a depth cubemap?
One way to avoid it is to render shadowmaps for (groups of?) recipient objects instead of light sources, but that has its own drawbacks.

quote:

OneEightHundred: any chance you know LordHavoc from irc?
Yes.

OneEightHundred fucked around with this message at 02:00 on May 7, 2010

Hubis
May 18, 2003

Boy, I wish we had one of those doomsday machines...

heeen posted:

Is there a better technique for shadowmaps than 6 passes for the six sides for a depth cubemap? I've heard the term "unrolled cubemap" somewhere, what's that about?


If you have access to it, one way to avoid the multiple passes is to use multiple render targets and the geometry shader to duplicate input geometry and assign it to the appropriate render target

hey mom its 420
May 12, 2007

I optimized my terrain drawing by using display lists (since it doesn't change anyway). Now it runs fine. Is this acceptable or should I still look into vertex arrays and VBOs?

Also, I've started getting some strange segfaults in my little OpenGL based game. Since I have no idea where it could originate from, I thought I'd run valgrind on it. But when I run valgrind on it, I just get a million errors, a lot of them from seemingly innocent function calls like glEnable(GL_COLOR_MATERIALS) and such. Has anyone here ever used valgrind on OpenGL applications?

chunkles
Aug 14, 2005

i am completely immersed in darkness
as i turn my body away from the sun

Bonus posted:

Also, I've started getting some strange segfaults in my little OpenGL based game. Since I have no idea where it could originate from, I thought I'd run valgrind on it. But when I run valgrind on it, I just get a million errors, a lot of them from seemingly innocent function calls like glEnable(GL_COLOR_MATERIALS) and such. Has anyone here ever used valgrind on OpenGL applications?

What OS/driver are you using? If you're using X11 and nvidia, try passing --smc-check=all to valgrind.

UraniumAnchor
May 21, 2006

Not a walrus.

Bonus posted:

I optimized my terrain drawing by using display lists (since it doesn't change anyway). Now it runs fine. Is this acceptable or should I still look into vertex arrays and VBOs?

From what I understand vertex arrays and display lists still live in 'client' memory (ie main ram) and VBOs live in 'server' memory (ie video ram), so if your geometry doesn't change much, or you can change it with a vertex shader, it's better to store in a VBO. How much difference does this make? No idea.

In general, display lists are antiquated so for anything remotely serious you should look into arrays, at the very least.

OneEightHundred
Feb 28, 2008

Soon, we will be unstoppable!
Display lists are just as much of a dinosaur as glBegin/glEnd.

Vertex arrays out of system memory are indeed held in client memory, and they are also deprecated, but they're at least useful for getting a feel for how that section of the API works.

For best performance, you should be using VBOs for storage and glDrawElements/glDrawRangeElements to draw.

Spite
Jul 27, 2001

Small chance of that...

Bonus posted:

I optimized my terrain drawing by using display lists (since it doesn't change anyway). Now it runs fine. Is this acceptable or should I still look into vertex arrays and VBOs?

Also, I've started getting some strange segfaults in my little OpenGL based game. Since I have no idea where it could originate from, I thought I'd run valgrind on it. But when I run valgrind on it, I just get a million errors, a lot of them from seemingly innocent function calls like glEnable(GL_COLOR_MATERIALS) and such. Has anyone here ever used valgrind on OpenGL applications?

Which OS? If OSX, use libgmalloc and gdb to find your error. You're probably passing a pointer to something that's too big or too small. I see a ton of errors in apps that give a bad pointer to stuff like glVertexPointer.

And for the love of God, don't use display lists EVER. As said, use VBOs with STATIC_DRAW for your terrain. Chunk it up so you can cull out parts - the fastest triangle is the one you don't have to draw. You can still use one VBO and DrawRangeElements per piece.

Note that VBOs don't necessarily HAVE to be in VRAM - the driver makes that decision and will page stuff on and off based on load and pressure. STATIC_DRAW hinted buffers will very likely stay resident on the card though. Geometry tends to use much less space than texture data, as well.

EDIT: and now I realize most of this information is redundant with what's already been posted.

Spite fucked around with this message at 09:23 on May 7, 2010

Adbot
ADBOT LOVES YOU

hey mom its 420
May 12, 2007

poo poo, you guys are awesome. This is pretty much my first time doing anything with 3D graphics, so I'm still getting the feel for things and I don't know what's what. I figured the knowledge I got from tutorials around the net would be antiquated. I'll use VBOs then.

I was using valgrind on Ubuntu. I figured out the error, it was just a stupid mistake of putting some data that I loaded from 3ds models on the stack instead of on the heap, so naturally the stack blew. I put it on the heap now and it works fine. I'll try it again with --smc-check=all just to see if it tells me some useful stuff that could come in handy.

hey mom its 420 fucked around with this message at 09:57 on May 7, 2010

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply