Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Jethro
Jun 1, 2000

I was raised on the dairy, Bitch!

ShinAli posted:

Not really a programming question, but I'd like to know what you know-more-than-I-dos think.

I remember awhile back some ATi guy saying something like how much it'd be better if developers were able to directly program to the GPU rather than depending on the graphics company's implementation of a library, with a chorus of major engine developers saying that would be pretty awesome, then the ATi guy immediately back peddling and going "HEY WHAT I MEANT TO SAY WAS OH BOY ISN'T DIRECTX JUST GREAT?"

Has there been much consideration about going towards this route, given that graphics companies agree to some common instruction set? Would there be too many drawbacks? Given that everyone agrees on a instruction set, we could still have all those libraries except the development of them could be more transparent; more than that, developers don't have to be so restricted to them.

Is this just a case where its what everyone wants, its just nVidia and AMD/ATi wouldn't go along with it? Or is this thinking flawed?
I think the problem is that the ATi guy was saying "wouldn't it be better if you had <the ability to write your own 'assembly' for each GPU your consumers might use>," and the developers heard "wouldn't it be better if you had <a set of magic 'go fast' commands>."

Adbot
ADBOT LOVES YOU

shodanjr_gr
Nov 20, 2007
I've run into an OpenGL/OpenCL multithreading/resource sharing question.

I have an app that uses multiple threads to poll a bunch of Kinects for depth/rgb frames. It then renders the frames into separate OpenGL contexts (right now there is a 1-1 correspondence between an image and an OpenGL context).

To my understanding it is possible to get OpenGL contexts to share display lists and textures (I'm using Qt for the UI and it offers such functionality and bone stock OpenGL does it as well). However, I haven't found it explicitly stated anywhere that more than two contexts can share resources.

Additionally, I plan to add some OpenCL functionality that basically does computation on these images and outputs results that I also want to be able to render in the aforementioned OpenGL contexts. Now, OpenCL allows you to interop with OpenGL by defining a single OpenGL context to share resources with.

The overarching question is whether I can "chain" resource sharing between contexts. As in, when my application starts, create a single "parent" OpenGL context, then have ALL other OpenGL and OpenCL contexts (that may reside in other threads) actually share resources with that "parent" context and as an extension share resources with each other?

Paniolo
Oct 9, 2007

Heads will roll.
Why are you using separate OpenGL contexts for everything?

shodanjr_gr
Nov 20, 2007

Paniolo posted:

Why are you using separate OpenGL contexts for everything?

I have different widget classes that handle visualizing different types of data and Qt makes no guarantee that they will share an OpenGL context.

Spite
Jul 27, 2001

Small chance of that...

Paniolo posted:

Why are you using separate OpenGL contexts for everything?

If you are using multiple contexts you MUST use a separate context per thread.

To answer the main question, yes, multiple contexts can share resources. Note that not everything is shared, like Fences (but Syncs are). Check the spec for specifics.
If you are trying to do something like
Context A, B, C are all shared
Context B modifies a texture
Context A and C get the change
That should work.
Create context A. Create contexts B and C as contexts shared with A. That should allow you to do this.

Keep in mind you've now created a slew of race conditions, so watch your locks!
Also, you have to make sure commands have been flushed to the GPU on the current context before expecting the results to show up elsewhere.
For example:
Context A calls glTexSubImage2D on Texture 1
Context B calls glBindTexture(1), Draw

This will have undefined results. You must do the following:

Context A calls glTexSubImage2D on Texture 1
Context A calls glFlush
Context B calls glBindTexture(1), Draw

On OSX, you must also call glBindTexture before changes are picked up, not sure about windows. It's probably vendor-specific. And again, you need a context per thread. Do not try to access the same context from separate threads; only pain and suffering awaits.

You probably want to re-architect your design, as it doesn't sound very good to me. Also, can't Qt pass you an opaque pointer? You can't hold a single context there and lock around its use? Or have a single background thread doing the rendering?

Spite fucked around with this message at 03:24 on Nov 3, 2011

shodanjr_gr
Nov 20, 2007

Spite posted:

If you are using multiple contexts you MUST use a separate context per thread.

To answer the main question, yes, multiple contexts can share resources. Note that not everything is shared, like Fences (but Syncs are). Check the spec for specifics.
If you are trying to do something like
Context A, B, C are all shared
Context B modifies a texture
Context A and C get the change
That should work.
Create context A. Create contexts B and C as contexts shared with A. That should allow you to do this.


That's what I ended up doing. I create context on application launch and then any other contexts that are initialized share resources with that one context.

quote:

Keep in mind you've now created a slew of race conditions, so watch your locks!
Also, you have to make sure commands have been flushed to the GPU on the current context before expecting the results to show up elsewhere.
For example:
Context A calls glTexSubImage2D on Texture 1
Context B calls glBindTexture(1), Draw

This will have undefined results. You must do the following:

Context A calls glTexSubImage2D on Texture 1
Context A calls glFlush
Context B calls glBindTexture(1), Draw

On OSX, you must also call glBindTexture before changes are picked up, not sure about windows. It's probably vendor-specific. And again, you need a context per thread. Do not try to access the same context from separate threads; only pain and suffering awaits.
I already have wrappers around all GPU-specific resources and the convention is that all of them must be locked for Read/Write before access. I am also figuring out a way to ensure consistency between the GPU and CPU versions of a resource (potentially have "LockForWriteCPU" and "LockForWriteGPU" functions along with a state variable and then an unlock function that updates the relevant copy of the data based on the lock state).

quote:

You probably want to re-architect your design, as it doesn't sound very good to me. Also, can't Qt pass you an opaque pointer? You can't hold a single context there and lock around its use? Or have a single background thread doing the rendering?
There's a couple of issues. First of all, I am using multiple widgets for rendering and I am not guaranteed that I will get the same context for all widgets (even if they are in the same thread I believe and I plan on multithreading those as well since they tank the UI thread). Additionally, I plan on having an OpenCL context in a separate thread. That's on top of a bunch of threads that produce resources for me. I wanted to minimize the amount of "global" locking that takes place in this scheme hence this exercise...

Spite
Jul 27, 2001

Small chance of that...

shodanjr_gr posted:

That's what I ended up doing. I create context on application launch and then any other contexts that are initialized share resources with that one context.

I already have wrappers around all GPU-specific resources and the convention is that all of them must be locked for Read/Write before access. I am also figuring out a way to ensure consistency between the GPU and CPU versions of a resource (potentially have "LockForWriteCPU" and "LockForWriteGPU" functions along with a state variable and then an unlock function that updates the relevant copy of the data based on the lock state).

There's a couple of issues. First of all, I am using multiple widgets for rendering and I am not guaranteed that I will get the same context for all widgets (even if they are in the same thread I believe and I plan on multithreading those as well since they tank the UI thread). Additionally, I plan on having an OpenCL context in a separate thread. That's on top of a bunch of threads that produce resources for me. I wanted to minimize the amount of "global" locking that takes place in this scheme hence this exercise...

This is way complicated. Have you tried a single OpenGL context that does all your GL rendering and passes the result back to your widgets? The widgets can update themselves as the rendering comes back. Remember: there's only one GPU so multiple threads will not help with the actual rendering.

Also GPU and CPU resources are separate from each other unless you are using CLIENT_STORAGE or just mapping the buffers and using that CPU side. You can track what needs to be uploaded to the GPU by just making dirty bits and setting them. Multiple threads should not be trying to update the same GPU object at once in general - that gets nasty very fast.

shodanjr_gr
Nov 20, 2007

Spite posted:

This is way complicated. Have you tried a single OpenGL context that does all your GL rendering and passes the result back to your widgets? The widgets can update themselves as the rendering comes back. Remember: there's only one GPU so multiple threads will not help with the actual rendering.

You mean as in having a single class that manages the context and rendering requests get posted to that class which does Render-To-Texture for each request then returns the texture to a basic widget for display?


quote:

Also GPU and CPU resources are separate from each other unless you are using CLIENT_STORAGE or just mapping the buffers and using that CPU side. You can track what needs to be uploaded to the GPU by just making dirty bits and setting them. Multiple threads should not be trying to update the same GPU object at once in general - that gets nasty very fast.

That's what I plan on doing...basically have each wrapper for my resources carry a CPU-side version ID and a GPU-side version ID and then a function that ensures consistency between the two versions. I am also providing a per-resource lock so technically more than one thread should not be locking the same resource for writing at the same time (either on the CPU or the GPU side).

Spite
Jul 27, 2001

Small chance of that...

shodanjr_gr posted:

You mean as in having a single class that manages the context and rendering requests get posted to that class which does Render-To-Texture for each request then returns the texture to a basic widget for display?

That's how I'd approach it myself. Just post requests to this thread/object and have it pass back whatever is requested or necessary for the UI to draw. I'm making quite a few assumptions because I don't know exactly what your architecture needs to be, but it will vastly simplify your locking and GPU handling to do it that way.

As for the second part, do you think multiple threads will be accessing the same CPU object simultaneously often? You can always make your CPU stuff more volatile - if you don't mind taking copies of stuff and passing that around/modifying the copy instead of locking. Your program's needs will better determine which will be more efficient.

Orzo
Sep 3, 2004

IT! IT is confusing! Say your goddamn pronouns!
Crosspost from the game development thread, perhaps the folks here might have some opinions...

Has anyone had problems with graphics 'stuttering?' I have a very simple test application (C#, SlimDX) where a textured quad (doesn't matter if I use the built-in Sprite class or manually fill my own vertex buffer) moves from left to right over and over, and about once per second there is a visible 'jump' of a few frames. I feel like I've tried everything and cannot figure out what it is. Garbage collection every frame, vsync on and off, etc etc. Eventually I tried the SlimDX samples that come with the SDK and found they that had the same problem, so now I think it's something wrong with my computer. I tried it on my laptop and had a similiar--but not quite as extreme--problem. It jumps a little bit, but not quite as dramatic on my main machine. Updated my video card (9800 gtx+) drivers to no avail. Turned on/off debug runtime in the DirectX control panel.

I'm totally at a loss! I know this isn't perfect information, I'm just hoping someone has seen these symptoms before and knows what it is.

Spite
Jul 27, 2001

Small chance of that...
Yikes, that's hard. It could be one of approximately a million things. I'm not familiar with slimdx but it's quite possible it's something with that? Do other 3d apps and games run fine? Have you tried writing a simple c d3d or opengl app that does the same thing?

HolaMundo
Apr 22, 2004
uragay

sponge would own me in soccer :(
Stupid question incoming:

I'm learning GLSL using the orange book.
I couldn't use glAttachShader beacuse it wouldn't find it and googling around I found out that to use GLSL with SDL (which I'm using in an application in which I wanna use shaders) you have to setup some extensions and stuff.
I'm guessing the functions in this extension do more or less the same than the ones mentioned in the orange book.

My question is what's the difference from using the ARB extension one (glAttachObjectARB for example) or the ones mentioned in the book, like glAttachShader (which I'm not sure how to use with SDL).

I've googled but I haven't found any information on which is the "right" way to do it, or what's the difference between them (if there's any).


edit: After reading this http://www.opengl.org/wiki/Getting_started#Getting_Functions I downloaded glew, which I'm now using to load all this functions. Gonna try to write a simple shader and see if it works.
Why does it have to be so hard? :(

HolaMundo fucked around with this message at 22:50 on Nov 13, 2011

OneEightHundred
Feb 28, 2008

Soon, we will be unstoppable!
edit: Okay, originally I said there was no difference, which is true, but you should have access to glAttachShader as long as you're using OpenGL 2.0 drivers. Static linking to the opengl32.lib library will only give you access to the symbols exposed by that library, you probably need to use something like SDL_GL_GetProcAddress to get other functions. Linking against the library is flaky in general and it's not uncommon for implementations to get every single function call they plan on using through GetProcAddress.

OneEightHundred fucked around with this message at 07:40 on Nov 14, 2011

HolaMundo
Apr 22, 2004
uragay

sponge would own me in soccer :(
After using glew (which does all the GetProcAddress fun stuff) I got it working.
I'm wondering why all the hassle (ok, maybe it's no that bad, heh) of having to use GetProcAddress to get the functions?
Not like it matters now that I've got it working, just being curious.

OneEightHundred
Feb 28, 2008

Soon, we will be unstoppable!
Because linking the library assumes the functions are there. It has some dangerous implications in that it can cause your app to flat-out not start if it's looking for functions that don't exist even if you don't plan on using them. Also because I think the opengl32.lib that ships with VS only has the 1.1 API calls.

Slurps Mad Rips
Jan 25, 2009

Bwaltow!

OneEightHundred posted:

Because linking the library assumes the functions are there. It has some dangerous implications in that it can cause your app to flat-out not start if it's looking for functions that don't exist even if you don't plan on using them. Also because I think the opengl32.lib that ships with VS only has the 1.1 API calls.

Couldn't someone create a stub opengl3.lib file for use? (I assume not, but AFAIK stub .lib files only contain the function names right?)

OneEightHundred
Feb 28, 2008

Soon, we will be unstoppable!
Yes, you could do that. You'd still want to use GetProcAddress for anything that you weren't sure would be available (i.e. calls exposed by extensions).

Plorkyeran
Mar 22, 2007

To Escape The Shackles Of The Old Forums, We Must Reject The Tribal Negativity He Endorsed
Unless you're running into some GLEW-related problem I don't really see a reason to bother, though.

Spite
Jul 27, 2001

Small chance of that...

OneEightHundred posted:

Yes, you could do that. You'd still want to use GetProcAddress for anything that you weren't sure would be available (i.e. calls exposed by extensions).

Not really. What's happening is that wglGetProcAddress asks the driver for its implementations of the functions. So you'd need a .lib for each driver and probably each build of each driver. Then you'd need a separate exe for each and it would be nasty.

This is also why you have to create a dummy context, get the new context creation function and create a new context if you want to use GL3+. It sucks.

OneEightHundred
Feb 28, 2008

Soon, we will be unstoppable!
Oh right I forgot that you have to use wglGetProcAddress and regular GetProcAddress on the DLL doesn't really work.

Yeah basically just have a structure that contains all of the functions you want and use that. It doesn't take that long to add new PFN*PROC types and add API calls as you go since there are only like 100 functions in the API worth using any more. Function pointer types for all of the extensions are already in glext.h

Paniolo
Oct 9, 2007

Heads will roll.
Or just use GLEW because that what it already does and it's a solved problem.

Slurps Mad Rips
Jan 25, 2009

Bwaltow!

Spite posted:

Not really. What's happening is that wglGetProcAddress asks the driver for its implementations of the functions. So you'd need a .lib for each driver and probably each build of each driver. Then you'd need a separate exe for each and it would be nasty.

This is also why you have to create a dummy context, get the new context creation function and create a new context if you want to use GL3+. It sucks.

OneEightHundred posted:

Oh right I forgot that you have to use wglGetProcAddress and regular GetProcAddress on the DLL doesn't really work.

Yeah basically just have a structure that contains all of the functions you want and use that. It doesn't take that long to add new PFN*PROC types and add API calls as you go since there are only like 100 functions in the API worth using any more. Function pointer types for all of the extensions are already in glext.h

I had a feeling this is what was being done. (especially since I did a dumpbin to a .def file of the ATI drivers, and only about 24 functions are actually present within the opengl driver dll. some of which are for egl :raise:)

Paniolo posted:

Or just use GLEW because that what it already does and it's a solved problem.

There's also the gl3w library which focuses on using and loading only opengl 3/4 functions.


One question though, on a slightly related matter. Since OpenCL works well with OpenGL, and there are a variety of implementations out there, writing for each implementation and distributing multiple executables isn't a factor because they didn't go the route of clGetProcAddress style system right?

Paniolo
Oct 9, 2007

Heads will roll.

SAHChandler posted:

One question though, on a slightly related matter. Since OpenCL works well with OpenGL, and there are a variety of implementations out there, writing for each implementation and distributing multiple executables isn't a factor because they didn't go the route of clGetProcAddress style system right?

It is a factor precisely because they didn't go that route. I could be wrong because my experience with OpenCL is fairly limited but I believe currently if you link with the nVidia libs your executable will only run on a machine with nVidia drivers, and the same goes for the Intel and ATI implementations.

Orzo
Sep 3, 2004

IT! IT is confusing! Say your goddamn pronouns!

Orzo posted:

Crosspost from the game development thread, perhaps the folks here might have some opinions...

Has anyone had problems with graphics 'stuttering?' I have a very simple test application (C#, SlimDX) where a textured quad (doesn't matter if I use the built-in Sprite class or manually fill my own vertex buffer) moves from left to right over and over, and about once per second there is a visible 'jump' of a few frames. I feel like I've tried everything and cannot figure out what it is. Garbage collection every frame, vsync on and off, etc etc. Eventually I tried the SlimDX samples that come with the SDK and found they that had the same problem, so now I think it's something wrong with my computer. I tried it on my laptop and had a similiar--but not quite as extreme--problem. It jumps a little bit, but not quite as dramatic on my main machine. Updated my video card (9800 gtx+) drivers to no avail. Turned on/off debug runtime in the DirectX control panel.

I'm totally at a loss! I know this isn't perfect information, I'm just hoping someone has seen these symptoms before and knows what it is.
I just wanted to reply to my own post because someone in the Game Development thread figured it out and maybe someday someone else will have this problem and appreciate the answer:

I forgot I had f.lux installed and running in the background. Disabling it does not fix the problem, but exiting out of the program does.

Spite
Jul 27, 2001

Small chance of that...

SAHChandler posted:

I had a feeling this is what was being done. (especially since I did a dumpbin to a .def file of the ATI drivers, and only about 24 functions are actually present within the opengl driver dll. some of which are for egl :raise:)


There's also the gl3w library which focuses on using and loading only opengl 3/4 functions.


One question though, on a slightly related matter. Since OpenCL works well with OpenGL, and there are a variety of implementations out there, writing for each implementation and distributing multiple executables isn't a factor because they didn't go the route of clGetProcAddress style system right?

There's an extension for ES2 compatibility, which is probably why they are there.

As for OpenCL: also keep in mind that OpenCL is driven by Apple, which controls its driver stack and can release headers that work with all its parts.

Slurps Mad Rips
Jan 25, 2009

Bwaltow!

Paniolo posted:

It is a factor precisely because they didn't go that route. I could be wrong because my experience with OpenCL is fairly limited but I believe currently if you link with the nVidia libs your executable will only run on a machine with nVidia drivers, and the same goes for the Intel and ATI implementations.

I've looked at the headers and such, and I think the only issue would be anything involving the linker, but I assume because it's dynamic linking, all the function calls can be resolved at runtime. This is something that no documentation out there seems to talk about, as everyone who is writing opencl apps seems to assume people will be able to download and compile the source (which solves the distribution issue)

HolaMundo
Apr 22, 2004
uragay

sponge would own me in soccer :(
More newbie questions!
I've been learning glsl for the past few days and it's been kinda confusing since sometimes you find stuff for different version of glsl and I can't keep track of what should or shouldn't work.
I understand the new versions of glsl have signficant changes from the older ones on how they work so I'm just using version 110 to teach me the basic stuff.

Started by writing a simple fragment shader which just draws the whole screen in one color, that was easy.
Next I decided to implement texture mapping (so I can try doing bump mapping later) but I'm having a hard time figuring out why my code is working / why it wasn't working before.

Fragment shader:
code:
#110
varying vec2 vTexCoord;

void main(void)
{
   vTexCoord = gl_MultiTexCoord0.xy;
   gl_Position = gl_ModelViewProjectionMatrix * gl_Vertex;
}
Vertex shader:
code:
#110
uniform sampler2D myTexture;
varying vec2 vTexCoord;

void main()
{
	gl_FragColor = texture2D(myTexture, vTexCoord);
}
From what I understood you have to somehow tell glsl which texture you want, using uniform variables so I was doing this:
code:
glBindTexture(GL_TEXTURE_2D,texMadera); //texMadera is a texture previously loaded.
GLint loc1 = glGetUniformLocation(p,"myTexture");
glUniform1i(loc1,texMadera);
This wasn't working and all I got was a black screen. After trying some stuff I randomly decided to comment the glGetUnfiormLocation and glUniform1i lines and it worked. It also worked using glUniform1i(loc1, 0).

Is the fragment shader just setting the sampler2D variable to whatever texture opengl is currently using?
I'm guessing this is the case, since I can change the name of the variable in the fragment shader and it still works but I don't understand why it wouldn't work with glUniform1i(loc1, texMadera).

Gah, this ended being kinda long.
tldr: how the hell does my vertex shader know which texture I'm using?

HiriseSoftware
Dec 3, 2004

Two tips for the wise:
1. Buy an AK-97 assault rifle.
2. If there's someone hanging around your neighborhood you don't know, shoot him.
You tell the shader what texture UNIT you're using and you still bind that texture in your C code. That's why passing 0 seemed to work for you. You don't give it the texture ID.

HolaMundo
Apr 22, 2004
uragay

sponge would own me in soccer :(
Doh!
Thanks :downs:

But why does it work if I change the variable name (of the sampler2D) in the vertex shader? Does it defaults to whatever texure is binded in unit 0 or something like that?

I'm guessing for bump mapping you could bind the normal map to a different unit than the actual texture and pass both to the shader.

HiriseSoftware
Dec 3, 2004

Two tips for the wise:
1. Buy an AK-97 assault rifle.
2. If there's someone hanging around your neighborhood you don't know, shoot him.

HolaMundo posted:

Doh!
Thanks :downs:

But why does it work if I change the variable name (of the sampler2D) in the vertex shader? Does it defaults to whatever texure is binded in unit 0 or something like that?

I'm guessing for bump mapping you could bind the normal map to a different unit than the actual texture and pass both to the shader.

It's probably defaulting the value of that variable to 0, which is the first texture unit. And yes, for bump mapping, you would have multiple texture units and thus multiple sampler2D variables.

Spite
Jul 27, 2001

Small chance of that...
Sampler2D is essentially a number that corresponds to the texture unit. Most GPUs have 16 these days.
ie:

glActiveTexture(GL_TEXTURE0)
glBindTexture(GL_TEXTURE_2D, tex)

Will bind the texture 'tex' to unit 0.
Then you set the sampler to 0 and it will sample from that texture.

As for defaults, it will depend on the implementation. It should default to 0, but I'd check the spec to be sure.

OneEightHundred
Feb 28, 2008

Soon, we will be unstoppable!
Is there a site with a solid rundown of GPU feature support these days? Delphi3D is hosed and it's a pain in the rear end trying to figure out what old hardware supports what render formats, what is supported as an input but not as a render target, what supports depth read and what only supports compare, and all of these other goofy pre-DX10 inconsistencies.

Spite
Jul 27, 2001

Small chance of that...
If you want to do some cross referencing between GL and DX, try:
http://developer.apple.com/graphicsimaging/opengl/capabilities/

I'd hope everything relatively recent/worth developing for supports depth read :)

OneEightHundred
Feb 28, 2008

Soon, we will be unstoppable!

Spite posted:

I'd hope everything relatively recent/worth developing for supports depth read :)
Yeah looks like I was confusing the X1000-series' lack of PCF with what looks like pre-PS2.0 inability to sample depth textures as floats.

That looks good though, guess everything I care about supports FP16 color buffers. :toot:

Spite
Jul 27, 2001

Small chance of that...

OneEightHundred posted:

Yeah looks like I was confusing the X1000-series' lack of PCF with what looks like pre-PS2.0 inability to sample depth textures as floats.

That looks good though, guess everything I care about supports FP16 color buffers. :toot:

The ATI X1xxxx series doesn't let you filter float textures in hardware, from what I recall. There's always room for a dirty shader hack though!

HolaMundo
Apr 22, 2004
uragay

sponge would own me in soccer :(
How should I handle having multiple shaders?
For example, I now have a normal mapping shader (which also handles texture mapping), but let's say I want to render something else with a texture but no normal map, how would I do that? Having another shader just for texture mapping seems stupid.
Also thought about having a bool to turn normal mapping on or off but it doesn't seem right either.

Mata
Dec 23, 2003

HolaMundo posted:

How should I handle having multiple shaders?
For example, I now have a normal mapping shader (which also handles texture mapping), but let's say I want to render something else with a texture but no normal map, how would I do that? Having another shader just for texture mapping seems stupid.
Also thought about having a bool to turn normal mapping on or off but it doesn't seem right either.

Yeah I think you're supposed to avoid conditionals and branching so the bool thing is not the best solution. I do something similar to what you want by using different techniques to compile different combinations of vertex and pixel shaders. In your example maybe you'd have one technique with normal mapping and one technique without it.
Maybe it would be better to do it with multiple passes instead of techniques, but I haven't really done stuff in multiple passes.

Mata fucked around with this message at 18:57 on Nov 23, 2011

haveblue
Aug 15, 2005



Toilet Rascal
It's almost always preferable to a true conditional to run the extra calculations in all cases and multiply the result by 0 if you don't want it to contribute to the fragment.

zzz
May 10, 2008
I haven't touched GPU stuff in a while, but I was under the impression that static branches based on global uniform variables will be optimized away by all modern compilers/drivers and never get executed on the GPU, so it wouldn't make a significant difference either way...?

Best way to find out is benchmark both, I guess :)

Adbot
ADBOT LOVES YOU

HolaMundo
Apr 22, 2004
uragay

sponge would own me in soccer :(
Thanks for the answers.
What I'm doing now is have two separate shaders (one for texture mapping, the other for normal mapping) and switch between them depending on what I want to render.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply