Should I use integer ID or pointers for my opaque

2019-06-20 16:54发布

问题:

I'm writing an abstraction layer on top of some graphics API (DirectX9 and DirectX11) and I would like your opinion.

Traditionally I would create a base class for each concept I want to abstract.
So in typical OO fashion I would have for example a class Shader and 2 subclasses DX9Shader and DX11Shader.

I would repeat the process for textures, etc... and when I need to instantiate them I have an abstract factory that will return the appropriate subclass depending on the current graphics API.
Following RAII, the returned pointer would be encapsulated in a std::shared_ptr.

So far so good but in my case there are a few problems with this approach:

  1. I need to come up with a public interface that encapsulate the functionality of both APIs (and other APIs in the future).
  2. The derived class are stored in separate DLLs (one for DX9, one for DX11 etc...) and having a shared_ptr to them in the client is a curse: on exit the graphic dlls are unloaded and if the client still has a shared_ptr to one of the graphics objects boom, crash due to calling code from unloaded DLL.

This prompted me to re-design the way I do things: I thought I could just return raw pointers to the resources and have the graphics API clean after itself but there's still the issue of dangling pointers on the client side and the interface problem. I even considered manual reference counting like COM but I thought that would be a step backwards (correct me if I'm wrong, coming from the shared_ptr world, manual reference counting seems primitive).

Then I saw the work of Humus where all his graphics classes are represented by integer IDs (much like what OpenGL does). Creating a new object only returns its integer ID, and stores the pointer internally; it's all perfectly opaque!

The classes that represent the abstraction (such as DX9Shader etc...) are all hidden behind the device API which is the only interface.
If one wants to set a texture, it's just a matter of calling device->SetTexture(ID) and the rest happens behind the scenes.

The downfall is that the hidden part of the API is bloated, there is a lot of boiler plate code required to make it work and I'm not a fan of a do-it-all class.

Any ideas/thoughts ?

回答1:

You say that the main problem is that a DLL is unloaded while still having a pointer to its internals. Well... don't do that. You have a class instance, who's members are implemented in that DLL. It is fundamentally an error for that DLL to be unloaded so long as those class instances exist.

You therefore need to be responsible in how you use this abstraction. Just as you need to be responsible with any code you load from a DLL: stuff that comes from the DLL must be cleaned up before you unload the DLL. How you do that is up to you. You could have an internal reference count that gets incremented for every object the DLL returns and only unload the DLL after all referenced objects go away. Or anything, really.

After all, even if you use these opaque numbers or whatever, what happens if you call one of those API functions on that number when the DLL is unloaded? Oops... So it doesn't really buy you any protection. You have to be responsible either way.

The downsides of the number method that you may not be thinking about are:

  • Reduced ability to know what an object actually is. API calls can fail because you passed a number that isn't really an object. Or worse, what happens if you pass a shader object into a function that takes a texture? Maybe we're talking about a function that takes a shader and a texture, and you accidentally forget the order of the arguments? The rules of C++ wouldn't allow that code to even compile if those were object pointers. But with integers? It's all good; you'd only get runtime errors.

  • Performance. Every API call will have to look this number up in a hashtable or something to get an actual pointer to work with. If it's a hashtable (ie: an array), then it's probably fairly minor. But it's still an indirection. And since your abstraction seems very low-level, any performance loss at this level can really hurt in performance-critical situations.

  • Lack of RAII and other scoping mechanisms. Sure, you could write a shared_ptr-esque object that would create and delete them. But you wouldn't have to do that if you were using an actual pointer.

It just doesn't seem worthwhile.



回答2:

Does it matter? To the user of the object, it is just an opaque handle. its actual implementation type doesn't matter, as long as I can pass the handle to your API functions and have them do stuff with the object.

You can change the implementation of these handles easily, so make it whatever is easier for you now.

Just declare the handle type as a typedef of either a pointer or an integer, and make sure that all client code uses the typedef name, then the client code doesn't depend on the specific type you chose to represent your handles.

Go for the simple solution now, and if/when you run into problems because that was too simple, change it.



回答3:

Regarding your p. 2: Client is always unloaded before libraries.

Every process has its library dependency tree, with .exe as tree root, user Dll at intermediate levels, and system libraries at low level. Process is loaded from low to high level, tree root (exe) is loaded last. Process is unloaded starting from the root, low-level libraries are unloaded last. This is done to prevent situations you are talking about.

Of course, if you load/unload libraries manually, this order is changed, and you are responsible to keep pointers valid.