I'm always referencing DLLs in my C# code, but they have remained somewhat of a mystery which I would like to clarify. This is a sort of brain dump of questions regarding DLLs.
I understand a DLL is a dynamically linked library which means that another program can access this library at run time to get "functionality". However, consider the following ASP.NET project with Web.dll
and Business.dll
(Web.dll
is the front end functionality and it references Business.dll
for types and methods).
At what point does Web.dll
dynamically link to Business.dll
? You notice a lot in Windows HDD thrashing for seemingly small tasks when using Word (etc.) and I reckon that Word is going off and dynamically linking in functionality from other DLLs?
1a. Additionally, what loads and links the DLL - the OS or some run time framework such as the .NET framework?
1b. What is the process of "linking"? Are compatibility checks made? Loading into the same memory? What does linking actually mean?
What actually executes the code in the DLL? Does it get executed by the processor or is there another stage of translation or compilation before the processor will understand the code inside the DLL?
2a. In the case of a DLL built in C# .NET, what is running this: the .NET framework or the operating system directly?
Does a DLL from Linux work on a Windows system (if such a thing exists), or are they operating system specific?
Are DLLs specific to a particular framework? Can a DLL built using C# .NET be used by a DLL built with, for example, Borland C++?
4a. If the answer to 4 is "no" then what is the point of a DLL? Why dont the various frameworks use their own formats for linked files? For example: an .exe built in .NET knows that a file type of .abc is something that it can link into its code.
Going back to the Web.dll
/ Business.dll
example - to get a class type of customer I need to reference Business.dll
from Web.dll
. This must mean that Business.dll
contains some sort of a specification as to what a customer class actually is. If I had compiled my Business.dll
file in, say, Delphi: would C# understand it and be able to create a customer class, or is there some sort of header info or something that says "hey sorry you can only use me from another Delphi DLL"?
5a. Same applies for methods; can I write a CreateInvoice()
method in a DLL, compile it in C++, and then access and run it from C#? What stops or allows me from doing this?
On the subject of DLL hijacking, surely the replacement (bad) DLL must contain the exact method signatures and types as the one that is being hijacked. I suppose this wouldn't be hard to do if you could find out what methods were available in the original DLL.
6a. What in my C# program is deciding if I can access another DLL? If my hijacked DLL contained exactly the same methods and types as the original but it was compiled in another language, would it work?
What is DLL importing and DLL registration?
First of all, you need to understand the difference between two very different kinds of DLLs. Microsoft decided to go with the same file extensions (.exe and .dll) with both .NET (managed code) and native code, however managed code DLLs and native DLLs are very different inside.
1) At what point does web.dll dynamically link to business.dll? You
notice a lot in Windows HDD thrashing for seemingly small tasks when
using Word etc and I reckon that this Word going off and dynamically
linking in functionality from other DLL's?
1) In the case of .NET, DLLs are usually loaded on demand when the first method trying to access anything from the DLL is executed. This is why you can get TypeNotFoundExceptions anywhere in your code if a DLL can't be loaded. When something like Word suddenly starts accessing the HDD a lot, it's likely swapping (getting data that has been swapped out to the disk to make room in the RAM)
1a) Additionally what loads and links the DLL - the O/S or some
runtime framework such as the .Net framework?
1a) In the case of managed DLLs, the .NET framework is what loads, JIT compiles (compiles the .NET bytecode into native code) and links the DLLs. In the case of native DLLs it's a component of the operating system that loads and links the DLL (no compilation is necessary because native DLLs already contain native code).
1b) What is the process of "linking"? Are checks made that there is
compatibility? Loading into the same memory? What does linking
actually mean?
1b) Linking is when references (e.g. method calls) in the calling code to symbols (e.g. methods) in the DLL are replaced with the actual addresses of the things in the DLL. This is necessary because the eventual addresses of the things in the DLL cannot be known before it's been loaded into memory.
2) What actually executes the code in the DLL? Does it get executed by
the processor or is there another stage of translation or compilation
before the processor will understand the code inside the DLL?
2) On Windows, .exe files and .dll files are quite identical. Native .exe and .dll files contain native code (the same stuff the processor executes), so there's no need to translate. Managed .exe and .dll files contain .NET bytecode which is first JIT compiled (translated into native code).
2a) In the case of a DLL built from C# .net what is running this? The
.Net framework or the operating system directly?
2a) After the code has been JIT compiled, it's ran in the exact same way as any code.
3) Does a DLL from say Linux work on a Windows system (if such a thing
exists) or are they operating system specific?
3) Managed DLLs might work as-is, as long as the frameworks on both platforms are up to date and whoever wrote the DLL didn't deliberately break compatibility by using native calls. Native DLLs will not works as-in, as the formats are different (even though the machine code inside is the same, if they're both for the same processor platform). By the way, on Linux, "DLLs" are known as .so (shared object) files.
4) Are they specific to a particular framework? Can a DLL built using
C# .Net be used by a DLL built with Borland C++ (example only)?
4) Managed DLLs are particular to the .NET framework, but naturally they work with any compatible language. Native DLLs are compatible as long as everyone uses the same conventions (calling conventions (how function arguments are passed on the machine code level), symbol naming, etc)
5) Going back to the web.dll / business.dll example. To get a class
type of customer I need to reference business.dll from web.dll. This
must mean that business.dll contains a specification of some sort of
what a customer class actually is. If I had compiled my business.dll
file in say Delphi would C# understand it and be able to create a
customer class - or is there some sort of header info or something
that says "hey sorry you can only use me from another delphi dll".
5) Managed DLLs contain a full description of every class, method, field, etc they contain. AFAIK Delphi doesn't support .NET, so it would create native DLLs, which can't be used in .NET straightforwadly. You will probably be able to call functions with PInvoke, but class definitions will not be found. I don't use Delphi so I don't know how it stores type information with DLLs. C++, for example, relies on header (.h) files which contain the type declarations and must be distributed with the DLL.
6) On the subject of DLL hijacking, surely the replacement (bad) DLL
must contain the exact method signatures, types as the one that is
being hijacked. I suppose this wouldnt be hard to do if you could find
out what methods etc were available in the original DLL.
6) Indeed, it's not hard to do if you can easily switch the DLL. Code signing can be used to avoid this. In order for someone to replace a signed DLL, they would have to know the signing key, which it kept secret.
6a) A bit of a repeat question here but this goes back to what in my
C# program is deciding if I can access another DLL? If my hijacked DLL
contained exactly the same methods and types as the original but it
was compiled in another lanugage would it work?
6a) It would work as long as it's a managed DLL, made with any .NET language.
- What is DLL importing? and dll registration?
"DLL importing" can mean many things, usually it means referencing a DLL file and using things in it.
DLL registration is something that's done on Windows to globally register DLL files as COM components to make them available to any software on the system.
A .dll file contains compiled code you can use in your application.
Sometimes the tool used to compile the .dll matters, sometimes not. If you can reference the .dll in your project, it doesn't matter which tool was used to code the .dll's exposed functions.
The linking happens at runtime, unlike statically linked libraries, such as your classes, which link at compile-time.
You can think of a .dll as a black box that provides something your application needs that you don't want to write yourself. Yes, someone understanding the .dll's signature could create another .dll file with different code inside it and your calling application couldn't know the difference.
HTH
1) At what point does web.dll dynamically link to business.dll? You
notice a lot in Windows HDD thrashing for seemingly small tasks when
using Word etc and I reckon that this Word going off and dynamically
linking in functionality from other DLL's?
1) I think you are confusing linking with loading. The link is when all the checks and balances are tested to be sure that what is asked for is available. At load time, parts of the dll are loaded into memory or swapped out to the pagefile. This is the HD activity you are seeing.
Dynamic linking is different from static linking in that in static linking, all the object code is put into the main .exe at link time. With dynamic linking, the object code is put into a separate file (the dll) and it is loaded at a different time from the .exe.
Dynamic linking can be implicit (i.e. the app links with a import lib), or explicit (i.e. the app uses LoadLibrary(ex) to load the dll).
In the implicit case, /DELAYLOAD can be used to postpone the loading of the dll until the app actually needs it. Otherwise, at least some parts of it are loaded (mapped into the process address space) as part of the process initilazation. The dll can also request
that it never be unloaded while the process is active.
COM uses LoadLibrary to load COM dlls. Note that even in the implicit case, the system is using something similar to LoadLibrary to load the dll either at process startup or on first use.
2) What actually executes the code in the DLL? Does it get executed by
the processor or is there another stage of translation or compilation
before the processor will understand the code inside the DLL?
2) Dlls contain object code just like .exes. The format of the dll file is almost identical to the format of an exe file. I have heard that there is only one bit that is different in the headers of the two files.
In the case of a DLL built from C# .net, the .Net framework is running it.
3) Does a DLL from say Linux work on a Windows system (if such a thing
exists) or are they operating system specific?
3) DLLs are platform specific.
4) Are they specific to a particular framework? Can a DLL built using
C# .Net be used by a DLL built with Borland C++ (example only)?
4) Dlls can interoperate with other frameworks if special care is taken or some additional glue code is written.
Dlls are very useful when a company sells multiple products that have overlapping capabilities. For instance, I maintain a raster i/o dll that is used by more than 30 different products at the company. If you have multiple products installed, one upgrade of the dll can upgrade all the products to new raster formats.
5) Going back to the web.dll / business.dll example. To get a class
type of customer I need to reference business.dll from web.dll. This
must mean that business.dll contains a specification of some sort of
what a customer class actually is. If I had compiled my business.dll
file in say Delphi would C# understand it and be able to create a
customer class - or is there some sort of header info or something
that says "hey sorry you can only use me from another delphi dll".
5) Depending on the platform, the capabilities of a dll are presented in various ways, thru .h files, .tlb files, or other ways on .net.
6) On the subject of DLL hijacking, surely the replacement (bad) DLL
must contain the exact method signatures, types as the one that is
being hijacked. I suppose this wouldnt be hard to do if you could find
out what methods etc were available in the original DLL.
6) dumpbin /exports and dumbin /imports are interesting tools to use on .exe and .dlls