How to read a winmd (WinRT metadata file)?

2020-07-17 05:53发布

A WinMD is a binary medadata file, that contains everything you need to learn about namespaces, types, classes, methods, parameters available in a native WinRT dll.

From Windows Runtime design:

The Windows Runtime is exposed using API metadata (.winmd files). This is the same format used by the .NET framework (Ecma-335). The underlying binary contract makes it easy for you to access the Windows Runtime APIs directly in the development language of your choice.

Each .winmd file exposes one or more namespaces. These namespaces are grouped by the functionality that they provide. A namespace contains types such as classes, structures, and enumerations.

Great; how do I access it?

Winmd is COM

WinRT under the hood is still COM. And Winmd (Windows Metadata) in WinRT, is the modern version of the old TLB (type library) files from COM.

| COM                        | WinRT                          |
|----------------------------|--------------------------------|
| CoInitialize               | RoInitialize                   |
| CoCreateInstance(ProgID)¹  | RoActivateInstance(ClassName)  |
| *.tlb                      | *.winmd                        |
| compiled from idl          | compiled from idl              |
| HKCR\Classes\[ProgID]      | HKLM\Software\Microsoft\WindowsRuntime\ActivatableClassId\[ClassName] |
| Code stored in native dll  | Code stored in native dll      |
| DllGetClassObject          | DllGetClassObject              |
| Is native code             | Is native code                 |
| IUnknown                   | IUnknown (and IInspectible)    |
| stdcall calling convention | stdcall calling convention     |
| Everything returns HRESULT | Everything returns HRESULT     |
| LoadTypeLib(*.tlb)         | ???(*.winmd)                   |

Reading metadata from a COM tlb

Given a COM tlb file (e.g. stdole.tlb), you can use various Windows functions to parse the tlb to get information out of it.

A call to LoadTypeLib gets you an ITypeLib interface:

ITypeLib tlb = LoadTypeLib("c:\Windows\system32\stdole2.tlb");

And then you can start iterating everything in the type library

for (int i = 0 to tlb.GetTypeInfoCount-1)
{
   ITypeInfo typeInfo = tlb.GetTypeInfo(i);
   TYPEATTR typeAttr = typeInfo.GetTypeAttr();

   case typeAttr.typeKind of
   TKIND_ENUM: LoadEnum(typeINfo, typeAttr);
   TKIND_DISPATCH,
   TKIND_INTERFACE: LoadInterface(typeInfo, typeAttr);
   TKIND_COCLASS: LoadCoClass(typeInfo, typeAttr);
   else
      //Unknown
   end;
   typeInfo.ReleaseTypeAttr(typeAttr);
}

How do we do the same with *.winmd files in the WinRT world?

From Larry Osterman:

From the idl files we produce a winmd file. A winmd file is the canonical definition of the type. And that's what get handed off to the language projections. The language projections read the winmd files, and they know how to take the contents of that winmd file - which is a binary file - and then project that and produce the appropriate language constructs for that language.

They all read that winmd file. It happens to be an ECMA-335 metadata-only assembly. That's the technical detail of the packaging file format.

One of the nice things about producing winmds, because it's regular, we can now build tooling to sort, collate, combine, the methods and types in a winmd file.

Loading metadata from a winmd

I've tried using RoGetMetaDataFile to load a WinMD. But RoGetMetaDataFile is not meant to let you process a winmd file directly. It is meant to let you discover information about a type that you already know exists - and you know its name.

Calling RoGetMetadataFile fails if you pass it a winmd filename:

HSTRING name = CreateWindowsString("C:\Windows\System32\WinMetadata\Windows.Globalization.winmd");
IMetaDataImport2 mdImport;
mdTypeDef mdType;

HRESULT hr = RoGetMetadataFile(name, null, null, out mdImport, out mdType);


0x80073D54
The process has no package identity

Which corresponds to AppModel error code:

#define APPMODEL_ERROR_NO_PACKAGE        15700L

But RoGetMetadataFile does succeed if you pass a class:

RoGetMetadataFile("Windows.Globalization.Calendar", ...);

MetaData Dispenser

There was a suggestion to use MetaDataGetDispenser to create an IMetaDataDispenser.

IMetaDataDispenser dispenser;
MetaDataGetDispenser(CLSID_CorMetaDataDispenser, IMetaDataDispenser, out dispenser);

Presumably you can use the OpenScope method to open a winmd file:

Opens an existing, on-disk file and maps its metadata into memory.
The file must contain common language runtime (CLR) metadata.

Where the first parameter (Scope) is "The name of the file to be opened."

So we try:

IUnknown unk;
dispenser.OpenScope(name, ofRead, IID_?????, out unk);

Except i don't know what interface i'm supposed to be asking for; the documentation won't say. It does remark:

The in-memory copy of the metadata can be queried using methods from one of the "import" interfaces, or added to using methods from the one of the "emit" interfaces.

The author who put the emphasis on the words "import" and "emit" is probably trying to provide a clue - without outright giving away the answer.

Bonus Chatter

  • i don't know the namespaces or types in the winmd (that's what we're trying to figure out)
  • with WinRT i'm not running managed code inside a CLR; this is for native code

The hypothetical motivation we can use for this question is that we're going to be creating a projection for a language that doesn't have one yet (e.g. ada, bpl, b, c). The other hypothetical motivation is to allow an IDE to be able to display metadata contents of a winmd file.

Also, remember that WinRT is not related to .NET in any way.

  • It is not managed code.
  • It does not exist in an assembly.
  • It does not run inside a .NET runtime.
  • But since .NET already provides you a way to interop with COM (and given that WinRT is COM)
  • you are able to call WinRT classes from your managed code

Many people seem to think WinRT is another name for .NET. WinRT does not use, require, or operate in .NET, C#, a .NET framework, or a .NET runtime.

  • WinRT is to native code
  • as .NET Framework Class Library is to managed code

WinRT is a class library for native code. .NET people already have their own class library.

Bonus Question

What are the functions in native mscore that lets you process the metadata of an ECMA-335 binary file?

Bonus Reading

2条回答
别忘想泡老子
2楼-- · 2020-07-17 06:38

.winmd files follow ECMA-335 standard, so any code able to read .NET assemblies can read .winmd files.

Two options I've used personally were Mono.Cecil and System.Reflection.Metadata. I personally found Mono.Cecil to be easier to work with.

查看更多
啃猪蹄的小仙女
3楼-- · 2020-07-17 06:44

One problem is there is two sets of documentation for IMetadataDispsenser.OpenScope:

And while the Windows Runtime documentation offers no documentation:

riid

The IID of the desired metadata interface to be returned; the caller will use the interface to import (read) or emit (write) metadata.

The .NET Framework version does offer documentation:

riid

[in] The IID of the desired metadata interface to be returned; the caller will use the interface to import (read) or emit (write) metadata.

The value of riid must specify one of the "import" or "emit" interfaces. Valid values are:

  • IID_IMetaDataImport
  • IID_IMetaDataImport2
  • IID_IMetaDataAssemblyImport
  • IID_IMetaDataEmit
  • IID_IMetaDataEmit2
  • IID_IMetaDataAssemblyEmit

So now we can start to put everything together.


  1. Create your metadata dispenser:

    IMetadataDispsener dispener;
    MetaDataGetDispenser(CLSID_CorMetaDataDispenser, IMetaDataDispenser, out dispenser);
    
  2. Use OpenScope to specify the *.winmd file you want to read. We ask for the IMetadataImport interface, because we want to import data from a winmd (rather than export it to a winmd):

    //Open the winmd file we want to dump
    String filename = "C:\Windows\System32\WinMetadata\Windows.Globalization.winmd";
    
    IMetaDataImport reader; //IMetadataImport2 supports generics
    dispenser.OpenScope(filename, ofRead, IMetaDataImport, out reader); //"Import" is used to read metadata. "Emit" is used to write metadata.
    
  3. Once you have the metadata importer, you can start to enumerate all the types in the metadata file:

    Pointer enum = null;
    mdTypeDef typeID;
    Int32 nRead;
    while (reader.EnumTypeDefs(enum, out typeID, 1, out nRead) = S_OK)
    {
       ProcessToken(reader, typeID);
    }
    reader.CloseEnum(enum);
    
  4. And now for each typeID in the winmd you can get various properties:

    void ProcessToken(IMetaDataImport reader, mdTypeDef typeID)
    {
       //Get three interesting properties of the token:
       String      typeName;       //e.g. "Windows.Globalization.NumberFormatting.DecimalFormatter"
       UInt32      ancestorTypeID; //the token of this type's ancestor (e.g. Object, Interface, System.ValueType, System.Enum)
       CorTypeAttr flags;          //various flags about the type (e.g. public, private, is an interface)
    
       GetTypeInfo(reader, typeID, out typeName, out ancestorTypeID, out flags);
    }
    

And there's some trickery needed when getting information about a type:

  • if the type is defined in the winmd itself: use GetTypeDefProps
  • if the type is a "reference" to a type that exists in another winmd: use GetTypeRefProps

The only way to tell the difference is to try to read the type properties assuming it is a type definition using GetTypeDefProps and check the return value:

  • if it returns S_OK it's a type reference
  • if it returns S_FALSE it's a type definition

    1. Get the properties of the type, including:

      • typeName: e.g. "Windows.Globalization.NumberFormatting.DecimalFormatter"
      • ancestorTypeID: e.g. 0x10000004
      • flags: e.g. 0x00004101

     

    void GetTypeInf(IMetaDataImport reader, mdTypeDef typeID, 
          out String typeName, DWORD ancestorTypeID, CorTypeAttr flags)
    {
       DWORD nRead;
       DWORD tdFlags;
       DWORD baseClassToken;
    
       hr = reader.GetTypeDefProps(typeID, null, 0, out nRead, out tdFlags, out baseClassToken);
       if (hr == S_OK)
       {
          //Allocate buffer for name
          SetLength(typeName, nRead);
          reader.GetTypeDefProps(typeID, typeName, Length(typeName),
                out nRead, out flags, out ancestorTypeID);
          return;
       }
    
       //We couldn't find it a a type **definition**. 
       //Try again as a type **reference**
       hr = reader.GetTypeRefProps(typeID, null, 0, out nRead, out tdFlags, out baseClassToken);
       if (hr == S_OK)
       {
          //Allocate buffer for name
          SetLength(typeName, nRead);
          reader.GetTypeRefProps(typeID, typeName, Length(typeName),
                out nRead, out flags, out ancestorTypeID);
          return;
       }       
    }
    

There's some other interesting gotchas if you're trying to decipher types. In the Windows Runtime, everything is either fundamentally:

  • an interface
  • or a class

Structs and Enums are also classes; but descendant of a specific class:

  • interface
  • class
    • System.ValueType --> struct
    • System.Enum --> enum
    • class

Invaluable assistance came from:

which i believe is the only documentation is existence on reading metadata from an EMCA-335 assembly using Microsoft's API.

查看更多
登录 后发表回答