Where is the PEM file format specified?

2019-01-11 05:56发布

问题:

I need to parse .PEM files.
I know that the standard for "Privacy-enhanced Electronic Mail" is defined in RFCs 1421-24. But they don't seem to mention some text I find inside OpenSSL .pem files (eg. "Key Attributes", "BEGIN CERTIFICATE", etc...) Is this an OpenSSL-specific format?

回答1:

It's often beneficial to look at an existing implementation and see what they do. OpenSSL/LibreSSL defines these BEGIN and END markers in crypto/pem/pem.h. For example, the current LibreSSL has the following:

#define PEM_STRING_X509_OLD "X509 CERTIFICATE"
#define PEM_STRING_X509     "CERTIFICATE"
#define PEM_STRING_X509_PAIR    "CERTIFICATE PAIR"
#define PEM_STRING_X509_TRUSTED "TRUSTED CERTIFICATE"
#define PEM_STRING_X509_REQ_OLD "NEW CERTIFICATE REQUEST"
#define PEM_STRING_X509_REQ "CERTIFICATE REQUEST"
#define PEM_STRING_X509_CRL "X509 CRL"
#define PEM_STRING_EVP_PKEY "ANY PRIVATE KEY"
#define PEM_STRING_PUBLIC   "PUBLIC KEY"
#define PEM_STRING_RSA      "RSA PRIVATE KEY"
#define PEM_STRING_RSA_PUBLIC   "RSA PUBLIC KEY"
#define PEM_STRING_DSA      "DSA PRIVATE KEY"
#define PEM_STRING_DSA_PUBLIC   "DSA PUBLIC KEY"
#define PEM_STRING_PKCS7    "PKCS7"
#define PEM_STRING_PKCS7_SIGNED "PKCS #7 SIGNED DATA"
#define PEM_STRING_PKCS8    "ENCRYPTED PRIVATE KEY"
#define PEM_STRING_PKCS8INF "PRIVATE KEY"
#define PEM_STRING_DHPARAMS "DH PARAMETERS"
#define PEM_STRING_SSL_SESSION  "SSL SESSION PARAMETERS"
#define PEM_STRING_DSAPARAMS    "DSA PARAMETERS"
#define PEM_STRING_ECDSA_PUBLIC "ECDSA PUBLIC KEY"
#define PEM_STRING_ECPARAMETERS "EC PARAMETERS"
#define PEM_STRING_ECPRIVATEKEY "EC PRIVATE KEY"
#define PEM_STRING_PARAMETERS   "PARAMETERS"
#define PEM_STRING_CMS      "CMS"

And as far as I know, there is no master list of BEGIN/END markers. They're pretty much defined on an as-needed basis by an implementation. And then if you want to inter-op with that implementation, you add the string to your own.



回答2:

Updated answer for 2015: As users have already answered twice, before moderator @royhowie deleted the answers: there is now RFC 7468 which defines the PEM headers. The following quote is only a small part, and you should read the actual spec, which will likely stay on the internet for far longer than StackOverflow will.

However @royhowie deletes every answer that points to the RFC as 'link only' unless it has some text. So here is some text:

  1. Textual Encoding of PKCS #10 Certification Request Syntax

    PKCS #10 Certification Requests are encoded using the "CERTIFICATE REQUEST" label. The encoded data MUST be a BER (DER strongly preferred; see Appendix B) encoded ASN.1 CertificationRequest structure as described in [RFC2986].

-----BEGIN CERTIFICATE REQUEST-----

MIIBWDCCAQcCAQAwTjELMAkGA1UEBhMCU0UxJzAlBgNVBAoTHlNpbW9uIEpvc2Vm c3NvbiBEYXRha29uc3VsdCBBQjEWMBQGA1UEAxMNam9zZWZzc29uLm9yZzBOMBAG ByqGSM49AgEGBSuBBAAhAzoABLLPSkuXY0l66MbxVJ3Mot5FCFuqQfn6dTs+9/CM EOlSwVej77tj56kj9R/j9Q+LfysX8FO9I5p3oGIwYAYJKoZIhvcNAQkOMVMwUTAY BgNVHREEETAPgg1qb3NlZnNzb24ub3JnMAwGA1UdEwEB/wQCMAAwDwYDVR0PAQH/ BAUDAwegADAWBgNVHSUBAf8EDDAKBggrBgEFBQcDATAKBggqhkjOPQQDAgM/ADA8 AhxBvfhxPFfbBbsE1NoFmCUczOFApEuQVUw3ZP69AhwWXk3dgSUsKnuwL5g/ftAY dEQc8B8jAcnuOrfU

-----END CERTIFICATE REQUEST-----

Figure 9: PKCS #10 Example

The label "NEW CERTIFICATE REQUEST" is also in wide use. Generators conforming to this document MUST generate "CERTIFICATE REQUEST" labels. Parsers MAY treat "NEW CERTIFICATE REQUEST" as equivalent to "CERTIFICATE REQUEST".^



回答3:

To get you started: As far as I know, if there's a part that's human-readable (has words and stuff), that's meant for human operators to know what the certification in question is, expiry dates, etc, for a quick manual verification. So you can ignore that.

You'll want to parse what's between the BEGIN-END blocks.

Inside, you'll find a Base64 encoded entity that you need to Base64 decode into bytes. These bytes represent a DER encoded certificate/key/etc. I'm not sure what good libraries you could use for parsing the DER data.

As a test to understand what data is inside each block, you can paste what's between the BEGIN-END blocks to this site which does ASN.1 decoding in JavaScript:

http://lapo.it/asn1js/

Although I wouldn't go pasting any production environment private keys to any site (although that seems to be just a javascript).

Base64: http://en.wikipedia.org/wiki/Base64

DER: http://en.wikipedia.org/wiki/Distinguished_Encoding_Rules

ASN.1: http://en.wikipedia.org/wiki/Abstract_Syntax_Notation_One



回答4:

I found an old thread regarding this issue. It looks like there is no "official" standard format for the encapsulation boundaries and the best way to determine this is by guessing the contents based on well-known keywords you find in the BEGIN statement.

As answered by indiv, for the full list of the keywords, refer to the OpenSSL crypto/pem/pem.h header file.



回答5:

I am unsure if it's specific to OpenSSL, but the documentation for PEM Encryption Format may be what you're looking for.



回答6:

Where is the PEM file format specified?

There is no one place. It depends on the standard. You can even make up your own encapsulation boundaries and use them in your own software.

As @indiv stated, OpenSSL has a fairly comprehensive list at <openssl dir>/crypto/pem/pem.h.

Someone asked the PKIX Working Group to provide a list like you are asking for back in 2006. The working group declined. See PEM file format rfc draft request.