I went through the math in the "worked example" in the RSA wiki page: https://en.wikipedia.org/wiki/RSA_(algorithm) and understood it entirely. For the remainder of this question, I will use math variables consistent with the wiki page.
I'm on a Unix machine and I'm looking in the ~/.ssh directory and I see all these files
id_rsa
id_rsa.pub
and I want to connect the theory with the practice.
What exactly is in id_rsa? If I cat it
cat id_rsa
I get a big jumble of characters. Is this some representation the number n = pq? What representation is it exactly? base 64? If so, then is id_rsa.pub suppose to be some representation of the numbers e and n?
In general, I'm trying to connect the theory of RSA with the actual practice as implemented through the ssh program on Unix machines. Any answers or pointers to the right direction would be greatly appreciated.
id_rsa is a base64-encoded DER-encoded string. The ASN.1 syntax for that DER-encoded string is described in RFC3447 (aka PKCS1):
DER encoding uses a tag-length-value notation. So here's a sample private key:
Here's the hex encoding:
The 30 is because it's a SEQUENCE tag. The 82025c represents the length. The first byte means the length is of the "long form" (82 & 80) and that the next two bytes represent the length (82 & 7F). So the actual length of the SEQUENCE is 025c. So after that is the value.
Then you get to the version. 02 is of type int, 01 is the tag length and 00 is the value. ie. it's a two-prime key as opposed to a multi-prime key.
More info on the Distinguished Encoding Rules.
Trying to understand ASN.1 is a lot more complicated and a lot of it, for the purpose of understanding the formatting of RSA private keys, is unnecessary. For X.509 it becomes more necessary but RSA keys aren't nearly as complicated, formatting-wise, as X.509 certs.
Hope that helps!