I'm trying to write a proxy server for logging and shaping traffic from sites hosted on an IIS farm. The hosting guidelines say:
Add or configure your proxy server to allow requests from the web servers to internet resources. Be sure that you log requests from the web servers. ·
Only allow the web server to make proxy requests to the internet, not to the internal network. So, if the destination of the request is the internet, it should be allowed to go through proxy. But if the application is trying to request a resource or server on the internal network, it should be prevented.
I'm using FiddlerCore (the library that drives Fiddler) which lets me inspect requests before they are sent, and again once the response headers are returned (at which point I have the host IP).
What can I do to determine if a request is being made locally or to the internet? Currently I'm blacklisting known internal IPs, but it doesn't seem right.
What's wrong with blacklisting internal IPs? Typically you'd just consider all the RFC 1918 non-routable networks to be internal (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16) and call it good. If you're actually publicly routable across the board, well, then the same applies but add your publicly routable network(s) to the list. Am I missing something?
The HTTP RFC specified that the client must specify the host header. Thus, determine if the host is local or internet by inspecting the host header.
If you have clients violating that (no host specified) you have the right to drop/deny them with 400 (Bad Request). (Better test if there are ones that do not follow protocol)
The problem, generally, is deciding what addresses are "internal" and which are external. RFC1918 is one mechanism by which you can do that, but IE uses a different strategy; see http://msdn.microsoft.com/en-us/library/bb250483(v=vs.85).aspx for more on how IE does it.