As I described in a previous question, I have an assignment to write a proxy server. It partially works now, but I still have a problem with handling of gzipped information. I store the HttpResponse in a String, and it appears I can't do that with gzipped content. However, the headers are text which I need to parse, and they all come from the same InputStream
. My question is, what do I have to do in order to correctly handle binary responses, while still parsing the headers as strings?
>> Please see the edit below before you look at the code.
Here's the Response
class implementation:
public class Response {
private String fullResponse = "";
private BufferedReader reader;
private boolean busy = true;
private int responseCode;
private CacheControl cacheControl;
public Response(String input) {
this(new ByteArrayInputStream(input.getBytes()));
}
public Response(InputStream input) {
reader = new BufferedReader(new InputStreamReader(input));
try {
while (!reader.ready());//wait for initialization.
String line;
while ((line = reader.readLine()) != null) {
fullResponse += "\r\n" + line;
if (HttpPatterns.RESPONSE_CODE.matches(line)) {
responseCode = (Integer) HttpPatterns.RESPONSE_CODE.process(line);
} else if (HttpPatterns.CACHE_CONTROL.matches(line)) {
cacheControl = (CacheControl) HttpPatterns.CACHE_CONTROL.process(line);
}
}
reader.close();
fullResponse = "\r\n" + fullResponse.trim() + "\r\n\r\n";
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
busy = false;
}
public CacheControl getCacheControl() {
return cacheControl;
}
public String getFullResponse() {
return fullResponse;
}
public boolean isBusy() {
return busy;
}
public int getResponseCode() {
return responseCode;
}
@Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result
+ ((fullResponse == null) ? 0 : fullResponse.hashCode());
return result;
}
@Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (!(obj instanceof Response))
return false;
Response other = (Response) obj;
if (fullResponse == null) {
if (other.fullResponse != null)
return false;
} else if (!fullResponse.equals(other.fullResponse))
return false;
return true;
}
@Override
public String toString() {
return "Response\n==============================\n" + fullResponse;
}
}
And here's HttpPatterns
:
public enum HttpPatterns {
RESPONSE_CODE("^HTTP/1\\.1 (\\d+) .*$"),
CACHE_CONTROL("^Cache-Control: (\\w+)$"),
HOST("^Host: (\\w+)$"),
REQUEST_HEADER("(GET|POST) ([^\\s]+) ([^\\s]+)$"),
ACCEPT_ENCODING("^Accept-Encoding: .*$");
private final Pattern pattern;
HttpPatterns(String regex) {
pattern = Pattern.compile(regex);
}
public boolean matches(String expression) {
return pattern.matcher(expression).matches();
}
public Object process(String expression) {
Matcher matcher = pattern.matcher(expression);
if (!matcher.matches()) {
throw new RuntimeException("Called `process`, but the expression doesn't match. Call `matches` first.");
}
if (this == RESPONSE_CODE) {
return Integer.parseInt(matcher.group(1));
} else if (this == CACHE_CONTROL) {
return CacheControl.parseString(matcher.group(1));
} else if (this == HOST) {
return matcher.group(1);
} else if (this == REQUEST_HEADER) {
return new RequestHeader(RequestType.parseString(matcher.group(1)), matcher.group(2), matcher.group(3));
} else { //never happens
return null;
}
}
}
EDIT
I tried implementing according the suggestions, but it's not working and I'm becoming desperate. When I try to view an image I get the following message from the browser:
The image “http://www.google.com/images/logos/ps_logo2.png” cannot be displayed because it contains errors.
Here's the log:
Request
==============================
GET http://www.google.com/images/logos/ps_logo2.png HTTP/1.1
Host: www.google.com
User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:2.0) Gecko/20100101 Firefox/4.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Cookie: PREF=ID=31f95dd7f42dfc7d:TM=1303507626:LM=1303507626:S=D4kIZ6rGFrlOUWlm
Not Reading from the Cache!!!!
I am going to try to connect to: www.google.com at port 80
Connected.
Writing to the server's buffer...
flushed.
Getting a response...
Got a binary response!
contentLength = 26209; headers.length() = 312; responseLength = 12136; fullResponse length = 12136
Got a response!
Writing to the Cache!!!!
I am going to write the following response:
HTTP/1.1 200 OK
Content-Type: image/png
Last-Modified: Thu, 05 Aug 2010 22:54:44 GMT
Date: Wed, 04 May 2011 15:05:30 GMT
Expires: Wed, 04 May 2011 15:05:30 GMT
Cache-Control: private, max-age=31536000
X-Content-Type-Options: nosniff
Server: sffe
Content-Length: 26209
X-XSS-Protection: 1; mode=block
Response body is binary and was truncated.
Finished with request!
Here's the new Response
class:
public class Response {
private String headers = "";
private BufferedReader reader;
private boolean busy = true;
private int responseCode;
private CacheControl cacheControl;
private InputStream fullResponse;
private ContentEncoding encoding = ContentEncoding.TEXT;
private ContentType contentType = ContentType.TEXT;
private int contentLength;
public Response(String input) {
this(new ByteArrayInputStream(input.getBytes()));
}
public Response(InputStream input) {
ByteArrayOutputStream tempStream = new ByteArrayOutputStream();
InputStreamReader inputReader = new InputStreamReader(input);
try {
while (!inputReader.ready());
int responseLength = 0;
while (inputReader.ready()) {
tempStream.write(inputReader.read());
responseLength++;
}
/*
* Read the headers
*/
reader = new BufferedReader(new InputStreamReader(new ByteArrayInputStream(tempStream.toByteArray())));
while (!reader.ready());//wait for initialization.
String line;
while ((line = reader.readLine()) != null) {
headers += "\r\n" + line;
if (HttpPatterns.RESPONSE_CODE.matches(line)) {
responseCode = (Integer) HttpPatterns.RESPONSE_CODE.process(line);
} else if (HttpPatterns.CACHE_CONTROL.matches(line)) {
cacheControl = (CacheControl) HttpPatterns.CACHE_CONTROL.process(line);
} else if (HttpPatterns.CONTENT_ENCODING.matches(line)) {
encoding = (ContentEncoding) HttpPatterns.CONTENT_ENCODING.process(line);
} else if (HttpPatterns.CONTENT_TYPE.matches(line)) {
contentType = (ContentType) HttpPatterns.CONTENT_TYPE.process(line);
} else if (HttpPatterns.CONTENT_LENGTH.matches(line)) {
contentLength = (Integer) HttpPatterns.CONTENT_LENGTH.process(line);
} else if (line.isEmpty()) {
break;
}
}
InputStreamReader streamReader = new InputStreamReader(new ByteArrayInputStream(tempStream.toByteArray()));
while (!reader.ready());//wait for initialization.
//Now let's get the rest
ByteArrayOutputStream out = new ByteArrayOutputStream();
int counter = 0;
while (streamReader.ready() && counter < (responseLength - contentLength)) {
out.write((char) streamReader.read());
counter++;
}
if (encoding == ContentEncoding.BINARY || contentType == ContentType.BINARY) {
System.out.println("Got a binary response!");
while (streamReader.ready()) {
out.write(streamReader.read());
}
} else {
System.out.println("Got a text response!");
while (streamReader.ready()) {
out.write((char) streamReader.read());
}
}
fullResponse = new ByteArrayInputStream(out.toByteArray());
System.out.println("\n\ncontentLength = " + contentLength +
"; headers.length() = " + headers.length() +
"; responseLength = " + responseLength +
"; fullResponse length = " + out.toByteArray().length + "\n\n");
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
busy = false;
}
}
and here's the ProxyServer
class:
class ProxyServer {
public void start() {
while (true) {
Socket serverSocket;
Socket clientSocket;
OutputStreamWriter toClient;
BufferedWriter toServer;
try {
//The client is meant to put data on the port, read the socket.
clientSocket = listeningSocket.accept();
Request request = new Request(clientSocket.getInputStream());
//System.out.println("Accepted a request!\n" + request);
while(request.busy);
//Make a connection to a real proxy.
//Host & Port - should be read from the request
URL url = null;
try {
url = new URL(request.getRequestURL());
} catch (MalformedURLException e){
url = new URL("http:\\"+request.getRequestHost()+request.getRequestURL());
}
System.out.println(request);
//remove entry from cache if needed
if (!request.getCacheControl().equals(CacheControl.CACHE) && cache.containsRequest(request)) {
cache.remove(request);
}
Response response = null;
if (request.getRequestType() == RequestType.GET && request.getCacheControl().equals(CacheControl.CACHE) && cache.containsRequest(request)) {
System.out.println("Reading from the Cache!!!!");
response = cache.get(request);
} else {
System.out.println("Not Reading from the Cache!!!!");
//Get the response from the destination
int remotePort = (url.getPort() == -1) ? 80 : url.getPort();
System.out.println("I am going to try to connect to: " + url.getHost() + " at port " + remotePort);
serverSocket = new Socket(url.getHost(), remotePort);
System.out.println("Connected.");
serverSocket.setSoTimeout(50000);
//write to the server - keep it open.
System.out.println("Writing to the server's buffer...");
toServer = new BufferedWriter(new OutputStreamWriter(serverSocket.getOutputStream()));
toServer.write(request.getFullRequest());
toServer.flush();
System.out.println("flushed.");
System.out.println("Getting a response...");
response = new Response(serverSocket.getInputStream());
//System.out.println("Got a response!\n" + response);
System.out.println("Got a response!\n");
//wait for the response
while(response.isBusy());
}
if (request.getRequestType() == RequestType.GET && request.getCacheControl().equals(CacheControl.CACHE) && response.getResponseCode() == 200) {
System.out.println("Writing to the Cache!!!!");
cache.put(request, response);
}
else System.out.println("Not Writing to the Cache!!!!");
response = filter.filter(response);
// Return the response to the client
toClient = new OutputStreamWriter(clientSocket.getOutputStream());
System.out.println("I am going to write the following response:\n" + response);
BufferedReader responseReader = new BufferedReader(new InputStreamReader(response.getFullResponse()));
while (responseReader.ready()) {
toClient.write(responseReader.read());
}
toClient.flush();
toClient.close();
clientSocket.close();
System.out.println("Finished with request!");
} catch (IOException e) {
e.printStackTrace();
continue;
}
}
}
}
I would appreciate any and all feedback/insight/suggestion regarding how to solve this, and would of course prefer some actual code.
Jersey — a high level web framework — may save your day. You don't have to manage gzip content, header, etc, yourself anymore.
The following code gets the image used for your example and save it to disk. Then it verifies the saved image is equal to the downloaded one:
You will need two maven dependencies to run it:
Store it in a byte array:
A more detailed process:
\r\n\r\n
in the buffer. You can write a helper function for examplestatic int arrayIndexOf(byte[] haystack, int offset, int length, byte[] needle)
Edit:
You are not following these steps I suggested.
inputReader.ready()
is a wrong way to detect the phases of the response. There is no guarantee that the header will be sent in a single burst.I tried to write a schematics in code (except the arrayIndexOf) function.
The
arrayIndexOf
method could look something like this: (there are probably faster versions)I had the same problem. I commented the line which adds the header accept gzip:
...and it worked!
You basically need to parse the response headers as text, and the rest as binary. It's slightly tricky to do so, as you can't just create an
InputStreamReader
around the stream - that will read more data than you want. You'll quite possibly need to read data into a byte array and then callEncoding.GetString
manually. Alternatively, if you've read data into a byte array already you could always create aByteArrayInputStream
around that, then anInputStreamReader
on top... but you'll need to work out how far the headers go before you get to the body of the response, which you should keep as binary data.After reading the headers with
BufferedReader
you'll need to detect if theContent-Encoding
header is set togzip
. If it is, to read the body you'll have to switch to using theInputStream
and wrap it with aGZIPInputStream
to decode the body. The tricky part however is the fact that theBufferedReader
will have buffered past the headers into the body and the underlyingInputStream
will be ahead of where you need it.What you could do is wrap the initial
InputStream
with aBufferedInputStream
and callmark()
on it before you begin processing the headers. When you're done processing the headers callreset()
. Then read that stream until you hit the empty line between headers and the body. Now wrap it with theGZIPInputStream
to process the body.