I need a servlet filter that will capture all input, then mangle that input, inserting a special token in every form. Imagine that the filter is tied to all requests (E.g. url-pattern=*
). I have the code for capture of content, but it doesn't seem like the RequestWrapper
is robust enough to capture all input. Some input returns zero bytes and then I can't "stream" that content back to the user. For example, we are still using Struts 1.3.10 and any Struts code does not "capture" properly, we get zero byte content. I believe it is because of how Struts handles forwards. If there is a forward involved in the request, I wonder if the capture code below will work. Here is all the code, do you have an approach that will capture any type of content that is meant for streaming to the user.
<filter>
<filter-name>Filter</filter-name>
<filter-class>mybrokenCaptureHtml.TokenFilter</filter-class>
</filter>
<filter-mapping>
<filter-name>Filter</filter-name>
<url-pattern>/*</url-pattern>
</filter-mapping>
package mybrokenCaptureHtml;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.PrintWriter;
import javax.servlet.Filter;
import javax.servlet.FilterChain;
import javax.servlet.FilterConfig;
import javax.servlet.ServletException;
import javax.servlet.ServletOutputStream;
import javax.servlet.ServletRequest;
import javax.servlet.ServletResponse;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import javax.servlet.http.HttpServletResponseWrapper;
public class TokenFilter implements Filter {
@Override
public void destroy() {
}
public void doFilter(ServletRequest servletRequest, ServletResponse servletResponse, FilterChain chain) throws IOException, ServletException {
HttpServletRequest request = (HttpServletRequest) servletRequest;
HttpServletResponse response = (HttpServletResponse) servletResponse;
try {
final MyResponseWrapper responseWrapper = new MyResponseWrapper((HttpServletResponse) response);
chain.doFilter(request, responseWrapper);
// **HERE DEPENDING ON THE SERVLET OR APPLICATION CODE (STRUTS, WICKET), the response returns an empty string //
// Especiall struts, is there something in their forwards that would cause an error?
final byte [] bytes = responseWrapper.toByteArray();
// For some applications that hit this filter
// ZERO BYTE DATA is returned, this is bad, but SOME
// CODE, the data is captured.
final String origHtml = new String(bytes);
final String newHtml = origHtml.replaceAll("(?i)</(\\s)*form(\\s)*>", "<input type=\"hidden\" name=\"zval\" value=\"fromSiteZ123\"/></form>");
response.getOutputStream().write(newHtml.getBytes());
} catch(final Exception e) {
e.printStackTrace();
}
return;
}
@Override
public void init(FilterConfig filterConfig) throws ServletException {
}
static class MyResponseWrapper extends HttpServletResponseWrapper {
private final MyPrintWriter pw = new MyPrintWriter();
public byte [] toByteArray() {
return pw.toByteArray();
}
public MyResponseWrapper(HttpServletResponse response) {
super(response);
}
@Override
public PrintWriter getWriter() {
return pw.getWriter();
}
@Override
public ServletOutputStream getOutputStream() {
return pw.getStream();
}
private static class MyPrintWriter {
private ByteArrayOutputStream baos = new ByteArrayOutputStream();
private PrintWriter pw = new PrintWriter(baos);
private ServletOutputStream sos = new MyServletStream(baos);
public PrintWriter getWriter() {
return pw;
}
public ServletOutputStream getStream() {
return sos;
}
byte[] toByteArray() {
return baos.toByteArray();
}
}
private static class MyServletStream extends ServletOutputStream {
ByteArrayOutputStream baos;
MyServletStream(final ByteArrayOutputStream baos) {
this.baos = baos;
}
@Override
public void write(final int param) throws IOException {
baos.write(param);
}
}
}
}
This is what an example Struts app may look like, for some applications (not Struts), we may capture the content. But for apps like the one below, zero bytes are returned for the HTML content but there should be content.
<%@ page contentType="text/html;charset=UTF-8" language="java" %>
<%@ taglib uri="/WEB-INF/struts-html.tld" prefix="html" %>
<%@ taglib uri="/WEB-INF/struts-bean.tld" prefix="bean" %>
<%@ taglib uri="/WEB-INF/c.tld" prefix="c" %>
<%@ taglib uri="/WEB-INF/struts-logic.tld" prefix="logic" %>
<%@ taglib uri="/WEB-INF/struts-tiles.tld" prefix="tiles" %>
<%@ taglib uri="/WEB-INF/struts-nested.tld" prefix="nested"%>
<html:html>
<head>
<title><bean:message key="myApp.customization.title" /></title>
<LINK rel="stylesheet" type="text/css" href="../theme/styles.css">
</head>
<body>
<html:form styleId="customizemyAppForm" method="post" action="/customizemyApp.do?step=submit">
<html:submit onclick="javascript:finish(this.form);" styleClass="input_small"> <bean:message key="myApp.customization.submit" /> </html:submit>
<input type="button" styleClass="input_small" width="80" style="WIDTH:80px" name="<bean:message key="myApp.customization.cancel" />" value="<bean:message key="myApp.customization.cancel" />" onclick="javascript:cancel();">
</html:form>
</body>
</html:html>
I suspect that the MyResponseWrapper
and MyPrintWriter
are not robust enough to capture all types of content.
Example servlet that would work(a):
response.getOutputStream().write(str.getBytes());
Example servlet that would not work(b):
response.getWriter().println("<html>data</html>");
Example a would get a capture, example b will not.
Here is an improved wrapper class, most of the applications will work but NOW some of the struts applications, only SOME of the response is sent to the browser.
import java.io.BufferedReader;
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.OutputStreamWriter;
import java.io.PrintWriter;
import javax.servlet.ServletOutputStream;
import javax.servlet.http.HttpServletResponse;
import javax.servlet.http.HttpServletResponseWrapper;
public class ByteArrayResponseWrapper extends HttpServletResponseWrapper {
private PrintWriter output = null;
private ServletOutputStream outStream = null;
private static final String NL = System.getProperty("line.separator");
public ByteArrayResponseWrapper(final HttpServletResponse response) {
super(response);
}
public String getDocument() {
InputStream in = null;
try {
in = this.getInputStream();
if (in != null) {
return getDocument(in);
}
} catch(final Exception ee) {
// ee.print;StackTrace();
} finally {
if (in != null) {
try {
in.close();
} catch (IOException e) {
//e.prin;tStackTrace();
}
}
}
return "";
}
protected String getDocument(final InputStream in) {
final StringBuffer buf = new StringBuffer();
BufferedReader br = null;
try {
String line = "";
br = new BufferedReader(new InputStreamReader(getInputStream(), this.getCharacterEncoding()));
while ((line = br.readLine()) != null) {
buf.append(line).append(NL);
}
} catch(final IOException e) {
//e.print;StackTrace();
} finally {
try {
if (br != null) {
br.close();
}
} catch (IOException ex) {
}
}
return buf.toString();
}
@Override
public PrintWriter getWriter() throws IOException {
if (output == null) {
output = new PrintWriter(new OutputStreamWriter(getOutputStream(), this.getCharacterEncoding()));
}
return output;
}
@Override
public ServletOutputStream getOutputStream() throws IOException {
if (outStream == null) {
outStream = new BufferingServletOutputStream();
}
return outStream;
}
public InputStream getInputStream() throws IOException {
final BufferingServletOutputStream out = (BufferingServletOutputStream) getOutputStream();
return new ByteArrayInputStream(out.getBuffer().toByteArray());
}
/**
* Implementation of ServletOutputStream that handles the in-memory
* buffering of the response content
*/
public static class BufferingServletOutputStream extends ServletOutputStream {
ByteArrayOutputStream out = null;
public BufferingServletOutputStream() {
this.out = new ByteArrayOutputStream();
}
public ByteArrayOutputStream getBuffer() {
return out;
}
public void write(int b) throws IOException {
out.write(b);
}
public void write(byte[] b) throws IOException {
out.write(b);
}
public void write(byte[] b, int off, int len) throws IOException {
out.write(b, off, len);
}
@Override
public void close() throws IOException {
out.close();
super.close();
}
@Override
public void flush() throws IOException {
out.flush();
super.flush();
}
}
}
I found a possible solution, in the getInputStream
method, it looks like if I call 'close' on all of the objects, e.g outStream.flush()
and outStream.close()
and then out.flush()
and out.close()
...it looks like the final bytes get written properly. it isn't intuitive but it looks like it works.
Your initial approach failed because
PrintWriter
wraps the givenByteArrayOutputStream
with aBufferedWriter
which has an internal character buffer of 8192 characters, and you neverflush()
the buffer before getting the bytes from theByteArrayOutputStream
. In other words, when less than ~8KB of data is written to thegetWriter()
of the response, the wrappedByteArrayOutputStream
actually never get filled; namely everything is still in that internal character buffer, waiting to be flushed.A fix would be to perform a
flush()
call beforetoByteArray()
in yourMyPrintWriter
:This way the internal character buffer will be flushed (i.e. it will actually write everything to the wrapped stream). This also totally explains why it works when you write to
getOutputStream()
, this step namely doesn't use thePrintWriter
and nothing gets buffered in some internal buffer.Unrelated to the concrete problem: this approach has some severe problems. It isn't respecting the response character encoding during construction of
PrintWriter
(you should actually wrap theByteArrayOutputStream
in anOutputStreamWriter
instead which can take a character encoding) and relying on the platform default, in other words, any written Unicode characters may end up in Mojibake this way and thus this approach isn't ready for World Domination.Also, this approach makes it possible to call both
getWriter()
andgetOutputStream()
on the same response, while that's considered an illegal state (precisely to avoid this kind of buffering and encoding trouble).Update as per the comment, here's a full rewrite of the response wrapper, showing the right way, hopefully in a more self-explaining way than the code you've so far:
Here's how you're supposed to use it: