I need to send (duplicate) traffic from one machine (port) and to two different machines (ports). I need to take care of TCP session as well.
In the beginnig I used em-proxy, but it seems to me that the overhead is quite large (it goes over 50% of cpu).
Then I installed haproxy and I managed to redirect traffic (not to duplicate). The overhead is reasonable (less than 5%).
The problem is that I could not say in haproxy config file the following:
- listen on specific address:port and whatever you find send on the two different
machines:ports and discard the answers from one of them.
Em-proxy code for this is quite simple, but it seems to me that EventMachine generates
a lot of overhead.
Before I dig in haproxy code and try to change (duplicate traffic) I would like
to know is there something similar out there?
Thanks.
I have created a proxy just for this purpose.
https://github.com/chrislusf/teeproxy
Usage
./teeProxy -l :8888 -a localhost:9000 -b localhost:9001
tee-proxy is a reverse proxy. For each incoming request, it clones the request into 2 and then forwards them to 2 servers. The results from server a
is returned as usual, but the results from server b
is ignored.
tee-proxy handles both GET
, POST
, and other HTTP methods.
How about the iptables experimental ROUTE target
? It has a "tee" option for mirroring traffic:
http://www.netfilter.org/projects/patch-o-matic/pom-external.html#pom-external-ROUTE
Which would let you mirror traffic with something like:
iptables -A PREROUTING -t mangle -p tcp --dport 80 -j ROUTE --gw 1.2.3.4 --tee
iptables -A POSTROUTING -t mangle -p tcp --sport 80 -j ROUTE --gw 1.2.3.4 --tee
The second machine would need to be on the same subnet and would either need to listen on the target IP address (and not reply to arps) or listen promiscuously.
Try https://github.com/agnoster/duplicator.
I tried teeproxy but got strange results with some requests other than GET's.
I have also written a reverse proxy / load balancer for a similar purpose with Node.js (it is just for fun, not production ready at the moment).
https://github.com/losnir/ampel
It is very opinionated, and currently supports:
GET
Using round-robin selection (1:1)
POST
Using request splitting. There is no concept of "master" and "shadow" -- the first backend that responds is the one that will serve the client request, and then all of the other responses will be discarded.
If someone finds it useful then I can improve it to be more flexible.