Duplicate TCP traffic with a proxy

2019-01-16 19:11发布

问题:

I need to send (duplicate) traffic from one machine (port) and to two different machines (ports). I need to take care of TCP session as well.

In the beginnig I used em-proxy, but it seems to me that the overhead is quite large (it goes over 50% of cpu). Then I installed haproxy and I managed to redirect traffic (not to duplicate). The overhead is reasonable (less than 5%).

The problem is that I could not say in haproxy config file the following:
- listen on specific address:port and whatever you find send on the two different machines:ports and discard the answers from one of them.

Em-proxy code for this is quite simple, but it seems to me that EventMachine generates a lot of overhead.

Before I dig in haproxy code and try to change (duplicate traffic) I would like to know is there something similar out there?

Thanks.

回答1:

I have created a proxy just for this purpose.

https://github.com/chrislusf/teeproxy

Usage

./teeProxy -l :8888 -a localhost:9000 -b localhost:9001

tee-proxy is a reverse proxy. For each incoming request, it clones the request into 2 and then forwards them to 2 servers. The results from server a is returned as usual, but the results from server b is ignored.

tee-proxy handles both GET, POST, and other HTTP methods.



回答2:

How about the iptables experimental ROUTE target? It has a "tee" option for mirroring traffic:

http://www.netfilter.org/projects/patch-o-matic/pom-external.html#pom-external-ROUTE

Which would let you mirror traffic with something like:

iptables -A PREROUTING -t mangle -p tcp --dport 80 -j ROUTE --gw 1.2.3.4 --tee
iptables -A POSTROUTING -t mangle -p tcp --sport 80 -j ROUTE --gw 1.2.3.4 --tee

The second machine would need to be on the same subnet and would either need to listen on the target IP address (and not reply to arps) or listen promiscuously.



回答3:

Try https://github.com/agnoster/duplicator.

I tried teeproxy but got strange results with some requests other than GET's.



回答4:

I have also written a reverse proxy / load balancer for a similar purpose with Node.js (it is just for fun, not production ready at the moment).

https://github.com/losnir/ampel

It is very opinionated, and currently supports:

  • GET Using round-robin selection (1:1)
  • POST Using request splitting. There is no concept of "master" and "shadow" -- the first backend that responds is the one that will serve the client request, and then all of the other responses will be discarded.

If someone finds it useful then I can improve it to be more flexible.