how to avoid routing through local stack in Linux

2020-06-04 17:25发布

问题:

I have the following environment: 2 hosts, each with 2 Ethernet interfaces connected to eachother (like on diagram below):

 +---------+               +---------+                     
 |      (1)+---------------+(2)      |    
 |  host1  |               |  host2  |
 |         |               |         |
 |      (3)+---------------+(4)      |
 +---------+               +---------+

I would like to write client/server socket tool that will open both client and server sockets on host1. I would like client to send TCP packets through interface (1) and server to listen on interface (3), that packets will go through host2.

Normally Linux stack will route this packets through local TCP/IP stack without sending those to host2.

I have tried to use SO_BINDTODEVICE option for both server and client and it seems that server indeed is binded to interface (3) and is not listening localhost traffic. I have checked that client from host1 could not be accepted whereas client from host2 does.

Unfortunately client packets are not send out (even tcpdump on interface(1) don't see packets) through interface (1) to interface (2). Of course routing is correct (i can ping (2) from (1), (4) from (1), (4) from (3) and so on).

My question is if this is possible to be implemented without using custom TCP/IP stack?

Maybe I should try to change destination IP address (from client) to be from outside network (and then will be sent using default gateway from interface (1) - interface (2)) and then in postrouting change those again to original ones? Is such solution possible to work?

I am writting my application in C under Debian.

Adding some more details and clarifications:

  1. of course both pairs (1)--(2) and (3)--(4) are different subnets
  2. what I want to achieve is (1)-->(2)-->(4)-->(3)
  3. host2 is blackbox so I cant install there any packet forwarder (that will open listening socket on interface (2) and forward those to (3) through (4)) - this is exactely what I want to avoid

The main problem seems to be local delivery. When I open socket on host1 and want to connect to socket, that is listening on other address of the same host kernel just uses local stack to deliver packets. See netfilter diagram below:

 --->[1]--->[ROUTE]--->[3]--->[4]--->
             |            ^
             |            |
             |         [ROUTE]
             v            |
            [2]          [5]
             |            ^
             |            |
             v            |

Packets are going through [5] NF_IP_LOCAL_OUT and [2] NF_IP_LOCAL_IN whereas I want to force them to go through [4].

回答1:

Untested (should work, but I may have missed something):

Linux has several routing tables. Table local contains some routes that the kernel adds automatically for every IP address added to the host. You can see them with ip route show table local. Routes labeled as local indicate local routes that go through the loopback interface. You could delete that route and add a normal unicast route to replace it:

ip route del table local <ip> dev <NIC>
ip route add table local <ip> dev <NIC>
ip route flush cache

Now your 1st box will try to send IP datagrams to that IP address as if it was a remote address, e.g: it will use ARP. So, your 2nd box will have to either reply to the ARP requests if it is acting as a router or is doing proxy-ARP, or you will have to add an association to the ARP cache:

arp -s <ip> <MAC>

Then, you will probably have to disable rp_filter on the interfaces:

echo 0 > /proc/sys/net/ipv4/conf/<NIC>/rp_filter

Them again, if this doesn't work, you could probably set up something with L2 NAT, using ebtables.



回答2:

For a very similar task I'm using such script:

ip rule add from all lookup local # add one more local table lookup rule with high pref
ip rule del pref 0 # delete default local table lookup rule
ip rout add ${ip3} via ${ip2} src ${ip1} table 100 # add correct route to some table
ip rule add from all lookup 100 pref 1000 # add rule to lookup new table before local table


回答3:

You can assign different subnets to (1)-(2) and (3)-(4) pairs, and have host2 forward the packets from (2) to (3). The client on host1 will be connecting to address of (2), so local network stack will not know that the target server is actually running locally too.