-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: waterfall-by-priority load balancing algorithm #82
Comments
@zehome Did you get a chance to think about this? :) Please let me know if you have any questions. |
I'd be really interested to see such a thing, at least to experiment with |
@zehome Any news? If I don’t hear from you, I’ll just implement this and send a pull request if it works well. |
Ok so I've tried thoses ideas in the past. It's kind of the approch used by multipath mosh, but instead on focusing on bandwidth, it's focused on latency. First in order to do that, you'll need to implement a flow recognition mecanism. Like (srcip, srcport, dstip, dstport, proto). Then, for each flow, you'll need to mesure bandwidth. You'll also need to handle flow timeout in order to keep memory usage low and not growing to the roof. I've already implemented partly the flow recognition system but it's not finished / polished because the idea I was working on is not working properly (per flow re-ordering). Anyway, we can always try to implement it and see in practice if it's a gain for your application. BTW, mlvpn will never be as efficient as mptcp, which does everything mlvpn should do. |
see issue zehome#82 for context
Sorry, I don’t follow. Why are you saying flow recognition is a prerequisite for waterfall-by-priority? I’ve implemented pull request #88 and tested it with two links: one gigabit ethernet link, one wifi link. With the pull request applied, I get precisely the traffic rates I’m expecting. E.g., I can successfully set ethernet to 1 MB/s and will see that 5 MB/s go over wifi. Conversely, when setting ethernet to 100 MB/s, no traffic goes over wifi. Am I missing a testcase? |
I just tried this patch set. I think you are taking the numbers in the bandwidth setting 'literally' (which is fair enough, but till now these numbers have been effectively an indication of relative bandwidth if I understand correctly. It would be helpful to specify what units you expect the numbers in for instance :-) ) So - having adjusted the bandwidth numbers somewhat, I ended up getting something that tried to work. However, much as I tried to adjust things, I ended up with a lot of traffic on one link (which was set to have quite low traffic) and little on the other - this all tested under a heavy load. Could it be that the bandwidth measure you specify above sends everything during the first part of the second through on one link, by the time the ack's come back, we've run out of time in the second, and we move on, meaning the second link never fires up? I GUESS using a re-order buffer could help here, but I guess it might have to be very large... I haven't looked at the code, but I wonder if a moving window approach could be used? Finally - @zehome - I notice you singing the praises of mptcp, which looks brilliant. The advantage of mlvpn is only that it's a single (and pretty simple) setup, where as, if I understand right, for an equivalent mptcp version you would need a vpn, a socks proxy, and some luck. (I should say, in my case, I have a 4.7M and a 1.8M link (dont ask). I get a max throughput of about 4.1M and 1.3M on the two links. so I'm loosing about 10% of my throughput, which means the overall benefit is verging on the marginal in my case :-( - I'd love to get closer to the full bandwidth :-) ) Cheers |
Thanks for giving the pull request a spin!
That’s already done (but not yet merged) in https://github.com/zehome/MLVPN/pull/80/files. The unit is bytes/s.
No. What exactly do you mean by “fail”?
You can answer that question better than I can by measuring the bandwidth on your underlying links, e.g. using I’d recommend to move the discussion about mptcp into a separate issue. |
ok - missed the bytes/s - it's what I guessed :-) I'm not sure, it may have been experimental error :-) - but when I have the loss tolerance set to e.g. 85%, the link was instantly marked as lossy ... at the time, my bandwidth settings would have been very low (as I had them in KB/s) - but I dont see why that should have cased links to be marked as lossy....? For the bandwidth, I know what the throughput is, my issue is not there. What I'm suggesting is if you are treating things e.g. "each second" (as you had indicated you might, - I didn't look at your code) - then I wonder if you are flooding the first link initially during the first part of the second (potentially causing the packet losses), then you get to a point where you have sent as many packets as you want (for the whole second), then you pass onto the second link. However, because it takes some time to send those packets, we're almost at the end of the second, so we dont actually manage to send much out on the second link before we end the second and start again. (again, I didn't look at your code, so perhaps it doesn't work this way....) I thought maybe the re-order buffer may help, but it seems to me you would need a very large re-order buffer, which potentially has other negative impacts? Another approach could be to have more of a 'window/filter' system, to try and keep the 'average' bandwidth correct over time... |
I suggest you have a look at the code, it should be a quick read and will give you a good grasp as to what’s happening (in detail). That said, your description is correct: packets are being sent on each link to the extent configured, i.e. if you configure your bandwidth too high, you might run into delays/packet losses — I haven’t tested that particular bit. It should be easy to verify whether this is happening: just configure half of your actual bandwidth as limit? Let me know what you find out. I’ll also do some more experimentation once I get around to it. |
I did try that, I seems not to give me a very good result. One of the experiments I tried, to make bandwidth more 'accurately' reflect that which was set (rather than @zehome's approach with the sort of 'buckets' - which ends up with a 'quantum' effect), was to use a random number approach. e.g. set the probability of using one interface rather than the other. so, then I'm guessing I'd have ended up with aabbbababaaa ... whatever. I believe thats because, again, when I tried to send too many packets at a time, things got congested and nasty. Potentially a small re-order buffer may help in this instance, but I think if the plan is to send, basically, a seconds worth of traffic, then the re-order buffer would have to be very large? So - what I'm saying is - I think - you want to only use one interface when traffic is low, but as soon as you have more traffic than that interface can handle, you need to intersperse the two interfaces, rather than 'burst sending'.... it's just a guess though :-) Cheers |
The very point of this issue/pull request is to do aaaaabbbb, i.e. entirely fill up interface after interface. I’ll do some more research when I get around to it. |
Had some time to play around with this. In my experiments, the links had 0 loss, so I’m not sure why you’re seeing something different. Further, when configuring bandwidth limits equal to/below/above the actual link speed (i.e. 10 Mbit/s, 5 Mbit/s, 50 Mbit/s, all on a 10 Mbit/s link), mlvpn behaves as expected. I’ll try to use this setup next week in a situation with many different links, perhaps then I’ll be able to reproduce what you’re seeing. |
it equally could be my VERY POOR quality ADSL that is causing problems with the approach.... |
Hi, I've recently re-visited this (I'm adding a 4g connection with limited monthly quota to an existing pair of ADSLs). There is a difference between UDP and TCP. On my home network, I see fair mix of both, I want both to work 'well'. For TCP, the SRTT time is normally held relatively close to the 'idle' state. It is possible to set a 'threshold' trigger point, and use that to add more bandwidth to the link, however I find this approach problematic as the threshold is very sensitive. Hence, outside of MLTCP, I conclude (sadly) that using SRTT to manage tunnel selection (as in #84) is unlikely to work well for TCP connections (though it is very good for UDP connections). Leaving it as a 'backup' mechanism if no bandwidths are given would seem a good approach. Rather, I believe it is better to use 'bandwidth' where possible. Currently MLVPN chooses bandwidth 'statically', allowing a dynamic approach based on incoming bandwidth seems to work very nicely. To give a reasonably fast response to traffic, if I find that we exceed a given links bandwidth, I add the full amount from the next path, I do not attempt to 'only' pass what is needed on that path. This will be efficient (and, I believe, close to optimal) for TCP connections, while for UDP connections, if they are bandwidth limited, it will effectively "overuse" the more expensive path by a small amount. I am personally happy with this compromise. Also note, to enable TCP to use the full bandwidth, the calculation must include something like a 20% over estimate of the bandwidth needed. What I am seeing looks pretty impressive in terms of performance across 2 ADSL links. In a few weeks, I should be able to try it with a 4g link included in the mix, and see what it does there. I have a patch for this, but it's currently rather experimental. I'll clean it up at some point and post it if people are interested. |
very nice. I've seen similar behavior as yours, it's a very hard thing to achieve good aggregation. I'm currently working on another project which leverages mptcp and it's awesome multi-connection balance algorithms, but with the ease of use of mlvpn. Currently, solutions using shadowsocks-redir are very hacky, there are better solutions which I intend to use. (TPROXY) |
keep me in touch, I'd love to see that. |
Hey,
currently, MLVPN supports a weighted round robin load balancing algorithm. This is fine if the cost of all the links is roughly the same, and all links will be used equally (or proportionally to their configured bandwidth).
However, I’m in a situation where it is desirable to use a minimum number of links first and then gradually use more links as needed. Think about a setup with the following links:
Whenever the needed bandwidth can be satisified by using link1, only link1 should be used. If necessary, link2 can be utilized, and if absolutely need be, link3 can be utilized as well.
My proposed algorithm for this situation works as follows:
I have implemented this in a proof-of-concept implementation (outside of MLVPN) and it works fine.
I’m willing to put in the work to implement this within MLVPN.
Would you accept such a contribution?
The text was updated successfully, but these errors were encountered: