[talk] LACP debug question
nonesuch at longcount.org
Tue Jul 14 16:26:22 EDT 2020
I believe that what the issue could be is LACP responses are not
being prioritized when being sent out. Do you know of a way to make
the sfxge driver reserve queue 0 for now flowid packets. This is what
was done in the chelsio fix above.
Anyone have a clue how to do this in the driver ?
On Tue, Jul 14, 2020 at 4:02 PM Mark Saad <nonesuch at longcount.org> wrote:
> On Mon, Jul 13, 2020 at 7:51 PM Pete Wright <pete at nomadlogic.org> wrote:
> > On 7/11/20 9:32 AM, Mark Saad wrote:
> > All
> > This is a repost from net . I am looking for someone who can help me better understand this LACP disconnect that I am seeing on 12-stable . The server is a router with Solarflare nics attached to a Pair of arista 7050’s .
> > Can you help me understand what I am looking at here. I enabled the lacp debug until I finally saw the issue I noted before. Due to some log rotation part of the message is clipped.
> > Here is a part the full thing is on patebin https://pastebin.com/BGtbxcBf
> > ouch this looks like a PITA to debug. were you able to make any progress? in the past when i ran into LACP issues on arista (both using linux and bsd) i had to fight issues with bad optics or cables. i'm guessing the logs from the switch aren't too helpful either?
> Switch logs are saying that the server disconnected. At this point I
> know Its not the cables, I use 10G DAC's and I have a variety in use
> Arisa, Fiberstore, and Mellanox . To be clear I am seeing this across
> a bunch of routers. The only commonality here is that it breaking on
> routers on Arista EOS 4.20.12M or later. It does not happen on
> 4.18.5M or earlier. Arista has said there was no changes in how LACP
> works between versions but I can't put my finger on it. Maybe the
> timing has changed ?
> > i've used the sfxge cards before and found them to be pretty darn stable...
> I have found them to be fairly consistent . The only thing that sounds
> like it could be related was this thread
> > is this happening in any of the LACP modes, or just one of them?
> I only use Active / Active
> > -p
> > --
> > Pete Wright
> > pete at nomadlogic.org
> > @nomadlogicLA
> mark saad | nonesuch at longcount.org
mark saad | nonesuch at longcount.org
More information about the talk