[talk] LACP debug question

Mark Saad nonesuch at longcount.org
Tue Jul 14 16:26:22 EDT 2020


Pete
   I believe that what the issue could be is LACP  responses are not
being prioritized when being sent out.  Do you know of a way to make
the sfxge driver reserve queue 0 for now flowid packets. This is what
was done in the chelsio fix above.

Anyone have a clue how to do this in the driver  ?

On Tue, Jul 14, 2020 at 4:02 PM Mark Saad <nonesuch at longcount.org> wrote:
>
> On Mon, Jul 13, 2020 at 7:51 PM Pete Wright <pete at nomadlogic.org> wrote:
> >
> >
> >
> > On 7/11/20 9:32 AM, Mark Saad wrote:
> >
> > All
> >   This is a repost from net . I am looking for someone who can help me better understand this LACP disconnect that I am seeing on 12-stable . The server is a router with Solarflare nics attached to a Pair of arista 7050’s .
> >
> > Can you help me understand what I am looking at here. I enabled the lacp debug until I finally saw the issue I noted before. Due to some log rotation part of the message is clipped.
> > Here is a part the full thing is on patebin https://pastebin.com/BGtbxcBf
> >
> >
> > ouch this looks like a PITA to debug.  were you able to make any progress?  in the past when i ran into LACP issues on arista (both using linux and bsd) i had to fight issues with bad optics or cables.  i'm guessing the logs from the switch aren't too helpful either?
>
> Switch logs are saying that the server disconnected. At this point I
> know Its not the cables, I use 10G DAC's and I have a variety in use
> Arisa, Fiberstore, and Mellanox .  To be clear I am seeing this across
> a bunch of routers. The only commonality here is that it breaking on
> routers on Arista EOS  4.20.12M or later. It does not happen on
> 4.18.5M or earlier.  Arista has said there was no changes in how LACP
> works between versions but I can't put my finger on it.  Maybe the
> timing has changed ?
>
> > i've used the sfxge cards before and found them to be pretty darn stable...
> >
>
> I have found them to be fairly consistent . The only thing that sounds
> like it could be related was this thread
>
> https://www.mail-archive.com/freebsd-net@freebsd.org/msg62552.html
>
> > is this happening in any of the LACP modes, or just one of them?
>
> I only use Active / Active
>
> >
> > -p
> >
> > --
> > Pete Wright
> > pete at nomadlogic.org
> > @nomadlogicLA
>
>
>
> --
> mark saad | nonesuch at longcount.org



-- 
mark saad | nonesuch at longcount.org




More information about the talk mailing list