bnx2x: Improve reliability in case of nested PCI errors
authorGuilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Fri, 22 Dec 2017 15:01:39 +0000 (13:01 -0200)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Sat, 3 Mar 2018 09:23:24 +0000 (10:23 +0100)
commitaf60c3822a721f9c2bf511ed1234e59ee4135010
tree645de3f1732dcb111ba673821cc821d412ea0199
parent78cc448e0882abffdc0c50309127d39bc2523146
bnx2x: Improve reliability in case of nested PCI errors

[ Upstream commit f7084059a9cb9e56a186e1677b1dcffd76c2cd24 ]

While in recovery process of PCI error (called EEH on PowerPC arch),
another PCI transaction could be corrupted causing a situation of
nested PCI errors. Also, this scenario could be reproduced with
error injection mechanisms (for debug purposes).

We observe that in case of nested PCI errors, bnx2x might attempt to
initialize its shmem and cause a kernel crash due to bad addresses
read from MCP. Multiple different stack traces were observed depending
on the point the second PCI error happens.

This patch avoids the crashes by:

 * failing PCI recovery in case of nested errors (since multiple
 PCI errors in a row are not expected to lead to a functional
 adapter anyway), and by,

 * preventing access to adapter FW when MCP is failed (we mark it as
 failed when shmem cannot get initialized properly).

Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Acked-by: Shahed Shaikh <Shahed.Shaikh@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c