Some 8086:10c9 NICs have a problem completing the ethtool loopback test.
The result looks like this:
ethtool -t eth1
The test result is FAIL
The test extra info:
Register test (offline) 0
Eeprom test (offline) 0
Interrupt test (offline) 0
Loopback test (offline) 13
Link test (on/offline) 0
A bisect clearly points to commit a95a07445e.
However that seems to only trigger the bug. While adding some printk the
problem disappeared, so this might be a timing issue. After some trial and
error I discovered that adding a small delay just before igb_write_phy_reg()
in igb_integrated_phy_loopback() allows the loopback test to succeed.
I was unable to figure out the root cause so far but I expect it to be
somewhere in the following executing path
igb_integrated_phy_loopback
->igb_write_phy_reg_igp
->igb_write_phy_reg_mdic
->igb_acquire_phy_82575
->igb_acquire_swfw_sync_82575
The problem could only be observed on 8086:10c9 NICs so far and not all
of them show the behaviour. I did not restrict the workaround to this
type of NIC as it should do no harm to other igb NICs.
With the patch below the loopback test succeeded 500 times in a row
using a NIC that would otherwise fail.
Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
A bus trace shows that while executing e1000e_down, TCTL is cleared except
for the PSP bit. This occurs while in the middle of fetching a TSO packet
since the Tx packet buffer is full at that point. Before the device is
reset, the e1000_watchdog_task starts to run from the middle (it was
apparently pre-empted earlier, although that is not in the trace) and sets
TCTL.EN. At that point, 82571 transmits the corrupted packet, apparently
because TCTL.MULR was cleared in the middle of fetching a packet, which is
forbidden.
Driver should just clear TCTL.EN in e1000_reset_hw_82571 instead of
clearing the entire register, so as not to change any settings in the
middle of fetching a packet.
Signed-off-by: Tushar Dave <tushar.n.dave@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Found that commit d478eb44 was a bad commit.
If the link partner is transmitting codeword (even if NULL codeword),
then the RXCW.C bit will be set so check for RXCW.CW is unnecessary.
Ref: RH BZ 840642
Reported-by: Fabio Futigami <ffutigam@redhat.com>
Signed-off-by: Tushar Dave <tushar.n.dave@intel.com>
CC: Marcelo Ricardo Leitner <mleitner@redhat.com>
CC: stable <stable@vger.kernel.org> [2.6.38+]
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Even when they go beyond 80 characters, user visible strings should be
on one line to make them easy to grep for.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
In the original code
...
if ((adapter->hw.mac.type == e1000_i210)
|| (adapter->hw.mac.type == e1000_i210)) {
...
the second check of 'adapter->hw.mac.type' is pointless since it tests
for the exact same value as the first.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Acked-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Move nvm invalid size check to before size assigned by mac_type for
82575 and later parts in get_invariants function. This fixes a problem
found on some 82576 devices where the part will not initialize because
the nvm_read function pointer ends up getting assigned to the incorrect
function.
Reported By: Stefan Assmann <sassmann@redhat.com>
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
The skb->pfmemalloc flag gets set to true iff during the slab allocation
of data in __alloc_skb that the the PFMEMALLOC reserves were used. If
page splitting is used, it is possible that pages will be allocated from
the PFMEMALLOC reserve without propagating this information to the skb.
This patch propagates page->pfmemalloc from pages allocated for fragments
to the skb.
It works by reintroducing and expanding the skb_alloc_page() API to take
an skb. If the page was allocated from pfmemalloc reserves, it is
automatically copied. If the driver allocates the page before the skb, it
should call skb_propagate_pfmemalloc() after the skb is allocated to
ensure the flag is copied properly.
Failure to do so is not critical. The resulting driver may perform slower
if it is used for swap-over-NBD or swap-over-NFS but it should not result
in failure.
[davem@davemloft.net: API rename and consistency]
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: Neil Brown <neilb@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Eric B Munson <emunson@mgebm.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Christoph Lameter <cl@linux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch resolves a "BUG: unable to handle kernel paging request at ..."
oops while dumping packet data. The issue occurs with IOMMU enabled due to
the address provided by phys_to_virt().
This patch makes use of skb->data on Tx and the virtual address of the pages
allocated for Rx.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This driver now offers software transmit time stamping, so it should
advertise that fact via ethtool. Compile tested only.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: e1000-devel@lists.sourceforge.net
Signed-off-by: David S. Miller <davem@davemloft.net>
This driver now offers software transmit time stamping, so it should
advertise that fact via ethtool. Compile tested only.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: e1000-devel@lists.sourceforge.net
Signed-off-by: David S. Miller <davem@davemloft.net>
There was a previous patch to resolve issue with 82576 losing PHY setting
after PHY power down. However that previous implementation triggered speed
mismatch and occasional link lost. Now, this patch resolves both initial
PHY setting and speed mismatch issues.
Signed-off-by: Akeem G. Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change makes it so that we can use 1TC DCB in the case of MSI and
legacy interrupts. The advantage to this is that it allows us to fully
support FCoE w/ DCB instead of having to drop to link flow control only
when using these interrupt modes.
Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds support for a new 82599 device that supports WoL.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
With DCB and FCoE configured extra queues may be allocated and
never used. After this patch we calculate the max correctly.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Do RAR entry accounting correctly so that errors are reported and
promisc mode is set correctly when the number of entries exceeds
the hardware limits.
This can happen with many macvlan devices attached to the PF or
by adding many fdb entries in SR-IOV modes.
Also this includes a small refactor to fdb_add() to avoid having so
many nested if/else statements after adding a check for the number
or RAR entries.
The max entries for the PF is currently 16 we allow 15 additional
entries to account for the defined MAC.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change makes it so the function ixgbe_dcb_get_tc_from_up will use the
num_tcs.pg_tcs to determine the starting value for determining a traffic
class based on a user priority. The main motivation for this change is to
address possible bad configurations in which more TCs worth of data are
populated then there are actual TCs. By limiting this value we can at
least make certain we are not providing a map with values that are out of
range.
As a result any user priorities that are setup in the configuration with a
traffic class mapping higher than what the hardware supports will be
reported as being on TC 0.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The recent changes to netdev_alloc_skb actually make it so that the size of
the buffer now actually has a more direct input on the truesize. So in
order to make best use of the piece of a page we are allocated I am
reducing the IXGBE_RX_HDR_SIZE to 256 so that our truesize will be reduced
by 256 bytes as well.
This should result in performance improvements since the number of uses per
page should increase from 4 to 6 in the case of a 4K page. In addition we
should see socket performance improvements due to the truesize dropping
to less than 1K for buffers less than 256 bytes.
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Make the function static to cleanup namespace.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Sibai Li <Sibai.li@intel.com
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change makes it so that we can use the atr_sample_rate to determine if
we are capable of supporting ATR. The advantage to this approach is that it
allows us to now determine the setting of the IXGBE_FLAG_FDIR_HASH_CAPABLE
based on the queueing scheme, instead of the queueing scheme being based on
the flag.
Using this approach there are essentially 5 conditions that must be checked
prior to trying to enable ATR:
1. Is SR-IOV disabled?
2. Are the number of TCs <= 1?
3. Is RSS queueing limit greater than 1?
4. Is atr_sample_rate set?
5. Is Flow Director perfect filtering disabled?
If any of these conditions are enabled they should disable ATR filtering.
Note that in the case of conditions 1 through 4 being met we will set
things up for ATR queueing, however if test 5 fails we will still leave the
queues allocated for use by perfect filters. The reason for this is to
allow for us to switch back and forth between ntuple and ATR without
needing to reallocate the descriptor rings.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change adds support for handling IO errors and slot resets.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change adds a spinlock around the mailbox accesses to prevent
simultaneous access to the mailboxes.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch does two things. First it drops the unnecessary work of
searching for enabled VFs when we first bring up the adapter and instead
just uses pci_num_vf to determine how many VFs are enabled on the adapter.
The second thing it does is drop the use of vfdev from the vf_data_storage
structure. Instead we just search the entire system for a VF that has us
as it's PF, and then if that VF is assigned we indicate that the VFs are
assigned. This allows us to still check for assigned VFs even if the
vfinfo allocation has failed, or vfinfo has been freed.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This is meant to fix a bug in which we were not checking for pre-existing
VFs if we were not setting the max_vfs value at driver load. What happens
now is that we always call ixgbe_enable_sriov and this checks for
pre-existing VFs ore requested VFs prior to deciding on no SR-IOV.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jerr Kirsher says:
====================
This series contains updates to ixgbe.
...
Alexander Duyck (9):
ixgbe: Use VMDq offset to indicate the default pool
ixgbe: Fix memory leak when SR-IOV VFs are direct assigned
ixgbe: Drop references to deprecated pci_ DMA api and instead use
dma_ API
ixgbe: Cleanup configuration of FCoE registers
ixgbe: Merge all FCoE percpu values into a single structure
ixgbe: Make FCoE allocation and configuration closer to how rings
work
ixgbe: Correctly set SAN MAC RAR pool to default pool of PF
ixgbe: Only enable anti-spoof on VF pools
ixgbe: Enable FCoE FSO and CRC offloads based on CAPABLE instead of
ENABLED flag
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Use PCI_VENDOR_ID_INTEL from pci_ids.h instead of creating its own
vendor ID #define.
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Bruce Allan <bruce.w.allan@intel.com>
Cc: Carolyn Wyborny <carolyn.wyborny@intel.com>
Cc: Don Skidmore <donald.c.skidmore@intel.com>
Cc: Greg Rose <gregory.v.rose@intel.com>
Cc: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Cc: Alex Duyck <alexander.h.duyck@intel.com>
Cc: John Ronciak <john.ronciak@intel.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use PCI_VENDOR_ID_INTEL from pci_ids.h instead of creating its own
vendor ID #define.
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Bruce Allan <bruce.w.allan@intel.com>
Cc: Carolyn Wyborny <carolyn.wyborny@intel.com>
Cc: Don Skidmore <donald.c.skidmore@intel.com>
Cc: Greg Rose <gregory.v.rose@intel.com>
Cc: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Cc: Alex Duyck <alexander.h.duyck@intel.com>
Cc: John Ronciak <john.ronciak@intel.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of only setting the FCOE segmentation offload and CRC offload flags
if we enable FCoE, we could just set them always since there are no
modifications needed to the hardware or adapter FCoE structure in order to
use these features.
The advantage to this is that if FCoE enablement fails, for example because
SR-IOV was enabled on 82599, we will still have use of the FCoE
segmentation offload and Tx/Rx CRC offloads which should still help to
improve the FCoE performance.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The current logic is enabling anti-spoof on all pools and then clearing
anti-spoof on just the first PF pool. The correct approach is to only set
anti-spoof on the VF pools and to leave all of the PF pools unchecked.
This allows for items such as FCoE to use adjacent pools within the PF for
transmit and receive queues without the traffic being blocked by this
security feature.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change corrects an issue in which an FCoE enabled adapter was always
setting the FCoE SAN MAC MPSAR register to 0x1. This results in the first
VF being assigned the SAN MAC address in the case of SR-IOV and as such is
incorrect. To resolve this I am adding a new function that will update the
SAN MAC pool address after reset.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch changes the behavior of the FCoE configuration so that it is
much closer to how the main body of the ixgbe driver works for ring
allocation.
The first piece is the ixgbe_fcoe_ddp_enable/disable calls. These allocate
the percpu values and if successful set the fcoe_ddp_xid value indicating
that we can support DDP.
The next piece is the ixgbe_setup/free_ddp_resources calls. These are
called on open/close and will allocate and free the DMA pools.
Finally ixgbe_configure_fcoe is now just register configuration. It can go
through and enable the registers for the FCoE redirection offload, and FIP
configuration without any interference from the DDP pool allocation.
The net result of all this is two fold. First it adds a certain amount of
exception handling. So for example if ixgbe_setup_fcoe_resources fails we
will actually generate an error in open and refuse to bring up the
interface.
Secondly it provides a much more graceful failure case than the previous
model which would skip setting up the registers for FCoE on failure to
allocate DDP resources leaving no Rx functionality enabled instead of just
disabling DDP.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change merges the 2 statistics values for noddp and noddp_ext_buff
and the dma_pool into a single structure that can be allocated per CPU.
The advantages to this are several fold. First we only need to do one
alloc_percpu call now instead of 3, so that means less overhead for
handling memory allocation failures. Secondly in the case of
ixgbe_fcoe_ddp_setup we only need to call get_cpu once which makes things a
bit cleaner since we can drop a put_cpu() from the exception path.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change makes it so we always use the FCoE redirection table. We just
set all 8 entries to the same value in the case of only having one queue
for FCoE.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The networking side of the code had already been updated to use dma_ calls
instead of the old pci_ calls. However it looks like the FCoE code was
never updated. This change goes through and moves everything from the pci
APIs to the dma APIs.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The VF driver had a memory leak that would occur if VFs were assigned to a
guest. The amount of leak would vary with the number of VFs but could max
out at about 14K per PF. To reproduce the leak all you would need to do is
enable all the VFs on the first PF. Then start a loop of loading and
unloading the driver with max_vfs=63 for the first port.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change makes it so that we can use the VMDq ring feature offset value
to determine the default pool instead of using num_vfs. The reason for
this change is to avoid issues should we fail to allocate vfinfo but have
pre-existing VFs. What should happen in this case is that num_vfs will go
to 0, but the VMDq offset will contain the location of the first PF pool.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Sibai Li <Sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
commit 9ac32e1b firmware: convert e100 driver to request_firmware()
did a straight conversion of the in-driver ucode to external
files. This introduced the possibility of the driver failing
to enable an interface due to missing ucode. There was no
evaluation of the importance of the ucode at the time.
Based on comments in earlier versions of this driver, and in
the source code for the FreeBSD fxp driver, we can assume that
the ucode implements the "CPU Cycle Saver" feature on supported
adapters. Although generally wanted, this is an optional
feature. The ucode source is not available, preventing it from
being included in free distributions. This creates unnecessary
problems for the end users. Doing a network install based on a
free distribution installer requires the user to download and
insert the ucode into the installer.
Making the ucode optional when possible improves the user
experience and driver usability.
The ucode for some adapters include a bugfix, making it
essential. We continue to fail for these adapters unless the
ucode is available.
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
This change is just meant to defragment the flags as there are several hole
that have been introduced since several features, or the flags for them,
have been removed.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
All of our hardware supports RSS even if it is only for a single queue. So
instead of toting around the RSS enable flag I am updating the code so that
all devices are enabled and if we want to disable RSS it is indicated via
the RSS mask.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change essentially makes it so that we can enable almost all of the
features all at once. This patch allows for the combination of SR-IOV,
DCB, and FCoE in the case of the x540. It also beefs up the SR-IOV by
adding support for RSS to the PF.
The testing matrix gets to be very complex for this patch as there are a
number of different features and subsets for queueing options. I tried to
narrow these down a bit by restricting the PF to only supporting 4TC DCB
when it is enabled in addition to SR-IOV.
Cc: Greg Rose <gregory.v.rose@intel.com>
Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change allows all pools from the default pool forward to be enabled vi
ixgbe_configure_virtualization. This is needed as we are planning to use
queues belonging to adjacent pools for FCoE when SR-IOV and FCoE are both
enabled.
In addition this patch contains some minor formatting changes as there were
a few spots that seemed to be in need of some cleanup.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In ixgbevf_get_ringparam we could run into a NULL pointer dereference
if the rings were not allocated when we attempted the call. To prevent
that we can just access the tx/rx_ring_count values instead of attempting
to access the rings to get the count.
This change corrects a memory leak and memory corruption in
ixgbevf_set_ringparam.
The memory leak was due to us not freeing the resources from the ring
before overwriting them. This change corrects the memory leak by making
certain to call ixgbe_free_tx/rx_resources on the rings prior to freeing
them.
The memory corruption was because we were replacing the rings but not
updating the q_vectors. It addresses the memory corruption by leaving the
rings in place and instead just copying the contents of the new rings into
the existing rings.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
There is a good bit of redundancy between the Tx checksum and segmentation
offloads. In order to reduce some of this I am moving the code for
creating a context descriptor into a separate function.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change adds the netdev to the ring structure. This allows for a
quicker transition from ring to netdev without having to go from ring to
adapter to netdev.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The driver is going back one step from its' previous location before
bumping tail. This is incorrect. We should just be writing the value of
next_to_use into the tail register.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
We have had an issue when using ixgbe+ixgbevf and 802.1 VLAN tagging.
When attaching a VLAN to a VF, frames with a 802.1q priority appeared
untagged on the VF hence not reaching the VLAN, where frames with
priority 0 where tagged as expected and seen by the VLAN device.
This seems due to the way ixgbevf is looking up the full tag
(prio+cfi+vlan) against the adapter active_vlans, as a condition to mark
the skb tagged.
Signed-off-by: Pascal Bouchareine <pascal@gandi.net>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change updates the descriptor macros to accept pointers, updates the
name to drop the _ADV suffix, and include the IXGBEVF name in the macro.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change is meant to make the code much more readable for MTQC and MRQC
configuration.
The big change is that I simplified much of the logic so that we are
essentially handling just 4 cases and their variants. In the cases where
RSS is disabled we are actually just programming the RETA table with all
1s resulting in a single queue RSS. In the case of SR-IOV I am treating
that as a subset of VMDq. This all results int he following configuration
for the hardware:
DCB
En Dis
VMDq En VMDQ/DCB VMDq/RSS
Dis DCB/RSS RSS
Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change cleans up some of the logic in an attempt to try and simplify
things for how we are configuring DCB w/ RSS.
In this patch I basically did 3 things. I updated the logic for getting
the first register index. I applied the fact that all TCs get the same
number of queues to simplify the looping logic in caching the DCB ring
register. Finally I updated how we configure the RQTC register to match
the fact that all TCs are assigned the same number of queues.
Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
It makes much more sense for us to configure the real number of Tx and Rx
queues in the ixgbe_open call than it does in ixgbe_set_num_queues. By
setting the number in ixgbe_open we can avoid a number of unecessary
updates and only have to make the calls once.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Previously we were exiting without cleaning up the memory internally on the
ixgbe_setup_rx_resources and ixgbe_setup_tx_resources calls. Instead of
forcing the caller to clean things up for us we should instead just unwind
the rings and free the memory as we go. This way we can more gracefully
clean up the rings in the event of an allocation failure.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When the link status changes on the PF we need to notify the VFs. In order
to do this we should ping all of the VFs in order to trigger a link status
change on them as well.
This fixes issues in which the PF would reset, but the VF didn't because the
NAK flag was not set in the VF mailbox.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The interrupt registers accessed in ixgbevf are more similar to the igb
style registers than they are to the ixgbe style registers. As such we
would be better off setting up the code for the EICS, EIMS, EICS, EIAM, and
EIAC like we do in igb instead of ixgbe.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently the VF driver is processing all of the transmits in interrupt
context. This can be messy since the Rx is all handled in NAPI and this
may result in interrupts being disabled. In order to resolve this move all
of the Tx packet processing into NAPI and combine all of the interrupt and
polling routines into just a pair of functions.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
For most cases the ixgbevf driver will only ever contain a single Tx and
single Rx queue. In order to track that it makes more sense to use a
pointer instead of using a bitmap which must be search in order to locate
the ring on an adapter index. As such I am changing the code to use
pointers and an iterator to access all rings on a given q_vector.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch addresses a kernel panic seen when setting up the interface.
Specifically we see a NULL pointer dereference on the Tx descriptor cleanup
path when enabling interrupts. This change corrects that so it cannot
occur.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This change cleans up the accounting needed at the start of xmit_frame so
that we can avoid doing too much work to determine how many descriptors we
will need.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch drops the use of eitr_low and eitr_high as values being stored
in the adapter structure. Since the values have no external way to be
changed they might as well just be hard coded values and save us the space
on the adapter structure.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The IXGBE_FLAG_RX_CSUM_ENABLED flag is redundant since NETIF_F_RXCSUM is
keeping the value we want to already have. As such we can drop the
redundant flag and just make use of NETIF_F_RXCSUM.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
There is no need to keep a separate netdev_registered value since that is
already stored in the netdev itself.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
There is a large amount of code present in this driver to support features
that either do no exist or are not supported such ask packet split, DCA, or
RSC. This patch strips out almost all of that code and in the case of
conditionals based on unused flags I am flatting the code out to just the
path that would have been selected.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jeff Kirsher says:
====================
This series contains fixes to e1000e.
...
Bruce Allan (1):
e1000e: fix test for PHY being accessible on 82577/8/9 and I217
Tushar Dave (1):
e1000e: Correct link check logic for 82571 serdes
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jett Kirsher says:
====================
This series contains updates to e1000e and ixgbe.
...
Alexander Duyck (5):
ixgbe: Simplify logic for getting traffic class from user priority
ixgbe: Cleanup unpacking code for DCB
ixgbe: Populate the prio_tc_map in ixgbe_setup_tc
ixgbe: Add function for obtaining FCoE TC based on FCoE user priority
ixgbe: Merge FCoE set_num and cache_ring calls into RSS/DCB config
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The commit 4197aa7bb8 implements 64 bit
per ring statistics. But the driver resets the 'total_bytes' and
'total_packets' from RX and TX rings in the RX and TX interrupt
handlers to zero. This results in statistics being lost and user space
reporting RX and TX statistics as zero. This patch addresses the
issue by preventing the resetting of RX and TX ring statistics to
zero.
Signed-off-by: Narendra K <narendra_k@dell.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert the existing uses of random_ether_addr to
the new eth_random_addr.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This change merges the ixgbe_cache_ring_fcoe and ixgbe_set_fcoe_queues
logic into the DCB and RSS initialization calls.
Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In upcoming patches it will become increasingly common to need to determine
the FCoE traffic class in order to determine the correct queues for FCoE.
In order to make this easier I am adding a function for obtaining the FCoE
traffic class based on the user priority.
Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
There were cases where the prio_tc_map was not populated when we were
calling open. This will result in us incorrectly configuring the traffic
classes when DCB is enabled. In order to correct this I have updated the
code so that we now populate the values prior to allocating the q_vectors
and calling ixgbe_open.
Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This is meant to be a generic clean-up of the remaining functions for
unpacking data from the DCB structures. The only real changes are:
replaced the variable i with tc for functions that were looping through the
traffic classes, and added a pointer for tc_class instead of path since
that way we only need to pull the pointer once instead of once per loop.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch is meant to help simplify the logic for getting traffic classes
from user priorities. To do this I am adding a function named
ixgbe_dcb_get_tc_from_up that will go through the traffic classes in
reverse order in order to determine which traffic class contains a bit for
a given user priority.
Adding a declaration for this new function to the header so that
we have a centralized means for sorting out traffic classes belonging to
features such as FCoE.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When configuring interrupt throttling on 82574 in MSI-X mode, we need to
be programming the EITR registers instead of the ITR register.
-rc2: Renamed e1000_write_itr() to e1000e_write_itr(), fixed whitespace
issues, and removed unnecessary !! operation.
-rc3: Reduced the scope of the loop variable in e1000e_write_itr().
Signed-off-by: Matthew Vick <matthew.vick@intel.com>
Acked-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cleanup code to make it more clean and readable.
Signed-off-by: Tushar Dave <tushar.n.dave@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Occasionally, the PHY can be initially inaccessible when the first read of
a PHY register, e.g. PHY_ID1, happens (signified by the returned value
0xFFFF) but subsequent accesses of the PHY work as expected. Add a retry
counter similar to how it is done in the generic e1000_get_phy_id().
Also, when the PHY is completely inaccessible (i.e. when subsequent reads
of the PHY_IDx registers returns all F's) and the MDIO access mode must be
set to slow before attempting to read the PHY ID again, the functions that
do these latter two actions expect the SW/FW/HW semaphore is not already
set so the semaphore must be released before and re-acquired after calling
them otherwise there is an unnecessarily inordinate amount of delay during
device initialization.
Reported-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
SYNCH bit and IV bit of RXCW register are sticky. Before examining these bits,
RXCW should be read twice to filter out one-time false events and have correct
values for these bits. Incorrect values of these bits in link check logic can
cause weird link stability issues if auto-negotiation fails.
CC: stable <stable@vger.kernel.org> [2.6.38+]
Reported-by: Dean Nelson <dnelson@redhat.com>
Signed-off-by: Tushar Dave <tushar.n.dave@intel.com>
Reviewed-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
There are really only 3 modes that can control the number of queues. Those
are RSS, DCB, and VMDq/SR-IOV. Currently we have things much more broken
up than they need to be for how we are configuring the rings. In order to
try and straiten some of this out I am going to start merging similar
functionality into single functions. To start with I am merging the Flow
Director ring configuration into the RSS ring configuration since Flow
Director cannot function with DCB or SR-IOV.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch replaces a switch statement for an 82598 workaround with an if
statement that only applies to 82598. In addition I am pulling out several
dead pieces of code and instead of reading the SRRCTL register and then
modifying it we are just writing a value which we generate from scratch.
Finally I am also removing any drop enable related code since that was
moved to a function of its own.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The mask value for ring features was overloaded for FCoE which can lead to
some confusion. In order to avoid any confusion I am splitting the mask
value and adding an offset value. This can be used for the start of the
FCoE rings, and in the future I hope to use it to store the start of the
registers for SR-IOV.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
We are currently using indices to indicate the upper limit on a ring
feature. However since we can switch back and forth on features such as
DCB and that has effects on other features such as RSS it is preferable to
instead store the upper limit separate from the current value for the
number of rings related to the feature.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
It makes much more sense for us to count q_vectors instead of MSI-X
vectors. We were using num_msix_vectors to find the number of q_vectors in
multiple places. This was wasteful since we only had one place that
actually needs the number of MSI-X vectors and that is in slow path.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Conflicts:
net/batman-adv/bridge_loop_avoidance.c
net/batman-adv/bridge_loop_avoidance.h
net/batman-adv/soft-interface.c
net/mac80211/mlme.c
With merge help from Antonio Quartulli (batman-adv) and
Stephen Rothwell (drivers/net/usb/qmi_wwan.c).
The net/mac80211/mlme.c conflict seemed easy enough, accounting for a
conversion to some new tracing macros.
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert doxygen (or similar) formatted comments to kernel-doc or
unformatted comment. Delete a few that are content-free.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix incorrect start markers, wrapped summary lines, missing section
breaks, incorrect separators, and some name mismatches. Delete
a few that are content-free.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
DCB and SR-IOV cannot currently be enabled at the same time as the queueing
schemes are incompatible. If they are both enabled it will result in Tx
hangs since only the first Tx queue will be able to transmit any traffic.
This simple fix for this is to block us from enabling TCs in ixgbe_setup_tc
if SR-IOV is enabled. This change will be reverted once we can support
SR-IOV and DCB coexistence.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently only used when packet split mode is enabled with jumbo frames,
IP payload checksum (for fragmented UDP packets) is mutually exclusive with
receive hashing offload since the hardware uses the same space in the
receive descriptor for the hardware-provided packet checksum and the RSS
hash, respectively. Users currently must disable jumbos when receive
hashing offload is enabled, or vice versa, because of this incompatibility.
Since testing has shown that IP payload checksum does not provide any real
benefit, just remove it so that there is no longer a choice between jumbos
or receive hashing offload but not both as done in other Intel GbE drivers
(e.g. e1000, igb).
Also, add a missing check for IP checksum error reported by the hardware;
let the stack verify the checksum when this happens.
CC: stable <stable@vger.kernel.org> [3.4]
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using ethtool -C ethX rx-usecs 0 crashes with a divide by zero.
Refactor this function to fix this issue and make it more clear
what the intent of each conditional is. Add comment regarding
using a setting of zero.
CC: stable <stable@vger.kernel.org> [3.3+]
CC: David Ahern <daahern@cisco.com>
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
drivers/net/caif/caif_hsi.c
drivers/net/usb/qmi_wwan.c
The qmi_wwan merge was trivial.
The caif_hsi.c, on the other hand, was not. It's a conflict between
1c385f1fdf ("caif-hsi: Replace platform
device with ops structure.") in the net-next tree and commit
39abbaef19 ("caif-hsi: Postpone init of
HIS until open()") in the net tree.
I did my best with that one and will ask Sjur to check it out.
Signed-off-by: David S. Miller <davem@davemloft.net>
FCoE target mode was experiencing issues due to the fact that we were
sending up data frames that were padded to 60 bytes after the DDP logic had
already stripped the frame down to 52 or 56 depending on the use of VLANs.
This was resulting in the FCoE DDP logic having issues since it thought the
frame still had data in it due to the padding.
To resolve this, adding code so that we do not pad FCoE frames prior to
handling them to the stack.
CC: <stable@vger.kernel.org>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
drivers/net/usb/qmi_wwan.c
net/batman-adv/translation-table.c
net/ipv6/route.c
qmi_wwan.c resolution provided by Bjørn Mork.
batman-adv conflict is dealing merely with the changes
of global function names to have a proper subsystem
prefix.
ipv6's route.c conflict is merely two side-by-side additions
of network namespace methods.
Signed-off-by: David S. Miller <davem@davemloft.net>
The check for length <= 0 is bogus because length is unsigned, and network
stack never sends zero length packets (unless it is totally broken).
The check for really small packets can be optimized (using unlikely)
and calling skb_pad directly.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch cleans up the method used for determining the link speed of
devices. The old method re-wrote some logic already existing in a mac.ops
function which should be used instead. The result is much simpler to
understand and removes a strange double-check of logic, as well as reducing
code redundancy.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds support for 1G Fiber PHY modules (SFP+ modules). This support
comes along side support for 1G Copper PHY modules, but uses a different PHY
type (ixgbe_sfp_type_1g_sx_core).
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch updates the igb version to 4.0.1.
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Our NVM image creation tools have evolved over the years and there are
multiple versions contained in them, depending on the tool used to create
them. This patch outputs the NVM versions available in ethtool -i output.
rc2: (not sure why others show in log but not in the message)
Added additional call to igb_set_fw_version per Community feedback.
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Rather than spread out the complexity of the RSS queue and queue pairing
assignment logic, place it all in one location for simplicity and
readability.
Signed-off-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Based on original patch from Richard Cochran <richardcochran@gmail.com>
Original patch caused build errors without CONFIG_IGB_1588_CLOCK and
CONFIG_PPS enabled, since the added code was not properly wrapped.
CC: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
PTP initialization is only done on supported parts, so remove needs
same checks or it will cause crashes on systems with igb devices that
don't support PTP. This patch adds those checks to the exit function.
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
There is a need to configure MMW_SIZE in register RTTBCNRM with a correct
value. For 82576 device, the value should be 0x14.
Signed-off-by: Lior Levy <lior.levy@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Tushar Dave <tushar.n.dave@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch fixes a memory leak that was introduced in the 3.4 kernel. The
leak occurred when FCoE was enabled and traffic was passed over the FCoE
rings reserved for FCoE. The memory leak was due to us not populating the
compound page information on the order 1 pages needed for FCoE.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Fix Kconfig file to make sure that PTP and IGB/IXGBE are both either
in-kernel or modules, not mixed. Having the build status mixed causes
compile errors.
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
i210/i211 device has only 16 RAR address filters like 82575, instead of
32 like i350. This patch removes the entries for i210/i211 in the
get_invariants function which was setting them for 32. This ensures that
they will get the default value which is the correct one.
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch fixes a potential hole when configuring the cycle counter used to
generate the nanosecond time clock. This clock is based off of the SYSTIME
registers along with the TIMINCA registers. The TIMINCA register determines
the increment to be added to the SYSTIME registers every DMA clock tick. This
register needs to be reconfigured whenever the link-speed changes. However,
the value calculated stays the same when link is down and when link is up.
Misconfiguration can occur if the link status changes due to a reset, which
causes the TIMINCA register to be reset. This reset puts the device in an
unstable state where the SYSTIME registers stop incrementing and the PTP
protocol does not function.
The solution is to double check the TIMINCA value and always reset the value
if the register is zero. This prevents a misconfiguration bug that halts the
PHC.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes a potential Rx timestamp deadlock that causes the Rx
timestamping to stall indefinitely. The issue could occur when a PTP packet is
timestamped by hardware but never reaches the Rx queue. In order to prevent a
permanent loss of timestamping, the RXSTMP(L/H) registers have to be read to
unlock them. (This used to only occur when a packet that was timestamped
reached the software.) However the registers can't be read early otherwise
there is no way to correlate them to the packet.
This patch introduces a filter function which can be used to determine if a
packet should have been timestamped. Supplied with the filter setup by the
hwtstamp ioctl, check to make sure the PTP protocol and message type match the
expected values. If so, then read the timestamp registers (to free them.) At
this point check the descriptor bit, if the bit is set then we know this
packet correlates to the timestamp stored in the RXTSTAMP registers.
Otherwise, assume that packet was dropped by the hardware, and ignore this
timestamp value. However, we have at least unlocked the rxtstamp registers for
future timestamping.
Due to the way the driver handles skb data, it cannot be directly accessed. In
order to work around this, a copy of the skb data into a linear buffer is
made. From this buffer it becomes possible to read the data correctly
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Richard Cochran <richardcochran@gmail.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When enabling the hwtstamp mode for Rx timestamping the V2 ptp event type
specific modes (Delay Request and Sync) have been rolled into the V2 all event
packet modes, in order to more accurately represent what hardware is doing.
Hardware always timestamps the Path delay packets when a V2 mode is selected,
regardless of what type was selected (in order to always support Path delay
mode). However this means the user selected modes of timestamping only Sync or
Delay Request is not truly supported. This patch correctly sets the mode for
the hwtstamp config and returns to the user that all V2 event packets will be
timestamped.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch fixes two minor nits from Richard Cochran. The first is a case of
ambitious line wrapping that wasn't necessary. The second is to re-order the
flag checks for PPS support. Previously, the hardware test was done first, and
the interrupt flag test was done second. Now, test the interrupt flag and use
the unlikely macro.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
ixgbe_sysfs.c is only needed when CONFIG_IXGBE_HWMON is configured in the
kernel.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Acked-by: Don Skidmore <Donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The flow control DV macros are used to calculate the flow control
high and low thresholds. This patch annotates these macros slightly
better and fixes the issues below.
The macro variables are renamed LINK to _max_frame_link and TC to
_max_frame_tc. This was to avoid confusion and make them more
readable. It was found that people auditing the code read TC to be
'traffic class' in the 802.1Q definition instead of the max frame
size of the tc. Hopefully it is clear now.
This audit also found the following real deviations from the
theoretical values. Fixed in this patch.
* I multiplied the DV calculations by (36/25) which always
evaluates to 1. This does not match the intended theoretical
value of 1.44.
* IXGBE_BT2KB added 1023 to account for rounding however this
really should be 8 * 1023 - 1 to account for division by 8k.
* x2 multiplication of max frame in DV calculations to account
for updated hardware recommendations.
With this patch the DV values are inline with the recommendations
in the 82599 and 82598 data sheets. Its worth noting I did not
see any dropped frames with flow control on in my experiments without
this patch. However aligning with the hardware specs and
recommendations seems like a good idea here to account for worst
case scenarios.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Based on a report from Ethan Zhao, before calling register_netdev() the
driver should be using logging macros that do not display the potentially
confusing "(unregistered net_device)" yet still display the useful driver
name and PCI bus/device/function.
Reported-by: Ethan Zhao <ethan.kernel@gmail.com>
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The hardware bit IXGBE_RXD_STAT_VP appears to be set even when Rx
stripping is disabled. This results in passing frames up the stack
which do not have the 802.1Q tag stripped but have the tci bits
set as if it was.
Working around this with a check for the feature flag bit. I
would welcome any better ideas or a pointer to exactly which
bits in the hardware register need to be cleared to get the
IXGBE_RXD_STAT_VP bit to be set per data sheet.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Marcus Dennis <marcusx.e.dennis@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
DCB can be used independent of if RX VLAN stripping is enabled
or disabled so remove erroneous check.
Also enable or disable VLAN stripping when features are applied so
hardware and feature flags are in sync.
CC: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Marcus Dennis <marcusx.e.dennis@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
commit 44abd5c127 introduced NULL pointer
dereferences when attempting to access the check_reset_block function
pointer on 8257x and 80003es2lan non-copper devices.
This fix should be applied back through 3.4.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The definition of I217_PROXY_CTRL must use the BM_PHY_REG() macro instead
of the PHY_REG() macro for PHY page 800 register 70 since it is for a PHY
register greater than the maximum allowed by the latter macro, and fix a
typo setting the I217_MEMPWR register in e1000_suspend_workarounds_ich8lan.
Also for clarity, rename a few defines as bit definitions instead of masks.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This is another fixup where the data is not transfered into buffer
addressed by skb->data but into a page.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Killing reset task while adapter is resetting causes deadlock.
Only kill reset task if adapter is not resetting.
Ref bug #43132 on bugzilla.kernel.org
CC: stable@vger.kernel.org
Signed-off-by: Tushar Dave <tushar.n.dave@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Under certain scenarios, it's possible that bursty manageability traffic
over the BMC-to-OS path may overrun the internal manageability receive
buffer causing dropped manageability packets. Clearing this bit prevents
this situation by interrupting coalescing to allow manageability traffic
through.
Signed-off-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The code seems to want to look at the last byte where the HW puts some
information. Since the skb->data area is never seen by the HW I guess it
does not work as expected. We pass the page address to the HW so I
*think* in order to get to the last byte where the information might be
one should use the page buffer and take a look.
This is of course not more than just compile tested.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
According to the comment, errata 23 says that the memory we allocate
can't cross a 64KiB boundary. In case of jumbo frames we allocate
complete pages which can never cross the 64KiB boundary because
PAGE_SIZE should be a multiple of 64KiB so we stop either before the
boundary or start after it but never cross it. Furthermore the check
seems bogus because it looks at skb->data which is not seen by the HW
at all because we only pass the DMA address of the page we allocated. So
I *think* the workaround is not required here.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This define is needed by i217.
Reported-by: Bjorn Mork <bjorn@mork.no>
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds new initialization functions and device support
for i210 and i211 devices.
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
82580 and later parts did not have low power setting functions. This patch
adds the specific functions, pointers and assignments for these low
power settings.
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Since the caller (PM resume code) is not the one holding rtnl, when taking the
'else' branch rtnl may be released at any moment, thereby defeating the whole
purpose of this code block.
Signed-off-by: Benjamin Poirier <bpoirier@suse.de>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update version number to better match the version of the out of tree
driver with similar functionality.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When the hwmon code was initially added it was with the assumption that a
sysfs patch would be also coming soon. Since that isn't the case some
clean up needs to be done. This patch does that.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Kernel software timestamping requires that the driver calls skb_tx_timestamp
just before passing the skb to the MAC, in order to provide the best software
timestamps. This patch adds this call for that support.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds support for the ethtool get_ts_info operation, which enables
access of available timestamp/timesync support for that device. It can query
which ptp clock device is associated with the particular port.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The current value of the udelay timeout for ixgbe_disable_rx_buff is too
short. This causes the security path to not not be properly disabled during
the section that is meant to have it turned off. The end result causes a race
condition that results in RX issues.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch enables the PPS system in the PHC framework, by enabling
the clock-out feature on the X540 device. Causes the SDP0 to be set as
a 1Hz clock. Also configures the timesync interrupt cause in order to
report each pulse to the PPS via the PHC framework, which can be used
for general system clock synchronization. (This allows a stable method
for tuning the general system time via the on-board SYSTIM register
based clock.)
Signed-off-by: Jacob E Keller <jacob.e.keller@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch enables hardware timestamping for use with PTP software by
extracting a ns counter from an arbitrary fixed point cycles counter.
The hardware generates SYSTIME registers using the DMA tick which
changes based on the current link speed. These SYSTIME registers are
converted to ns using the cyclecounter and timecounter structures
provided by the kernel. Using the SO_TIMESTAMPING api, software can
enable and access timestamps for PTP packets.
The SO_TIMESTAMPING API has space for 3 different kinds of timestamps,
SYS, RAW, and SOF. SYS hardware timestamps are hardware ns values that
are then scaled to the software clock. RAW hardware timestamps are the
direct raw value of the ns counter. SOF software timestamps are the
software timestamp calculated as close as possible to the software
transmit, but are not offloaded to the hardware. This patch only
supports the RAW hardware timestamps due to inefficiency of the SYS
design.
This patch also enables the PHC subsystem features for atomically
adjusting the cycle register, and adjusting the clock frequency in
parts per billion. This frequency adjustment works by slightly
adjusting the value added to the cycle registers each DMA tick. This
causes the hardware registers to overflow rapidly (approximately once
every 34 seconds, when at 10gig link). To solve this, the timecounter
structure is used, along with a timer set for every 25 seconds. This
allows for detecting register overflow and converting the cycle
counter registers into ns values needed for providing useful
timestamps to the network stack.
Only the basic required clock functions are supported at this time,
although the hardware supports some ancillary features and these could
easily be enabled in the future.
Note that use of this hardware timestamping requires modifying daemon
software to use the SO_TIMESTAMPING API for timestamps, and the
ptp_clock PHC framework for accessing the clock. The timestamps have
no relation to the system time at all, so software must use the posix
clock generated by the PHC framework instead.
Signed-off-by: Jacob E Keller <jacob.e.keller@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
If the VF sends a MACVLAN request with index of zero then it is not
actually trying to add a filter. Check the index value and only
indicate that operation is not allowed when the VF is actually trying
to add a filter.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The drop enable bit can be used to improve the performance of the adapter
in the case of multiple queues being present. This performance gain is due
to the fact that some slower CPUs can cause the FIFO to backfill preventing
faster CPUs from receiving additional work. By setting the drop enable bit
we prevent this and instead just drop the packets that would have been
bound for the slower CPU.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change cleans up the logic in the priority based flow control
configuration routines. Both the 82599 and 82598 based routines perform
similar functions however they are both arranged completely differently.
This patch goes over both of them to clean up the code.
In addition I am dropping the ixgbe_fc_pfc flow control mode and instead
just replacing it with checks for if priority flow control is enabled.
This allows us to maintain some of the link flow control information which
allows for an easier transition between link and priority flow control.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Previously we would get a mailbox error and still process the message.
Instead we should exit on error.
In addition we should also be flushing the ACK of the message so that we
can guarantee that the other end is aware we have received the message
while we are processing it.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Current igb outputs registers related to TX/RX queues(ex. RDT, RDH, TDT, TDH).
But it thinks the number of RX/TX queues is 4. But 82576 has 16 RX/TX queues.
This patch modifies igb to output the rest of the registers if the device is
82576.
Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
Acked-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
During merge of net to net-next the changes in patch:
e1000e: Fix default interrupt throttle rate not set in NIC HW
got munged in param.c of the e1000e driver. This rectifies the
merge issues.
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Conflicts:
drivers/net/ethernet/intel/e1000e/param.c
drivers/net/wireless/iwlwifi/iwl-agn-rx.c
drivers/net/wireless/iwlwifi/iwl-trans-pcie-rx.c
drivers/net/wireless/iwlwifi/iwl-trans.h
Resolved the iwlwifi conflict with mainline using 3-way diff posted
by John Linville and Stephen Rothwell. In 'net' we added a bug
fix to make iwlwifi report a more accurate skb->truesize but this
conflicted with RX path changes that happened meanwhile in net-next.
In e1000e a conflict arose in the validation code for settings of
adapter->itr. 'net-next' had more sophisticated logic so that
logic was used.
Signed-off-by: David S. Miller <davem@davemloft.net>
PFC stats are only tabulated when PFC is enabled. However in IEEE
mode the ieee_pfc pfc_tc bits were not checked and the calculation
was aborted.
This results in statistics not being reported through ethtool and
possible a false Tx hang occurring when receiving pause frames.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Clear the REQ and GNT bit in the eeprom control register (EECD).
This is required if the eeprom is to be accessed with auto read
EERD register.
After a cold reset this doesn't matter but if PBIST MAC test was
executed before booting, the register was left in a dirty state
(the 2 bits where set), which caused the read operation to time out
and returning 0.
Reference (page 312):
http://download.intel.com/design/network/manuals/316080.pdf
Reported-by: Aleksandar Igic <aleksandar.igic@dektech.com.au>
Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Like other supported (igp) PHYs, the driver needs to be able to force the
master/slave mode on 82577. Since the code is the same as what already
exists in the code flow for igp PHYs, move it to a new function to be
called for both flows.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
igb and ixgbe incorrectly call netdev_tx_reset_queue() from
i{gb|xgbe}_clean_tx_ring() this sort of works in most cases except
when the number of real tx queues changes. When the number of real
tx queues changes netdev_tx_reset_queue() only gets called on the
new number of queues so when we reduce the number of queues we risk
triggering the watchdog timer and repeated device resets.
So this is not only a cosmetic issue but causes real bugs. For
example enabling/disabling DCB or FCoE in ixgbe will trigger this.
CC: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: John Bishop <johnx.bishop@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change updates the link flow control configuration so that we
correctly set the link flow control settings for DCB. Previously we would
have to call the fc_enable call 8 times, once for each packet buffer. If
we move that logic into the fc_enable call itself we can avoid multiple
unnecessary register writes.
This change also corrects an issue in which we were only shifting the water
marks for 82599 parts by 6 instead of 10. This was resulting in us only
using 1/16 of the packet buffer when flow control was enabled.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
We can avoid many of the forward declarations found in ixgbe_common.c by
just reordering things so this patch does that to help cleanup the code.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change replaces the calls to put_page with calls to __free_page.
Since the FCoE code is able to access order 1 pages I thought it would be a
good idea to change things over to using __free_pages since that is the
preferred approach for freeing pages.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change makes it so that ixgbe_fc_autoneg is a void and always sets the
current_mode. Previously if the link was down we would return an error,
however there is no harm in simply treating a link down case as a case in
which autoneg simply failed. This allows us to rely on the return value of
the ixgbe_fc_enable call now since there should be no cases where it
returns an error that would normally be ignored.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change reorders the mapping of rings to q_vectors in the case that the
number of rings exceeds the number of q_vectors. Previously we would
allocate the first R/N queues to the first q_vector where R is the number
of rings and N is the number of q_vectors. Instead of doing this we can do
a better job of interleaving the rings to the CPUs by assigning every Nth
ring to the q_vector.
The below tables illustrate this change for the R = 16 N = 4 case.
Before patch After patch
q_vector: 0 1 2 3 0 1 2 3
Rings: 0 4 8 12 0 1 2 3
1 5 9 13 4 5 6 7
3 6 10 14 8 9 10 11
4 7 11 15 12 13 14 15
This should improve the performance for both DCB or ATR when the number of
rings exceeds the number of q_vectors allocated by the adapter.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change makes it so that we can track instances of where a packet was
dropped due to a packet being received when there are no DMA buffers
available in the ring.
For some reason this was only being enabled with RSC, however it makes
more sense to always have this feature on so that we can track any cases
where we might drop a buffer due to an Rx ring being full.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
i217 is the next-generation LOM that will be available on systems with the
Lynx Point Platform Controller Hub (PCH) chipset from Intel. This patch
provides the initial support for the device.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Version bump to 1.11.3-k.
Signed-off-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
After this commit:
commit aacc1bea19
Author: Multanen, Eric W <eric.w.multanen@intel.com>
Date: Wed Mar 28 07:49:09 2012 +0000
ixgbe: driver fix for link flap
The BIT_APP_UPCHG bit is no longer set when ixgbe_dcbnl_set_all() is
called. This results in the FCoE app user priority never getting set
and the driver will not configure the tx_rings correctly for FCoE
packets which use the SAN MTU and FCoE offloads.
We resolve this regression by fixing ixgbe_copy_dcb_cfg() to also
check for FCoE application changes. Additionally, we can drop the
IEEE variants of get_dcb_app() because this path is never called
with the IEEE mode enabled.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
It was possible for shutdown to pull the rug out from other driver entry
points. Now we just grab the rtnl lock before taking everything apart.
Thanks to Hariharan for noticing this tight race condition.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Cc: Hariharan Nagarajan <hanagara@cisco.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
If the Physical Function (PF) resets after the VF has set jumbo
frame MTU then the VF jumbo frame is overwritten. Make sure the
VF driver always requests proper MTU size after reset
synchronization.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The X540 10Gig controller is capable of linking at 100Mbits - add
support for reporting that link speed.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
For the 82573, ASPM L1 gets disabled wholesale so this special-case code
is not required. For the 82574 the previous patch does the same as for
the 82573, disabling L1 on the adapter. Thus, this code is no longer
required and can be removed.
Signed-off-by: Chris Boot <bootc@bootc.net>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
ASPM on the 82574 causes trouble. Currently the driver disables L0s for
this NIC but only disables L1 if the MTU is >1500. This patch simply
causes L1 to be disabled regardless of the MTU setting.
Signed-off-by: Chris Boot <bootc@bootc.net>
Cc: "Wyborny, Carolyn" <carolyn.wyborny@intel.com>
Cc: Nix <nix@esperi.org.uk>
Link: https://lkml.org/lkml/2012/3/19/362
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Previously, IPv6 extension header parsing was disabled for all devices
supported by e1000e when using packet split mode. However, as per a
silicon errata, only certain devices need this restriction and will need
to disable IPv6 extension header parsing for all modes.
Signed-off-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
For 82574 and 82583 devices, resolve an intermittent link issue where
the link negotiates to 100Mbps rather than 1Gbps when powering off the
PHY and powering on the PHY after several seconds.
Signed-off-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Calling the locked versions of the read/write PHY ops function pointers
often produces excessively long lines. Shorten these as is done with
the non-locked versions of the PHY register read/write functions.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
There is a known issue in the 82577 and 82578 device that can cause a hang
in the device hardware during traffic stress; the current workaround in the
driver is to disable transmit flow control by default. If the user enables
transmit flow control and the device hang occurs, provide a message in the
syslog suggesting to re-enable the workaround.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
While testing the TCP changes I had to fix an issue in order to be able to
load and unload the module.
The recent patch that added thermal sensor support added a use after free
bug on module unload with an 82598 adapter in the system. To resolve the
issue I have updated the code so that when we free the info_kobj we set it
back to NULL.
I suspect there are likely other bugs present, but I will leave that for
another patch that can undergo more testing.
I am submitting this directly to net-next since this fixes a fairly serious
bug that will lock up the ixgbe module until the system is rebooted.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the user request for the number of VFs in the max_vfs parameter is
out of range then reset the value to the default value of zero. This
makes the behavior of the ixgbe driver the same as for the igb driver.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Robert Garrett <robertx.e.garrett@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
If the host VMM administrator has set the virtual function device's
MAC address then also deny VF requests for MACVLAN filters.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Garrett, Robert <robertx.e.garrett@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Some of our adapters have thermal data available, this patch exports
this data via hwmon sysfs interface.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Some 82599 adapters contain thermal data that we can get to via
an i2c interface. These functions provide support to get at that
data. A following patch will export this data.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Secondary unicast and multicast addresses are added to the Receive
Address registers (RAR) for most parts supported by the driver. For
82579, there is only one actual RAR and a number of Shared Receive Address
registers (SHRAR) that are shared among the driver and f/w which can be
reserved and write-protected by the f/w. On this device, use the SHRARs
that are not taken by f/w for the additional addresses.
Add a MAC ops function pointer infrastructure (similar to other MAC
operations in the driver) for setting RARs, introduce a new rar_set
function for 82579 and convert the existing code that sets RARs on other
devices to a generic rar_set function.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The PHY initialization flows and assorted workarounds for 82577/8/9 done
during driver load and resume from Sx should be the same yet they are not.
Combine the current flows/workarounds into a common set of functions that
are called during the different code paths.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
An update to the EEPROM on 82579 will extend a delay in hardware to fix an
issue with WoL not working after a G3->S5 transition which is unrelated to
the driver. However, this extended delay conflicts with nominal operation
of the device when it is initialized by the driver and after every reset
of the hardware (i.e. the driver starts configuring the device before the
hardware is done with it's own configuration work). The workaround for
when the driver is in control of the device is to tell the hardware after
every reset the configuration delay should be the original shorter one.
Some pre-existing variables are renamed generically to be re-used with
new register accesses.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
With the support to bounce buffer added, the skb is coming as nonlinear in the
case of non-DDPed data frames for FCoE, which is mostly ok as the FCoE stack
would take care of that. However, for target mode, we have to set the FC CRC
and FC EOF field to allow the protocol stack to not drop the frame for the last
data frame of that sequence. So fix this by linearizing the skb first before
doing skb_put().
Signed-off-by: Yi Zou <yi.zou@intel.com>
Tested-by: Marcus Dennis <marcusx.e.dennis@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The driver was freeing memory in shutdown instead of remove. As a result
we were leaking memory if IEEE DCB was enabled and we loaded/unloaded the
driver. This change moves the freeing of the memory into the remove
routine where it belongs.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Maybe it's a typo, but it cause that igbvf can't be initialized successfully.
Set perm_addr value using valid dev_addr, although which is equal to hw.mac.addr.
Signed-off-by: Samuel Liao <samuelliao@tencent.com>
Signed-off-by: Shan Wei <davidshan@tencent.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch consolidates the case logic for checking whether a device supports
WoL into a single place. Previously ethtool and probe used similar logic that
was copied and maintained separately. This patch encapsulates the core logic
into a function so that a user only has to update one place.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
During igb_reset(), we initiate a hardware reset which will clear our
flow control settings. For auto-negotiation, we re-negotiate them when
linking up again, but we need to force them off properly for the forced
speed case.
Signed-off-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Previously, a workaround was added to address a hardware bug in the
PCIm2PCI arbiter where a write by the driver of the Transmit/Receive
Descriptor Tail register could happen concurrently with a write of any
MAC CSR register by the Manageability Engine (ME) which could cause the
Tail register to have an incorrect value. The arbiter is supposed to
prevent the concurrent writes but there is a bug that can cause the Host
(driver) access to be acknowledged later than it should.
After further investigation, it was discovered that a driver write access
of any MAC CSR register after being idle for some time can be lost when
ME is accessing a MAC CSR register. When this happens, no further target
access is claimed by the MAC which could hang the system.
The workaround to check bit 24 in the FWSM register (set only when ME is
accessing a MAC CSR register) and delay for a limited amount of time until
it is cleared is now done for all driver writes of MAC CSR registers on
82579 with ME enabled. In the rare case when the driver is writing the
Tail register and ME is accessing any MAC CSR register for a duration
longer than the maximum delay, write the register and verify it has the
correct value before continuing, otherwise reset the device.
This patch also moves some pre-existing macros from the hardware-specific
header file to the more appropriate generic driver header file.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In K1 mode (a MAC/PHY interconnect power mode), the 82579 device shuts down
the Phase Lock Loop (PLL) of the interconnect to save power. When the PLL
starts working, the 82579 device may start to transfer the packet through
the interconnect before it is fully functional causing packet drops. This
workaround disables shutting down the PLL in K1 mode for 1G link speed.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Performance testing has shown that enabling DMA burst on 82574
improves performance on small packets, so enable it by default.
Signed-off-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
80003ES2LAN has an errata such that far-end loopback may be activated by
bit errors producing a reserved symbol. In order to disable far-end
loopback quickly enough, disable it immediately following a reset.
Signed-off-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Based on the original patch from Ying Cai <ycai@google.com>
This change ensures that the itr/itr_setting adjustment logic is used,
even for the default/compiled-in value.
Context:
When we changed the default InterruptThrottleRate value from default
(3 = dynamic mode) to 8000 for example, only adapter->itr_setting
(which controls interrupt coalescing mode) was set to 8000, but
adapter->itr (which controls the value set in NIC register) was not
updated accordingly. So from ethtool, it seemed the interrupt
throttling is enabled at 8000 intr/s, but the NIC actually was
running in dynamic mode which has lower CPU efficiency especially
when throughput is not high.
CC: Ying Cai <ycai@google.com>
CC: David Decotigny <david.decotigny@google.com>
Signed-off-by: Jeff Kirsher <jeffrey.kirsher@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Following logs where seen on Systems with multiple NICs,
while using MSI interrupts as shown below:
Feb 16 15:09:32 (none) user.notice kernel: 0000:00:0d.0: lan0_0: NIC Link is Up
1000 Mbps Full Duplex, Flow Control: RX/TX
Feb 16 15:09:32 (none) user.notice kernel: 0000:40:0d.0: wan0_1: NIC Link is Up
1000 Mbps Full Duplex, Flow Control: RX/TX
Feb 16 15:09:32 (none) user.notice kernel: 0000:40:0d.0: lan0_1: NIC Link is Up
1000 Mbps Full Duplex, Flow Control: RX/TX
Feb 16 15:09:32 (none) user.warn kernel: 0000:40:0e.0: wan4_0: MSI interrupt
test failed, using legacy interrupt.
Feb 16 15:09:32 (none) user.notice kernel: 0000:00:0e.0: wan1_0: NIC Link is Up
1000 Mbps Full Duplex, Flow Control: RX/TX
Feb 16 15:09:33 (none) user.notice kernel: 0000:00:0e.0: lan1_0: NIC Link is Up
1000 Mbps Full Duplex, Flow Control: RX/TX
Feb 16 15:09:33 (none) user.notice kernel: 0000:00:0f.0: wan2_0: NIC Link is Up
1000 Mbps Full Duplex, Flow Control: RX/TX
Feb 16 15:09:33 (none) user.notice kernel: 0000:00:0f.0: lan2_0: NIC Link is Up
1000 Mbps Full Duplex, Flow Control: RX/TX
Feb 16 15:09:33 (none) user.notice kernel: 0000:40:0a.0: wan3_0: NIC Link is Up
1000 Mbps Full Duplex, Flow Control: RX/TX
Feb 16 15:09:33 (none) user.notice kernel: 0000:40:0a.0: lan3_0: NIC Link is Up
1000 Mbps Full Duplex, Flow Control: RX/TX
Feb 16 15:09:34 (none) user.notice kernel: 0000:40:0e.0: lan4_0: NIC Link is Up
1000 Mbps Full Duplex, Flow Control: RX/TX
Feb 16 15:09:34 (none) user.notice kernel: 0000:40:0f.0: wan5_0: NIC Link is Up
1000 Mbps Full Duplex, Flow Control: RX/TX
Feb 16 15:09:34 (none) user.notice kernel: 0000:40:0f.0: lan5_0: NIC Link is Up
1000 Mbps Full Duplex, Flow Control: RX/TX
This patch fixes this problem by increasing the msleep from 50 to 100.
Signed-off-by: Prasanna S Panchamukhi <ppanchamukhi@riverbed.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Fix merge between commit 3adadc08cc ("net ax25: Reorder ax25_exit to
remove races") and commit 0ca7a4c87d ("net ax25: Simplify and
cleanup the ax25 sysctl handling")
The former moved around the sysctl register/unregister calls, the
later simply removed them.
With help from Stephen Rothwell.
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch makes it so that we identify FCoE rings earlier than
ixgbe_set_rx_buffer_len. Instead we identify the Rx FCoE rings at
allocation time in ixgbe_alloc_q_vector.
The motivation behind this change is to avoid memory corruption when FCoE
is enabled. Without this change we were initializing the rings at 0, and
2K on systems with 4K pages, then when we bumped the buffer size to 4K with
order 1 pages we were accessing offsets 2K and 6K instead of 0 and 4K.
This was resulting in memory corruptions.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Yi Zou <yi.zou@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Upon resume from standby, ixgbe may trigger the ASSERT_RTNL() in
netif_set_real_num_tx_queues(). The call stack is:
netif_set_real_num_tx_queues
ixgbe_set_num_queues
ixgbe_init_interrupt_scheme
ixgbe_resume
Signed-off-by: Benjamin Poirier <bpoirier@suse.de>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Conflicts:
drivers/net/ethernet/atheros/atlx/atl1.c
drivers/net/ethernet/atheros/atlx/atl1.h
Resolved a conflict between a DMA error bug fix and NAPI
support changes in the atl1 driver.
Signed-off-by: David S. Miller <davem@davemloft.net>
The UTA table was being set to the functional equivalent of promiscuous
mode. This was resulting in traffic from the virtual function being
flooded onto the wire and the PF device. This resulted in additional
overhead for VF traffic sent to the network and in the case of traffic
sent to the PF or another VF resulted in unwanted packets on the wire.
This was actually not the intended behavior. Now that we can program
the embedded switch correctly we can remove this snippit of code. Users
who want to support this should configure the FDB correctly using the
FDB ops.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This allows RAR table updates while in promiscuous. With
SR-IOV enabled it is valuable to allow the RAR table to
be updated even when in promisc mode to configure forwarding
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Enable FDB ops on ixgbe when in SR-IOV mode.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for I2C clock stretching which is required per
SFF-8636. Customers with passive DA cables implement clock stretching
would fail without this patch.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Replace occurrences of 'if (<bool expr> == <1|0>)' with
'if ([!]<bool expr>)'
Replace occurrences of '<bool var> = (<non-bool expr>) ? true : false'
with '<bool var> = <non-bool expr>'.
Replace occurrence of '<bool var> = <non-bool expr>' with
'<bool var> = !!<non-bool expr>'
While the latter replacement is not really necessary, it is done here for
consistency and clarity. No functional changes.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Now that split strings generate checkpatch warnings (per Chapter 2 of
Documentation/CodingStyle to make it easier to grep the code for the
string) cleanup the remaining instances of them in the driver.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch enables software (and phy device) transmit time stamping.
Tested on an old PIII laptop with built in NIC.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
There are times we turn of the laser before shutdown. This is a bad thing
if we want to wake on lan to work so now we make sure the laser is on
before shutdown if we support WoL.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
A workaround was previously put in the driver to reset the device when
transitioning to Sx in order to activate the changed settings of the PHY
OEM bits (Low Power Link Up, or LPLU, and GbE disable configuration) for
82577/8/9 devices. After further review, it was found such a reset can
cause the 82579 to confuse which version of 82579 it actually is and broke
LPLU on all 82577/8/9 devices. The workaround during an S0->Sx transition
on 82579 (instead of resetting the PHY) is to restart auto-negotiation
after the OEM bits are configured; the restart of auto-negotiation
activates the new OEM bits as does the reset. With 82577/8, the reset is
changed to a generic reset which fixes the LPLU bits getting set wrong.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The platform is removed, so there are no users of this driver.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Bruce Allan <bruce.w.allan@intel.com>
Cc: Carolyn Wyborny <carolyn.wyborny@intel.com>
Cc: Don Skidmore <donald.c.skidmore@intel.com>
Cc: Greg Rose <gregory.v.rose@intel.com>
Cc: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Cc: Alex Duyck <alexander.h.duyck@intel.com>
Cc: John Ronciak <john.ronciak@intel.com>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit removes the legacy timecompare code from the igb driver and
offers a tunable PHC instead.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds a source file implementing a PHC. Only the basic
clock operations have been implemented, although the hardware
would offer some ancillary functions. The code is fairly self
contained and is not yet used in the main igb driver.
Every timestamp and clock read operation must consult the overflow
counter to form a correct time value. Access to the counter is
protected by a spin lock, and the counter is implemented using the
standard cyclecounter/timecounter code.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch modifies ixgbe_get_pcie_msix_count_generic() to support
all current HW and removes the 82598 specific function.
- change the type of ixgbe_get_pcie_msix_count_generic() to u16
- include a check to make sure the maximum allowed number of vectors
is not exceeded.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Some Rx and Tx specific registers are arrays indexed by the queue number.
For clarity, specify the intended queue rather than obscuring it behind a
define.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Rename NAPI polling routine and a parameter with more appropriate names,
refactor a conditional branch to get rid of an unnecessary goto/label and
fix a line exceeding 80 columns.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Move the first phrase of a multi-line comment to the second line.
Signed-off-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This allows the NIC to receive errored frames (bad FCS, etc)
and pass them up the stack. This can be useful when using
sniffers.
Signed-off-by: Ben Greear <greearb@candelatech.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In rare circumstances, a descriptor writeback flush may not work if it
arrives on a specific clock cycle as a writeback request is going out.
Signed-off-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When the adapter is closed while it is simultaneously going through a
reset, it can cause a null-pointer dereference when the two different code
paths simultaneously cleanup up the Tx/Rx resources.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Fix up code so that changes in DCB settings
are detected only when ixgbe_dcbnl_set_all is called.
Previously, a series of 'change' commands followed by
a call to ixgbe_dcbnl_set_all() would always be handled
as a HW change - even if the net change was zero.
This patch checks for this case of no actual change and
skips going through the HW set process.
Without this fix, the link could reset and result in
a link flap.
The core change in this patch is to check for changes
in the ixgbe_copy_dcb_cfg() routine - and return
a bitmask of detected changes. The other
places where changes were detected previously can be removed.
Signed-off-by: Eric Multanen <eric.w.multanen@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Update the driver version number to better match version of out of tree
driver that has similar functionality.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This was pointed out to me by Xiaojun Zhang on Source Forge.
CC: Xiaojun Zhang <zhangxiaojun@sourceforge.net>
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Dan Carpenter noticed that ixgbevf initial default was different than
the rest. But the problem is broader than that, only one Intel driver (ixgb)
was doing it almost right.
The convention for default debug level should be consistent among
Intel drivers and follow established convention.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch fixes a regression introduced by commit "e1000: do vlan
cleanup (799d531)".
Apparently some e1000 chips (not mine) are sensitive about the order of
setting vlan filter and vlan stripping/inserting functionality. So this
patch changes the order so it's the same as before vlan cleanup.
Reported-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Tested-by: Ben Greear <greearb@candelatech.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Pull kmap_atomic cleanup from Cong Wang.
It's been in -next for a long time, and it gets rid of the (no longer
used) second argument to k[un]map_atomic().
Fix up a few trivial conflicts in various drivers, and do an "evil
merge" to catch some new uses that have come in since Cong's tree.
* 'kmap_atomic' of git://github.com/congwang/linux: (59 commits)
feature-removal-schedule.txt: schedule the deprecated form of kmap_atomic() for removal
highmem: kill all __kmap_atomic() [swarren@nvidia.com: highmem: Fix ARM build break due to __kmap_atomic rename]
drbd: remove the second argument of k[un]map_atomic()
zcache: remove the second argument of k[un]map_atomic()
gma500: remove the second argument of k[un]map_atomic()
dm: remove the second argument of k[un]map_atomic()
tomoyo: remove the second argument of k[un]map_atomic()
sunrpc: remove the second argument of k[un]map_atomic()
rds: remove the second argument of k[un]map_atomic()
net: remove the second argument of k[un]map_atomic()
mm: remove the second argument of k[un]map_atomic()
lib: remove the second argument of k[un]map_atomic()
power: remove the second argument of k[un]map_atomic()
kdb: remove the second argument of k[un]map_atomic()
udf: remove the second argument of k[un]map_atomic()
ubifs: remove the second argument of k[un]map_atomic()
squashfs: remove the second argument of k[un]map_atomic()
reiserfs: remove the second argument of k[un]map_atomic()
ocfs2: remove the second argument of k[un]map_atomic()
ntfs: remove the second argument of k[un]map_atomic()
...
This patch allows us to avoid a Tx hang when SR-IOV is enabled. This hang
can be triggered by sending small packets at a rate that was triggering Rx
missed errors from the adapter while the internal Tx switch and at least
one VF are enabled.
This was all due to the fact that under heavy stress the Rx FIFO never
drained below the flow control high water mark. This resulted in the Tx
FIFO being head of line blocked due to the fact that it relies on the flow
control high water mark to determine when it is acceptable for the Tx to
place a packet in the Rx FIFO.
The resolution for this is to set the FCRTH value to the RXPBSIZE - 32 so
that even if the ring is almost completely full we can still place Tx
packets on the Rx ring and drop incoming Rx traffic if we do not have
sufficient space available in the Rx FIFO.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Resolve namespace issues when FCoE or DCB is not enabled.
The issue is with certain configurations we end up with namespace
problems. A simple example:
ixgbe_main.c
- defines func A()
- uses func A()
ixgbe_fcoe.c
- uses func A()
ixgbe.h
- has prototype for func A()
For default (FCoE included) all is good. But when it isn't the namespace
checker complains about how func A() could be static.
To resolve this, created a ixgbe_lib file to contain functions used
by DCB/FCoE and their helper functions so that they are always in
namespace whether or not DCB/FCoE is enabled.
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Remove an unnecessary #define and use memcpy
instead of a loop to copy an ethernet address.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch replaces the variable name data with the variable name features
for ixgbe_fix_features and ixgbe_set_features. This helps to make some
issues more obvious such as the fact that we were disabling Rx VLAN tag
stripping when we should have been forcing it to be enabled when DCB is
enabled.
In addition there was deprecated code present that was disabling the LRO
flag if we had the itr value set too low. I have updated this logic so
that we will now allow the LRO flag to be set, but will not enable RSC
until the rx-usecs value is high enough to allow enough time for Rx packet
coalescing.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds support for enabling or disabling UDP RSS via the
ethtool -N rx-flow-hash command.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch contains several fixes for formatting in regards to whitespace.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change fixes two minor issues. The first was the fact that we were
setting the return value to false twice in the set_rss_queues function.
The second is the fact that we should have been using "min_t(int," instead
of "min((int)" in set_fdir_queues.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The err_eeprom and err_sw_init tags both go to the same location. So
instead of maintaining two tags this patch combines them so we only use
err_sw_init.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change relocates the ixgbe_poll routine so it is right next to the
interrupt routine that schedules and calls it.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change just cleans up some of the logic in the service_timer function
so that we can avoid unnecessary swapping of the ready value between true to
false and back to true.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change makes it so that only the 2nd cache line in the ring structure
should see frequent updates. The advantage to this is that it should
reduce the amount of cross CPU cache bouncing since only the 2nd cache line
will be changing between most network transactions.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change makes it so that we store the tx_flags and protocol information
to the tx_buffer_info structure sooner. This allows us to avoid unnecessary
read/write transactions since we are placing the data in the final location
earlier.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change makes it so that we always write the DMA address for the skb
itself on the same tx_buffer struct that the skb is written on. This way
we don't need the MAPPED_AS_PAGE flag and we always know it will be the
first DMA value that we will have to unmap.
In addition I have found an issue in which we were leaking a DMA mapping if
the value happened to be 0 which is possible on some platforms. In order
to resolve that I have updated the transmit path to use the length instead
of the DMA mapping in order to determine if a mapping is actually present.
One other tweak in this patch is that it only writes the olinfo information
on the first descriptor. As it turns out it isn't necessary to write it
for anything but the first descriptor so there is no need to carry it
forward.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change makes it so that gso_segs and bytecount are written to the ring
sooner. This helps to simplify the logic for the two since segmentation
offloads can now update them within their own function.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Instead of keeping a local copy of the skb on the stack for as long as long
as we do it makes sense to instead just place it on the first tx_buffer
structure so that we can save space on the stack and avoid unnecessary
read/write operations copying the pointer out of the stack and onto the
ring later.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
A separate value was added to track Tx completions in order to determine if
the Tx unit was hung. However we can do the same thing using the number of
packets completed without having to add another stat to the Tx ring.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change makes it more likely that the descriptor flags setup will use
cmov instructions instead of conditional jumps when setting up the flags.
The advantage to this is that the code should just flow a bit more
smoothly.
To do this it is necessary to set the TX_FLAGS_CSUM bit in tx_flags when
doing TSO so that we also do the checksum in addition to the segmentation
offload.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change makes certain that any packet we attempt to transmit will meet
minimum size requirements for the hardware.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change is meant to just cleanup the logic in ixgbe_change_mtu since we
are making it unnecessarily complex due to a workaround required for 82599
when SR-IOV is enabled.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch replaces the existing Rx hot-path in the ixgbe driver with a new
implementation that is based on performing a double buffered receive. The
ixgbe driver already had something similar in place for its' packet split
path, however in that case we were still receiving the header for the
packet into the sk_buff. The big change here is the entire receive path
will receive into pages only, and then pull the header out of the page and
copy it into the sk_buff data. There are several motivations behind this
approach.
First, this allows us to avoid several cache misses as we were taking a
set of cache misses for allocating the sk_buff and then another set for
receiving data into the sk_buff. We are able to avoid these misses on
receive now as we allocate the sk_buff when data is available.
Second we are able to see a considerable performance gain when an IOMMU is
enabled because we are no longer unmapping every buffer on receive.
Instead we can delay the unmap until we are unable to use the page, and
instead we can simply call sync_single_range on the half of the page that
contains new data.
Finally we are able to drop a considerable amount of code from the driver
as we no longer have to support 2 different receive modes, packet split and
one buffer. This allows us to optimize the Rx path further since less
branching is required.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This allows the NIC to receive all frames available, including
those with bad FCS, ethernet control frames, and more.
Tested by sending frames with bad FCS.
Signed-off-by: Ben Greear <greearb@candelatech.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Including bad FCS, used generate frames with bad FCS
to test other system's handling of RX of bad packets.
Signed-off-by: Ben Greear <greearb@candelatech.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This allows the NIC to receive all frames available, including
those with bad FCS, un-matched vlans, ethernet control frames,
and more.
Tested by sending frames with bad FCS.
Signed-off-by: Ben Greear <greearb@candelatech.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Including bad FCS, used generate frames with bad FCS
to test other system's handling of RX of bad packets.
Signed-off-by: Ben Greear <greearb@candelatech.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Disabling and enabling DCB can cause FCoE hardware initialization to
occur on the incorrect traffic class when the up2tc mapping has not
yet been reconfigured.
Fix this by using the DCB configuration maps that are correct
and will be pushed at mqprio after DCB driver setup completes
successfully.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Marcus Dennis <marcusx.e.dennis@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
There was a race condition in the reset path where the RX buffer
could become corrupted during Fdir configuration.This is due to
a HW bug.The fix right now is to lock the buffer while we do the
fdir configuration.Since we were using similar workaround for another bug,
I moved the existing code to a function and reused it.HW team also recommended
that IXGBE_MAX_SECRX_POLL value be changed from 30 to 40.The erratum for this
bug will be published in the next release 82599 Spec Update
Signed-off-by: Atita Shirwaikar <atita.shirwaikar@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
using the form min((int)var, ver)) is replaced by min_t(int, ...)
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This is clearly a typeo where we are not checking the return value from
get_link_capabilities but should. This patch corrects that.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
There isn't much point in using variables to store the values of eitr_low
and eitr_high since they are not user changeable. As such I am replacing
them with the constants 10 and 20 in order to avoid any confusion on what
the values actually are.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
A previous fix had gone though and disabled relaxed ordering for Rx
descriptor read fetching. This was not necessary as this functions
correctly and has no ill effects on the system.
In addition several of the defines used for the DCA control registers were
incorrect in that they indicated descriptor effects when they actually had
an impact on either data or header write back. As such I have update these
to correctly reflect either DATA or HEAD.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>