Various drivers are using implementations of ethtool_ops::get_link
that are equivalent to the default ethtool_op_get_link(). Change
them to use that instead.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This driver now uses Generic Receive Offload, not the older LRO.
Change references to LRO in names and comments.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Long before this driver went into mainline, it had support for
multiple TX queues per port, with lockless TX enabled. Since Linux
did not know anything of this, filling up any hardware TX queue would
stop the core TX queue and multiple hardware TX queues could fill up
before the scheduler reacted. Thus it was necessary to keep a count
of how many TX queues were stopped and to wake the core TX queue only
when all had free space again.
The driver also previously (ab)used the per-hardware-queue stopped
flag as a counter to deal with various things that can inhibit TX, but
it no longer does that.
Remove the per-channel tx_stop_count, tx_stop_lock and
per-hardware-queue stopped count and just use the networking core
queue state directly.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Call netif_napi_{add,del}() on the NAPI contexts in the new and
old channels, respectively.
Since efx_init_napi() cannot fail, make its return type void.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
If we are using a legacy interrupt, our IRQ may be shared and our
interrupt handler may be called even though interrupts are disabled on
the NIC. When we change ring sizes, we reallocate the event queue and
the interrupt handler may use an invalid pointer when called for
another device's interrupt.
Maintain a legacy_irq_enabled flag and test that at the top of the
interrupt handler. Note that this problem results from the need to
work around broken INT_ISR0 reads, and does not affect the legacy
interrupt handler for Falcon A1.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Move search_depth arrays into per-table state.
Define initialisation function efx_filter_init_rx() which sets
everything apart from the match fields.
Define efx_filter_set_{ipv4_local,ipv4_full,eth_local}() to set the
match fields. This allows some simplification of callers and later
support for additional protocols and more flexible matching using
multiple calls to these functions.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
The separation between filter tables is largely an internal detail
and it may be removed in future hardware. To prepare for that:
- Merge table ID with filter index to make an opaque filter ID
- Wrap efx_filter_table_clear() with a function that clears filters
from both RX tables, which is all that the current caller requires
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Add message at start of self-test and increase log level of message at
end of self-test, so that any other messages produced during the
test are clearly associated with it.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Whenever we add DMA descriptors to a TX ring and update the ring
pointer, the TX DMA engine must first read the new DMA descriptors and
then start reading packet data. However, all released Solarflare 10G
controllers have a 'TX push' feature that allows us to reduce latency
by writing the first new DMA descriptor along with the pointer update.
This is only useful when the queue is empty. The hardware should
ignore the pushed descriptor if the queue is not empty, but this check
is buggy, so we must do it in software.
In order to tell whether a TX queue is empty, we need to compare the
previous transmission count (write_count) and completion count
(read_count). However, if we do that every time we update the ring
pointer then read_count may ping-pong between the caches of two CPUs
running the transmission and completion paths for the queue.
Therefore, we split the check for an empty queue between the
completion path and the transmission path:
- Add an empty_read_count field representing a point at which the
completion path saw the TX queue as empty.
- Add an old_write_count field for use on the completion path.
- On the completion path, whenever read_count reaches or passes
old_write_count the TX queue may be empty. We then read
write_count, set empty_read_count if read_count == write_count,
and update old_write_count.
- On the transmission path, we read empty_read_count. If it's set, we
compare it with the value of write_count before the current set of
descriptors was added. If they match, the queue really is empty and
we can use TX push.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
It is not necessary to serialise writes to the paged 128-bit
registers. However, if we don't then we must always write the last
dword separately, not as part of a qword write.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Document exactly which registers and functions have special behaviour,
and why races on writes to descriptor pointers are safe.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Place the regularly updated fields (locks, MAC stats, etc.) on a
separate cache-line from fields which are mostly constant. This
should reduce cache misses for access to the latter on the data path.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
For some reason we failed to make this change when perm_addr was
introduced.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We only have direct access to MDIO on Falcon, so move this out of
struct efx_nic.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We only have direct access to SPI on Falcon, so move all this state
out of struct efx_nic.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ethtool EEPROM interface is really meant for exposing chip
configuration, not BootROM configuration.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the Falcon board config is invalid, we cannot proceed - we do not
have a valid board type to pass to falcon_probe_board(), and if we
kluge that to work with an unknown board then other initialisation
code will crash.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The mcfw *never* sends CMDDONE when rebooting. Changing this so that it always
sends CMDDONE *before* REBOOT is easy on Siena, but it's not obvious that we
could guarantee to be able to implement this on future hardware.
Given this, I'm less convinced that the protocol should be changed.
To reiterate the failure mode: The driver sees this:
issue command
receive REBOOT event
Was that reboot event sent before the command was issued, or in
response to the command? If the former then there will be a subsequent
CMDDONE event, if the latter, then there will be no CMDDONE event.
Options to resolve this are:
1. REBOOT always completes an outstanding mcdi request, and we set
the credits count to ignore a subsequent CMDDONE event with
mismatching seqno.
2. REBOOT never completes an outstanding mcdi request. If there is
no CMDDONE event then we rely on the mcdi timeout code to complete
the outstanding request, incurring a 10s delay.
I'd argue that (2) is tidier, but that incurring a 10s delay is a little
needless. Let's go with (1).
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When we enable PMA/PMD loopback this automatically sets RXIN_SEL
(inverse polarity for RXIN). We need to clear that bit during the
soft-reset sequence, as it is not done automatically.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We do not want to shut down the board based on a fault that has
already been cleared.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Set both the 'maximum' and critical temperature limits for LM87
hardware monitors on Falcon boards. Do not shut down a port until the
critical temperature is reached, but warn as soon as the 'maximum'
temperature is reached.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some errors are expected, e.g. when sending new commands to an MC
running old firmware. Only the caller of efx_mcdi_rpc() can decide
what is a real error. Therefore log the error responses with
netif_dbg().
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use vzalloc() and vzalloc_node() in net drivers
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Jon Mason <jon.mason@exar.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make local functions and variable static. Do some rearrangement
of the string table stuff to put it where it gets used.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The filter engine will time-out and ignore filters beyond
200-something hops. We also need to avoid infinite loops in
efx_filter_search() when the table is full.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change "return (EXPR);" to "return EXPR;"
return is not a function, parentheses are not required.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This board never went into production, but some engineering samples
are in use.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
SFN4111T never reached production and is not being used for internal
or customer testing.
Since we have no production Falcon boards using the SFT9001 or the
GMAC, remove support for them as well.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/net/sfc/filter.c: In function ‘efx_probe_filters’:
drivers/net/sfc/filter.c:422: error: implicit declaration of function ‘vmalloc’
drivers/net/sfc/filter.c:422: warning: assignment makes pointer from integer without a cast
drivers/net/sfc/filter.c: In function ‘efx_remove_filters’:
drivers/net/sfc/filter.c:442: error: implicit declaration of function ‘vfree’
Signed-off-by: David S. Miller <davem@davemloft.net>
For backward compatibility, add it at the end.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This requires some reorganisation of channel setup and teardown to
ensure that we can always roll-back a failed change.
Based on work by Steve Hodgson <shodgson@solarflare.com>
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Allow the ring size to be specified in non
power-of-two sizes (for instance to limit
the amount of receive buffers).
- Automatically size the event queue.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This will allow for reallocation of channel structures and rings.
Change module parameter separate_tx_channels to be read-only, since we
now require its value to be constant.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation for changes to the way channels and queue structures
are allocated, revise the macros and functions used to look up and
iterator over them.
- Replace efx_for_each_tx_queue() with iteration over channels then TX
queues
- Replace efx_for_each_rx_queue() with iteration over channels then RX
queues (with one exception, shortly to be removed)
- Introduce efx_get_{channel,rx_queue,tx_queue}() functions to look up
channels and queues by index
- Introduce efx_channel_get_{rx,tx}_queue() functions to look up a
channel's queues
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently we allocate DMA descriptor rings and event rings using
pci_alloc_consistent() which selects non-blocking behaviour from the
page allocator (GFP_ATOMIC). This is unnecessary, and since we
currently allocate a single contiguous block for each ring (up to 32
pages!) these allocations are likely to fail if there is any
significant memory pressure. Use dma_alloc_coherent() and GFP_KERNEL
instead.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
rx_over_errors appears to be intended as a count of packets that
overflow a packet buffer in the NIC. Given that we implement a
cut-through receive path, this should always be 0.
rx_dropped appears to be the correct counter for packets dropped due
to lack of host buffers.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Calculating rx_bad as rx_packets - rx_good is unnecessary and
incorrect, since rx_good does not include control frames (e.g.
pause frames) and rx_packets does.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
fresh skbs have ip_summed set to CHECKSUM_NONE (0)
We can avoid setting again skb->ip_summed to CHECKSUM_NONE in drivers.
Introduce skb_checksum_none_assert() helper so that we keep this
assertion documented in driver sources.
Change most occurrences of :
skb->ip_summed = CHECKSUM_NONE;
by :
skb_checksum_none_assert(skb);
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit eedc765ca4 merged changes from
net-2.6 that added and then removed efx_nic::port_num, which was also
added in net-next-2.6. The end result should be that it is removed,
since it is now unused.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a small possibility that a reader gets incorrect values on 32
bit arches. SNMP applications could catch incorrect counters when a
32bit high part is changed by another stats consumer/provider.
One way to solve this is to add a rtnl_link_stats64 param to all
ndo_get_stats64() methods, and also add such a parameter to
dev_get_stats().
Rule is that we are not allowed to use dev->stats64 as a temporary
storage for 64bit stats, but a caller provided area (usually on stack)
Old drivers (only providing get_stats() method) need no changes.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow ethtool to query the number of RX rings, the fields used in RX
flow hashing and the hash indirection table.
Allow ethtool to update the RX flow hash indirection table.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ethtool_op_set_flags() does not check for unsupported flags, and has
no way of doing so. This means it is not suitable for use as a
default implementation of ethtool_ops::set_flags.
Add a 'supported' parameter specifying the flags that the driver and
hardware support, validate the requested flags against this, and
change all current callers to pass this parameter.
Change some other trivial implementations of ethtool_ops::set_flags to
call ethtool_op_set_flags().
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Reviewed-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Insertion of the Falcon hash is unreliable.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We will use this hash key for Toeplitz IPv4 hashing too.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The hash appears immediately before the packet data, not at the
beginning of the buffer. This means we can easily use negative offsets
from the start of packet data, so adjust the data and length at the
top of __efx_rx_packet() instead of wherever we consume the hash.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace EFX_ERR() with netif_err(), EFX_INFO() with netif_info(),
EFX_LOG() with netif_dbg() and EFX_TRACE() and EFX_REGDUMP() with
netif_vdbg().
Replace EFX_ERR_RL(), EFX_INFO_RL() and EFX_LOG_RL() using explicit
calls to net_ratelimit().
Implement the ethtool operations to get and set message level flags,
and add a 'debug' module parameter for the initial value.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This exposes the port number to userland through sysfs.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A single shared memory region used to communicate with firmware is
mapped into both PCI PFs of the SFC9020 and SFL9021. Drivers must be
able to identify which port they are addressing in order to use the
correct sub-region. Currently we use the PCI function number, but the
PCI address may be virtualised. Use the CS_PORT_NUM register field
defined for just this purpose.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
cleanup patch.
Use new __packed annotation in drivers/net/
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A single shared memory region used to communicate with firmware is
mapped into both PCI PFs of the SFC9020 and SFL9021. Drivers must be
able to identify which port they are addressing in order to use the
correct sub-region. Currently we use the PCI function number, but the
PCI address may be virtualised. Use the CS_PORT_NUM register field
defined for just this purpose.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
rx_errors is defined as 'bad packets received', but we are currently
including various overflow errors as well.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Insert a structure at the start of the shared page that
tracks the dma mapping refcnt. DMA into the next cache
line of the (shared) page (plus EFX_PAGE_IP_ALIGN).
When recycling a page, check the page refcnt. If the
page is otherwise unused, then resurrect the other
receive buffer that previously referenced the page.
Be careful not to overflow the receive ring, since we
can now resurrect n receive buffers in a row.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The cut-through design of the receive path means that packets that
fail to match the appropriate MAC filter are not discarded at the MAC
but are flagged in the completion event as 'to be discarded'. On
networks with heavy multicast traffic, this can account for a
significant proportion of received packets, so it is worthwhile to
recycle the buffer immediately in this case rather than freeing it
and then reallocating it shortly after.
The only complication here is dealing with a page shared
between two receive buffers. In that case, we need to be
careful to free the dma mapping when both buffers have
been free'd by the kernel. This means that we can only
recycle such a page if both receive buffers are discarded.
Unfortunately, in an environment with 1500mtu,
rx_alloc_method=PAGE, and a mixture of discarded and
not-discarded frames hitting the same receive queue,
buffer recycling won't always be possible.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Pull the loop handling into efx_init_rx_buffers_(skb|page)
- Remove rx_queue->buf_page, and associated clean up code
- Remove unmap_addr, since unmap_addr is trivially calculable
This will allow us to recycle discarded buffers directly
from efx_rx_packet(), since will never be in the middle of
splitting a page.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ensure that efx_fast_push_rx_descriptors() must only run
from efx_process_channel() [NAPI], or when napi_disable()
has been executed.
Reimplement the slow fill by sending an event to the
channel, so that NAPI runs, and hanging the subsequent
fast fill off the event handler. Replace the sfc_refill
workqueue and delayed work items with a timer. We do
not need to stop this timer in efx_flush_all() because
it's safe to send the event always; receiving it will
be delayed until NAPI is restarted.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Formerly, efx_test_eventq_irq() assumed it was the only user of
driver generated events. Allow it to interoperate with other users.
We can create more than 16 channels, so align event codes with
a multiple of 256 not 16.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It's been observed that some phys (such as the qt2025c) can
do down-up-down-up transitions, presumably as pcs block lock
settles down.
The loopback selftest will start sending data immediately
after the link comes up. Work around this by waiting for
the link state to stay up for two consecutive polls, rather
than one.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
All of the ethtool code paths keep them in sync, but we need
to ensure they are sync'd at start of day. Matches the sft9001
driver.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Under certain conditions a PHY may backpressure Falcon B0
in such a way that flushes timeout. In normal circumstances
the phy poller would fix the PHY, and the flush could complete.
But efx_nic_flush_queues() is always called after efx_stop_all(),
so the poller has been stopped. Even if this weren't the case,
how long would we have to wait for the poller to fix this? And
several callers of efx_nic_flush_queues() are about to reset
the device anyway - so we don't need to do anything.
Work around this bug by scheduling a reset. Ensure that the
MAC is never rewired back into the datapath before the reset
runs (we already ignore all rx events anyway).
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
efx_pm_freeze() sets efx->state = STATE_FINI, which means
efx_reset_work() will abort any scheduled resets.
efx_pm_thaw() should reschedule efx_reset_work() again,
since a freeze/thaw will not have reset the hardware.
This bug was spotted by inspection - there is no real world example of
this happening.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Most of its members are constant capabilities, not configuration. The
new name is also consistent with the name of the pointer to it in
struct efx_nic and the names of structures used by other PHY drivers.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Create a core TX queue and 2 hardware TX queues for each channel.
If separate_tx_channels is set, create equal numbers of RX and TX
channels instead.
Rewrite the channel and queue iteration macros accordingly.
Eliminate efx_channel::used_flags as redundant.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This makes no immediate difference, but we definitely do not want
to test all TX queues once we allocate a pair of TX queues to each
channel.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no need for this to be unsigned long; make it unsigned int.
It does need a line in kernel-doc, so add that.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently TX completions do not count towards the NAPI budget. This
means a continuous stream of TX completions can cause the polling
function to loop indefinitely with scheduling disabled. To avoid
this, follow the common practice of reporting the budget spent after
processing one ring-full of TX completions.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When set, an event is not sent whenever periodic MAC statistics are
raised. This avoids unnecessary wake-ups.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>