The indirection table has the indices of the Rx queues. When we store it
during set indirection operation, we convert the indices to our internal
representation of the indices.
Our internal representation of the indices is: even indices for Tx and
uneven indices for Rx, where every Tx/Rx pair are in a consecutive order
starting from 0. For example if the driver has 3 queues (3 for Tx and 3
for Rx) then the indices are as follows:
0 1 2 3 4 5
Tx Rx Tx Rx Tx Rx
The BUG:
The issue is that when we satisfy a get request for the indirection
table, we don't convert the indices back to the original representation.
The FIX:
Simply apply the inverse function for the indices of the indirection
table after we set it.
Fixes: 1738cd3ed3 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The first_interrupt field is accessed in ena_intr_msix_io() upon
receiving an interrupt.The rx_ring and tx_ring fields of napi can
be NULL when receiving interrupt for xdp queues. This patch fixes
the issue by moving the field to the ena_napi struct.
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit implements the XDP_TX action in the ena driver. We allocate
separate tx queues for the XDP_TX. We currently allow xdp only when
there is enough queues to allocate for xdp.
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit implements the basic functionality of drop/pass logic in the
ena driver.
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Set channels callback enables the user to change the count of queues
used by the driver using ethtool. We decided to currently support only
equal number of rx and tx queues, this might change in the future.
Also rename dev_up to dev_was_up in ena_update_queue_count() to make
it clearer.
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Update ena_ethtool:ena_get_channels() to return adapter->max_io_queues
so that ethtool -l returns the correct maximum queue number.
- Change the name of ena_calc_io_queue_num() to
ena_calc_max_io_queue_num() as it returns the maximum number of io
queues and actual number of queues can be smaller if changed
by ethtool -L which is implemented in a later commit.
- Change variable name from io_queue_num to max_num_io_queues in
ena_calc_max_io_queue_num() and ena_probe().
- Make all types of variables that convey the number and sizeof queues
to be u32, for consistency with the API between the driver and the
device.
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Most places in the code refer to the IO queues as io_queues and not
simply queues. Examples - max_io_queues_per_vf, ENA_MAX_NUM_IO_QUEUES,
ena_destroy_all_io_queues() etc..
We are also adding the new max_num_io_queues field to struct ena_adapter
in the following commit.
The changes included in this commit are:
struct ena_adapter->num_queues => struct ena_adapter->num_io_queues
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Deleted unused 4 fields from struct ena_adapter and their only user
ena_restore_ethtool_params().
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1. Out of the fields {per_napi_bytes, per_napi_packets} in struct ena_ring,
only rx_ring->per_napi_packets are used to determine if napi did work
for dim.
This commit removes all other uses of these fields.
2. Remove ena_ring->moder_tbl_idx, which is not used by dim.
3. Remove all calls to ena_com_destroy_interrupt_moderation(), since all it
did was to destroy the interrupt moderation table, which is removed as
part of removing old interrupt moderation code.
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the dim library for the rx adaptive interrupt moderation implementation
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 315c28d2b7 ("net: ena: ethtool: add extra properties retrieval via get_priv_flags").
As discussed at netconf and on the mailing list we can't allow
for the the abuse of private flags for exposing arbitrary device
labels.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update driver version to match device specification.
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement the set_ringparam() function of the ethtool interface
to enable the changing of io queue sizes.
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If there is not enough memory to allocate io queues the driver will
try to allocate smaller queues.
The backoff algorithm is as follows:
1. Try to allocate TX and RX and if successful.
1.1. return success
2. Divide by 2 the size of the larger of RX and TX queues (or both if their size is the same).
3. If TX or RX is smaller than 256
3.1. return failure.
4. else
4.1. go back to 1.
Also change the tx_queue_size, rx_queue_size field names in struct
adapter to requested_tx_queue_size and requested_rx_queue_size, and
use RX and TX queue 0 for actual queue sizes.
Explanation:
The original fields were useless as they were simply used to assign
values once from them to each of the queues in the adapter in ena_probe().
They could simply be deleted. However now that we have a backoff
feature, we have use for them. In case of backoff there is a difference
between the requested queue sizes and the actual sizes. Therefore there
is a need to save the requested queue size for future retries of queue
allocation (for example if allocation failed and then ifdown + ifup was
called we want to start the allocation from the original requested size of
the queues).
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use MAX_QUEUES_EXT get feature capability to query the device.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a new statistics to ETHTOOL to specify if the device calculated
and validated the Rx csum.
Signed-off-by: Evgeny Shmeilin <evgeny@annapurnaLabs.com>
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
struct ena_ring holds a union of free_rx_ids and free_tx_ids.
Both of the above fields mean the exact same thing and are used
exactly the same way.
Furthermore, these fields are always used with a prefix of the
type of ring. So for tx it will be tx_ring->free_tx_ids, and for
rx it will be rx_ring->free_rx_ids, which shows how redundant the
"_tx" and "_rx" parts are.
Furthermore still, this may lead to confusing code like where
tx_ring->free_rx_ids which works correctly but looks like a mess.
This commit removes the aforementioned redundancy by replacing the
free_rx/tx_ids union with a single free_ids field.
It also changes a single goto label name from err_free_tx_ids: to
err_tx_free_ids: for consistency with the above new notation.
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit adds a mechanism for exposing different device
properties via ethtool's priv_flags. The strings are provided
by the device and copied to user space through the driver.
In this commit we:
Add commands, structs and defines necessary for handling
extra properties
Add functions for:
Allocation/destruction of a buffer for extra properties strings.
Retreival of extra properties strings and flags from the network device.
Handle the allocation of a buffer for extra properties strings.
* Initialize buffer with extra properties strings from the
network device at driver startup.
Use ethtool's get_priv_flags to expose extra properties of
the ENA device
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update driver version due to critical bug fixes.
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Improves socket memory utilization when receiving packets larger
than 128 bytes (the previous rx copybreak) and smaller than 256 bytes.
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently Rx refill is done when the number of required descriptors is
above 1/8 queue size. With a default of 1024 entries per queue the
threshold is 128 descriptors.
There is intention to increase the queue size to 8196 entries.
In this case threshold of 1024 descriptors is too large and can hurt
latency.
Add another limitation to Rx threshold to be at most 256 descriptors.
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Set skb->ip_summed to the correct value as reported by the device.
Add counter for the case where rx csum offload is enabled but
device didn't check it.
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch includes all code changes necessary in ena_netdev to enable
packet sending via the LLQ placemnt mode.
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The buffer length field in the ena rx descriptor is 16 bit, and the
current driver passes a full page in each ena rx descriptor.
When PAGE_SIZE equals 64kB or more, the buffer length field becomes
zero.
To solve this issue, limit the ena Rx descriptor to use 16kB even
when allocating 64kB kernel pages. This change would not impact ena
device functionality, as 16kB is still larger than maximum MTU.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A mechanism for detection of stuck Rx/Tx rings due to missed or
misrouted interrupts.
Check if there are unhandled completion descriptors before the first
MSI-X interrupt arrived.
The check is per queue and per interrupt vector.
Once such condition is detected, driver and device reset is scheduled.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In rare cases, ena driver would reset and re-start the device,
for example, in case of misbehaving application that causes
transmit timeout
The first step in the reset procedure is to stop the Tx traffic by
calling ena_carrier_off().
After the driver have just started the device reset procedure, device
happens to send an asynchronous notification (via AENQ) to the driver
than there was a link change (to link-up state).
This link change is mapped to a call to netif_carrier_on() which
re-activates the Tx queues, violating the assumption of no tx traffic
until device reset is completed, as the reset task might still be in
the process of queues initialization, leading to an access to
uninitialized memory.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a new statistic to ethtool stats that show the number of packets
without transmit acknowledgement from ENA device.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove ena_device_io_suspend/resume() methods
Those methods were intend to be used by the device to trigger
suspend/resume but eventually it was dropped.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current driver tries to allocate msix vectors as the number of the
negotiated io queues. (with another msix vector for management).
If pci_alloc_irq_vectors() fails, the driver aborts the probe
and the ENA network device is never brought up.
With this patch, the driver's logic will reduce the number of IO
queues to the number of allocated msix vectors (minus one for management)
instead of failing probe().
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ENA driver post Rx buffers through the Rx submission queue
for the ENA device to fill them with receive packets.
Each Rx buffer is marked with req_id in the Rx descriptor.
Newer ENA devices could consume the posted Rx buffer in out of order,
and as result the corresponding Rx completion queue will have Rx
completion descriptors with non contiguous req_id(s)
In this change the driver holds two rings.
The first ring (called free_rx_ids) is a mapping ring.
It holds all the unused request ids.
The values in this ring are from 0 to ring_size -1.
When the driver wants to allocate a new Rx buffer it uses the head of
free_rx_ids and uses it's value as the index for rx_buffer_info ring.
The req_id is also written to the Rx descriptor
Upon Rx completion,
The driver took the req_id from the completion descriptor and uses it
as index in rx_buffer_info.
The req_id is then return to the free_rx_ids ring.
This patch also adds statistics to inform when the driver receive out
of range or unused req_id.
Note:
free_rx_ids is only accessible from the napi handler, so no locking is
required
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For each device reset, log to the device what is the cause
the reset occur.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With this patch, ENA device can update the ena driver about
the desired timeout values:
These values are part of the "hardware hints" which are transmitted
to the driver as Asynchronous event through ENA async
event notification queue.
In case the ENA device does not support this capability,
the driver will use its own default values.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
check_for_missing_tx_completions() is called from a timer
task and looking for lost tx packets.
The old implementation accumulate all the lost tx packets
and did not check if those packets were retrieved on a later stage.
This cause to a situation where the driver reset
the device for no reason.
Fixes: 1738cd3ed3 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the rare case where the device runs out of free rx buffer
descriptors (in case of pressure on kernel memory),
and the napi handler continuously fail to refill new Rx descriptors
until device rx queue totally runs out of all free rx buffers
to post incoming packet, leading to a deadlock:
* The device won't send interrupts since all the new
Rx packets will be dropped.
* The napi handler won't try to allocate new Rx descriptors
since allocation is part of NAPI that's not being invoked any more
The fix involves detecting this scenario and rescheduling NAPI
(to refill buffers) by the keepalive/watchdog task.
Fixes: 1738cd3ed3 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove the deprecated pci_enable_msix API in favour of its successor.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The timeouts were too agressive and sometimes cause false alarms.
Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ndo_get_stat64() can be called from atomic context, but the current
implementation sends an admin command to retrieve the statistics from
the device. This admin command can sleep.
This patch re-factors the implementation of ena_get_stats64() to use
the {rx,tx}bytes/count from the driver's inner counters, and to obtain
the rx drop counter from the asynchronous keep alive (heart bit)
event.
Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is a driver for the ENA family of networking devices.
Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
Signed-off-by: David S. Miller <davem@davemloft.net>