linux

Commit Graph

Author	SHA1	Message	Date
Brenden Blanco	d576acf0a2	net/mlx4_en: add page recycle to prepare rx ring for tx support The mlx4 driver by default allocates order-3 pages for the ring to consume in multiple fragments. When the device has an xdp program, this behavior will prevent tx actions since the page must be re-mapped in TODEVICE mode, which cannot be done if the page is still shared. Start by making the allocator configurable based on whether xdp is running, such that order-0 pages are always used and never shared. Since this will stress the page allocator, add a simple page cache to each rx ring. Pages in the cache are left dma-mapped, and in drop-only stress tests the page allocator is eliminated from the perf report. Note that setting an xdp program will now require the rings to be reconfigured. Before: 26.91% ksoftirqd/0 [mlx4_en] [k] mlx4_en_process_rx_cq 17.88% ksoftirqd/0 [mlx4_en] [k] mlx4_en_alloc_frags 6.00% ksoftirqd/0 [mlx4_en] [k] mlx4_en_free_frag 4.49% ksoftirqd/0 [kernel.vmlinux] [k] get_page_from_freelist 3.21% swapper [kernel.vmlinux] [k] intel_idle 2.73% ksoftirqd/0 [kernel.vmlinux] [k] bpf_map_lookup_elem 2.57% swapper [mlx4_en] [k] mlx4_en_process_rx_cq After: 31.72% swapper [kernel.vmlinux] [k] intel_idle 8.79% swapper [mlx4_en] [k] mlx4_en_process_rx_cq 7.54% swapper [kernel.vmlinux] [k] poll_idle 6.36% swapper [mlx4_core] [k] mlx4_eq_int 4.21% swapper [kernel.vmlinux] [k] tasklet_action 4.03% swapper [kernel.vmlinux] [k] cpuidle_enter_state 3.43% swapper [mlx4_en] [k] mlx4_en_prepare_rx_desc 2.18% swapper [kernel.vmlinux] [k] native_irq_return_iret 1.37% swapper [kernel.vmlinux] [k] menu_select 1.09% swapper [kernel.vmlinux] [k] bpf_map_lookup_elem Signed-off-by: Brenden Blanco <bblanco@plumgrid.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 21:46:32 -07:00
Brenden Blanco	86af8b4191	Add sample for adding simple drop program to link Add a sample program that only drops packets at the BPF_PROG_TYPE_XDP_RX hook of a link. With the drop-only program, observed single core rate is ~20Mpps. Other tests were run, for instance without the dropcnt increment or without reading from the packet header, the packet rate was mostly unchanged. $ perf record -a samples/bpf/xdp1 $(</sys/class/net/eth0/ifindex) proto 17: 20403027 drops/s ./pktgen_sample03_burst_single_flow.sh -i $DEV -d $IP -m $MAC -t 4 Running... ctrl^C to stop Device: eth4@0 Result: OK: 11791017(c11788327+d2689) usec, 59622913 (60byte,0frags) 5056638pps 2427Mb/sec (2427186240bps) errors: 0 Device: eth4@1 Result: OK: 11791012(c11787906+d3106) usec, 60526944 (60byte,0frags) 5133311pps 2463Mb/sec (2463989280bps) errors: 0 Device: eth4@2 Result: OK: 11791019(c11788249+d2769) usec, 59868091 (60byte,0frags) 5077431pps 2437Mb/sec (2437166880bps) errors: 0 Device: eth4@3 Result: OK: 11795039(c11792403+d2636) usec, 59483181 (60byte,0frags) 5043067pps 2420Mb/sec (2420672160bps) errors: 0 perf report --no-children: 26.05% ksoftirqd/0 [mlx4_en] [k] mlx4_en_process_rx_cq 17.84% ksoftirqd/0 [mlx4_en] [k] mlx4_en_alloc_frags 5.52% ksoftirqd/0 [mlx4_en] [k] mlx4_en_free_frag 4.90% swapper [kernel.vmlinux] [k] poll_idle 4.14% ksoftirqd/0 [kernel.vmlinux] [k] get_page_from_freelist 2.78% ksoftirqd/0 [kernel.vmlinux] [k] __free_pages_ok 2.57% ksoftirqd/0 [kernel.vmlinux] [k] bpf_map_lookup_elem 2.51% swapper [mlx4_en] [k] mlx4_en_process_rx_cq 1.94% ksoftirqd/0 [kernel.vmlinux] [k] percpu_array_map_lookup_elem 1.45% swapper [mlx4_en] [k] mlx4_en_alloc_frags 1.35% ksoftirqd/0 [kernel.vmlinux] [k] free_one_page 1.33% swapper [kernel.vmlinux] [k] intel_idle 1.04% ksoftirqd/0 [mlx4_en] [k] 0x000000000001c5c5 0.96% ksoftirqd/0 [mlx4_en] [k] 0x000000000001c58d 0.93% ksoftirqd/0 [mlx4_en] [k] 0x000000000001c6ee 0.92% ksoftirqd/0 [mlx4_en] [k] 0x000000000001c6b9 0.89% ksoftirqd/0 [kernel.vmlinux] [k] __alloc_pages_nodemask 0.83% ksoftirqd/0 [mlx4_en] [k] 0x000000000001c686 0.83% ksoftirqd/0 [mlx4_en] [k] 0x000000000001c5d5 0.78% ksoftirqd/0 [mlx4_en] [k] mlx4_alloc_pages.isra.23 0.77% ksoftirqd/0 [mlx4_en] [k] 0x000000000001c5b4 0.77% ksoftirqd/0 [kernel.vmlinux] [k] net_rx_action machine specs: receiver - Intel E5-1630 v3 @ 3.70GHz sender - Intel E5645 @ 2.40GHz Mellanox ConnectX-3 @40G Signed-off-by: Brenden Blanco <bblanco@plumgrid.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 21:46:32 -07:00
Brenden Blanco	47a38e1550	net/mlx4_en: add support for fast rx drop bpf program Add support for the BPF_PROG_TYPE_XDP hook in mlx4 driver. In tc/socket bpf programs, helpers linearize skb fragments as needed when the program touches the packet data. However, in the pursuit of speed, XDP programs will not be allowed to use these slower functions, especially if it involves allocating an skb. Therefore, disallow MTU settings that would produce a multi-fragment packet that XDP programs would fail to access. Future enhancements could be done to increase the allowable MTU. The xdp program is present as a per-ring data structure, but as of yet it is not possible to set at that granularity through any ndo. Signed-off-by: Brenden Blanco <bblanco@plumgrid.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 21:46:32 -07:00
Brenden Blanco	d1fdd91386	rtnl: add option for setting link xdp prog Sets the bpf program represented by fd as an early filter in the rx path of the netdev. The fd must have been created as BPF_PROG_TYPE_XDP. Providing a negative value as fd clears the program. Getting the fd back via rtnl is not possible, therefore reading of this value merely provides a bool whether the program is valid on the link or not. Signed-off-by: Brenden Blanco <bblanco@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 21:46:32 -07:00
Brenden Blanco	a7862b4584	net: add ndo to setup/query xdp prog in adapter rx Add one new netdev op for drivers implementing the BPF_PROG_TYPE_XDP filter. The single op is used for both setup/query of the xdp program, modelled after ndo_setup_tc. Signed-off-by: Brenden Blanco <bblanco@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 21:46:31 -07:00
Brenden Blanco	6a773a15a1	bpf: add XDP prog type for early driver filter Add a new bpf prog type that is intended to run in early stages of the packet rx path. Only minimal packet metadata will be available, hence a new context type, struct xdp_md, is exposed to userspace. So far only expose the packet start and end pointers, and only in read mode. An XDP program must return one of the well known enum values, all other return codes are reserved for future use. Unfortunately, this restriction is hard to enforce at verification time, so take the approach of warning at runtime when such programs are encountered. Out of bounds return codes should alias to XDP_ABORTED. Signed-off-by: Brenden Blanco <bblanco@plumgrid.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 21:46:31 -07:00
Brenden Blanco	59d3656d5b	bpf: add bpf_prog_add api for bulk prog refcnt A subsystem may need to store many copies of a bpf program, each deserving its own reference. Rather than requiring the caller to loop one by one (with possible mid-loop failure), add a bulk bpf_prog_add api. Signed-off-by: Brenden Blanco <bblanco@plumgrid.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 21:46:31 -07:00
David S. Miller	ddbcb79493	Merge branch 'ncsi' Gavin Shan says: ==================== NCSI Support This series rebases on David's linux-net git repo ("master" branch). It's to support NCSI stack on drivers/net/ethernet/faraday/ftgmac100.c. The implementation is based on NCSI spec (version: 1.1.0): https://www.dmtf.org/sites/default/files/standards/documents/DSP0222_1.1.0.pdf As the following figure shows and defined in NCSI spec: * The NC-SI (aka NCSI) is defined as the interface between a (Base) Management Controller (BMC) and one or multiple Network Interface Controlers (NIC) on host side. The interface is responsible for providing external network connectivity for BMC. * Each BMC can connect to multiple packages, up to 8. Each package can have multiple channels, up to 32. Every package and channel are identified by 3-bits and 5-bits in NCSI packet. * NCSI packet, encapsulated in ethernet frame, has 0x88F8 in the protocol field. The destination MAC address should be 0xFF's while the source MAC address can be arbitrary one. * NCSI packets are classified to command, response, AEN (Asynchronous Event Notification). Commands are sent from BMC to host (NIC) for configuration and information retrival. Responses, corresponding to commands, are sent from host to BMC for confirmation and requested information. One command should have one and only one response. AEN is sent from host to BMC for notification (e.g. link down on active channel) so that BMC can take appropriate action. +------------------+ +----------------------------------------------+ \| \| \| Host \| \| BMC \| \| \| \| \| \| +-------------------+ +-------------------+ \| \| +---------+ \| \| \| Package-A \| \| Package-B \| \| \| \| \| \| \| +---------+---------+ +-------------------+ \| \| \|ftgmac100\| \| \| \| Channel \| Channel \| \| Channel \| Channel \| \| +----+----+----+---+ +-+---------+---------+--+---------+---------+-+ \| \| \| \| \| \| +-----------------------------+----------------------+ The series of patches is highlighted as: The design for the patchset is highlighted as below: * The network driver uses 3 interfaces exported from NCSI stack: ncsi_register_dev() - Register (create) a associated NCSI device. ncsi_start_dev() - Bring up the NCSI device. ncsi_unregister_dev() - Destroy the registered NCSI device. * There are several data structures introduced for different objects: struct ncsi_dev - NCSI device seen by network device driver. struct ncsi_dev_priv - NCSI device seen by NCSI stack. struct ncsi_package - NCSI package which can have multiple channels. struct ncsi_channel - NCSI channel. * The NCSI stack is driven by workqueue and state machine internally. * The all available NCSI packages and channels are enumerated (probed) on the first call to ncsi_start_dev(). The NCSI topology won't change until the NCSI device is destroyed. * All available channels will be brought up When the hardware arbitration is enabled. Otherwise, only one channel is selected as active one. The NCSI internal is driven by state machine with help of a workqueue. In the meanwhile, there are 3 states for each channel which can be put into a queue requesting for configuration or suspending. Channels in the queue with inactive state set will be configured (bringup) while channels in the queue with active state will be suspended (teardown). The request configuration or suspending is being applied on the channel if it's in invisible state. * Failover, another inactive channel is selected as active, can happen when the hardware arbitration is disabled. The failover can be caused by timeout on link monitor and AEN. * NCSI stack should be configurable through netlink or another mechanism, it's not implemented in this patchset. It's something TBD. * The first NIC driver that is aware of NCSI: drivers/net/ethernet/faraday/ftgmac100.c Changelog ========= v2 -> v3: * Include (one line) change in include/uapi/linux/if_ether.h to fix build error. v1 -> v2: * Support NCSI spec v1.1.0 (3 more commands and 4 hardware arbitration modes added). * Enable AEN packets according to the supported list. * Introduce NCSI channel states and processing queue in order to support the hardware arbitration. * The hardware arbitration is supported (tested with emulated environment). * Introduce link monitor with GLS (Get Link Status) command/response as part of the error handling defined in NCSI spec. * Support IPv6 address discovery when CONFIG_IPV6 is enabled. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 20:49:18 -07:00
Gavin Shan	fc6061cf93	net/faraday: Mask PHY interrupt with NCSI mode Bogus PHY interrupts are observed. This masks the PHY interrupt when the interface works in NCSI mode as there is no attached PHY under the circumstance. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Acked-by: Joel Stanley <joel@jms.id.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 20:49:18 -07:00
Gavin Shan	bb168e2e9e	net/faraday: Match driver according to compatible property This matches the driver with devices compatible with "faraday,ftgmac100" declared in the device tree. Originally, device's name from device tree for it. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Acked-by: Joel Stanley <joel@jms.id.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 20:49:18 -07:00
Gavin Shan	bd466c3fb5	net/faraday: Support NCSI mode This makes ftgmac100 driver support NCSI mode. The NCSI is enabled on the interface if property "use-nc-si" or "use-ncsi" is found from the device node in device tree. * No PHY device is used when NCSI mode is enabled. * The NCSI device (struct ncsi_dev) is created when probing the device while it's enabled/started when the interface is brought up. * Hardware IP checksum dosn't work when NCSI mode is enabled. It is disabled on enabled NCSI. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Acked-by: Joel Stanley <joel@jms.id.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 20:49:17 -07:00
Gavin Shan	113ce107af	net/faraday: Read MAC address from chip The device is assigned with random MAC address. It isn't reasonable. An valid MAC address might have been provided by (uboot) firmware by device-tree or in chip. It's reasonable to use it to maintain consistency. This uses the MAC address from device-tree or that in the chip if it's valid. Otherwise, a random MAC address is given as before. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Acked-by: Joel Stanley <joel@jms.id.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 20:49:17 -07:00
Gavin Shan	eb4181849f	net/faraday: Helper functions to create or destroy MDIO interface This introduces two helper functions to create or destroy MDIO interface. No logical changes introduced except the proper MDIO names are given when having more than one MDIO bus. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Acked-by: Joel Stanley <joel@jms.id.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 20:49:17 -07:00
Gavin Shan	7a82ecf4cf	net/ncsi: NCSI AEN packet handler This introduces NCSI AEN packet handlers that result in (A) the currently active channel is reconfigured; (B) Currently active channel is deconfigured and disabled, another channel is chosen as active one and configured. Case (B) won't happen if hardware arbitration has been enabled, the channel that was in active state is suspended simply. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Acked-by: Joel Stanley <joel@jms.id.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 20:49:17 -07:00
Gavin Shan	e6f44ed6d0	net/ncsi: Package and channel management This manages NCSI packages and channels: * The available packages and channels are enumerated in the first time of calling ncsi_start_dev(). The channels' capabilities are probed in the meanwhile. The NCSI network topology won't change until the NCSI device is destroyed. * There in a queue in every NCSI device. The element in the queue, channel, is waiting for configuration (bringup) or suspending (teardown). The channel's state (inactive/active) indicates the futher action (configuration or suspending) will be applied on the channel. Another channel's state (invisible) means the requested action is being applied. * The hardware arbitration will be enabled if all available packages and channels support it. All available channels try to provide service when hardware arbitration is enabled. Otherwise, one channel is selected as the active one at once. * When channel is in active state, meaning it's providing service, a timer started to retrieve the channe's link status. If the channel's link status fails to be updated in the determined period, the channel is going to be reconfigured. It's the error handling implementation as defined in NCSI spec. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Acked-by: Joel Stanley <joel@jms.id.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 20:49:17 -07:00
Gavin Shan	138635cc27	net/ncsi: NCSI response packet handler The NCSI response packets are sent to MC (Management Controller) from the remote end. They are responses of NCSI command packets for multiple purposes: completion status of NCSI command packets, return NCSI channel's capability or configuration etc. This defines struct to represent NCSI response packets and introduces function ncsi_rcv_rsp() which will be used to receive NCSI response packets and parse them. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Acked-by: Joel Stanley <joel@jms.id.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 20:49:17 -07:00
Gavin Shan	6389eaa7fa	net/ncsi: NCSI command packet handler The NCSI command packets are sent from MC (Management Controller) to remote end. They are used for multiple purposes: probe existing NCSI package/channel, retrieve NCSI channel's capability, configure NCSI channel etc. This defines struct to represent NCSI command packets and introduces function ncsi_xmit_cmd(), which will be used to transmit NCSI command packet according to the request. The request is represented by struct ncsi_cmd_arg. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Acked-by: Joel Stanley <joel@jms.id.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 20:49:16 -07:00
Gavin Shan	2d283bdd07	net/ncsi: Resource management NCSI spec (DSP0222) defines several objects: package, channel, mode, filter, version and statistics etc. This introduces the data structs to represent those objects and implement functions to manage them. Also, this introduces CONFIG_NET_NCSI for the newly implemented NCSI stack. * The user (e.g. netdev driver) dereference NCSI device by "struct ncsi_dev", which is embedded to "struct ncsi_dev_priv". The later one is used by NCSI stack internally. * Every NCSI device can have multiple packages simultaneously, up to 8 packages. It's represented by "struct ncsi_package" and identified by 3-bits ID. * Every NCSI package can have multiple channels, up to 32. It's represented by "struct ncsi_channel" and identified by 5-bits ID. * Every NCSI channel has version, statistics, various modes and filters. They are represented by "struct ncsi_channel_version", "struct ncsi_channel_stats", "struct ncsi_channel_mode" and "struct ncsi_channel_filter" separately. * Apart from AEN (Asynchronous Event Notification), the NCSI stack works in terms of command and response. This introduces "struct ncsi_req" to represent a complete NCSI transaction made of NCSI request and response. link: https://www.dmtf.org/sites/default/files/standards/documents/DSP0222_1.1.0.pdf Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Acked-by: Joel Stanley <joel@jms.id.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 20:49:16 -07:00
David S. Miller	5e31c7019f	Merge branch 'dsa-mv88e6xxx-g2-cleanup-stp' Vivien Didelot says: ==================== net: dsa: mv88e6xxx: Global2 cleanup and STP The Marvell switches registers are organized in distinct internal SMI devices, such as PHY, Port, Global 1 or Global 2 registers sets. Since not all chips support every registers sets or have slightly differences in them (such as old 88E6060 or new 88E6390 likely to be supported soon), make the setup code clearer now by removing a few family checks and adding flags to describe the Global 2 registers map. This patchset enables basic STP support and bridging on most chips when getting rid of a few inconsistencies in chip descriptions (patch 1) and add bridge Ageing Time support to DSA and the mv88e6xxx driver. Changes v2 -> v3: - rename mv88e6xxx_update_write to mv88e6xxx_update - set fastest ageing time in use in the chip for multiple bridges, tested with a few printk Changes v1 -> v2: - add a write helper for pointer-data Update registers - add ageing time support ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 19:42:02 -07:00
Vivien Didelot	2cfcd96416	net: dsa: mv88e6xxx: add support for DSA ageing time Implement the DSA driver function to configure the bridge ageing time. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 19:42:02 -07:00
Vivien Didelot	acddbd2147	net: dsa: mv88e6xxx: add G1 helper for ageing time All Marvell switch chips from (88E6060 to 88E6390) have a ATU Control register containing bits 11:4 to configure an ATU Age Time quotient. However the coefficient used to calculate the ATU Age Time vary with the models. E.g. 88E6060, 88E6352 and 88E6390 use respectively 16, 15 and 3.75 seconds. Add a age_time_coeff to the info structure to handle this and a Global 1 helper to set the default age time of 5 minutes in the setup code. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 19:42:02 -07:00
Vivien Didelot	34a79f63bb	net: dsa: support switchdev ageing time attr Add a new function for DSA drivers to handle the switchdev SWITCHDEV_ATTR_ID_BRIDGE_AGEING_TIME attribute. The ageing time is passed as milliseconds. Also because we can have multiple logical bridges on top of a physical switch and ageing time are switch-wide, call the driver function with the fastest ageing time in use on the chip instead of the requested one. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 19:42:01 -07:00
Vivien Didelot	8ec61c7f7c	net: dsa: mv88e6xxx: add cap for IRL Add capability flags to describe the presence of Ingress Rate Limit unit registers and an helper function to clear it. In the meantime, fix a few harmless issues: - 6185 and 6095 don't have such registers (reserved) - the previous code didn't wait for the IRL operation to complete Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 19:42:01 -07:00
Vivien Didelot	9bda889fae	net: dsa: mv88e6xxx: add cap for Priority Override Add flags and helpers to describe the presence of Priority Override Table (POT) related registers and simplify the setup of Global 2. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 19:42:01 -07:00
Vivien Didelot	63ed880dea	net: dsa: mv88e6xxx: add cap for PVT Add flags to describe the presence of Cross-chip Port VLAN Table (PVT) related registers and simplify the setup of Global 2. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 19:42:01 -07:00
Vivien Didelot	3b4caa1b1c	net: dsa: mv88e6xxx: rework Switch MAC setter Switches such as 88E6185 as 3 Switch MAC registers in Global 1. Newer chips such as 88E6352 have freed these registers in favor of an indirect access in a Switch MAC/WoL/WoF register in Global 2. Explicit this difference with G1 and G2 helpers and flags. Also, note that this indirect access is a single-register which doesn't require to wait for the operation to complete (like Switch MAC, Trunk Mapping, etc.), in contrary to multi-registers indirect accesses with several operations and a busy bit. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 19:42:01 -07:00
Vivien Didelot	47395ed280	net: dsa: mv88e6xxx: add cap for MGMT Enables bits Some switches provide a Rsvd2CPU mechanism used to choose which of the 16 reserved multicast destination addresses matching 01:80:c2:00:00:0x should be considered as MGMT and thus forwarded to the CPU port. Other switches extend this mechanism to also configure as MGMT the additional 16 reserved multicast addresses matching 01:80:c2:00:00:2x. This mechanism is exposed via two registers in Global 2, and an Rsvd2CPU enable bit in the management register. Newer chip (such as 88E6390) has replaced these registers with a new indirect MGMT mechanism in Global 1. The patch adds two MV88E6XXX_FLAG_G2_MGMT_EN_{0,2}X flags to describe the presence of these Global 2 registers. If 88E6390 support is added, a MV88E6XXX_FLAG_G1_MGMT_CTRL flag will be needed to setup Rsvd2CPU. Note: all switches still support in parallel the ATU Load operation with an MGMT Entry State to forward such frames in a less convenient way. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 19:42:01 -07:00
Vivien Didelot	5154041fa7	net: dsa: mv88e6xxx: extract trunk mapping The Trunk Mask and Trunk Mapping registers are two Global 2 indirect accesses to trunking configuration. Add helpers for these tables and simplify the Global 2 setup. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 19:42:00 -07:00
Vivien Didelot	f22ab64123	net: dsa: mv88e6xxx: extract device mapping The Device Mapping register is an indirect table access. Provide helpers to access this table and explicit the checking of the new DSA_RTABLE_NONE routing table value. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 19:42:00 -07:00
Vivien Didelot	9729934c4f	net: dsa: mv88e6xxx: split setup of Global 1 and 2 Separate the setup of Global 1 and Global 2 internal SMI devices and add a flag to describe the presence of this second registers set. Also rearrange the G1 setup in the registers order. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 19:42:00 -07:00
Vivien Didelot	d51c542b78	net: dsa: mv88e6xxx: remove basic function flags All 88E6xxx Marvell switches (even the old not supported yet 88E6060) have at least an ATU, per-port STP states and VLAN map, to run basic switch functions such as Spanning Tree and port based VLANs. Get rid of the related MV88E6XXX_FLAG_{ATU,PORTSTATE,VLANTABLE} flags, as they are defaults to every chip. This enables STP on 6185 and removes many inconsistencies on others. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 19:42:00 -07:00
Andrew Morton	183fc1537e	kernel/trace/bpf_trace.c: work around gcc-4.4.4 anon union initialization bug kernel/trace/bpf_trace.c: In function 'bpf_event_output': kernel/trace/bpf_trace.c:312: error: unknown field 'next' specified in initializer kernel/trace/bpf_trace.c:312: warning: missing braces around initializer kernel/trace/bpf_trace.c:312: warning: (near initialization for 'raw.frag.<anonymous>') Fixes: `555c8a8623` ("bpf: avoid stack copy and use skb ctx for event output") Acked-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@kernel.org> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 19:27:01 -07:00
Andy Lutomirski	a725ee3e44	virtio-net: Remove more stack DMA VLAN and MQ control was doing DMA from the stack. Fix it. Cc: Michael S. Tsirkin <mst@redhat.com> Cc: "netdev@vger.kernel.org" <netdev@vger.kernel.org> Signed-off-by: Andy Lutomirski <luto@kernel.org> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 19:25:43 -07:00
Florian Fainelli	cbce91cad4	bnxt_en: Remove locking around txr->dev_state txr->dev_state was not consistently manipulated with the acquisition of the per-queue lock, after further inspection the lock does not seem necessary, either the value is read as BNXT_DEV_STATE_CLOSING or 0. Reported-by: coverity (CID 1339583) Fixes: `c0c050c58d` ("bnxt_en: New Broadcom ethernet driver.") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 19:22:38 -07:00
David S. Miller	7e0433b395	Merge branch 'frag-udp-tunneled-skbs' Shmulik Ladkani says: ==================== net: Consider fragmentation of udp tunneled skbs in 'ip_finish_output_gso' Currently IP fragmentation of GSO segments that exceed dst mtu is considered only in the ipv4 forwarding case. There are cases where GSO skbs that are bridged and then udp-tunneled may have gso_size exceeding the egress device mtu. It makes sense to fragment them, as in the non GSOed code path. The exact cases where this behavior is needed is described and addressed in the 2nd patch. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 16:40:22 -07:00
Shmulik Ladkani	b8247f095e	net: ip_finish_output_gso: If skb_gso_network_seglen exceeds MTU, allow segmentation for local udp tunneled skbs Given: - tap0 and vxlan0 are bridged - vxlan0 stacked on eth0, eth0 having small mtu (e.g. 1400) Assume GSO skbs arriving from tap0 having a gso_size as determined by user-provided virtio_net_hdr (e.g. 1460 corresponding to VM mtu of 1500). After encapsulation these skbs have skb_gso_network_seglen that exceed eth0's ip_skb_dst_mtu. These skbs are accidentally passed to ip_finish_output2 AS IS. Alas, each final segment (segmented either by validate_xmit_skb or by hardware UFO) would be larger than eth0 mtu. As a result, those above-mtu segments get dropped on certain networks. This behavior is not aligned with the NON-GSO case: Assume a non-gso 1500-sized IP packet arrives from tap0. After encapsulation, the vxlan datagram is fragmented normally at the ip_finish_output-->ip_fragment code path. The expected behavior for the GSO case would be segmenting the "gso-oversized" skb first, then fragmenting each segment according to dst mtu, and finally passing the resulting fragments to ip_finish_output2. 'ip_finish_output_gso' already supports this "Slowpath" behavior, according to the IPSKB_FRAG_SEGS flag, which is only set during ipv4 forwarding (not set in the bridged case). In order to support the bridged case, we'll mark skbs arriving from an ingress interface that get udp-encaspulated as "allowed to be fragmented", causing their network_seglen to be validated by 'ip_finish_output_gso' (and fragment if needed). Note the TUNNEL_DONT_FRAGMENT tun_flag is still honoured (both in the gso and non-gso cases), which serves users wishing to forbid fragmentation at the udp tunnel endpoint. Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Florian Westphal <fw@strlen.de> Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 16:40:22 -07:00
Shmulik Ladkani	359ebda25a	net/ipv4: Introduce IPSKB_FRAG_SEGS bit to inet_skb_parm.flags This flag indicates whether fragmentation of segments is allowed. Formerly this policy was hardcoded according to IPSKB_FORWARDED (set by either ip_forward or ipmr_forward). Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Florian Westphal <fw@strlen.de> Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 16:40:22 -07:00
David S. Miller	dfa15378bd	Merge branch 'bnxt_en-NS2-Nitro' Michael Chan says: ==================== bnxt_en: Add support for NS2 Nitro. This series adds support for the embedded version of the ethernet controller (Nitro) in the North Star 2 SoC. There are a number of features not supported and a software workaround for a hardware rx bug is required for Nitro A0. Please review. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 16:29:41 -07:00
Prashant Sreedharan	fa853dda19	bnxt_en: Add BCM58700 PCI device ID for NS2 Nitro. A bridge device in NS2 has the same device ID as the ethernet controller. Add check to avoid probing the bridge device. Signed-off-by: Prashant Sreedharan <prashant.sreedharan@broadcom.com> Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 16:29:41 -07:00
Prashant Sreedharan	dc52c6c70e	bnxt_en: Workaround Nitro A0 RX hardware bug (part 4). Allocate special vnic for dropping packets not matching the RX filters. First vnic is for normal RX packets and the driver will drop all packets on the 2nd vnic. Signed-off-by: Prashant Sreedharan <prashant.sreedharan@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 16:29:40 -07:00
Prashant Sreedharan	10bbdaf5e4	bnxt_en: Workaround Nitro A0 hardware RX bug (part 3). Allocate napi for special vnic, packets arriving on this napi will simply be dropped and the buffers will be replenished back to the HW. Signed-off-by: Prashant Sreedharan <prashant.sreedharan@broadcom.com> Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 16:29:40 -07:00
Prashant Sreedharan	765951938e	bnxt_en: Workaround Nitro A0 hardware RX bug (part 2). The hardware is unable to drop rx packets not matching the RX filters. To workaround it, we create a special VNIC and configure the hardware to direct all packets not matching the filters to it. We then setup the driver to drop packets received on this VNIC. This patch creates the infrastructure for this VNIC, reserves a completion ring, and rx rings. Only shared completion ring mode is supported. The next 2 patches add a NAPI to handle packets from this VNIC and the setup of the VNIC. Signed-off-by: Prashant Sreedharan <prashant.sreedharan@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 16:29:40 -07:00
Prashant Sreedharan	94ce9caa0f	bnxt_en: Workaround Nitro A0 hardware RX bug (part 1). Nitro A0 has a hardware bug in the rx path. The workaround is to create a special COS context as a path for non-RSS (non-IP) packets. Without this workaround, the chip may stall when receiving RSS and non-RSS packets. Add infrastructure to allow 2 contexts (RSS and CoS) per VNIC. Allocate and configure the CoS context for Nitro A0. Signed-off-by: Prashant Sreedharan <prashant.sreedharan@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 16:29:40 -07:00
Prashant Sreedharan	3e8060fa83	bnxt_en: Add basic support for Nitro in North Star 2. Nitro is the embedded version of the ethernet controller in the North Star 2 SoC. Add basic code to recognize the chip ID and disable the features (ntuple, TPA, ring and port statistics) not supported on Nitro A0. Signed-off-by: Prashant Sreedharan <prashant.sreedharan@broadcom.com> Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 16:29:40 -07:00
David S. Miller	cb924052d8	Merge branch 'marvell-phy' Charles-Antoine Couret says: ==================== Marvell phy: fiber interface configuration Another patchset to manage correctly the fiber link for some concerned Marvell's phy like 88E1512. This patchset fixed the commit log for the third and last commits and a comment in the first commit. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 16:05:57 -07:00
Charles-Antoine Couret	3758be3dc1	Marvell phy: add functions to suspend and resume both interfaces: fiber and copper links. These functions used standards registers in a different page for both interfaces: copper and fiber. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Charles-Antoine Couret <charles-antoine.couret@nexvision.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 16:05:56 -07:00
Charles-Antoine Couret	78301ebe9b	Marvell phy: add configuration of autonegociation for fiber link. To be correctly initilized, the fiber interface needs to be configured via autonegociation registers which use some customs options or registers. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Charles-Antoine Couret <charles-antoine.couret@nexvision.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 16:05:56 -07:00
Charles-Antoine Couret	2170fef78a	Marvell phy: add field to get errors from fiber link. Add support for the fiber receiver error counter in the statistics. Rename the current counter which is for copper errors to phy_receive_errors_copper, so it is easy to distinguish copper from fiber. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Charles-Antoine Couret <charles-antoine.couret@nexvision.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 16:05:56 -07:00
Charles-Antoine Couret	6cfb3bcc06	Marvell phy: check link status in case of fiber link. For concerned phy, the fiber link is checked before the copper link. According to datasheet, the link which is up is enabled. If both links are down, copper link would be used. To detect fiber link status, we used the real time status because of troubles with the copper method. Tested with Marvell 88E1512. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Charles-Antoine Couret <charles-antoine.couret@nexvision.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 16:05:56 -07:00
David S. Miller	bcf77d5b59	Merge branch 'renesas-dma-channel' Sergei Shtylyov says: ==================== Fix DMA channel misreporting for the Renesas Ethernet drivers Here's a set of 2 patches against DaveM's 'net.git' repo fixing up the DMA channel reporting by 'ifconfig'... ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 16:01:34 -07:00

1 2 3 4 5 ...

604798 Commits All Branches Search

604798 Commits

All Branches