linux/include/net/dsa.h

678 lines
19 KiB
C
Raw Normal View History

net: Distributed Switch Architecture protocol support Distributed Switch Architecture is a protocol for managing hardware switch chips. It consists of a set of MII management registers and commands to configure the switch, and an ethernet header format to signal which of the ports of the switch a packet was received from or is intended to be sent to. The switches that this driver supports are typically embedded in access points and routers, and a typical setup with a DSA switch looks something like this: +-----------+ +-----------+ | | RGMII | | | +-------+ +------ 1000baseT MDI ("WAN") | | | 6-port +------ 1000baseT MDI ("LAN1") | CPU | | ethernet +------ 1000baseT MDI ("LAN2") | |MIImgmt| switch +------ 1000baseT MDI ("LAN3") | +-------+ w/5 PHYs +------ 1000baseT MDI ("LAN4") | | | | +-----------+ +-----------+ The switch driver presents each port on the switch as a separate network interface to Linux, polls the switch to maintain software link state of those ports, forwards MII management interface accesses to those network interfaces (e.g. as done by ethtool) to the switch, and exposes the switch's hardware statistics counters via the appropriate Linux kernel interfaces. This initial patch supports the MII management interface register layout of the Marvell 88E6123, 88E6161 and 88E6165 switch chips, and supports the "Ethertype DSA" packet tagging format. (There is no officially registered ethertype for the Ethertype DSA packet format, so we just grab a random one. The ethertype to use is programmed into the switch, and the switch driver uses the value of ETH_P_EDSA for this, so this define can be changed at any time in the future if the one we chose is allocated to another protocol or if Ethertype DSA gets its own officially registered ethertype, and everything will continue to work.) Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Tested-by: Nicolas Pitre <nico@marvell.com> Tested-by: Byron Bradley <byron.bbradley@gmail.com> Tested-by: Tim Ellis <tim.ellis@mac.com> Tested-by: Peter van Valderen <linux@ddcrew.com> Tested-by: Dirk Teurlings <dirk@upexia.nl> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-07 21:44:02 +08:00
/*
* include/net/dsa.h - Driver for Distributed Switch Architecture switch chips
dsa: add switch chip cascading support The initial version of the DSA driver only supported a single switch chip per network interface, while DSA-capable switch chips can be interconnected to form a tree of switch chips. This patch adds support for multiple switch chips on a network interface. An example topology for a 16-port device with an embedded CPU is as follows: +-----+ +--------+ +--------+ | |eth0 10| switch |9 10| switch | | CPU +----------+ +-------+ | | | | chip 0 | | chip 1 | +-----+ +---++---+ +---++---+ || || || || ||1000baseT ||1000baseT ||ports 1-8 ||ports 9-16 This requires a couple of interdependent changes in the DSA layer: - The dsa platform driver data needs to be extended: there is still only one netdevice per DSA driver instance (eth0 in the example above), but each of the switch chips in the tree needs its own mii_bus device pointer, MII management bus address, and port name array. (include/net/dsa.h) The existing in-tree dsa users need some small changes to deal with this. (arch/arm) - The DSA and Ethertype DSA tagging modules need to be extended to use the DSA device ID field on receive and demultiplex the packet accordingly, and fill in the DSA device ID field on transmit according to which switch chip the packet is heading to. (net/dsa/tag_{dsa,edsa}.c) - The concept of "CPU port", which is the switch chip port that the CPU is connected to (port 10 on switch chip 0 in the example), needs to be extended with the concept of "upstream port", which is the port on the switch chip that will bring us one hop closer to the CPU (port 10 for both switch chips in the example above). - The dsa platform data needs to specify which ports on which switch chips are links to other switch chips, so that we can enable DSA tagging mode on them. (For inter-switch links, we always use non-EtherType DSA tagging, since it has lower overhead. The CPU link uses dsa or edsa tagging depending on what the 'root' switch chip supports.) This is done by specifying "dsa" for the given port in the port array. - The dsa platform data needs to be extended with information on via which port to reach any given switch chip from any given switch chip. This info is specified via the per-switch chip data struct ->rtable[] array, which gives the nexthop ports for each of the other switches in the tree. For the example topology above, the dsa platform data would look something like this: static struct dsa_chip_data sw[2] = { { .mii_bus = &foo, .sw_addr = 1, .port_names[0] = "p1", .port_names[1] = "p2", .port_names[2] = "p3", .port_names[3] = "p4", .port_names[4] = "p5", .port_names[5] = "p6", .port_names[6] = "p7", .port_names[7] = "p8", .port_names[9] = "dsa", .port_names[10] = "cpu", .rtable = (s8 []){ -1, 9, }, }, { .mii_bus = &foo, .sw_addr = 2, .port_names[0] = "p9", .port_names[1] = "p10", .port_names[2] = "p11", .port_names[3] = "p12", .port_names[4] = "p13", .port_names[5] = "p14", .port_names[6] = "p15", .port_names[7] = "p16", .port_names[10] = "dsa", .rtable = (s8 []){ 10, -1, }, }, }, static struct dsa_platform_data pd = { .netdev = &foo, .nr_switches = 2, .sw = sw, }; Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Tested-by: Gary Thomas <gary@mlbassoc.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-20 17:52:09 +08:00
* Copyright (c) 2008-2009 Marvell Semiconductor
net: Distributed Switch Architecture protocol support Distributed Switch Architecture is a protocol for managing hardware switch chips. It consists of a set of MII management registers and commands to configure the switch, and an ethernet header format to signal which of the ports of the switch a packet was received from or is intended to be sent to. The switches that this driver supports are typically embedded in access points and routers, and a typical setup with a DSA switch looks something like this: +-----------+ +-----------+ | | RGMII | | | +-------+ +------ 1000baseT MDI ("WAN") | | | 6-port +------ 1000baseT MDI ("LAN1") | CPU | | ethernet +------ 1000baseT MDI ("LAN2") | |MIImgmt| switch +------ 1000baseT MDI ("LAN3") | +-------+ w/5 PHYs +------ 1000baseT MDI ("LAN4") | | | | +-----------+ +-----------+ The switch driver presents each port on the switch as a separate network interface to Linux, polls the switch to maintain software link state of those ports, forwards MII management interface accesses to those network interfaces (e.g. as done by ethtool) to the switch, and exposes the switch's hardware statistics counters via the appropriate Linux kernel interfaces. This initial patch supports the MII management interface register layout of the Marvell 88E6123, 88E6161 and 88E6165 switch chips, and supports the "Ethertype DSA" packet tagging format. (There is no officially registered ethertype for the Ethertype DSA packet format, so we just grab a random one. The ethertype to use is programmed into the switch, and the switch driver uses the value of ETH_P_EDSA for this, so this define can be changed at any time in the future if the one we chose is allocated to another protocol or if Ethertype DSA gets its own officially registered ethertype, and everything will continue to work.) Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Tested-by: Nicolas Pitre <nico@marvell.com> Tested-by: Byron Bradley <byron.bbradley@gmail.com> Tested-by: Tim Ellis <tim.ellis@mac.com> Tested-by: Peter van Valderen <linux@ddcrew.com> Tested-by: Dirk Teurlings <dirk@upexia.nl> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-07 21:44:02 +08:00
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*/
#ifndef __LINUX_NET_DSA_H
#define __LINUX_NET_DSA_H
#include <linux/if.h>
#include <linux/if_ether.h>
#include <linux/list.h>
#include <linux/notifier.h>
#include <linux/timer.h>
#include <linux/workqueue.h>
#include <linux/of.h>
#include <linux/ethtool.h>
#include <linux/net_tstamp.h>
#include <linux/phy.h>
#include <linux/platform_data/dsa.h>
#include <net/devlink.h>
#include <net/switchdev.h>
struct tc_action;
struct phy_device;
struct fixed_phy_status;
struct phylink_link_state;
#define DSA_TAG_PROTO_NONE_VALUE 0
#define DSA_TAG_PROTO_BRCM_VALUE 1
#define DSA_TAG_PROTO_BRCM_PREPEND_VALUE 2
#define DSA_TAG_PROTO_DSA_VALUE 3
#define DSA_TAG_PROTO_EDSA_VALUE 4
#define DSA_TAG_PROTO_GSWIP_VALUE 5
#define DSA_TAG_PROTO_KSZ9477_VALUE 6
#define DSA_TAG_PROTO_KSZ9893_VALUE 7
#define DSA_TAG_PROTO_LAN9303_VALUE 8
#define DSA_TAG_PROTO_MTK_VALUE 9
#define DSA_TAG_PROTO_QCA_VALUE 10
#define DSA_TAG_PROTO_TRAILER_VALUE 11
net: dsa: Optional VLAN-based port separation for switches without tagging This patch provides generic DSA code for using VLAN (802.1Q) tags for the same purpose as a dedicated switch tag for injection/extraction. It is based on the discussions and interest that has been so far expressed in https://www.spinics.net/lists/netdev/msg556125.html. Unlike all other DSA-supported tagging protocols, CONFIG_NET_DSA_TAG_8021Q does not offer a complete solution for drivers (nor can it). Instead, it provides generic code that driver can opt into calling: - dsa_8021q_xmit: Inserts a VLAN header with the specified contents. Can be called from another tagging protocol's xmit function. Currently the LAN9303 driver is inserting headers that are simply 802.1Q with custom fields, so this is an opportunity for code reuse. - dsa_8021q_rcv: Retrieves the TPID and TCI from a VLAN-tagged skb. Removing the VLAN header is left as a decision for the caller to make. - dsa_port_setup_8021q_tagging: For each user port, installs an Rx VID and a Tx VID, for proper untagged traffic identification on ingress and steering on egress. Also sets up the VLAN trunk on the upstream (CPU or DSA) port. Drivers are intentionally left to call this function explicitly, depending on the context and hardware support. The expected switch behavior and VLAN semantics should not be violated under any conditions. That is, after calling dsa_port_setup_8021q_tagging, the hardware should still pass all ingress traffic, be it tagged or untagged. For uniformity with the other tagging protocols, a module for the dsa_8021q_netdev_ops structure is registered, but the typical usage is to set up another tagging protocol which selects CONFIG_NET_DSA_TAG_8021Q, and calls the API from tag_8021q.h. Null function definitions are also provided so that a "depends on" is not forced in the Kconfig. This tagging protocol only works when switch ports are standalone, or when they are added to a VLAN-unaware bridge. It will probably remain this way for the reasons below. When added to a bridge that has vlan_filtering 1, the bridge core will install its own VLANs and reset the pvids through switchdev. For the bridge core, switchdev is a write-only pipe. All VLAN-related state is kept in the bridge core and nothing is read from DSA/switchdev or from the driver. So the bridge core will break this port separation because it will install the vlan_default_pvid into all switchdev ports. Even if we could teach the bridge driver about switchdev preference of a certain vlan_default_pvid (task difficult in itself since the current setting is per-bridge but we would need it per-port), there would still exist many other challenges. Firstly, in the DSA rcv callback, a driver would have to perform an iterative reverse lookup to find the correct switch port. That is because the port is a bridge slave, so its Rx VID (port PVID) is subject to user configuration. How would we ensure that the user doesn't reset the pvid to a different value (which would make an O(1) translation impossible), or to a non-unique value within this DSA switch tree (which would make any translation impossible)? Finally, not all switch ports are equal in DSA, and that makes it difficult for the bridge to be completely aware of this anyway. The CPU port needs to transmit tagged packets (VLAN trunk) in order for the DSA rcv code to be able to decode source information. But the bridge code has absolutely no idea which switch port is the CPU port, if nothing else then just because there is no netdevice registered by DSA for the CPU port. Also DSA does not currently allow the user to specify that they want the CPU port to do VLAN trunking anyway. VLANs are added to the CPU port using the same flags as they were added on the user port. So the VLANs installed by dsa_port_setup_8021q_tagging per driver request should remain private from the bridge's and user's perspective, and should not alter the VLAN semantics observed by the user. In the current implementation a VLAN range ending at 4095 (VLAN_N_VID) is reserved for this purpose. Each port receives a unique Rx VLAN and a unique Tx VLAN. Separate VLANs are needed for Rx and Tx because they serve different purposes: on Rx the switch must process traffic as untagged and process it with a port-based VLAN, but with care not to hinder bridging. On the other hand, the Tx VLAN is where the reachability restrictions are imposed, since by tagging frames in the xmit callback we are telling the switch onto which port to steer the frame. Some general guidance on how this support might be employed for real-life hardware (some comments made by Florian Fainelli): - If the hardware supports VLAN tag stacking, it should somehow back up its private VLAN settings when the bridge tries to override them. Then the driver could re-apply them as outer tags. Dedicating an outer tag per bridge device would allow identical inner tag VID numbers to co-exist, yet preserve broadcast domain isolation. - If the switch cannot handle VLAN tag stacking, it should disable this port separation when added as slave to a vlan_filtering bridge, in that case having reduced functionality. - Drivers for old switches that don't support the entire VLAN_N_VID range will need to rework the current range selection mechanism. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-05 18:19:22 +08:00
#define DSA_TAG_PROTO_8021Q_VALUE 12
enum dsa_tag_protocol {
DSA_TAG_PROTO_NONE = DSA_TAG_PROTO_NONE_VALUE,
DSA_TAG_PROTO_BRCM = DSA_TAG_PROTO_BRCM_VALUE,
DSA_TAG_PROTO_BRCM_PREPEND = DSA_TAG_PROTO_BRCM_PREPEND_VALUE,
DSA_TAG_PROTO_DSA = DSA_TAG_PROTO_DSA_VALUE,
DSA_TAG_PROTO_EDSA = DSA_TAG_PROTO_EDSA_VALUE,
DSA_TAG_PROTO_GSWIP = DSA_TAG_PROTO_GSWIP_VALUE,
DSA_TAG_PROTO_KSZ9477 = DSA_TAG_PROTO_KSZ9477_VALUE,
DSA_TAG_PROTO_KSZ9893 = DSA_TAG_PROTO_KSZ9893_VALUE,
DSA_TAG_PROTO_LAN9303 = DSA_TAG_PROTO_LAN9303_VALUE,
DSA_TAG_PROTO_MTK = DSA_TAG_PROTO_MTK_VALUE,
DSA_TAG_PROTO_QCA = DSA_TAG_PROTO_QCA_VALUE,
DSA_TAG_PROTO_TRAILER = DSA_TAG_PROTO_TRAILER_VALUE,
net: dsa: Optional VLAN-based port separation for switches without tagging This patch provides generic DSA code for using VLAN (802.1Q) tags for the same purpose as a dedicated switch tag for injection/extraction. It is based on the discussions and interest that has been so far expressed in https://www.spinics.net/lists/netdev/msg556125.html. Unlike all other DSA-supported tagging protocols, CONFIG_NET_DSA_TAG_8021Q does not offer a complete solution for drivers (nor can it). Instead, it provides generic code that driver can opt into calling: - dsa_8021q_xmit: Inserts a VLAN header with the specified contents. Can be called from another tagging protocol's xmit function. Currently the LAN9303 driver is inserting headers that are simply 802.1Q with custom fields, so this is an opportunity for code reuse. - dsa_8021q_rcv: Retrieves the TPID and TCI from a VLAN-tagged skb. Removing the VLAN header is left as a decision for the caller to make. - dsa_port_setup_8021q_tagging: For each user port, installs an Rx VID and a Tx VID, for proper untagged traffic identification on ingress and steering on egress. Also sets up the VLAN trunk on the upstream (CPU or DSA) port. Drivers are intentionally left to call this function explicitly, depending on the context and hardware support. The expected switch behavior and VLAN semantics should not be violated under any conditions. That is, after calling dsa_port_setup_8021q_tagging, the hardware should still pass all ingress traffic, be it tagged or untagged. For uniformity with the other tagging protocols, a module for the dsa_8021q_netdev_ops structure is registered, but the typical usage is to set up another tagging protocol which selects CONFIG_NET_DSA_TAG_8021Q, and calls the API from tag_8021q.h. Null function definitions are also provided so that a "depends on" is not forced in the Kconfig. This tagging protocol only works when switch ports are standalone, or when they are added to a VLAN-unaware bridge. It will probably remain this way for the reasons below. When added to a bridge that has vlan_filtering 1, the bridge core will install its own VLANs and reset the pvids through switchdev. For the bridge core, switchdev is a write-only pipe. All VLAN-related state is kept in the bridge core and nothing is read from DSA/switchdev or from the driver. So the bridge core will break this port separation because it will install the vlan_default_pvid into all switchdev ports. Even if we could teach the bridge driver about switchdev preference of a certain vlan_default_pvid (task difficult in itself since the current setting is per-bridge but we would need it per-port), there would still exist many other challenges. Firstly, in the DSA rcv callback, a driver would have to perform an iterative reverse lookup to find the correct switch port. That is because the port is a bridge slave, so its Rx VID (port PVID) is subject to user configuration. How would we ensure that the user doesn't reset the pvid to a different value (which would make an O(1) translation impossible), or to a non-unique value within this DSA switch tree (which would make any translation impossible)? Finally, not all switch ports are equal in DSA, and that makes it difficult for the bridge to be completely aware of this anyway. The CPU port needs to transmit tagged packets (VLAN trunk) in order for the DSA rcv code to be able to decode source information. But the bridge code has absolutely no idea which switch port is the CPU port, if nothing else then just because there is no netdevice registered by DSA for the CPU port. Also DSA does not currently allow the user to specify that they want the CPU port to do VLAN trunking anyway. VLANs are added to the CPU port using the same flags as they were added on the user port. So the VLANs installed by dsa_port_setup_8021q_tagging per driver request should remain private from the bridge's and user's perspective, and should not alter the VLAN semantics observed by the user. In the current implementation a VLAN range ending at 4095 (VLAN_N_VID) is reserved for this purpose. Each port receives a unique Rx VLAN and a unique Tx VLAN. Separate VLANs are needed for Rx and Tx because they serve different purposes: on Rx the switch must process traffic as untagged and process it with a port-based VLAN, but with care not to hinder bridging. On the other hand, the Tx VLAN is where the reachability restrictions are imposed, since by tagging frames in the xmit callback we are telling the switch onto which port to steer the frame. Some general guidance on how this support might be employed for real-life hardware (some comments made by Florian Fainelli): - If the hardware supports VLAN tag stacking, it should somehow back up its private VLAN settings when the bridge tries to override them. Then the driver could re-apply them as outer tags. Dedicating an outer tag per bridge device would allow identical inner tag VID numbers to co-exist, yet preserve broadcast domain isolation. - If the switch cannot handle VLAN tag stacking, it should disable this port separation when added as slave to a vlan_filtering bridge, in that case having reduced functionality. - Drivers for old switches that don't support the entire VLAN_N_VID range will need to rework the current range selection mechanism. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-05 18:19:22 +08:00
DSA_TAG_PROTO_8021Q = DSA_TAG_PROTO_8021Q_VALUE,
};
struct packet_type;
struct dsa_switch;
struct dsa_device_ops {
struct sk_buff *(*xmit)(struct sk_buff *skb, struct net_device *dev);
struct sk_buff *(*rcv)(struct sk_buff *skb, struct net_device *dev,
struct packet_type *pt);
int (*flow_dissect)(const struct sk_buff *skb, __be16 *proto,
int *offset);
net: dsa: Allow drivers to filter packets they can decode source port from Frames get processed by DSA and redirected to switch port net devices based on the ETH_P_XDSA multiplexed packet_type handler found by the network stack when calling eth_type_trans(). The running assumption is that once the DSA .rcv function is called, DSA is always able to decode the switch tag in order to change the skb->dev from its master. However there are tagging protocols (such as the new DSA_TAG_PROTO_SJA1105, user of DSA_TAG_PROTO_8021Q) where this assumption is not completely true, since switch tagging piggybacks on the absence of a vlan_filtering bridge. Moreover, management traffic (BPDU, PTP) for this switch doesn't rely on switch tagging, but on a different mechanism. So it would make sense to at least be able to terminate that. Having DSA receive traffic it can't decode would put it in an impossible situation: the eth_type_trans() function would invoke the DSA .rcv(), which could not change skb->dev, then eth_type_trans() would be invoked again, which again would call the DSA .rcv, and the packet would never be able to exit the DSA filter and would spiral in a loop until the whole system dies. This happens because eth_type_trans() doesn't actually look at the skb (so as to identify a potential tag) when it deems it as being ETH_P_XDSA. It just checks whether skb->dev has a DSA private pointer installed (therefore it's a DSA master) and that there exists a .rcv callback (everybody except DSA_TAG_PROTO_NONE has that). This is understandable as there are many switch tags out there, and exhaustively checking for all of them is far from ideal. The solution lies in introducing a filtering function for each tagging protocol. In the absence of a filtering function, all traffic is passed to the .rcv DSA callback. The tagging protocol should see the filtering function as a pre-validation that it can decode the incoming skb. The traffic that doesn't match the filter will bypass the DSA .rcv callback and be left on the master netdevice, which wasn't previously possible. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-05 18:19:23 +08:00
/* Used to determine which traffic should match the DSA filter in
* eth_type_trans, and which, if any, should bypass it and be processed
* as regular on the master net device.
*/
bool (*filter)(const struct sk_buff *skb, struct net_device *dev);
unsigned int overhead;
const char *name;
enum dsa_tag_protocol proto;
};
#define DSA_TAG_DRIVER_ALIAS "dsa_tag-"
#define MODULE_ALIAS_DSA_TAG_DRIVER(__proto) \
MODULE_ALIAS(DSA_TAG_DRIVER_ALIAS __stringify(__proto##_VALUE))
struct dsa_switch_tree {
struct list_head list;
/* Notifier chain for switch-wide events */
struct raw_notifier_head nh;
/* Tree identifier */
unsigned int index;
/* Number of switches attached to this tree */
struct kref refcount;
/* Has this tree been applied to the hardware? */
bool setup;
/*
* Configuration data for the platform device that owns
* this dsa switch tree instance.
*/
struct dsa_platform_data *pd;
/*
* The switch port to which the CPU is attached.
*/
struct dsa_port *cpu_dp;
/*
* Data for the individual switch chips.
*/
struct dsa_switch *ds[DSA_MAX_SWITCHES];
};
/* TC matchall action types, only mirroring for now */
enum dsa_port_mall_action_type {
DSA_PORT_MALL_MIRROR,
};
/* TC mirroring entry */
struct dsa_mall_mirror_tc_entry {
u8 to_local_port;
bool ingress;
};
/* TC matchall entry */
struct dsa_mall_tc_entry {
struct list_head list;
unsigned long cookie;
enum dsa_port_mall_action_type type;
union {
struct dsa_mall_mirror_tc_entry mirror;
};
};
struct dsa_port {
/* A CPU port is physically connected to a master device.
* A user port exposed to userspace has a slave device.
*/
union {
struct net_device *master;
struct net_device *slave;
};
/* CPU port tagging operations used by master or slave devices */
const struct dsa_device_ops *tag_ops;
/* Copies for faster access in master receive hot path */
struct dsa_switch_tree *dst;
struct sk_buff *(*rcv)(struct sk_buff *skb, struct net_device *dev,
struct packet_type *pt);
net: dsa: Allow drivers to filter packets they can decode source port from Frames get processed by DSA and redirected to switch port net devices based on the ETH_P_XDSA multiplexed packet_type handler found by the network stack when calling eth_type_trans(). The running assumption is that once the DSA .rcv function is called, DSA is always able to decode the switch tag in order to change the skb->dev from its master. However there are tagging protocols (such as the new DSA_TAG_PROTO_SJA1105, user of DSA_TAG_PROTO_8021Q) where this assumption is not completely true, since switch tagging piggybacks on the absence of a vlan_filtering bridge. Moreover, management traffic (BPDU, PTP) for this switch doesn't rely on switch tagging, but on a different mechanism. So it would make sense to at least be able to terminate that. Having DSA receive traffic it can't decode would put it in an impossible situation: the eth_type_trans() function would invoke the DSA .rcv(), which could not change skb->dev, then eth_type_trans() would be invoked again, which again would call the DSA .rcv, and the packet would never be able to exit the DSA filter and would spiral in a loop until the whole system dies. This happens because eth_type_trans() doesn't actually look at the skb (so as to identify a potential tag) when it deems it as being ETH_P_XDSA. It just checks whether skb->dev has a DSA private pointer installed (therefore it's a DSA master) and that there exists a .rcv callback (everybody except DSA_TAG_PROTO_NONE has that). This is understandable as there are many switch tags out there, and exhaustively checking for all of them is far from ideal. The solution lies in introducing a filtering function for each tagging protocol. In the absence of a filtering function, all traffic is passed to the .rcv DSA callback. The tagging protocol should see the filtering function as a pre-validation that it can decode the incoming skb. The traffic that doesn't match the filter will bypass the DSA .rcv callback and be left on the master netdevice, which wasn't previously possible. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-05 18:19:23 +08:00
bool (*filter)(const struct sk_buff *skb, struct net_device *dev);
enum {
DSA_PORT_TYPE_UNUSED = 0,
DSA_PORT_TYPE_CPU,
DSA_PORT_TYPE_DSA,
DSA_PORT_TYPE_USER,
} type;
struct dsa_switch *ds;
unsigned int index;
const char *name;
const struct dsa_port *cpu_dp;
const char *mac;
struct device_node *dn;
unsigned int ageing_time;
bool vlan_filtering;
u8 stp_state;
struct net_device *bridge_dev;
struct devlink_port devlink_port;
struct phylink *pl;
/*
* Original copy of the master netdev ethtool_ops
*/
const struct ethtool_ops *orig_ethtool_ops;
/*
* Original copy of the master netdev net_device_ops
*/
const struct net_device_ops *orig_ndo_ops;
};
struct dsa_switch {
struct device *dev;
/*
* Parent switch tree, and switch index.
*/
struct dsa_switch_tree *dst;
unsigned int index;
/* Listener for switch fabric events */
struct notifier_block nb;
/*
* Give the switch driver somewhere to hang its private data
* structure.
*/
void *priv;
/*
* Configuration data for this switch.
*/
struct dsa_chip_data *cd;
/*
* The switch operations.
*/
const struct dsa_switch_ops *ops;
/*
* An array of which element [a] indicates which port on this
* switch should be used to send packets to that are destined
* for switch a. Can be NULL if there is only one switch chip.
*/
s8 rtable[DSA_MAX_SWITCHES];
/*
* Slave mii_bus and devices for the individual ports.
*/
u32 phys_mii_mask;
struct mii_bus *slave_mii_bus;
/* Ageing Time limits in msecs */
unsigned int ageing_time_min;
unsigned int ageing_time_max;
/* devlink used to represent this switch device */
struct devlink *devlink;
/* Number of switch port queues */
unsigned int num_tx_queues;
/* Disallow bridge core from requesting different VLAN awareness
* settings on ports if not hardware-supported
*/
bool vlan_filtering_is_global;
/* In case vlan_filtering_is_global is set, the VLAN awareness state
* should be retrieved from here and not from the per-port settings.
*/
bool vlan_filtering;
unsigned long *bitmap;
unsigned long _bitmap;
/* Dynamically allocated ports, keep last */
size_t num_ports;
struct dsa_port ports[];
};
static inline const struct dsa_port *dsa_to_port(struct dsa_switch *ds, int p)
{
return &ds->ports[p];
}
static inline bool dsa_is_unused_port(struct dsa_switch *ds, int p)
{
return dsa_to_port(ds, p)->type == DSA_PORT_TYPE_UNUSED;
}
static inline bool dsa_is_cpu_port(struct dsa_switch *ds, int p)
{
return dsa_to_port(ds, p)->type == DSA_PORT_TYPE_CPU;
}
static inline bool dsa_is_dsa_port(struct dsa_switch *ds, int p)
{
return dsa_to_port(ds, p)->type == DSA_PORT_TYPE_DSA;
}
static inline bool dsa_is_user_port(struct dsa_switch *ds, int p)
{
return dsa_to_port(ds, p)->type == DSA_PORT_TYPE_USER;
}
static inline u32 dsa_user_ports(struct dsa_switch *ds)
{
u32 mask = 0;
int p;
for (p = 0; p < ds->num_ports; p++)
if (dsa_is_user_port(ds, p))
mask |= BIT(p);
return mask;
}
/* Return the local port used to reach an arbitrary switch port */
static inline unsigned int dsa_towards_port(struct dsa_switch *ds, int device,
int port)
{
if (device == ds->index)
return port;
else
return ds->rtable[device];
}
/* Return the local port used to reach the dedicated CPU port */
static inline unsigned int dsa_upstream_port(struct dsa_switch *ds, int port)
{
const struct dsa_port *dp = dsa_to_port(ds, port);
const struct dsa_port *cpu_dp = dp->cpu_dp;
if (!cpu_dp)
return port;
return dsa_towards_port(ds, cpu_dp->ds->index, cpu_dp->index);
}
static inline bool dsa_port_is_vlan_filtering(const struct dsa_port *dp)
{
const struct dsa_switch *ds = dp->ds;
if (ds->vlan_filtering_is_global)
return ds->vlan_filtering;
else
return dp->vlan_filtering;
}
typedef int dsa_fdb_dump_cb_t(const unsigned char *addr, u16 vid,
bool is_static, void *data);
struct dsa_switch_ops {
enum dsa_tag_protocol (*get_tag_protocol)(struct dsa_switch *ds,
int port);
int (*setup)(struct dsa_switch *ds);
u32 (*get_phy_flags)(struct dsa_switch *ds, int port);
/*
* Access to the switch's PHY registers.
*/
int (*phy_read)(struct dsa_switch *ds, int port, int regnum);
int (*phy_write)(struct dsa_switch *ds, int port,
int regnum, u16 val);
/*
* Link state adjustment (called from libphy)
*/
void (*adjust_link)(struct dsa_switch *ds, int port,
struct phy_device *phydev);
void (*fixed_link_update)(struct dsa_switch *ds, int port,
struct fixed_phy_status *st);
/*
* PHYLINK integration
*/
void (*phylink_validate)(struct dsa_switch *ds, int port,
unsigned long *supported,
struct phylink_link_state *state);
int (*phylink_mac_link_state)(struct dsa_switch *ds, int port,
struct phylink_link_state *state);
void (*phylink_mac_config)(struct dsa_switch *ds, int port,
unsigned int mode,
const struct phylink_link_state *state);
void (*phylink_mac_an_restart)(struct dsa_switch *ds, int port);
void (*phylink_mac_link_down)(struct dsa_switch *ds, int port,
unsigned int mode,
phy_interface_t interface);
void (*phylink_mac_link_up)(struct dsa_switch *ds, int port,
unsigned int mode,
phy_interface_t interface,
struct phy_device *phydev);
void (*phylink_fixed_state)(struct dsa_switch *ds, int port,
struct phylink_link_state *state);
/*
* ethtool hardware statistics.
*/
void (*get_strings)(struct dsa_switch *ds, int port,
u32 stringset, uint8_t *data);
void (*get_ethtool_stats)(struct dsa_switch *ds,
int port, uint64_t *data);
int (*get_sset_count)(struct dsa_switch *ds, int port, int sset);
void (*get_ethtool_phy_stats)(struct dsa_switch *ds,
int port, uint64_t *data);
/*
* ethtool Wake-on-LAN
*/
void (*get_wol)(struct dsa_switch *ds, int port,
struct ethtool_wolinfo *w);
int (*set_wol)(struct dsa_switch *ds, int port,
struct ethtool_wolinfo *w);
/*
* ethtool timestamp info
*/
int (*get_ts_info)(struct dsa_switch *ds, int port,
struct ethtool_ts_info *ts);
/*
* Suspend and resume
*/
int (*suspend)(struct dsa_switch *ds);
int (*resume)(struct dsa_switch *ds);
/*
* Port enable/disable
*/
int (*port_enable)(struct dsa_switch *ds, int port,
struct phy_device *phy);
void (*port_disable)(struct dsa_switch *ds, int port);
/*
* Port's MAC EEE settings
*/
int (*set_mac_eee)(struct dsa_switch *ds, int port,
struct ethtool_eee *e);
int (*get_mac_eee)(struct dsa_switch *ds, int port,
struct ethtool_eee *e);
/* EEPROM access */
int (*get_eeprom_len)(struct dsa_switch *ds);
int (*get_eeprom)(struct dsa_switch *ds,
struct ethtool_eeprom *eeprom, u8 *data);
int (*set_eeprom)(struct dsa_switch *ds,
struct ethtool_eeprom *eeprom, u8 *data);
/*
* Register access.
*/
int (*get_regs_len)(struct dsa_switch *ds, int port);
void (*get_regs)(struct dsa_switch *ds, int port,
struct ethtool_regs *regs, void *p);
/*
* Bridge integration
*/
int (*set_ageing_time)(struct dsa_switch *ds, unsigned int msecs);
int (*port_bridge_join)(struct dsa_switch *ds, int port,
struct net_device *bridge);
void (*port_bridge_leave)(struct dsa_switch *ds, int port,
struct net_device *bridge);
void (*port_stp_state_set)(struct dsa_switch *ds, int port,
u8 state);
void (*port_fast_age)(struct dsa_switch *ds, int port);
int (*port_egress_floods)(struct dsa_switch *ds, int port,
bool unicast, bool multicast);
/*
* VLAN support
*/
int (*port_vlan_filtering)(struct dsa_switch *ds, int port,
bool vlan_filtering);
int (*port_vlan_prepare)(struct dsa_switch *ds, int port,
const struct switchdev_obj_port_vlan *vlan);
void (*port_vlan_add)(struct dsa_switch *ds, int port,
const struct switchdev_obj_port_vlan *vlan);
int (*port_vlan_del)(struct dsa_switch *ds, int port,
const struct switchdev_obj_port_vlan *vlan);
/*
* Forwarding database
*/
int (*port_fdb_add)(struct dsa_switch *ds, int port,
const unsigned char *addr, u16 vid);
int (*port_fdb_del)(struct dsa_switch *ds, int port,
const unsigned char *addr, u16 vid);
int (*port_fdb_dump)(struct dsa_switch *ds, int port,
dsa_fdb_dump_cb_t *cb, void *data);
/*
* Multicast database
*/
int (*port_mdb_prepare)(struct dsa_switch *ds, int port,
const struct switchdev_obj_port_mdb *mdb);
void (*port_mdb_add)(struct dsa_switch *ds, int port,
const struct switchdev_obj_port_mdb *mdb);
int (*port_mdb_del)(struct dsa_switch *ds, int port,
const struct switchdev_obj_port_mdb *mdb);
/*
* RXNFC
*/
int (*get_rxnfc)(struct dsa_switch *ds, int port,
struct ethtool_rxnfc *nfc, u32 *rule_locs);
int (*set_rxnfc)(struct dsa_switch *ds, int port,
struct ethtool_rxnfc *nfc);
/*
* TC integration
*/
int (*port_mirror_add)(struct dsa_switch *ds, int port,
struct dsa_mall_mirror_tc_entry *mirror,
bool ingress);
void (*port_mirror_del)(struct dsa_switch *ds, int port,
struct dsa_mall_mirror_tc_entry *mirror);
/*
* Cross-chip operations
*/
int (*crosschip_bridge_join)(struct dsa_switch *ds, int sw_index,
int port, struct net_device *br);
void (*crosschip_bridge_leave)(struct dsa_switch *ds, int sw_index,
int port, struct net_device *br);
/*
* PTP functionality
*/
int (*port_hwtstamp_get)(struct dsa_switch *ds, int port,
struct ifreq *ifr);
int (*port_hwtstamp_set)(struct dsa_switch *ds, int port,
struct ifreq *ifr);
bool (*port_txtstamp)(struct dsa_switch *ds, int port,
struct sk_buff *clone, unsigned int type);
bool (*port_rxtstamp)(struct dsa_switch *ds, int port,
struct sk_buff *skb, unsigned int type);
};
struct dsa_switch_driver {
struct list_head list;
const struct dsa_switch_ops *ops;
};
struct net_device *dsa_dev_to_net_device(struct device *dev);
/* Keep inline for faster access in hot path */
static inline bool netdev_uses_dsa(struct net_device *dev)
{
#if IS_ENABLED(CONFIG_NET_DSA)
return dev->dsa_ptr && dev->dsa_ptr->rcv;
#endif
return false;
}
net: dsa: Allow drivers to filter packets they can decode source port from Frames get processed by DSA and redirected to switch port net devices based on the ETH_P_XDSA multiplexed packet_type handler found by the network stack when calling eth_type_trans(). The running assumption is that once the DSA .rcv function is called, DSA is always able to decode the switch tag in order to change the skb->dev from its master. However there are tagging protocols (such as the new DSA_TAG_PROTO_SJA1105, user of DSA_TAG_PROTO_8021Q) where this assumption is not completely true, since switch tagging piggybacks on the absence of a vlan_filtering bridge. Moreover, management traffic (BPDU, PTP) for this switch doesn't rely on switch tagging, but on a different mechanism. So it would make sense to at least be able to terminate that. Having DSA receive traffic it can't decode would put it in an impossible situation: the eth_type_trans() function would invoke the DSA .rcv(), which could not change skb->dev, then eth_type_trans() would be invoked again, which again would call the DSA .rcv, and the packet would never be able to exit the DSA filter and would spiral in a loop until the whole system dies. This happens because eth_type_trans() doesn't actually look at the skb (so as to identify a potential tag) when it deems it as being ETH_P_XDSA. It just checks whether skb->dev has a DSA private pointer installed (therefore it's a DSA master) and that there exists a .rcv callback (everybody except DSA_TAG_PROTO_NONE has that). This is understandable as there are many switch tags out there, and exhaustively checking for all of them is far from ideal. The solution lies in introducing a filtering function for each tagging protocol. In the absence of a filtering function, all traffic is passed to the .rcv DSA callback. The tagging protocol should see the filtering function as a pre-validation that it can decode the incoming skb. The traffic that doesn't match the filter will bypass the DSA .rcv callback and be left on the master netdevice, which wasn't previously possible. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-05 18:19:23 +08:00
static inline bool dsa_can_decode(const struct sk_buff *skb,
struct net_device *dev)
{
#if IS_ENABLED(CONFIG_NET_DSA)
return !dev->dsa_ptr->filter || dev->dsa_ptr->filter(skb, dev);
#endif
return false;
}
struct dsa_switch *dsa_switch_alloc(struct device *dev, size_t n);
void dsa_unregister_switch(struct dsa_switch *ds);
int dsa_register_switch(struct dsa_switch *ds);
#ifdef CONFIG_PM_SLEEP
int dsa_switch_suspend(struct dsa_switch *ds);
int dsa_switch_resume(struct dsa_switch *ds);
#else
static inline int dsa_switch_suspend(struct dsa_switch *ds)
{
return 0;
}
static inline int dsa_switch_resume(struct dsa_switch *ds)
{
return 0;
}
#endif /* CONFIG_PM_SLEEP */
enum dsa_notifier_type {
DSA_PORT_REGISTER,
DSA_PORT_UNREGISTER,
};
struct dsa_notifier_info {
struct net_device *dev;
};
struct dsa_notifier_register_info {
struct dsa_notifier_info info; /* must be first */
struct net_device *master;
unsigned int port_number;
unsigned int switch_number;
};
static inline struct net_device *
dsa_notifier_info_to_dev(const struct dsa_notifier_info *info)
{
return info->dev;
}
#if IS_ENABLED(CONFIG_NET_DSA)
int register_dsa_notifier(struct notifier_block *nb);
int unregister_dsa_notifier(struct notifier_block *nb);
int call_dsa_notifiers(unsigned long val, struct net_device *dev,
struct dsa_notifier_info *info);
#else
static inline int register_dsa_notifier(struct notifier_block *nb)
{
return 0;
}
static inline int unregister_dsa_notifier(struct notifier_block *nb)
{
return 0;
}
static inline int call_dsa_notifiers(unsigned long val, struct net_device *dev,
struct dsa_notifier_info *info)
{
return NOTIFY_DONE;
}
#endif
/* Broadcom tag specific helpers to insert and extract queue/port number */
#define BRCM_TAG_SET_PORT_QUEUE(p, q) ((p) << 8 | q)
#define BRCM_TAG_GET_PORT(v) ((v) >> 8)
#define BRCM_TAG_GET_QUEUE(v) ((v) & 0xff)
int dsa_port_get_phy_strings(struct dsa_port *dp, uint8_t *data);
int dsa_port_get_ethtool_phy_stats(struct dsa_port *dp, uint64_t *data);
int dsa_port_get_phy_sset_count(struct dsa_port *dp);
void dsa_port_phylink_mac_change(struct dsa_switch *ds, int port, bool up);
struct dsa_tag_driver {
const struct dsa_device_ops *ops;
struct list_head list;
struct module *owner;
};
void dsa_tag_drivers_register(struct dsa_tag_driver *dsa_tag_driver_array[],
unsigned int count,
struct module *owner);
void dsa_tag_drivers_unregister(struct dsa_tag_driver *dsa_tag_driver_array[],
unsigned int count);
#define dsa_tag_driver_module_drivers(__dsa_tag_drivers_array, __count) \
static int __init dsa_tag_driver_module_init(void) \
{ \
dsa_tag_drivers_register(__dsa_tag_drivers_array, __count, \
THIS_MODULE); \
return 0; \
} \
module_init(dsa_tag_driver_module_init); \
\
static void __exit dsa_tag_driver_module_exit(void) \
{ \
dsa_tag_drivers_unregister(__dsa_tag_drivers_array, __count); \
} \
module_exit(dsa_tag_driver_module_exit)
/**
* module_dsa_tag_drivers() - Helper macro for registering DSA tag
* drivers
* @__ops_array: Array of tag driver strucutres
*
* Helper macro for DSA tag drivers which do not do anything special
* in module init/exit. Each module may only use this macro once, and
* calling it replaces module_init() and module_exit().
*/
#define module_dsa_tag_drivers(__ops_array) \
dsa_tag_driver_module_drivers(__ops_array, ARRAY_SIZE(__ops_array))
#define DSA_TAG_DRIVER_NAME(__ops) dsa_tag_driver ## _ ## __ops
/* Create a static structure we can build a linked list of dsa_tag
* drivers
*/
#define DSA_TAG_DRIVER(__ops) \
static struct dsa_tag_driver DSA_TAG_DRIVER_NAME(__ops) = { \
.ops = &__ops, \
}
/**
* module_dsa_tag_driver() - Helper macro for registering a single DSA tag
* driver
* @__ops: Single tag driver structures
*
* Helper macro for DSA tag drivers which do not do anything special
* in module init/exit. Each module may only use this macro once, and
* calling it replaces module_init() and module_exit().
*/
#define module_dsa_tag_driver(__ops) \
DSA_TAG_DRIVER(__ops); \
\
static struct dsa_tag_driver *dsa_tag_driver_array[] = { \
&DSA_TAG_DRIVER_NAME(__ops) \
}; \
module_dsa_tag_drivers(dsa_tag_driver_array)
net: Distributed Switch Architecture protocol support Distributed Switch Architecture is a protocol for managing hardware switch chips. It consists of a set of MII management registers and commands to configure the switch, and an ethernet header format to signal which of the ports of the switch a packet was received from or is intended to be sent to. The switches that this driver supports are typically embedded in access points and routers, and a typical setup with a DSA switch looks something like this: +-----------+ +-----------+ | | RGMII | | | +-------+ +------ 1000baseT MDI ("WAN") | | | 6-port +------ 1000baseT MDI ("LAN1") | CPU | | ethernet +------ 1000baseT MDI ("LAN2") | |MIImgmt| switch +------ 1000baseT MDI ("LAN3") | +-------+ w/5 PHYs +------ 1000baseT MDI ("LAN4") | | | | +-----------+ +-----------+ The switch driver presents each port on the switch as a separate network interface to Linux, polls the switch to maintain software link state of those ports, forwards MII management interface accesses to those network interfaces (e.g. as done by ethtool) to the switch, and exposes the switch's hardware statistics counters via the appropriate Linux kernel interfaces. This initial patch supports the MII management interface register layout of the Marvell 88E6123, 88E6161 and 88E6165 switch chips, and supports the "Ethertype DSA" packet tagging format. (There is no officially registered ethertype for the Ethertype DSA packet format, so we just grab a random one. The ethertype to use is programmed into the switch, and the switch driver uses the value of ETH_P_EDSA for this, so this define can be changed at any time in the future if the one we chose is allocated to another protocol or if Ethertype DSA gets its own officially registered ethertype, and everything will continue to work.) Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Tested-by: Nicolas Pitre <nico@marvell.com> Tested-by: Byron Bradley <byron.bbradley@gmail.com> Tested-by: Tim Ellis <tim.ellis@mac.com> Tested-by: Peter van Valderen <linux@ddcrew.com> Tested-by: Dirk Teurlings <dirk@upexia.nl> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-07 21:44:02 +08:00
#endif