Merge branch 'net-ipv6-Fix-route-append-and-replace-use-cases'

David Ahern says:

====================
net/ipv6: Fix route append and replace use cases

This patch set fixes a few append and replace uses cases for IPv6 and
adds test cases that codifies the expectations of how append and replace
are expected to work. In paricular it allows a multipath route to have
a dev-only nexthop, something Thomas tried to accomplish with commit
edd7ceb782 ("ipv6: Allow non-gateway ECMP for IPv6") which had to be
reverted because of breakage, and to replace an existing FIB entry
with a reject route.

There are a number of inconsistent and surprising aspects to the Linux
API for adding, deleting, replacing and changing FIB entries. For example,
with IPv4 NLM_F_APPEND means insert the route after any existing entries
with the same key (prefix + priority + TOS for IPv4) and NLM_F_CREATE
without the append flag inserts the new route before any existing entries.

IPv6 on the other hand attempts to guess whether a new route should be
appended to an existing one, possibly creating a multipath route, or to
add a new entry after any existing ones. This applies to both the 'append'
(NLM_F_CREATE + NLM_F_APPEND) and 'prepend' (NLM_F_CREATE only) cases
meaning for IPv6 the NLM_F_APPEND is basically ignored. This guessing
whether the route should be added to a multipath route (gateway routes)
or inserted after existing entries (non-gateway based routes) means a
multipath route can not have a dev only nexthop (potentially required in
some cases - tunnels or VRF route leaking for example) and route 'replace'
is a bit adhoc treating gateway based routes and dev-only / reject routes
differently.

This has led to frustration with developers working on routing suites
such as FRR where workarounds such as delete and add are used instead of
replace.

After this patch set there are 2 differences between IPv4 and IPv6:
1. 'ip ro prepend' = NLM_F_CREATE only
    IPv4 adds the new route before any existing ones
    IPv6 adds new route after any existing ones

2. 'ip ro append' = NLM_F_CREATE|NLM_F_APPEND
   IPv4 adds the new route after any existing ones
   IPv6 adds the nexthop to existing routes converting to multipath

For the former, there are cases where we want same prefix routes added
after existing ones (e.g., multicast, prefix routes for macvlan when used
for virtual router redundancy). Requiring the APPEND flag to add a new
route to an existing one helps here but is a slight change in behavior
since prepend with gateway routes now create a separate entry.

For the latter IPv6 behavior is preferred - appending a route for the same
prefix and metric to make a multipath route, so really IPv4 not allowing an
existing route to be updated is the limiter. This will be fixed when
nexthops become separate objects - a future patch set.

Thank you to Thomas and Ido for testing earlier versions of this set, and
to Ido for providing an update to the mlxsw driver.

Changes since RFC
- cleanup wording in test script; add comments about expected failures
  and why
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
This commit is contained in:
David S. Miller 2018-05-22 14:44:20 -04:00
commit 73bf1fc58d
5 changed files with 744 additions and 105 deletions

View File

@ -5725,6 +5725,7 @@ static void mlxsw_sp_router_fib6_event_work(struct work_struct *work)
switch (fib_work->event) {
case FIB_EVENT_ENTRY_REPLACE: /* fall through */
case FIB_EVENT_ENTRY_APPEND: /* fall through */
case FIB_EVENT_ENTRY_ADD:
replace = fib_work->event == FIB_EVENT_ENTRY_REPLACE;
err = mlxsw_sp_router_fib6_add(mlxsw_sp,
@ -5831,6 +5832,7 @@ static void mlxsw_sp_router_fib6_event(struct mlxsw_sp_fib_event_work *fib_work,
switch (fib_work->event) {
case FIB_EVENT_ENTRY_REPLACE: /* fall through */
case FIB_EVENT_ENTRY_APPEND: /* fall through */
case FIB_EVENT_ENTRY_ADD: /* fall through */
case FIB_EVENT_ENTRY_DEL:
fen6_info = container_of(info, struct fib6_entry_notifier_info,

View File

@ -66,12 +66,6 @@ static inline bool rt6_need_strict(const struct in6_addr *daddr)
(IPV6_ADDR_MULTICAST | IPV6_ADDR_LINKLOCAL | IPV6_ADDR_LOOPBACK);
}
static inline bool rt6_qualify_for_ecmp(const struct fib6_info *f6i)
{
return (f6i->fib6_flags & (RTF_GATEWAY|RTF_ADDRCONF|RTF_DYNAMIC)) ==
RTF_GATEWAY;
}
void ip6_route_input(struct sk_buff *skb);
struct dst_entry *ip6_route_input_lookup(struct net *net,
struct net_device *dev,

View File

@ -934,19 +934,19 @@ static int fib6_add_rt2node(struct fib6_node *fn, struct fib6_info *rt,
{
struct fib6_info *leaf = rcu_dereference_protected(fn->leaf,
lockdep_is_held(&rt->fib6_table->tb6_lock));
struct fib6_info *iter = NULL;
struct fib6_info *iter = NULL, *match = NULL;
struct fib6_info __rcu **ins;
struct fib6_info __rcu **fallback_ins = NULL;
int replace = (info->nlh &&
(info->nlh->nlmsg_flags & NLM_F_REPLACE));
int append = (info->nlh &&
(info->nlh->nlmsg_flags & NLM_F_APPEND));
int add = (!info->nlh ||
(info->nlh->nlmsg_flags & NLM_F_CREATE));
int found = 0;
bool rt_can_ecmp = rt6_qualify_for_ecmp(rt);
u16 nlflags = NLM_F_EXCL;
int err;
if (info->nlh && (info->nlh->nlmsg_flags & NLM_F_APPEND))
if (append)
nlflags |= NLM_F_APPEND;
ins = &fn->leaf;
@ -968,13 +968,8 @@ static int fib6_add_rt2node(struct fib6_node *fn, struct fib6_info *rt,
nlflags &= ~NLM_F_EXCL;
if (replace) {
if (rt_can_ecmp == rt6_qualify_for_ecmp(iter)) {
found++;
break;
}
if (rt_can_ecmp)
fallback_ins = fallback_ins ?: ins;
goto next_iter;
found++;
break;
}
if (rt6_duplicate_nexthop(iter, rt)) {
@ -989,86 +984,67 @@ static int fib6_add_rt2node(struct fib6_node *fn, struct fib6_info *rt,
fib6_metric_set(iter, RTAX_MTU, rt->fib6_pmtu);
return -EEXIST;
}
/* If we have the same destination and the same metric,
* but not the same gateway, then the route we try to
* add is sibling to this route, increment our counter
* of siblings, and later we will add our route to the
* list.
* Only static routes (which don't have flag
* RTF_EXPIRES) are used for ECMPv6.
*
* To avoid long list, we only had siblings if the
* route have a gateway.
*/
if (rt_can_ecmp &&
rt6_qualify_for_ecmp(iter))
rt->fib6_nsiblings++;
/* first route that matches */
if (!match)
match = iter;
}
if (iter->fib6_metric > rt->fib6_metric)
break;
next_iter:
ins = &iter->fib6_next;
}
if (fallback_ins && !found) {
/* No ECMP-able route found, replace first non-ECMP one */
ins = fallback_ins;
iter = rcu_dereference_protected(*ins,
lockdep_is_held(&rt->fib6_table->tb6_lock));
found++;
}
/* Reset round-robin state, if necessary */
if (ins == &fn->leaf)
fn->rr_ptr = NULL;
/* Link this route to others same route. */
if (rt->fib6_nsiblings) {
unsigned int fib6_nsiblings;
if (append && match) {
struct fib6_info *sibling, *temp_sibling;
/* Find the first route that have the same metric */
sibling = leaf;
while (sibling) {
if (sibling->fib6_metric == rt->fib6_metric &&
rt6_qualify_for_ecmp(sibling)) {
list_add_tail(&rt->fib6_siblings,
&sibling->fib6_siblings);
break;
}
sibling = rcu_dereference_protected(sibling->fib6_next,
lockdep_is_held(&rt->fib6_table->tb6_lock));
if (rt->fib6_flags & RTF_REJECT) {
NL_SET_ERR_MSG(extack,
"Can not append a REJECT route");
return -EINVAL;
} else if (match->fib6_flags & RTF_REJECT) {
NL_SET_ERR_MSG(extack,
"Can not append to a REJECT route");
return -EINVAL;
}
rt->fib6_nsiblings = match->fib6_nsiblings;
list_add_tail(&rt->fib6_siblings, &match->fib6_siblings);
match->fib6_nsiblings++;
/* For each sibling in the list, increment the counter of
* siblings. BUG() if counters does not match, list of siblings
* is broken!
*/
fib6_nsiblings = 0;
list_for_each_entry_safe(sibling, temp_sibling,
&rt->fib6_siblings, fib6_siblings) {
&match->fib6_siblings, fib6_siblings) {
sibling->fib6_nsiblings++;
BUG_ON(sibling->fib6_nsiblings != rt->fib6_nsiblings);
fib6_nsiblings++;
BUG_ON(sibling->fib6_nsiblings != match->fib6_nsiblings);
}
BUG_ON(fib6_nsiblings != rt->fib6_nsiblings);
rt6_multipath_rebalance(temp_sibling);
rt6_multipath_rebalance(match);
}
/*
* insert node
*/
if (!replace) {
enum fib_event_type event;
if (!add)
pr_warn("NLM_F_CREATE should be set when creating new route\n");
add:
nlflags |= NLM_F_CREATE;
err = call_fib6_entry_notifiers(info->nl_net,
FIB_EVENT_ENTRY_ADD,
rt, extack);
event = append ? FIB_EVENT_ENTRY_APPEND : FIB_EVENT_ENTRY_ADD;
err = call_fib6_entry_notifiers(info->nl_net, event, rt,
extack);
if (err)
return err;
@ -1086,7 +1062,7 @@ static int fib6_add_rt2node(struct fib6_node *fn, struct fib6_info *rt,
}
} else {
int nsiblings;
struct fib6_info *tmp;
if (!found) {
if (add)
@ -1101,48 +1077,57 @@ static int fib6_add_rt2node(struct fib6_node *fn, struct fib6_info *rt,
if (err)
return err;
/* if route being replaced has siblings, set tmp to
* last one, otherwise tmp is current route. this is
* used to set fib6_next for new route
*/
if (iter->fib6_nsiblings)
tmp = list_last_entry(&iter->fib6_siblings,
struct fib6_info,
fib6_siblings);
else
tmp = iter;
/* insert new route */
atomic_inc(&rt->fib6_ref);
rcu_assign_pointer(rt->fib6_node, fn);
rt->fib6_next = iter->fib6_next;
rt->fib6_next = tmp->fib6_next;
rcu_assign_pointer(*ins, rt);
if (!info->skip_notify)
inet6_rt_notify(RTM_NEWROUTE, rt, info, NLM_F_REPLACE);
if (!(fn->fn_flags & RTN_RTINFO)) {
info->nl_net->ipv6.rt6_stats->fib_route_nodes++;
fn->fn_flags |= RTN_RTINFO;
}
nsiblings = iter->fib6_nsiblings;
iter->fib6_node = NULL;
fib6_purge_rt(iter, fn, info->nl_net);
if (rcu_access_pointer(fn->rr_ptr) == iter)
fn->rr_ptr = NULL;
fib6_info_release(iter);
if (nsiblings) {
/* delete old route */
rt = iter;
if (rt->fib6_nsiblings) {
struct fib6_info *tmp;
/* Replacing an ECMP route, remove all siblings */
ins = &rt->fib6_next;
iter = rcu_dereference_protected(*ins,
lockdep_is_held(&rt->fib6_table->tb6_lock));
while (iter) {
if (iter->fib6_metric > rt->fib6_metric)
break;
if (rt6_qualify_for_ecmp(iter)) {
*ins = iter->fib6_next;
iter->fib6_node = NULL;
fib6_purge_rt(iter, fn, info->nl_net);
if (rcu_access_pointer(fn->rr_ptr) == iter)
fn->rr_ptr = NULL;
fib6_info_release(iter);
nsiblings--;
info->nl_net->ipv6.rt6_stats->fib_rt_entries--;
} else {
ins = &iter->fib6_next;
}
iter = rcu_dereference_protected(*ins,
lockdep_is_held(&rt->fib6_table->tb6_lock));
list_for_each_entry_safe(iter, tmp, &rt->fib6_siblings,
fib6_siblings) {
iter->fib6_node = NULL;
fib6_purge_rt(iter, fn, info->nl_net);
if (rcu_access_pointer(fn->rr_ptr) == iter)
fn->rr_ptr = NULL;
fib6_info_release(iter);
rt->fib6_nsiblings--;
info->nl_net->ipv6.rt6_stats->fib_rt_entries--;
}
WARN_ON(nsiblings != 0);
}
WARN_ON(rt->fib6_nsiblings != 0);
rt->fib6_node = NULL;
fib6_purge_rt(rt, fn, info->nl_net);
if (rcu_access_pointer(fn->rr_ptr) == rt)
fn->rr_ptr = NULL;
fib6_info_release(rt);
}
return 0;

View File

@ -3791,7 +3791,7 @@ static struct fib6_info *rt6_multipath_first_sibling(const struct fib6_info *rt)
lockdep_is_held(&rt->fib6_table->tb6_lock));
while (iter) {
if (iter->fib6_metric == rt->fib6_metric &&
rt6_qualify_for_ecmp(iter))
iter->fib6_nsiblings)
return iter;
iter = rcu_dereference_protected(iter->fib6_next,
lockdep_is_held(&rt->fib6_table->tb6_lock));
@ -4381,6 +4381,7 @@ static int ip6_route_multipath_add(struct fib6_config *cfg,
*/
cfg->fc_nlinfo.nlh->nlmsg_flags &= ~(NLM_F_EXCL |
NLM_F_REPLACE);
cfg->fc_nlinfo.nlh->nlmsg_flags |= NLM_F_APPEND;
nhn++;
}

View File

@ -6,8 +6,10 @@
ret=0
VERBOSE=${VERBOSE:=0}
PAUSE_ON_FAIL=${PAUSE_ON_FAIL:=no}
TESTS="unregister down carrier nexthop ipv6_rt ipv4_rt"
VERBOSE=0
PAUSE_ON_FAIL=no
PAUSE=no
IP="ip -netns testns"
log_test()
@ -18,8 +20,10 @@ log_test()
if [ ${rc} -eq ${expected} ]; then
printf " TEST: %-60s [ OK ]\n" "${msg}"
nsuccess=$((nsuccess+1))
else
ret=1
nfail=$((nfail+1))
printf " TEST: %-60s [FAIL]\n" "${msg}"
if [ "${PAUSE_ON_FAIL}" = "yes" ]; then
echo
@ -28,6 +32,13 @@ log_test()
[ "$a" = "q" ] && exit 1
fi
fi
if [ "${PAUSE}" = "yes" ]; then
echo
echo "hit enter to continue, 'q' to quit"
read a
[ "$a" = "q" ] && exit 1
fi
}
setup()
@ -563,20 +574,649 @@ fib_nexthop_test()
}
################################################################################
#
# Tests on route add and replace
fib_test()
run_cmd()
{
if [ -n "$TEST" ]; then
eval $TEST
else
fib_unreg_test
fib_down_test
fib_carrier_test
fib_nexthop_test
local cmd="$1"
local out
local stderr="2>/dev/null"
if [ "$VERBOSE" = "1" ]; then
printf " COMMAND: $cmd\n"
stderr=
fi
out=$(eval $cmd $stderr)
rc=$?
if [ "$VERBOSE" = "1" -a -n "$out" ]; then
echo " $out"
fi
[ "$VERBOSE" = "1" ] && echo
return $rc
}
# add route for a prefix, flushing any existing routes first
# expected to be the first step of a test
add_route6()
{
local pfx="$1"
local nh="$2"
local out
if [ "$VERBOSE" = "1" ]; then
echo
echo " ##################################################"
echo
fi
run_cmd "$IP -6 ro flush ${pfx}"
[ $? -ne 0 ] && exit 1
out=$($IP -6 ro ls match ${pfx})
if [ -n "$out" ]; then
echo "Failed to flush routes for prefix used for tests."
exit 1
fi
run_cmd "$IP -6 ro add ${pfx} ${nh}"
if [ $? -ne 0 ]; then
echo "Failed to add initial route for test."
exit 1
fi
}
# add initial route - used in replace route tests
add_initial_route6()
{
add_route6 "2001:db8:104::/64" "$1"
}
check_route6()
{
local pfx="2001:db8:104::/64"
local expected="$1"
local out
local rc=0
out=$($IP -6 ro ls match ${pfx} | sed -e 's/ pref medium//')
if [ -z "${out}" ]; then
if [ "$VERBOSE" = "1" ]; then
printf "\nNo route entry found\n"
printf "Expected:\n"
printf " ${expected}\n"
fi
return 1
fi
# tricky way to convert output to 1-line without ip's
# messy '\'; this drops all extra white space
out=$(echo ${out})
if [ "${out}" != "${expected}" ]; then
rc=1
if [ "${VERBOSE}" = "1" ]; then
printf " Unexpected route entry. Have:\n"
printf " ${out}\n"
printf " Expected:\n"
printf " ${expected}\n\n"
fi
fi
return $rc
}
route_cleanup()
{
$IP li del red 2>/dev/null
$IP li del dummy1 2>/dev/null
$IP li del veth1 2>/dev/null
$IP li del veth3 2>/dev/null
cleanup &> /dev/null
}
route_setup()
{
route_cleanup
setup
[ "${VERBOSE}" = "1" ] && set -x
set -e
$IP li add red up type vrf table 101
$IP li add veth1 type veth peer name veth2
$IP li add veth3 type veth peer name veth4
$IP li set veth1 up
$IP li set veth3 up
$IP li set veth2 vrf red up
$IP li set veth4 vrf red up
$IP li add dummy1 type dummy
$IP li set dummy1 vrf red up
$IP -6 addr add 2001:db8:101::1/64 dev veth1
$IP -6 addr add 2001:db8:101::2/64 dev veth2
$IP -6 addr add 2001:db8:103::1/64 dev veth3
$IP -6 addr add 2001:db8:103::2/64 dev veth4
$IP -6 addr add 2001:db8:104::1/64 dev dummy1
$IP addr add 172.16.101.1/24 dev veth1
$IP addr add 172.16.101.2/24 dev veth2
$IP addr add 172.16.103.1/24 dev veth3
$IP addr add 172.16.103.2/24 dev veth4
$IP addr add 172.16.104.1/24 dev dummy1
set +ex
}
# assumption is that basic add of a single path route works
# otherwise just adding an address on an interface is broken
ipv6_rt_add()
{
local rc
echo
echo "IPv6 route add / append tests"
# route add same prefix - fails with EEXISTS b/c ip adds NLM_F_EXCL
add_route6 "2001:db8:104::/64" "via 2001:db8:101::2"
run_cmd "$IP -6 ro add 2001:db8:104::/64 via 2001:db8:103::2"
log_test $? 2 "Attempt to add duplicate route - gw"
# route add same prefix - fails with EEXISTS b/c ip adds NLM_F_EXCL
add_route6 "2001:db8:104::/64" "via 2001:db8:101::2"
run_cmd "$IP -6 ro add 2001:db8:104::/64 dev veth3"
log_test $? 2 "Attempt to add duplicate route - dev only"
# route add same prefix - fails with EEXISTS b/c ip adds NLM_F_EXCL
add_route6 "2001:db8:104::/64" "via 2001:db8:101::2"
run_cmd "$IP -6 ro add unreachable 2001:db8:104::/64"
log_test $? 2 "Attempt to add duplicate route - reject route"
# iproute2 prepend only sets NLM_F_CREATE
# - adds a new route; does NOT convert existing route to ECMP
add_route6 "2001:db8:104::/64" "via 2001:db8:101::2"
run_cmd "$IP -6 ro prepend 2001:db8:104::/64 via 2001:db8:103::2"
check_route6 "2001:db8:104::/64 via 2001:db8:101::2 dev veth1 metric 1024 2001:db8:104::/64 via 2001:db8:103::2 dev veth3 metric 1024"
log_test $? 0 "Add new route for existing prefix (w/o NLM_F_EXCL)"
# route append with same prefix adds a new route
# - iproute2 sets NLM_F_CREATE | NLM_F_APPEND
add_route6 "2001:db8:104::/64" "via 2001:db8:101::2"
run_cmd "$IP -6 ro append 2001:db8:104::/64 via 2001:db8:103::2"
check_route6 "2001:db8:104::/64 metric 1024 nexthop via 2001:db8:101::2 dev veth1 weight 1 nexthop via 2001:db8:103::2 dev veth3 weight 1"
log_test $? 0 "Append nexthop to existing route - gw"
add_route6 "2001:db8:104::/64" "via 2001:db8:101::2"
run_cmd "$IP -6 ro append 2001:db8:104::/64 dev veth3"
check_route6 "2001:db8:104::/64 metric 1024 nexthop via 2001:db8:101::2 dev veth1 weight 1 nexthop dev veth3 weight 1"
log_test $? 0 "Append nexthop to existing route - dev only"
# multipath route can not have a nexthop that is a reject route
add_route6 "2001:db8:104::/64" "via 2001:db8:101::2"
run_cmd "$IP -6 ro append unreachable 2001:db8:104::/64"
log_test $? 2 "Append nexthop to existing route - reject route"
# reject route can not be converted to multipath route
run_cmd "$IP -6 ro flush 2001:db8:104::/64"
run_cmd "$IP -6 ro add unreachable 2001:db8:104::/64"
run_cmd "$IP -6 ro append 2001:db8:104::/64 via 2001:db8:103::2"
log_test $? 2 "Append nexthop to existing reject route - gw"
run_cmd "$IP -6 ro flush 2001:db8:104::/64"
run_cmd "$IP -6 ro add unreachable 2001:db8:104::/64"
run_cmd "$IP -6 ro append 2001:db8:104::/64 dev veth3"
log_test $? 2 "Append nexthop to existing reject route - dev only"
# insert mpath directly
add_route6 "2001:db8:104::/64" "nexthop via 2001:db8:101::2 nexthop via 2001:db8:103::2"
check_route6 "2001:db8:104::/64 metric 1024 nexthop via 2001:db8:101::2 dev veth1 weight 1 nexthop via 2001:db8:103::2 dev veth3 weight 1"
log_test $? 0 "Add multipath route"
add_route6 "2001:db8:104::/64" "nexthop via 2001:db8:101::2 nexthop via 2001:db8:103::2"
run_cmd "$IP -6 ro add 2001:db8:104::/64 nexthop via 2001:db8:101::2 nexthop via 2001:db8:103::2"
log_test $? 2 "Attempt to add duplicate multipath route"
# insert of a second route without append but different metric
add_route6 "2001:db8:104::/64" "via 2001:db8:101::2"
run_cmd "$IP -6 ro add 2001:db8:104::/64 via 2001:db8:103::2 metric 512"
rc=$?
if [ $rc -eq 0 ]; then
run_cmd "$IP -6 ro add 2001:db8:104::/64 via 2001:db8:103::3 metric 256"
rc=$?
fi
log_test $rc 0 "Route add with different metrics"
run_cmd "$IP -6 ro del 2001:db8:104::/64 metric 512"
rc=$?
if [ $rc -eq 0 ]; then
check_route6 "2001:db8:104::/64 via 2001:db8:103::3 dev veth3 metric 256 2001:db8:104::/64 via 2001:db8:101::2 dev veth1 metric 1024"
rc=$?
fi
log_test $rc 0 "Route delete with metric"
}
ipv6_rt_replace_single()
{
# single path with single path
#
add_initial_route6 "via 2001:db8:101::2"
run_cmd "$IP -6 ro replace 2001:db8:104::/64 via 2001:db8:103::2"
check_route6 "2001:db8:104::/64 via 2001:db8:103::2 dev veth3 metric 1024"
log_test $? 0 "Single path with single path"
# single path with multipath
#
add_initial_route6 "nexthop via 2001:db8:101::2"
run_cmd "$IP -6 ro replace 2001:db8:104::/64 nexthop via 2001:db8:101::3 nexthop via 2001:db8:103::2"
check_route6 "2001:db8:104::/64 metric 1024 nexthop via 2001:db8:101::3 dev veth1 weight 1 nexthop via 2001:db8:103::2 dev veth3 weight 1"
log_test $? 0 "Single path with multipath"
# single path with reject
#
add_initial_route6 "nexthop via 2001:db8:101::2"
run_cmd "$IP -6 ro replace unreachable 2001:db8:104::/64"
check_route6 "unreachable 2001:db8:104::/64 dev lo metric 1024"
log_test $? 0 "Single path with reject route"
# single path with single path using MULTIPATH attribute
#
add_initial_route6 "via 2001:db8:101::2"
run_cmd "$IP -6 ro replace 2001:db8:104::/64 nexthop via 2001:db8:103::2"
check_route6 "2001:db8:104::/64 via 2001:db8:103::2 dev veth3 metric 1024"
log_test $? 0 "Single path with single path via multipath attribute"
# route replace fails - invalid nexthop
add_initial_route6 "via 2001:db8:101::2"
run_cmd "$IP -6 ro replace 2001:db8:104::/64 via 2001:db8:104::2"
if [ $? -eq 0 ]; then
# previous command is expected to fail so if it returns 0
# that means the test failed.
log_test 0 1 "Invalid nexthop"
else
check_route6 "2001:db8:104::/64 via 2001:db8:101::2 dev veth1 metric 1024"
log_test $? 0 "Invalid nexthop"
fi
# replace non-existent route
# - note use of change versus replace since ip adds NLM_F_CREATE
# for replace
add_initial_route6 "via 2001:db8:101::2"
run_cmd "$IP -6 ro change 2001:db8:105::/64 via 2001:db8:101::2"
log_test $? 2 "Single path - replace of non-existent route"
}
ipv6_rt_replace_mpath()
{
# multipath with multipath
add_initial_route6 "nexthop via 2001:db8:101::2 nexthop via 2001:db8:103::2"
run_cmd "$IP -6 ro replace 2001:db8:104::/64 nexthop via 2001:db8:101::3 nexthop via 2001:db8:103::3"
check_route6 "2001:db8:104::/64 metric 1024 nexthop via 2001:db8:101::3 dev veth1 weight 1 nexthop via 2001:db8:103::3 dev veth3 weight 1"
log_test $? 0 "Multipath with multipath"
# multipath with single
add_initial_route6 "nexthop via 2001:db8:101::2 nexthop via 2001:db8:103::2"
run_cmd "$IP -6 ro replace 2001:db8:104::/64 via 2001:db8:101::3"
check_route6 "2001:db8:104::/64 via 2001:db8:101::3 dev veth1 metric 1024"
log_test $? 0 "Multipath with single path"
# multipath with single
add_initial_route6 "nexthop via 2001:db8:101::2 nexthop via 2001:db8:103::2"
run_cmd "$IP -6 ro replace 2001:db8:104::/64 nexthop via 2001:db8:101::3"
check_route6 "2001:db8:104::/64 via 2001:db8:101::3 dev veth1 metric 1024"
log_test $? 0 "Multipath with single path via multipath attribute"
# multipath with reject
add_initial_route6 "nexthop via 2001:db8:101::2 nexthop via 2001:db8:103::2"
run_cmd "$IP -6 ro replace unreachable 2001:db8:104::/64"
check_route6 "unreachable 2001:db8:104::/64 dev lo metric 1024"
log_test $? 0 "Multipath with reject route"
# route replace fails - invalid nexthop 1
add_initial_route6 "nexthop via 2001:db8:101::2 nexthop via 2001:db8:103::2"
run_cmd "$IP -6 ro replace 2001:db8:104::/64 nexthop via 2001:db8:111::3 nexthop via 2001:db8:103::3"
check_route6 "2001:db8:104::/64 metric 1024 nexthop via 2001:db8:101::2 dev veth1 weight 1 nexthop via 2001:db8:103::2 dev veth3 weight 1"
log_test $? 0 "Multipath - invalid first nexthop"
# route replace fails - invalid nexthop 2
add_initial_route6 "nexthop via 2001:db8:101::2 nexthop via 2001:db8:103::2"
run_cmd "$IP -6 ro replace 2001:db8:104::/64 nexthop via 2001:db8:101::3 nexthop via 2001:db8:113::3"
check_route6 "2001:db8:104::/64 metric 1024 nexthop via 2001:db8:101::2 dev veth1 weight 1 nexthop via 2001:db8:103::2 dev veth3 weight 1"
log_test $? 0 "Multipath - invalid second nexthop"
# multipath non-existent route
add_initial_route6 "nexthop via 2001:db8:101::2 nexthop via 2001:db8:103::2"
run_cmd "$IP -6 ro change 2001:db8:105::/64 nexthop via 2001:db8:101::3 nexthop via 2001:db8:103::3"
log_test $? 2 "Multipath - replace of non-existent route"
}
ipv6_rt_replace()
{
echo
echo "IPv6 route replace tests"
ipv6_rt_replace_single
ipv6_rt_replace_mpath
}
ipv6_route_test()
{
route_setup
ipv6_rt_add
ipv6_rt_replace
route_cleanup
}
# add route for a prefix, flushing any existing routes first
# expected to be the first step of a test
add_route()
{
local pfx="$1"
local nh="$2"
local out
if [ "$VERBOSE" = "1" ]; then
echo
echo " ##################################################"
echo
fi
run_cmd "$IP ro flush ${pfx}"
[ $? -ne 0 ] && exit 1
out=$($IP ro ls match ${pfx})
if [ -n "$out" ]; then
echo "Failed to flush routes for prefix used for tests."
exit 1
fi
run_cmd "$IP ro add ${pfx} ${nh}"
if [ $? -ne 0 ]; then
echo "Failed to add initial route for test."
exit 1
fi
}
# add initial route - used in replace route tests
add_initial_route()
{
add_route "172.16.104.0/24" "$1"
}
check_route()
{
local pfx="172.16.104.0/24"
local expected="$1"
local out
local rc=0
out=$($IP ro ls match ${pfx})
if [ -z "${out}" ]; then
if [ "$VERBOSE" = "1" ]; then
printf "\nNo route entry found\n"
printf "Expected:\n"
printf " ${expected}\n"
fi
return 1
fi
# tricky way to convert output to 1-line without ip's
# messy '\'; this drops all extra white space
out=$(echo ${out})
if [ "${out}" != "${expected}" ]; then
rc=1
if [ "${VERBOSE}" = "1" ]; then
printf " Unexpected route entry. Have:\n"
printf " ${out}\n"
printf " Expected:\n"
printf " ${expected}\n\n"
fi
fi
return $rc
}
# assumption is that basic add of a single path route works
# otherwise just adding an address on an interface is broken
ipv4_rt_add()
{
local rc
echo
echo "IPv4 route add / append tests"
# route add same prefix - fails with EEXISTS b/c ip adds NLM_F_EXCL
add_route "172.16.104.0/24" "via 172.16.101.2"
run_cmd "$IP ro add 172.16.104.0/24 via 172.16.103.2"
log_test $? 2 "Attempt to add duplicate route - gw"
# route add same prefix - fails with EEXISTS b/c ip adds NLM_F_EXCL
add_route "172.16.104.0/24" "via 172.16.101.2"
run_cmd "$IP ro add 172.16.104.0/24 dev veth3"
log_test $? 2 "Attempt to add duplicate route - dev only"
# route add same prefix - fails with EEXISTS b/c ip adds NLM_F_EXCL
add_route "172.16.104.0/24" "via 172.16.101.2"
run_cmd "$IP ro add unreachable 172.16.104.0/24"
log_test $? 2 "Attempt to add duplicate route - reject route"
# iproute2 prepend only sets NLM_F_CREATE
# - adds a new route; does NOT convert existing route to ECMP
add_route "172.16.104.0/24" "via 172.16.101.2"
run_cmd "$IP ro prepend 172.16.104.0/24 via 172.16.103.2"
check_route "172.16.104.0/24 via 172.16.103.2 dev veth3 172.16.104.0/24 via 172.16.101.2 dev veth1"
log_test $? 0 "Add new nexthop for existing prefix"
# route append with same prefix adds a new route
# - iproute2 sets NLM_F_CREATE | NLM_F_APPEND
add_route "172.16.104.0/24" "via 172.16.101.2"
run_cmd "$IP ro append 172.16.104.0/24 via 172.16.103.2"
check_route "172.16.104.0/24 via 172.16.101.2 dev veth1 172.16.104.0/24 via 172.16.103.2 dev veth3"
log_test $? 0 "Append nexthop to existing route - gw"
add_route "172.16.104.0/24" "via 172.16.101.2"
run_cmd "$IP ro append 172.16.104.0/24 dev veth3"
check_route "172.16.104.0/24 via 172.16.101.2 dev veth1 172.16.104.0/24 dev veth3 scope link"
log_test $? 0 "Append nexthop to existing route - dev only"
add_route "172.16.104.0/24" "via 172.16.101.2"
run_cmd "$IP ro append unreachable 172.16.104.0/24"
check_route "172.16.104.0/24 via 172.16.101.2 dev veth1 unreachable 172.16.104.0/24"
log_test $? 0 "Append nexthop to existing route - reject route"
run_cmd "$IP ro flush 172.16.104.0/24"
run_cmd "$IP ro add unreachable 172.16.104.0/24"
run_cmd "$IP ro append 172.16.104.0/24 via 172.16.103.2"
check_route "unreachable 172.16.104.0/24 172.16.104.0/24 via 172.16.103.2 dev veth3"
log_test $? 0 "Append nexthop to existing reject route - gw"
run_cmd "$IP ro flush 172.16.104.0/24"
run_cmd "$IP ro add unreachable 172.16.104.0/24"
run_cmd "$IP ro append 172.16.104.0/24 dev veth3"
check_route "unreachable 172.16.104.0/24 172.16.104.0/24 dev veth3 scope link"
log_test $? 0 "Append nexthop to existing reject route - dev only"
# insert mpath directly
add_route "172.16.104.0/24" "nexthop via 172.16.101.2 nexthop via 172.16.103.2"
check_route "172.16.104.0/24 nexthop via 172.16.101.2 dev veth1 weight 1 nexthop via 172.16.103.2 dev veth3 weight 1"
log_test $? 0 "add multipath route"
add_route "172.16.104.0/24" "nexthop via 172.16.101.2 nexthop via 172.16.103.2"
run_cmd "$IP ro add 172.16.104.0/24 nexthop via 172.16.101.2 nexthop via 172.16.103.2"
log_test $? 2 "Attempt to add duplicate multipath route"
# insert of a second route without append but different metric
add_route "172.16.104.0/24" "via 172.16.101.2"
run_cmd "$IP ro add 172.16.104.0/24 via 172.16.103.2 metric 512"
rc=$?
if [ $rc -eq 0 ]; then
run_cmd "$IP ro add 172.16.104.0/24 via 172.16.103.3 metric 256"
rc=$?
fi
log_test $rc 0 "Route add with different metrics"
run_cmd "$IP ro del 172.16.104.0/24 metric 512"
rc=$?
if [ $rc -eq 0 ]; then
check_route "172.16.104.0/24 via 172.16.101.2 dev veth1 172.16.104.0/24 via 172.16.103.3 dev veth3 metric 256"
rc=$?
fi
log_test $rc 0 "Route delete with metric"
}
ipv4_rt_replace_single()
{
# single path with single path
#
add_initial_route "via 172.16.101.2"
run_cmd "$IP ro replace 172.16.104.0/24 via 172.16.103.2"
check_route "172.16.104.0/24 via 172.16.103.2 dev veth3"
log_test $? 0 "Single path with single path"
# single path with multipath
#
add_initial_route "nexthop via 172.16.101.2"
run_cmd "$IP ro replace 172.16.104.0/24 nexthop via 172.16.101.3 nexthop via 172.16.103.2"
check_route "172.16.104.0/24 nexthop via 172.16.101.3 dev veth1 weight 1 nexthop via 172.16.103.2 dev veth3 weight 1"
log_test $? 0 "Single path with multipath"
# single path with reject
#
add_initial_route "nexthop via 172.16.101.2"
run_cmd "$IP ro replace unreachable 172.16.104.0/24"
check_route "unreachable 172.16.104.0/24"
log_test $? 0 "Single path with reject route"
# single path with single path using MULTIPATH attribute
#
add_initial_route "via 172.16.101.2"
run_cmd "$IP ro replace 172.16.104.0/24 nexthop via 172.16.103.2"
check_route "172.16.104.0/24 via 172.16.103.2 dev veth3"
log_test $? 0 "Single path with single path via multipath attribute"
# route replace fails - invalid nexthop
add_initial_route "via 172.16.101.2"
run_cmd "$IP ro replace 172.16.104.0/24 via 2001:db8:104::2"
if [ $? -eq 0 ]; then
# previous command is expected to fail so if it returns 0
# that means the test failed.
log_test 0 1 "Invalid nexthop"
else
check_route "172.16.104.0/24 via 172.16.101.2 dev veth1"
log_test $? 0 "Invalid nexthop"
fi
# replace non-existent route
# - note use of change versus replace since ip adds NLM_F_CREATE
# for replace
add_initial_route "via 172.16.101.2"
run_cmd "$IP ro change 172.16.105.0/24 via 172.16.101.2"
log_test $? 2 "Single path - replace of non-existent route"
}
ipv4_rt_replace_mpath()
{
# multipath with multipath
add_initial_route "nexthop via 172.16.101.2 nexthop via 172.16.103.2"
run_cmd "$IP ro replace 172.16.104.0/24 nexthop via 172.16.101.3 nexthop via 172.16.103.3"
check_route "172.16.104.0/24 nexthop via 172.16.101.3 dev veth1 weight 1 nexthop via 172.16.103.3 dev veth3 weight 1"
log_test $? 0 "Multipath with multipath"
# multipath with single
add_initial_route "nexthop via 172.16.101.2 nexthop via 172.16.103.2"
run_cmd "$IP ro replace 172.16.104.0/24 via 172.16.101.3"
check_route "172.16.104.0/24 via 172.16.101.3 dev veth1"
log_test $? 0 "Multipath with single path"
# multipath with single
add_initial_route "nexthop via 172.16.101.2 nexthop via 172.16.103.2"
run_cmd "$IP ro replace 172.16.104.0/24 nexthop via 172.16.101.3"
check_route "172.16.104.0/24 via 172.16.101.3 dev veth1"
log_test $? 0 "Multipath with single path via multipath attribute"
# multipath with reject
add_initial_route "nexthop via 172.16.101.2 nexthop via 172.16.103.2"
run_cmd "$IP ro replace unreachable 172.16.104.0/24"
check_route "unreachable 172.16.104.0/24"
log_test $? 0 "Multipath with reject route"
# route replace fails - invalid nexthop 1
add_initial_route "nexthop via 172.16.101.2 nexthop via 172.16.103.2"
run_cmd "$IP ro replace 172.16.104.0/24 nexthop via 172.16.111.3 nexthop via 172.16.103.3"
check_route "172.16.104.0/24 nexthop via 172.16.101.2 dev veth1 weight 1 nexthop via 172.16.103.2 dev veth3 weight 1"
log_test $? 0 "Multipath - invalid first nexthop"
# route replace fails - invalid nexthop 2
add_initial_route "nexthop via 172.16.101.2 nexthop via 172.16.103.2"
run_cmd "$IP ro replace 172.16.104.0/24 nexthop via 172.16.101.3 nexthop via 172.16.113.3"
check_route "172.16.104.0/24 nexthop via 172.16.101.2 dev veth1 weight 1 nexthop via 172.16.103.2 dev veth3 weight 1"
log_test $? 0 "Multipath - invalid second nexthop"
# multipath non-existent route
add_initial_route "nexthop via 172.16.101.2 nexthop via 172.16.103.2"
run_cmd "$IP ro change 172.16.105.0/24 nexthop via 172.16.101.3 nexthop via 172.16.103.3"
log_test $? 2 "Multipath - replace of non-existent route"
}
ipv4_rt_replace()
{
echo
echo "IPv4 route replace tests"
ipv4_rt_replace_single
ipv4_rt_replace_mpath
}
ipv4_route_test()
{
route_setup
ipv4_rt_add
ipv4_rt_replace
route_cleanup
}
################################################################################
# usage
usage()
{
cat <<EOF
usage: ${0##*/} OPTS
-t <test> Test(s) to run (default: all)
(options: $TESTS)
-p Pause on fail
-P Pause after each test before cleanup
-v verbose mode (show commands and output)
EOF
}
################################################################################
# main
while getopts :t:pPhv o
do
case $o in
t) TESTS=$OPTARG;;
p) PAUSE_ON_FAIL=yes;;
P) PAUSE=yes;;
v) VERBOSE=$(($VERBOSE + 1));;
h) usage; exit 0;;
*) usage; exit 1;;
esac
done
PEER_CMD="ip netns exec ${PEER_NS}"
# make sure we don't pause twice
[ "${PAUSE}" = "yes" ] && PAUSE_ON_FAIL=no
if [ "$(id -u)" -ne 0 ];then
echo "SKIP: Need root privileges"
exit 0
@ -596,6 +1236,23 @@ fi
# start clean
cleanup &> /dev/null
fib_test
for t in $TESTS
do
case $t in
fib_unreg_test|unregister) fib_unreg_test;;
fib_down_test|down) fib_down_test;;
fib_carrier_test|carrier) fib_carrier_test;;
fib_nexthop_test|nexthop) fib_nexthop_test;;
ipv6_route_test|ipv6_rt) ipv6_route_test;;
ipv4_route_test|ipv4_rt) ipv4_route_test;;
help) echo "Test names: $TESTS"; exit 0;;
esac
done
if [ "$TESTS" != "none" ]; then
printf "\nTests passed: %3d\n" ${nsuccess}
printf "Tests failed: %3d\n" ${nfail}
fi
exit $ret