Add TIPC_NL_NAME_TABLE_GET command to the new tipc netlink API.
This command supports dumping the name table of all nodes.
Netlink logical layout of name table response message:
-> name table
-> publication
-> type
-> lower
-> upper
-> scope
-> node
-> ref
-> key
Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add TIPC_NL_NET_SET command to the new tipc netlink API.
This command can set the network id and network (tipc) address.
Netlink logical layout of network set message:
-> net
[ -> id ]
[ -> address ]
Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add TIPC_NL_NET_GET command to the new tipc netlink API.
This command dumps the network id of the node.
Netlink logical layout of returned network data:
-> net
-> id
Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add TIPC_NL_NODE_GET to the new tipc netlink API.
This command can dump the address and node status of all nodes in the
tipc cluster.
Netlink logical layout of returned node/address data:
-> node
-> address
-> up flag
Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add TIPC_NL_MEDIA_SET command to the new tipc netlink API.
This command can set one or more link properties for a particular
media.
Netlink logical layout of bearer set message:
-> media
-> name
-> link properties
[ -> tolerance ]
[ -> priority ]
[ -> window ]
Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add TIPC_NL_MEDIA_GET command to the new tipc netlink API.
This command supports dumping all information about all defined
media as well as getting all information about a specific media.
The information about a media includes name and link properties.
Netlink logical layout of media get response message:
-> media
-> name
-> link properties
-> tolerance
-> priority
-> window
Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add TIPC_NL_LINK_RESET_STATS command to the new netlink API.
This command resets the link statistics for a particular link.
Netlink logical layout of link reset message:
-> link
-> name
Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add TIPC_NL_LINK_SET to the new tipc netlink API.
This command can set one or more link properties for a particular
link.
Netlink logical layout of link set message:
-> link
-> name
-> properties
[ -> tolerance ]
[ -> priority ]
[ -> window ]
Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add TIPC_NL_LINK_GET command to the new tipc netlink API.
This command supports dumping all information about all links
(including the broadcast link) or getting all information about a
specific link (not the broadcast link).
The information about a link includes name, transmission info,
properties and link statistics.
As the tipc broadcast link is special we unfortunately have to treat
it specially. It is a deliberate decision not to abstract the
broadcast link on this (API) level.
Netlink logical layout of link response message:
-> port
-> name
-> MTU
-> RX
-> TX
-> up flag
-> active flag
-> properties
-> priority
-> tolerance
-> window
-> statistics
-> rx_info
-> rx_fragments
-> rx_fragmented
-> rx_bundles
-> rx_bundled
-> tx_info
-> tx_fragments
-> tx_fragmented
-> tx_bundles
-> tx_bundled
-> msg_prof_tot
-> msg_len_cnt
-> msg_len_tot
-> msg_len_p0
-> msg_len_p1
-> msg_len_p2
-> msg_len_p3
-> msg_len_p4
-> msg_len_p5
-> msg_len_p6
-> rx_states
-> rx_probes
-> rx_nacks
-> rx_deferred
-> tx_states
-> tx_probes
-> tx_nacks
-> tx_acks
-> retransmitted
-> duplicates
-> link_congs
-> max_queue
-> avg_queue
Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add TIPC_NL_PUBL_GET command to the new tipc netlink API.
This command supports dumping of all publications for a specific
socket.
Netlink logical layout of request message:
-> socket
-> reference
Netlink logical layout of response message:
-> publication
-> type
-> lower
-> upper
Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add TIPC_NL_SOCK_GET command to the new tipc netlink API.
This command supports dumping of all available sockets with their
associated connection or publication(s). It could be extended to reply
with a single socket if the NLM_F_DUMP isn't set.
The information about a socket includes reference, address, connection
information / publication information.
Netlink logical layout of response message:
-> socket
-> reference
-> address
[
-> connection
-> node
-> socket
[
-> connected flag
-> type
-> instance
]
]
[
-> publication flag
]
Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add TIPC_NL_BEARER_SET command to the new tipc netlink API.
This command can set one or more link properties for a particular
bearer.
Netlink logical layout of bearer set message:
-> bearer
-> name
-> link properties
[ -> tolerance ]
[ -> priority ]
[ -> window ]
Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add TIPC_NL_BEARER_GET command to the new tipc netlink API.
This command supports dumping all data about all bearers or getting
all information about a specific bearer.
The information about a bearer includes name, link priorities and
domain.
Netlink logical layout of bearer get message:
-> bearer
-> name
Netlink logical layout of returned bearer information:
-> bearer
-> name
-> link properties
-> priority
-> tolerance
-> window
-> domain
Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A new netlink API for tipc that can disable or enable a tipc bearer.
The new API is separated from the old API because of a bug in the
user space client (tipc-config). The problem is that older versions
of tipc-config has a very low receive limit and adding commands to
the legacy genl_opts struct causes the ctrl_getfamily() response
message to grow, subsequently breaking the tool.
The new API utilizes netlink policies for input validation. Where the
top-level netlink attributes are tipc-logical entities, like bearer.
The top level entities then contain nested attributes. In this case
a name, nested link properties and a domain.
Netlink commands implemented in this patch:
TIPC_NL_BEARER_ENABLE
TIPC_NL_BEARER_DISABLE
Netlink logical layout of bearer enable message:
-> bearer
-> name
[ -> domain ]
[
-> properties
-> priority
]
Netlink logical layout of bearer disable message:
-> bearer
-> name
Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no reason to limit the amount of possible links to a
neighboring node to 2. If we have more then two bearers we can also
establish more links.
Signed-off-by: Holger Brunck <holger.brunck@keymile.com>
Reviewed-By: Jon Maloy <jon.maloy@ericsson.com>
cc: Ying Xue <ying.xue@windriver.com>
cc: Erik Hugne <erik.hugne@ericsson.com>
cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
This encapsulates all of the skb_copy_datagram_iovec() callers
with call argument signature "skb, offset, msghdr->msg_iov, length".
When we move to iov_iters in the networking, the iov_iter object will
sit in the msghdr.
Having a helper like this means there will be less places to touch
during that transformation.
Based upon descriptions and patch from Al Viro.
Signed-off-by: David S. Miller <davem@davemloft.net>
Locking dependency detected below possible unsafe locking scenario:
CPU0 CPU1
T0: tipc_named_rcv() tipc_rcv()
T1: [grab nametble write lock]* [grab node lock]*
T2: tipc_update_nametbl() tipc_node_link_up()
T3: tipc_nodesub_subscribe() tipc_nametbl_publish()
T4: [grab node lock]* [grab nametble write lock]*
The opposite order of holding nametbl write lock and node lock on
above two different paths may result in a deadlock. If we move the
the updating of the name table after link state named out of node
lock, the reverse order of holding locks will be eliminated, and
as a result, the deadlock risk.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In commit ec8a2e5621 ("tipc: same receive
code path for connection protocol and data messages") we omitted the
the possiblilty that an arriving message extracted from a bundle buffer
may be a multicast message. Such messages need to be to be delivered to
the socket via a separate function, tipc_sk_mcast_rcv(). As a result,
small multicast messages arriving as members of a bundle buffer will be
silently dropped.
This commit corrects the error by considering this case in the function
tipc_link_bundle_rcv().
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
One aim of commit 50100a5e39 ("tipc:
use pseudo message to wake up sockets after link congestion") was
to handle link congestion abatement in a uniform way for both unicast
and multicast transmit. However, the latter doesn't work correctly,
and has been broken since the referenced commit was applied.
If a user now sends a burst of multicast messages that is big
enough to cause broadcast link congestion, it will be put to sleep,
and not be waked up when the congestion abates as it should be.
This has two reasons. First, the flag that is used, TIPC_WAKEUP_USERS,
is set correctly, but in the wrong field. Instead of setting it in the
'action_flags' field of the arrival node struct, it is by mistake set
in the dummy node struct that is owned by the broadcast link, where it
will never tested for. Second, we cannot use the same flag for waking
up unicast and multicast users, since the function tipc_node_unlock()
needs to pick the wakeup pseudo messages to deliver from different
queues. It must hence be able to distinguish between the two cases.
This commit solves this problem by adding a new flag
TIPC_WAKEUP_BCAST_USERS, and a new function tipc_bclink_wakeup_user().
The latter is to be called by tipc_node_unlock() when the named flag,
now set in the correct field, is encountered.
v2: using explicit 'unsigned int' declaration instead of 'uint', as
per comment from David Miller.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes the following sparse warnings:
sparse: symbol 'tipc_update_nametbl' was not declared. Should it be static?
Also, the function is changed to return bool upon success, rather than a
potentially freed pointer.
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
TIPC name table updates are distributed asynchronously in a cluster,
entailing a risk of certain race conditions. E.g., if two nodes
simultaneously issue conflicting (overlapping) publications, this may
not be detected until both publications have reached a third node, in
which case one of the publications will be silently dropped on that
node. Hence, we end up with an inconsistent name table.
In most cases this conflict is just a temporary race, e.g., one
node is issuing a publication under the assumption that a previous,
conflicting, publication has already been withdrawn by the other node.
However, because of the (rtt related) distributed update delay, this
may not yet hold true on all nodes. The symptom of this failure is a
syslog message: "tipc: Cannot publish {%u,%u,%u}, overlap error".
In this commit we add a resiliency queue at the receiving end of
the name table distributor. When insertion of an arriving publication
fails, we retain it in this queue for a short amount of time, assuming
that another update will arrive very soon and clear the conflict. If so
happens, we insert the publication, otherwise we drop it.
The (configurable) retention value defaults to 2000 ms. Knowing from
experience that the situation described above is extremely rare, there
is no risk that the queue will accumulate any large number of items.
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need to perform the same actions when processing deferred name
table updates, so this functionality is moved to a separate
function.
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 6c9808ce09 ("tipc: remove port_lock") accidentally involves
a potential bug: when tipc socket instance(tsk) is not got with given
reference number in tipc_sk_get(), tsk is set to NULL. Subsequently
we jump to exit label where to decrease socket reference counter
pointed by tsk pointer in tipc_sk_put(). However, As now tsk is NULL,
oops may happen because of touching a NULL pointer.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Erik Hugne <erik.hugne@ericsson.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We complete the merging of the port and socket layer by aggregating
the fields of struct tipc_port directly into struct tipc_sock, and
moving the combined structure into socket.c.
We also move all functions and macros that are not any longer
exposed to the rest of the stack into socket.c, and rename them
accordingly.
Despite the size of this commit, there are no functional changes.
We have only made such changes that are necessary due of the removal
of struct tipc_port.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The reference table is now 'socket aware' instead of being generic,
and has in reality become a socket internal table. In order to be
able to minimize the API exposed by the socket layer towards the rest
of the stack, we now move the reference table definitions and functions
into the file socket.c, and rename the functions accordingly.
There are no functional changes in this commit.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We move the inline functions in the file port.h to socket.c, and modify
their names accordingly.
We move struct tipc_port and some macros to socket.h.
Finally, we remove the file port.h.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In this commit, we move the remaining functions in port.c to
socket.c, and give them new names that correspond to their new
location. We then remove the file port.c.
There are only cosmetic changes to the moved functions.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In previous commits we have reduced usage of port_lock to a minimum,
and complemented it with usage of bh_lock_sock() at the remaining
locations. The purpose has been to remove this lock altogether, since
it largely duplicates the role of bh_lock_sock. We are now ready to do
this.
However, we still need to protect the BH callers from inadvertent
release of the socket while they hold a reference to it. We do this by
replacing port_lock by a combination of a rw-lock protecting the
reference table as such, and updating the socket reference counter while
the socket is referenced from BH. This technique is more standard and
comprehensible than the previous approach, and turns out to have a
positive effect on overall performance.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to make tipc_sock the only entity referencable from other
parts of the stack, we add a tipc_sock pointer instead of a tipc_port
pointer to the registry. As a consequence, we also let the function
tipc_port_lock() return a pointer to a tipc_sock instead of a tipc_port.
We keep the function's name for now, since the lock still is owned by
the port.
This is another step in the direction of eliminating port_lock, replacing
its usage with lock_sock() and bh_lock_sock().
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The functions tipc_port_get_ports() and tipc_port_reinit() scan over
all sockets/ports to access each of them. This is done by using a
dedicated linked list, 'tipc_socks' where all sockets are members. The
list is in turn protected by a spinlock, 'port_list_lock', while each
socket is locked by using port_lock at the moment of access.
In order to reduce complexity and risk of deadlock, we want to get
rid of the linked list and the accompanying spinlock.
This is what we do in this commit. Instead of the linked list, we use
the port registry to scan across the sockets. We also add usage of
bh_lock_sock() inside the scope of port_lock in both functions, as a
preparation for the complete removal of port_lock.
Finally, we move the functions from port.c to socket.c, and rename them
to tipc_sk_sock_show() and tipc_sk_reinit() repectively.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After the latest changes to the socket/port layer the existence of
the functions tipc_port_init() and tipc_port_destroy() cannot be
justified. They are both called only once, from tipc_sk_create() and
tipc_sk_delete() respectively, and their functionality can better be
merged into the latter two functions.
This also entails that all remaining references to port_lock now are
made from inside socket.c, something that will make it easier to remove
this lock.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The function tipc_acknowledge() is a remnant from the obsolete native
API. Currently, it grabs port_lock, before building an acknowledge
message and sending it to the peer.
Since all access to socket members now is protected by the socket lock,
it has become unnecessary to grab port_lock here.
In this commit, we remove the usage of port_lock, simplify the
function, and move it to socket.c, renaming it to tipc_sk_send_ack().
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tipc_port_connect()/tipc_port_disconnect() are remnants of the obsolete
native API. Their only task is to grab port_lock and call the functions
__tipc_port_connect()/__tipc_port_disconnect() respectively, which will
perform the actual state change.
Since socket/port exection now is single-threaded the use of port_lock
is not needed any more, so we can safely replace the two functions with
their lock-free counterparts.
In this commit, we remove the two functions. Furthermore, the contents
of __tipc_port_disconnect() is so trivial that we choose to eliminate
that function too, expanding its functionality into tipc_shutdown().
__tipc_port_connect() is simplified, moved to socket.c, and given the
more correct name tipc_sk_finish_conn(). Finally, we eliminate the
function auto_connect(), and expand its contents into filter_connect().
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tipc_port_shutdown() is a remnant from the now obsolete native
interface. As such it grabs port_lock in order to protect itself
from concurrent BH processing.
However, after the recent changes to the port/socket upcalls, sockets
are now basically single-threaded, and all execution, except the read-only
tipc_sk_timer(), is executing within the protection of lock_sock(). So
the use of port_lock is not needed here.
In this commit we eliminate the whole function, and merge it into its
only caller, tipc_shutdown().
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The last remaining BH upcall to the socket, apart for the message
reception function tipc_sk_rcv(), is the timer function.
We prefer to let this function continue executing in BH, since it only
does read-acces to semi-permanent data, but we make three changes to it:
1) We introduce a bh_lock_sock()/bh_unlock_sock() inside the scope
of port_lock. This is a preparation for replacing port_lock with
bh_lock_sock() at the locations where it is still used.
2) We move the function from port.c to socket.c, as a further step
of eliminating the port code level altogether.
3) We let it make use of the newly introduced tipc_msg_create()
function. This enables us to get rid of three context specific
functions (port_create_self_abort_msg() etc.) in port.c
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the current implementation, each 'struct tipc_node' instance keeps
a linked list of those ports/sockets that are connected to the node
represented by that struct. The purpose of this is to let the node
object know which sockets to alert when it loses contact with its peer
node, i.e., which sockets need to have their connections aborted.
This entails an unwanted direct reference from the node structure
back to the port/socket structure, and a need to grab port_lock
when we have to make an upcall to the port. We want to get rid of
this unecessary BH entry point into the socket, and also eliminate
its use of port_lock.
In this commit, we instead let the node struct keep list of "connected
socket" structs, which each represents a connected socket, but is
allocated independently by the node at the moment of connection. If
the node loses contact with its peer node, the list is traversed, and
a "connection abort" message is created for each entry in the list. The
message is sent to it respective connected socket using the ordinary
data path, and the receiving socket aborts its connections upon reception
of the message.
This enables us to get rid of the direct reference from 'struct node' to
´struct port', and another unwanted BH access point to the latter.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current link implementation keeps a linked list of blocked ports/
sockets that is populated when there is link congestion. The purpose
of this is to let the link know which users to wake up when the
congestion abates.
This adds unnecessary complexity to the data structure and the code,
since it forces us to involve the link each time we want to delete
a socket. It also forces us to grab the spinlock port_lock within
the scope of node_lock. We want to get rid of this direct dependence,
as well as the deadlock hazard resulting from the usage of port_lock.
In this commit, we instead let the link keep list of a "wakeup" pseudo
messages for use in such situations. Those messages are sent to the
pending sockets via the ordinary message reception path, and wake up
the socket's owner when they are received.
This enables us to get rid of the 'waiting_ports' linked lists in struct
tipc_port that manifest this direct reference. As a consequence, we can
eliminate another BH entry into the socket, and hence the need to grab
port_lock. This is a further step in our effort to remove port_lock
altogether.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The function tipc_msg_init() has turned out to be of limited value
in many cases. It take too few parameters to be usable for creating
a complete message, it makes too many assumptions about what the
message should be used for, and it does not allocate any buffer to
be returned to the caller.
Therefore, we now introduce the new function tipc_msg_create(), which
takes all the parameters needed to create a full message, and returns
a buffer of the requested size. The new function will be very useful
for the changes we will be doing in later commits in this series.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 3b4f302d85 ("tipc: eliminate
redundant locking") introduced a bug by removing the sanity check
for message importance, allowing programs to assign any value to
the msg_user field. This will mess up the packet reception logic
and may cause random link resets.
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As per comment from David Miller, we try to make the buffer reassembly
function more resilient to user errors than it is today.
- We check that the "*buf" parameter always is set, since this is
mandatory input.
- We ensure that *buf->next always is set to NULL before linking in
the buffer, instead of relying of the caller to have done this.
- We ensure that the "tail" pointer in the head buffer's control
block is initialized to NULL when the first fragment arrives.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes the following sparse warnings:
net/tipc/socket.c:545:5: warning:
symbol 'tipc_sk_proto_rcv' was not declared. Should it be static?
net/tipc/socket.c:2015:5: warning:
symbol 'tipc_ioctl' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When we run broadcast packets over dual bearers/interfaces, the
current transmission code is flipping bearers between each sent
packet, with the purpose of leveraging the double bandwidth
available. The receiving bclink is resequencing the packets if
needed, so all messages are delivered upwards from the broadcast
link in the correct order, even if they may arrive in concurrent
interrupts.
However, at the moment of delivery upwards to the socket, we release
all spinlocks (bclink_lock, node_lock), so it is still possible
that arriving messages bypass each other before they reach the socket
queue.
We fix this by applying the same technique we are using for unicast
traffic. We use a link selector (i.e., the last bit of sending port
number) to ensure that messages from the same sender socket always are
sent over the same bearer. This guarantees sequential delivery between
socket pairs, which is sufficient to satisfy the protocol spec, as well
as all known user requirements.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After the previous commit, we can now give the functions with temporary
names, such as tipc_link_xmit2(), tipc_msg_build2() etc., their proper
names.
There are no functional changes in this commit.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We can now remove a number of functions which have become obsolete
and unreferenced through this commit series. There are no functional
changes in this commit.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In this commit, we convert the socket multicast send function to
directly call the new multicast/broadcast function (tipc_bclink_xmit2())
introduced in the previous commit. We do this instead of letting the
call go via the now obsolete tipc_port_mcast_xmit(), hence saving
a call level and some code complexity.
We also remove the initial destination lookup at the message sending
side, and replace that with an unconditional lookup at the receiving
side, including on the sending node itself. This makes the destination
lookup and message transfer more uniform than before.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We add a new broadcast link transmit function in bclink.c and a new
receive function in socket.c. The purpose is to move the branching
between external and internal destination down to the link layer,
just as we have done with unicast in earlier commits. We also make
use of the new link-independent fragmentation support that was
introduced in an earlier commit series.
This gives a shorter and simpler code path, and makes it possible
to obtain copy-free buffer delivery to all node local destination
sockets.
The new transmission code is added in parallel with the existing one,
and will be used by the socket multicast send function in the next
commit in this series.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>