linux

Commit Graph

Author	SHA1	Message	Date
Chuhong Yuan	a29e1012c1	RDMA/uverbs: Add a check for uverbs_attr_get to uverbs_copy_to_struct_or_zero All current callers for uverbs_copy_to_struct_or_zero() already check that the attribute exists, but it make sense to verify the result like the other functions do. Link: https://lore.kernel.org/r/20191018081533.8544-1-hslester96@gmail.com Signed-off-by: Chuhong Yuan <hslester96@gmail.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2019-10-22 16:10:03 -03:00
Alexander Potapenko	6471384af2	mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options Patch series "add init_on_alloc/init_on_free boot options", v10. Provide init_on_alloc and init_on_free boot options. These are aimed at preventing possible information leaks and making the control-flow bugs that depend on uninitialized values more deterministic. Enabling either of the options guarantees that the memory returned by the page allocator and SL[AU]B is initialized with zeroes. SLOB allocator isn't supported at the moment, as its emulation of kmem caches complicates handling of SLAB_TYPESAFE_BY_RCU caches correctly. Enabling init_on_free also guarantees that pages and heap objects are initialized right after they're freed, so it won't be possible to access stale data by using a dangling pointer. As suggested by Michal Hocko, right now we don't let the heap users to disable initialization for certain allocations. There's not enough evidence that doing so can speed up real-life cases, and introducing ways to opt-out may result in things going out of control. This patch (of 2): The new options are needed to prevent possible information leaks and make control-flow bugs that depend on uninitialized values more deterministic. This is expected to be on-by-default on Android and Chrome OS. And it gives the opportunity for anyone else to use it under distros too via the boot args. (The init_on_free feature is regularly requested by folks where memory forensics is included in their threat models.) init_on_alloc=1 makes the kernel initialize newly allocated pages and heap objects with zeroes. Initialization is done at allocation time at the places where checks for __GFP_ZERO are performed. init_on_free=1 makes the kernel initialize freed pages and heap objects with zeroes upon their deletion. This helps to ensure sensitive data doesn't leak via use-after-free accesses. Both init_on_alloc=1 and init_on_free=1 guarantee that the allocator returns zeroed memory. The two exceptions are slab caches with constructors and SLAB_TYPESAFE_BY_RCU flag. Those are never zero-initialized to preserve their semantics. Both init_on_alloc and init_on_free default to zero, but those defaults can be overridden with CONFIG_INIT_ON_ALLOC_DEFAULT_ON and CONFIG_INIT_ON_FREE_DEFAULT_ON. If either SLUB poisoning or page poisoning is enabled, those options take precedence over init_on_alloc and init_on_free: initialization is only applied to unpoisoned allocations. Slowdown for the new features compared to init_on_free=0, init_on_alloc=0: hackbench, init_on_free=1: +7.62% sys time (st.err 0.74%) hackbench, init_on_alloc=1: +7.75% sys time (st.err 2.14%) Linux build with -j12, init_on_free=1: +8.38% wall time (st.err 0.39%) Linux build with -j12, init_on_free=1: +24.42% sys time (st.err 0.52%) Linux build with -j12, init_on_alloc=1: -0.13% wall time (st.err 0.42%) Linux build with -j12, init_on_alloc=1: +0.57% sys time (st.err 0.40%) The slowdown for init_on_free=0, init_on_alloc=0 compared to the baseline is within the standard error. The new features are also going to pave the way for hardware memory tagging (e.g. arm64's MTE), which will require both on_alloc and on_free hooks to set the tags for heap objects. With MTE, tagging will have the same cost as memory initialization. Although init_on_free is rather costly, there are paranoid use-cases where in-memory data lifetime is desired to be minimized. There are various arguments for/against the realism of the associated threat models, but given that we'll need the infrastructure for MTE anyway, and there are people who want wipe-on-free behavior no matter what the performance cost, it seems reasonable to include it in this series. [glider@google.com: v8] Link: http://lkml.kernel.org/r/20190626121943.131390-2-glider@google.com [glider@google.com: v9] Link: http://lkml.kernel.org/r/20190627130316.254309-2-glider@google.com [glider@google.com: v10] Link: http://lkml.kernel.org/r/20190628093131.199499-2-glider@google.com Link: http://lkml.kernel.org/r/20190617151050.92663-2-glider@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Michal Hocko <mhocko@suse.cz> [page and dmapool parts Acked-by: James Morris <jamorris@linux.microsoft.com>] Cc: Christoph Lameter <cl@linux.com> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: "Serge E. Hallyn" <serge@hallyn.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Kostya Serebryany <kcc@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Sandeep Patil <sspatil@android.com> Cc: Laura Abbott <labbott@redhat.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Jann Horn <jannh@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Marco Elver <elver@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-07-12 11:05:46 -07:00
Gal Pressman	f89adedaf3	RDMA/uverbs: Initialize udata struct on destroy flows Cited commit introduced the udata parameter to different destroy flows but the uapi method definition does not have udata (i.e has_udata flag is not set). As a result, an uninitialized udata struct is being passed down to the driver callbacks. Fix that by clearing the driver udata even in cases where has_udata flag is not set. Fixes: `c4367a2635` ("IB: Pass uverbs_attr_bundle down ib_x destroy path") Cc: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Co-developed-by: Jason Gunthorpe <jgg@ziepe.ca> Signed-off-by: Jason Gunthorpe <jgg@ziepe.ca> Signed-off-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2019-05-02 17:07:02 -03:00
Shamir Rabinovitch	a6a3797df2	IB: Pass uverbs_attr_bundle down uobject destroy path Pass uverbs_attr_bundle down the uobject destroy path. The next patch will use this to eliminate the dependecy of the drivers in ib_x->uobject pointers. Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2019-04-01 14:55:36 -03:00
Shamir Rabinovitch	70f06b26f0	IB: ucontext should be set properly for all cmd & ioctl paths the Attempt to use the below commit to initialize the ucontext for the uobject destroy path has shown that the below commit is incomplete. Parts were reverted and the ucontext set up in the uverbs_attr_bundle was moved to rdma_lookup_get_uobject which is called from the uobj_get_XXX macros and rdma_alloc_begin_uobject which is called when uobject is created. Fixes: `3d9dfd0603` ("IB/uverbs: Add ib_ucontext to uverbs_attr_bundle sent from ioctl and cmd flows") Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2019-04-01 14:55:03 -03:00
Shamir Rabinovitch	3d9dfd0603	IB/uverbs: Add ib_ucontext to uverbs_attr_bundle sent from ioctl and cmd flows Add ib_ucontext to the uverbs_attr_bundle sent down the iocl and cmd flows as soon as the flow has ib_uobject. In addition, remove rdma_get_ucontext helper function that is only used by ib_umem_get. Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2019-02-15 11:16:21 -07:00
Jason Gunthorpe	d6f4a21f30	RDMA/uverbs: Mark ioctl responses with UVERBS_ATTR_F_VALID_OUTPUT When the ioctl interface for the write commands was introduced it did not mark the core response with UVERBS_ATTR_F_VALID_OUTPUT. This causes rdma-core in userspace to not mark the buffers as written for valgrind. Along the same lines it turns out we have always missed marking the driver data. Fixing both of these makes valgrind work properly with rdma-core and ioctl. Fixes: `4785860e04` ("RDMA/uverbs: Implement an ioctl that can call write and write_ex handlers") Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>	2019-01-14 14:02:22 -07:00
Michael Guralnik	2e8039c656	IB/core: uverbs copy to struct or zero helper Add a helper to zero fill fields before copying data to UVERBS_ATTR_STRUCT. As UVERBS_ATTR_STRUCT can be used as an extensible struct, we want to make sure that if the user supplies us with a struct that has new fields that we are not aware of, we return them zeroed to the user. This helper should be used when using UVERBS_ATTR_STRUCT for an extendable data structure and there is a need to make sure that extended members of the struct, that the kernel doesn't handle, are returned zeroed to the user. This is needed due to the fact that UVERBS_ATTR_STRUCT allows non-zero values for members after 'last' member. Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2018-12-20 15:18:18 -07:00
Jason Gunthorpe	4785860e04	RDMA/uverbs: Implement an ioctl that can call write and write_ex handlers Now that the handlers do not process their own udata we can make a sensible ioctl that wrappers them. The ioctl follows the same format as the write_ex() and has the user explicitly specify the core and driver in/out opaque structures and a command number. This works for all forms of write commands. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2018-12-18 14:12:48 -05:00
Jason Gunthorpe	07f05f40d9	RDMA/uverbs: Use uverbs_attr_bundle to pass udata for ioctl() Have the core code initialize the driver_udata if the method has a udata description. This is done using the same create_udata the handler was supposed to call. This makes ioctl consistent with the write and write_ex paths. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>	2018-11-26 16:48:07 -07:00
Jason Gunthorpe	15a1b4becb	RDMA/uverbs: Do not pass ib_uverbs_file to ioctl methods The uverbs_attr_bundle already contains this pointer, and most methods don't actually need it. Get rid of the redundant function argument. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>	2018-11-26 16:48:07 -07:00
Jason Gunthorpe	e73798f20e	RDMA/uverbs: Fix RCU annotation for radix slot deference The uapi radix tree is a write-once data structure protected by kref. Once we get to the ioctl() fop it is not possible for anything else to be writing to it, so the access should use rcu_dereference_protected. Reported-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2018-10-03 16:01:40 -06:00
Guy Levi	70cd20aed0	IB/uverbs: Add IDRs array attribute type to ioctl() interface Methods sometimes need to get a flexible set of IDRs and not a strict set as can be achieved today by the conventional IDR attribute. Add a new IDRS_ARRAY attribute to the generic uverbs ioctl layer. IDRS_ARRAY points to array of idrs of the same object type and same access rights, only write and read are supported. Signed-off-by: Guy Levi <guyle@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>`` Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2018-09-11 09:28:06 -06:00
Mark Bloch	0953fffec9	RDMA/uverbs: Add UVERBS_ATTR_CONST_IN to the specs language This makes it clear and safe to access constants passed in from user space. We define a consistent ABI of u64 for all constants, and verify that the data passed in can be represented by the type the user supplies. The expectation is this will always be used with an enum declaring the constant values, and the user will use the enum type as input to the accessor. To retrieve the attribute value we introduce two helper calls - one standard which may fail if attribute is not valid and one where caller can provide a default value which will be used in case the attribute is not valid (useful when attribute is optional). Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>	2018-09-05 15:14:58 -06:00
Jason Gunthorpe	4ce719f846	IB/uverbs: Do not check for device disassociation during ioctl Now that the ioctl path and uobjects are converted to use uverbs_api, it is now safe to remove the disassociation protection from the common ioctl code. This completes the work to make destroy functions continue to work even after device disassociation. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2018-08-13 09:17:19 -06:00
Jason Gunthorpe	3a863577a7	IB/uverbs: Use uverbs_api to unmarshal ioctl commands Convert the ioctl method syscall path to use the uverbs_api data structures. The new uapi structure includes all the same information, just in a different and more optimal way. - Use attr_bkey instead of 2 level radix trees for everything related to attributes. This includes the attribute storage, presence, and detection of missing mandatory attributes. - Avoid iterating over all attribute storage at finish, instead use find_first_bit with the attr_bkey to locate only those attrs that need cleanup. - Organize things to always run, and always rely on, cleanup. This avoids a bunch of tricky error unwind cases. - Locate the method using the radix tree, and locate the attributes using a very efficient incremental radix tree lookup - Use the precomputed destroy_bkey to handle uobject destruction - Use the precomputed allocation sizes and precomputed 'need_stack' to avoid maths in the fast path. This is optimal if userspace does not pass (many) unsupported attributes. Overall this results in much better codegen for the attribute accessors, everything is now stored in bitmaps or linear arrays indexed by attr_bkey. The compiler can compute attr_bkey values at compile time for all method attributes, meaning things like uverbs_attr_is_valid() now compile into single instruction bit tests. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2018-08-13 09:17:16 -06:00
Jason Gunthorpe	461bb2eee4	IB/uverbs: Add a simple allocator to uverbs_attr_bundle This is similar in spirit to devm, it keeps track of any allocations linked to this method call and ensures they are all freed when the method exits. Further, if there is space in the internal/onstack buffer then the allocator will hand out that memory and avoid an expensive call to kalloc/kfree in the syscall path. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com>	2018-08-13 09:16:08 -06:00
Jason Gunthorpe	6a1f444fef	IB/uverbs: Remove the ib_uverbs_attr pointer from each attr Memory in the bundle is valuable, do not waste it holding an 8 byte pointer for the rare case of writing to a PTR_OUT. We can compute the pointer by storing a small 1 byte array offset and the base address of the uattr memory in the bundle private memory. This also means we can access the kernel's copy of the ib_uverbs_attr, so drop the copy of flags as well. Since the uattr base should be private bundle information this also de-inlines the already too big uverbs_copy_to inline and moves create_udata into uverbs_ioctl.c so they can see the private struct definition. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com>	2018-08-10 16:06:24 -06:00
Jason Gunthorpe	4b3dd2bbf0	IB/uverbs: Provide implementation private memory for the uverbs_attr_bundle This already existed as the anonymous 'ctx' structure, but this was not really a useful form. Hoist this struct into bundle_priv and rework the internal things to use it instead. Move a bunch of the processing internal state into the priv and reduce the excessive use of function arguments. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com>	2018-08-10 16:06:24 -06:00
Jason Gunthorpe	6b0d08f4a2	IB/uverbs: Use uverbs_api to manage the object type inside the uobject Currently the struct uverbs_obj_type stored in the ib_uobject is part of the .rodata segment of the module that defines the object. This is a problem if drivers define new uapi objects as we will be left with a dangling pointer after device disassociation. Switch the uverbs_obj_type for struct uverbs_api_object, which is allocated memory that is part of the uverbs_api and is guaranteed to always exist. Further this moves the 'type_class' into this memory which means access to the IDR/FD function pointers is also guaranteed. Drivers cannot define new types. This makes it safe to continue to use all uobjects, including driver defined ones, after disassociation. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2018-08-10 16:06:24 -06:00
Jason Gunthorpe	922983c2a1	IB/uverbs: Fix reading of 32 bit flags This is missing a zeroing of the high bits of flags, and is also not correct for big endian machines. Properly zero extend the 32 bit flags into the 64 bit stack variable. Reported-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Fixes: `bccd06223f` ("IB/uverbs: Add UVERBS_ATTR_FLAGS_IN to the specs language") Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>	2018-08-09 15:46:07 -06:00
Jason Gunthorpe	e83f0ecdc4	IB/uverbs: Do not pass struct ib_device to the ioctl methods This does the same as the patch before, except for ioctl. The rules are the same, but for the ioctl methods the core code handles setting up the uobject. - Retrieve the ib_dev from the uobject->context->device. This is safe under ioctl as the core has already done rdma_alloc_begin_uobject and so CREATE calls are entirely protected by the rwsem. - Retrieve the ib_dev from uobject->object - Call ib_uverbs_get_ucontext() Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2018-08-01 14:55:48 -06:00
Jason Gunthorpe	7452a3c745	IB/uverbs: Allow RDMA_REMOVE_DESTROY to work concurrently with disassociate After all the recent structural changes this is now straightfoward, hoist the hw_destroy_rwsem up out of rdma_destroy_explicit and wrap it around the uobject write lock as well as the destroy. This is necessary as obtaining a write lock concurrently with uverbs_destroy_ufile_hw() will cause malfunction. After this change none of the destroy callbacks require the disassociate_srcu lock to be correct. This requires introducing a new lookup mode, UVERBS_LOOKUP_DESTROY as the IOCTL interface needs to hold an unlocked kref until all command verification is completed. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2018-08-01 14:55:48 -06:00
Jason Gunthorpe	aa72c9a5f9	IB/uverbs: Remove rdma_explicit_destroy() from the ioctl methods The core code will destroy the HW object on behalf of the method, if the method provides an implementation it must simply copy data from the stub uobj into the response. Destroy methods cannot touch the HW object. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2018-08-01 14:55:37 -06:00
Jason Gunthorpe	bccd06223f	IB/uverbs: Add UVERBS_ATTR_FLAGS_IN to the specs language This clearly indicates that the input is a bitwise combination of values in an enum, and identifies which enum contains the definition of the bits. Special accessors are provided that handle the mandatory validation of the allowed bits and enforce the correct type for bitwise flags. If we had introduced this at the start then the kabi would have uniformly used u64 data to pass flags, however today there is a mixture of u64 and u32 flags. All places are converted to accept both sizes and the accessor fixes it. This allows all existing flags to grow to u64 in future without any hassle. Finally all flags are, by definition, optional. If flags are not passed the accessor does not fail, but provides a value of zero. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com>	2018-07-30 20:23:29 -06:00
Jason Gunthorpe	22fa27fbc6	IB/uverbs: Fix locking around struct ib_uverbs_file ucontext We have a parallel unlocked reader and writer with ib_uverbs_get_context() vs everything else, and nothing guarantees this works properly. Audit and fix all of the places that access ucontext to use one of the following locking schemes: - Call ib_uverbs_get_ucontext() under SRCU and check for failure - Access the ucontext through an struct ib_uobject context member while holding a READ or WRITE lock on the uobject. This value cannot be NULL and has no race. - Hold the ucontext_lock and check for ufile->ucontext !NULL This also re-implements ib_uverbs_get_ucontext() in a way that is safe against concurrent ib_uverbs_get_context() and disassociation. As a side effect, every access to ucontext in the commands is via ib_uverbs_get_context() with an error check, or via the uobject, so there is no longer any need for the core code to check ucontext on every command call. These checks are also removed. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2018-07-25 14:21:46 -06:00
Jason Gunthorpe	1250c3048c	IB/uverbs: Handle IDR and FD types without truncation Our ABI for write() uses a s32 for FDs and a u32 for IDRs, but internally we ended up implicitly casting these ABI values into an 'int'. For ioctl() we use a s64 for FDs and a u64 for IDRs, again casting to an int. The various casts to int are all missing range checks which can cause userspace values that should be considered invalid to be accepted. Fix this by making the generic lookup routine accept a s64, which does not truncate the write API's u32/s32 or the ioctl API's s64. Then push the detailed range checking down to the actual type implementations to be shared by both interfaces. Finally, change the copy of the uobj->id to sign extend into a s64, so eg, if we ever wish to return a negative value for a FD it is carried properly. This ensures that userspace values are never weirdly interpreted due to the various trunctations and everything that is really out of range gets an EINVAL. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2018-07-25 14:21:21 -06:00
Jason Gunthorpe	6ef1c82821	IB/uverbs: Replace ib_ucontext with ib_uverbs_file in core function calls The correct handle to refer to the idr/etc is ib_uverbs_file, revise all the core APIs to use this instead. The user API are left as wrappers that automatically convert a ucontext to a ufile for now. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>	2018-07-09 11:26:17 -06:00
Jason Gunthorpe	422e3d37ed	RDMA/uverbs: Combine MIN_SZ_OR_ZERO with UVERBS_ATTR_STRUCT After all the rework is done it is now possible to include single flags in the type macros. Any user of UVERBS_ATTR_STRUCT needs to zero check data past the end of the known struct to be correct, so make this mandatory, and get rid of MIN_SZ_OR_ZERO as a user flag. This changes UVERBS_ATTR_TYPE to refer to a struct of exact size with not possibility of extension, convert the few users of UVERBS_ATTR_TYPE and MIN_SZ_OR_ZERO to use UVERBS_ATTR_STRUCT. The one user of UVERBS_ATTR_STRUCT without MIN_SZ_OR_ZERO is just confused. There is some padding at the end of that struct, but userspace always provides it with the padding. The construction doesn't test if the padding is zero, so it is pointless. Just use UVERBS_ATTR_TYPE. Finally, rename min_sz_or_zero to zero_trailing to better reflect what it does and hopefully avoid such mis-uses in the future. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>	2018-07-04 13:47:01 -06:00
Jason Gunthorpe	83bb444233	RDMA/uverbs: Remove UA_FLAGS This bit of boilerplate isn't really necessary, we can use bitfields instead of a flags enum and the macros can then individually initialize them through the __VA_ARGS__ like everything else. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>	2018-07-04 13:47:01 -06:00
Jason Gunthorpe	d108dac080	RDMA/uverbs: Simplify UVERBS_ATTR family of macros Instead of using a complex cascade of macros, just directly provide the initializer list each of the declarations is trying to create. Now that the macros are simplified this also reworks the uverbs_attr_spec to be friendly to older compilers by eliminating any unnamed structures/unions inside, and removing the duplication of some fields. The structure size remains at 16 bytes which was the original motivation for some of this oddness. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>	2018-07-04 13:47:01 -06:00
Jason Gunthorpe	87fc2a620a	RDMA/uverbs: Store the specs_root in the struct ib_uverbs_device The specs are required to operate the uverbs file, so they belong inside the ib_uverbs_device, not inside the ib_device. The spec passed in the ib_device is just a communication from the driver and should not be used during runtime. This also changes the lifetime of the spec memory to match the ib_uverbs_device, however at this time the spec_root can still contain driver pointers after disassociation, so it cannot be used if ib_dev is NULL. This is preparation for another series. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>	2018-07-04 13:47:01 -06:00
Jason Gunthorpe	321d7863ac	IB/uverbs: Delete type and id from uverbs_obj_attr In this context the uobject is not allowed to be NULL, so type is the same as uobject->type, and at least for IDR, id is the same as uobject->id. FD objects should never handle the FD number outside the uAPI boundary code. Suggested-by: Guy Levi <guyle@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2018-06-22 09:02:59 -06:00
Matan Barak	19b9def258	IB/uverbs: Allow an empty namespace in ioctl() framework The ioctl parser framework wrongly assumed that each namespace is populated. This could lead to NULL dereferences. Fix the parser to always check that a given namespace indeed exists. Fixes: `fac9658cab` ("IB/core: Add new ioctl interface") Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2018-06-19 10:53:02 -06:00
Matan Barak	8762d149e8	IB/uverbs: Add PTR_IN attributes that are allocated/copied automatically Adding UVERBS_ATTR_SPEC_F_ALLOC_AND_COPY flag to PTR_IN attributes. By using this flag, the parse automatically allocates and copies the user-space data. This data is accessible by using uverbs_attr_get_len and uverbs_attr_get_alloced_ptr inline accessor functions from the handler. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2018-06-19 10:53:02 -06:00
Matan Barak	9442d8bf1d	IB/uverbs: Refactor uverbs_finalize_objects uverbs_finalize_objects is currently used only to commit or abort objects. Since we want to add automatic allocation/free of PTR_IN attributes, moving it to uverbs_ioctl.c and renamit it to uverbs_finalize_attrs. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2018-06-19 10:53:02 -06:00
Matan Barak	f604db645a	IB/uverbs: Fix validating mandatory attributes Previously, if a method contained mandatory attributes in a namespace that wasn't given by the user, these attributes weren't validated. Fixing this by iterating over all specification namespaces. Fixes: `fac9658cab` ("IB/core: Add new ioctl interface") Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2018-04-27 13:53:41 -04:00
Matan Barak	494c5580aa	IB/uverbs: Add enum attribute type to ioctl() interface Methods sometimes need to get one attribute out of a group of pre-defined attributes. This is an enum-like behavior. Since this is a common requirement, we add a new ENUM attribute to the generic uverbs ioctl() layer. This attribute is embedded in methods, like any other attributes we currently have. ENUM attributes point to an array of standard UVERBS_ATTR_PTR_IN. The user-space encodes the enum's attribute id in the id field and the internal PTR_IN attr id in the enum_data.elem_id field. This ENUM attribute could be shared by several attributes and it can get UVERBS_ATTR_SPEC_F_MANDATORY flag, stating this attribute must be supported by the kernel, like any other attribute. Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2018-04-04 12:06:24 -06:00
Matan Barak	c66db31113	IB/uverbs: Safely extend existing attributes Previously, we've used UVERBS_ATTR_SPEC_F_MIN_SZ for extending existing attributes. The behavior of this flag was the kernel accepts anything bigger than the minimum size it specified. This is unsafe, since in order to safely extend an attribute, we need to make sure unknown size is zeroed. Replacing UVERBS_ATTR_SPEC_F_MIN_SZ with UVERBS_ATTR_SPEC_F_MIN_SZ_OR_ZERO, which essentially checks that the unknown size is zero. In addition, attributes are now decorated with UVERBS_ATTR_TYPE and UVERBS_ATTR_STRUCT, so we can provide the minimum and known length. Users of this flag needs to use copy_from_or_zero functions/macros. Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2018-03-19 14:45:17 -06:00
Matan Barak	1f07e08fab	IB/uverbs: Enable compact representation of uverbs_attr_spec Downstream patches extend uverbs_attr_spec with new fields. In order to save space, we move the type and flags fields to the various attribute flavors contained in the union. Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2018-03-19 14:45:17 -06:00
Matan Barak	0ede73bc01	IB/uverbs: Extend uverbs_ioctl header with driver_id Extending uverbs_ioctl header with driver_id and another reserved field. driver_id should be used in order to identify the driver. Since every driver could have its own parsing tree, this is necessary for strace support. Downstream patches take off the EXPERIMENTAL flag from the ioctl() IB support and thus we add some reserved fields for future usage. Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2018-03-19 14:45:17 -06:00
Matan Barak	4d39a959bc	IB/uverbs: Fix possible oops with duplicate ioctl attributes If the same attribute is listed twice by the user in the ioctl attribute list then error unwind can cause the kernel to deref garbage. This happens when an object with WRITE access is sent twice. The second parse properly fails but corrupts the state required for the error unwind it triggers. Fixing this by making duplicates in the attribute list invalid. This is not something we need to support. The ioctl interface is currently recommended to be disabled in kConfig. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2018-02-15 14:59:46 -07:00
Jason Gunthorpe	3624a8f025	RDMA/uverbs: Use an unambiguous errno for method not supported Returning EOPNOTSUPP is problematic because it can also be returned by the method function, and we use it in quite a few places in drivers these days. Instead, dedicate EPROTONOSUPPORT to indicate that the ioctl framework is enabled but the requested object and method are not supported by the kernel. No other case will return this code, and it lets userspace know to fall back to write(). grep says we do not use it today in drivers/infiniband subsystem. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2018-01-25 10:57:29 -05:00
Bart Van Assche	69abc735f4	RDMA/uverbs: Make the code in ib_uverbs_cmd_verbs() less confusing This patch reduces the number of #ifdefs and also avoids that smatch reports the following: drivers/infiniband/core/uverbs_ioctl.c:276: ib_uverbs_cmd_verbs() warn: if statement not indented drivers/infiniband/core/uverbs_ioctl.c:280: ib_uverbs_cmd_verbs() warn: possible memory leak of 'ctx' drivers/infiniband/core/uverbs_ioctl.c:315: ib_uverbs_cmd_verbs() warn: if statement not indented References: commit `fac9658cab` ("IB/core: Add new ioctl interface") Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Acked-by: Matan Barak <matanb@mellanox.com> Cc: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-10-18 10:42:02 -04:00
Matan Barak	fac9658cab	IB/core: Add new ioctl interface In this ioctl interface, processing the command starts from properties of the command and fetching the appropriate user objects before calling the handler. Parsing and validation is done according to a specifier declared by the driver's code. In the driver, all supported objects are declared. These objects are separated to different object namepsaces. Dividing objects to namespaces is done at initialization by using the higher bits of the object ids. This initialization can mix objects declared in different places to one parsing tree using in this ioctl interface. For each object we list all supported methods. Similarly to objects, methods are separated to method namespaces too. Namespacing is done similarly to the objects case. This could be used in order to add methods to an existing object. Each method has a specific handler, which could be either a default handler or a driver specific handler. Along with the handler, a bunch of attributes are specified as well. Similarly to objects and method, attributes are namespaced and hashed by their ids at initialization too. All supported attributes are subject to automatic fetching and validation. These attributes include the command, response and the method's related objects' ids. When these entities (objects, methods and attributes) are used, the high bits of the entities ids are used in order to calculate the hash bucket index. Then, these high bits are masked out in order to have a zero based index. Since we use these high bits for both bucketing and namespacing, we get a compact representation and O(1) array access. This is mandatory for efficient dispatching. Each attribute has a type (PTR_IN, PTR_OUT, IDR and FD) and a length. Attributes could be validated through some attributes, like: () Minimum size / Exact size () Fops for FD () Object type for IDR If an IDR/fd attribute is specified, the kernel also states the object type and the required access (NEW, WRITE, READ or DESTROY). All uobject/fd management is done automatically by the infrastructure, meaning - the infrastructure will fail concurrent commands that at least one of them requires concurrent access (WRITE/DESTROY), synchronize actions with device removals (dissociate context events) and take care of reference counting (increase/decrease) for concurrent actions invocation. The reference counts on the actual kernel objects shall be handled by the handlers. objects +--------+ \| \| \| \| methods +--------+ \| \| ns method method_spec +-----+ \|len \| +--------+ +------+[d]+-------+ +----------------+[d]+------------+ \|attr1+-> \|type \| \| object +> \|method+-> \| spec +-> + attr_buckets +-> \|default_chain+--> +-----+ \|idr_type\| +--------+ +------+ \|handler\| \| \| +------------+ \|attr2\| \|access \| \| \| \| \| +-------+ +----------------+ \|driver chain\| +-----+ +--------+ \| \| \| \| +------------+ \| \| +------+ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| +--------+ [d] = Hash ids to groups using the high order bits The right types table is also chosen by using the high bits from the ids. Currently we have either default or driver specific groups. Once validation and object fetching (or creation) completed, we call the handler: int (handler)(struct ib_device ib_dev, struct ib_uverbs_file ufile, struct uverbs_attr_bundle *ctx); ctx bundles attributes of different namespaces. Each element there is an array of attributes which corresponds to one namespaces of attributes. For example, in the usually used case: ctx core +----------------------------+ +------------+ \| core: +---> \| valid \| +----------------------------+ \| cmd_attr \| \| driver: \| +------------+ \|----------------------------+--+ \| valid \| \| \| cmd_attr \| \| +------------+ \| \| valid \| \| \| obj_attr \| \| +------------+ \| \| drivers \| +------------+ +> \| valid \| \| cmd_attr \| +------------+ \| valid \| \| cmd_attr \| +------------+ \| valid \| \| obj_attr \| +------------+ Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-08-31 08:35:09 -04:00

45 Commits