linux/Documentation/filesystems/nfs/pnfs.txt

110 lines
4.3 KiB
Plaintext
Raw Normal View History

Reference counting in pnfs:
==========================
The are several inter-related caches. We have layouts which can
reference multiple devices, each of which can reference multiple data servers.
Each data server can be referenced by multiple devices. Each device
can be referenced by multiple layouts. To keep all of this straight,
we need to reference count.
struct pnfs_layout_hdr
----------------------
The on-the-wire command LAYOUTGET corresponds to struct
pnfs_layout_segment, usually referred to by the variable name lseg.
Each nfs_inode may hold a pointer to a cache of these layout
segments in nfsi->layout, of type struct pnfs_layout_hdr.
We reference the header for the inode pointing to it, across each
outstanding RPC call that references it (LAYOUTGET, LAYOUTRETURN,
LAYOUTCOMMIT), and for each lseg held within.
Each header is also (when non-empty) put on a list associated with
struct nfs_client (cl_layouts). Being put on this list does not bump
the reference count, as the layout is kept around by the lseg that
keeps it in the list.
deviceid_cache
--------------
lsegs reference device ids, which are resolved per nfs_client and
layout driver type. The device ids are held in a RCU cache (struct
nfs4_deviceid_cache). The cache itself is referenced across each
mount. The entries (struct nfs4_deviceid) themselves are held across
the lifetime of each lseg referencing them.
RCU is used because the deviceid is basically a write once, read many
data structure. The hlist size of 32 buckets needs better
justification, but seems reasonable given that we can have multiple
deviceid's per filesystem, and multiple filesystems per nfs_client.
The hash code is copied from the nfsd code base. A discussion of
hashing and variations of this algorithm can be found at:
http://groups.google.com/group/comp.lang.c/browse_thread/thread/9522965e2b8d3809
data server cache
-----------------
file driver devices refer to data servers, which are kept in a module
level cache. Its reference is held over the lifetime of the deviceid
pointing to it.
lseg
----
lseg maintains an extra reference corresponding to the NFS_LSEG_VALID
bit which holds it in the pnfs_layout_hdr's list. When the final lseg
is removed from the pnfs_layout_hdr's list, the NFS_LAYOUT_DESTROYED
bit is set, preventing any new lsegs from being added.
pnfs-obj: autologin: Add support for protocol autologin The pnfs-objects protocol mandates that we autologin into devices not present in the system, according to information specified in the get_device_info returned from the server. The Protocol specifies two login hints. 1. An IP address:port combination 2. A string URI which is constructed as a URL with a protocol prefix followed by :// and a string as address. For each protocol prefix the string-address format might be different. We only support the second option. The first option is just redundant to the second one. NOTE: The Kernel part of autologin does not parse the URI string. It just channels it to a user-mode script. So any new login protocols should only update the user-mode script which is a part of the nfs-utils package, but the Kernel need not change. We implement the autologin by using the call_usermodehelper() API. (Thanks to Steve Dickson <steved@redhat.com> for pointing it out) So there is no running daemon needed, and/or special setup. We Add the osd_login_prog Kernel module parameters which defaults to: /sbin/osd_login Kernel try's to upcall the program specified in osd_login_prog. If the file is not found or the execution fails Kernel will disable any farther upcalls, by zeroing out osd_login_prog, Until Admin re-enables it by setting the osd_login_prog parameter to a proper program. Also add text about the osd_login program command line API to: Documentation/filesystems/nfs/pnfs.txt and documentation of the new osd_login_prog module parameter to: Documentation/kernel-parameters.txt TODO: Add timeout option in the case osd_login program gets stuck Signed-off-by: Sachin Bhamare <sbhamare@panasas.com> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-20 11:47:58 +08:00
layout drivers
--------------
PNFS utilizes what is called layout drivers. The STD defines 3 basic
layout types: "files" "objects" and "blocks". For each of these types
there is a layout-driver with a common function-vectors table which
are called by the nfs-client pnfs-core to implement the different layout
types.
Files-layout-driver code is in: fs/nfs/nfs4filelayout.c && nfs4filelayoutdev.c
Objects-layout-deriver code is in: fs/nfs/objlayout/.. directory
Blocks-layout-deriver code is in: fs/nfs/blocklayout/.. directory
objects-layout setup
--------------------
As part of the full STD implementation the objlayoutdriver.ko needs, at times,
to automatically login to yet undiscovered iscsi/osd devices. For this the
driver makes up-calles to a user-mode script called *osd_login*
The path_name of the script to use is by default:
/sbin/osd_login.
This name can be overridden by the Kernel module parameter:
objlayoutdriver.osd_login_prog
If Kernel does not find the osd_login_prog path it will zero it out
and will not attempt farther logins. An admin can then write new value
to the objlayoutdriver.osd_login_prog Kernel parameter to re-enable it.
The /sbin/osd_login is part of the nfs-utils package, and should usually
be installed on distributions that support this Kernel version.
The API to the login script is as follows:
Usage: $0 -u <URI> -o <OSDNAME> -s <SYSTEMID>
Options:
-u target uri e.g. iscsi://<ip>:<port>
(allways exists)
(More protocols can be defined in the future.
The client does not interpret this string it is
passed unchanged as received from the Server)
pnfs-obj: autologin: Add support for protocol autologin The pnfs-objects protocol mandates that we autologin into devices not present in the system, according to information specified in the get_device_info returned from the server. The Protocol specifies two login hints. 1. An IP address:port combination 2. A string URI which is constructed as a URL with a protocol prefix followed by :// and a string as address. For each protocol prefix the string-address format might be different. We only support the second option. The first option is just redundant to the second one. NOTE: The Kernel part of autologin does not parse the URI string. It just channels it to a user-mode script. So any new login protocols should only update the user-mode script which is a part of the nfs-utils package, but the Kernel need not change. We implement the autologin by using the call_usermodehelper() API. (Thanks to Steve Dickson <steved@redhat.com> for pointing it out) So there is no running daemon needed, and/or special setup. We Add the osd_login_prog Kernel module parameters which defaults to: /sbin/osd_login Kernel try's to upcall the program specified in osd_login_prog. If the file is not found or the execution fails Kernel will disable any farther upcalls, by zeroing out osd_login_prog, Until Admin re-enables it by setting the osd_login_prog parameter to a proper program. Also add text about the osd_login program command line API to: Documentation/filesystems/nfs/pnfs.txt and documentation of the new osd_login_prog module parameter to: Documentation/kernel-parameters.txt TODO: Add timeout option in the case osd_login program gets stuck Signed-off-by: Sachin Bhamare <sbhamare@panasas.com> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-20 11:47:58 +08:00
-o osdname of the requested target OSD
(Might be empty)
(A string which denotes the OSD name, there is a
limit of 64 chars on this string)
-s systemid of the requested target OSD
(Might be empty)
(This string, if not empty is always an hex
representation of the 20 bytes osd_system_id)
blocks-layout setup
-------------------
TODO: Document the setup needs of the blocks layout driver