linux_old1

Commit Graph

Author	SHA1	Message	Date
Dan Williams	1501efadc5	md/raid: only permit hot-add of compatible integrity profiles It is not safe for an integrity profile to be changed while i/o is in-flight in the queue. Prevent adding new disks or otherwise online spares to an array if the device has an incompatible integrity profile. The original change to the blk_integrity_unregister implementation in md, commmit `c7bfced9a6` "md: suspend i/o during runtime blk_integrity_unregister" introduced an immediate hang regression. This policy of disallowing changes the integrity profile once one has been established is shared with DM. Here is an abbreviated log from a test run that: 1/ Creates a degraded raid1 with an integrity-enabled device (pmem0s) [ 59.076127] 2/ Tries to add an integrity-disabled device (pmem1m) [ 90.489209] 3/ Retries with an integrity-enabled device (pmem1s) [ 205.671277] [ 59.076127] md/raid1:md0: active with 1 out of 2 mirrors [ 59.078302] md: data integrity enabled on md0 [..] [ 90.489209] md0: incompatible integrity profile for pmem1m [..] [ 205.671277] md: super_written gets error=-5 [ 205.677386] md/raid1:md0: Disk failure on pmem1m, disabling device. [ 205.677386] md/raid1:md0: Operation continuing on 1 devices. [ 205.683037] RAID1 conf printout: [ 205.684699] --- wd:1 rd:2 [ 205.685972] disk 0, wo:0, o:1, dev:pmem0s [ 205.687562] disk 1, wo:1, o:1, dev:pmem1s [ 205.691717] md: recovery of RAID array md0 Fixes: `c7bfced9a6` ("md: suspend i/o during runtime blk_integrity_unregister") Cc: <stable@vger.kernel.org> Cc: Mike Snitzer <snitzer@redhat.com> Reported-by: NeilBrown <neilb@suse.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: NeilBrown <neilb@suse.com>	2016-01-14 11:49:57 +11:00
NeilBrown	7aafc405ce	Remove myself as MD Maintainer, and add to Credits. Signed-off-by: NeilBrown <neilb@suse.com>	2016-01-14 11:49:43 +11:00
Shaohua Li	16a43f6a65	raid5-cache: handle journal hotadd in quiesce Handle journal hotadd in quiesce to avoid creating duplicated threads. Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: NeilBrown <neilb@suse.com>	2016-01-14 11:49:43 +11:00
Shaohua Li	87d4d91616	MD: add journal with array suspended Hot add journal disk in recovery thread context brings a lot of trouble as IO could be running. Unlike spare disk hot add, adding journal disk with array suspended makes more sense and implmentation is much easier. Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: NeilBrown <neilb@suse.com>	2016-01-14 11:49:43 +11:00
Shaohua Li	a62ab49eb5	md: set MD_HAS_JOURNAL in correct places Set MD_HAS_JOURNAL when a array is loaded or journal is initialized. This is to avoid the flags set too early in journal disk hotadd. Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: NeilBrown <neilb@suse.com>	2016-01-14 11:49:43 +11:00
NeilBrown	274d8cbde1	md: Remove 'ready' field from mddev. This field is always set in tandem with ->pers, and when it is tested ->pers is also tested. So ->ready is not needed. It was needed once, but code rearrangement and locking changes have removed that needed. Signed-off-by: NeilBrown <neilb@suse.com>	2016-01-07 11:01:14 +11:00
Guoqing Jiang	bb9ef71646	md: remove unnecesary md_new_event_inintr md_new_event had removed sysfs_notify since 'commit `72a23c211e` ("Make sure all changes to md/sync_action are notified.")', so we can use md_new_event and delete md_new_event_inintr. Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: NeilBrown <neilb@suse.com>	2016-01-07 11:01:14 +11:00
Christoph Hellwig	5036c39020	raid5: allow r5l_io_unit allocations to fail And propagate the error up the stack so we can add the stripe to no_stripes_list and retry our log operation later. This avoids blocking raid5d due to reclaim, an it allows to get rid of the deadlock-prone GFP_NOFAIL allocation. shli: add missing mempool_destroy() Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: NeilBrown <neilb@suse.com>	2016-01-06 11:40:12 +11:00
Christoph Hellwig	e8deb63810	raid5-cache: use a mempool for the metadata block We only have a limited number in flight, so use a page based mempool. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: NeilBrown <neilb@suse.com>	2016-01-06 11:40:08 +11:00
Christoph Hellwig	c38d29b33b	raid5-cache: use a bio_set This allows us to make guaranteed forward progress. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: NeilBrown <neilb@suse.com>	2016-01-06 11:40:04 +11:00
Shaohua Li	f6b6ec5cfa	raid5-cache: add journal hot add/remove support Add support for journal disk hot add/remove. Mostly trival checks in md part. The raid5 part is a little tricky. For hot-remove, we can't wait pending write as it's called from raid5d. The wait will cause deadlock. We simplily fail the hot-remove. A hot-remove retry can success eventually since if journal disk is faulty all pending write will be failed and finish. For hot-add, since an array supporting journal but without journal disk will be marked read-only, we are safe to hot add journal without stopping IO (should be read IO, while journal only handles write IO). Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: NeilBrown <neilb@suse.com>	2016-01-06 11:39:57 +11:00
Deepa Dinamani	9ebc6ef188	drivers: md: use ktime_get_real_seconds() get_seconds() API is not y2038 safe on 32 bit systems and the API is deprecated. Replace it with calls to ktime_get_real_seconds() API instead. Change mddev structure types to time64_t accordingly. 32 bit signed timestamps will overflow in the year 2038. Change the user interface mdu_array_info_s structure timestamps: ctime and utime values used in ioctls GET_ARRAY_INFO and SET_ARRAY_INFO to unsigned int. This will extend the field to last until the year 2106. The long term plan is to get rid of ctime and utime values in this structure as this information can be read from the on-disk meta data directly. Clamp the tim64_t timestamps to positive values with a max of U32_MAX when returning from GET_ARRAY_INFO ioctl to accommodate above changes in the data type of timestamps to unsigned int. v0.90 on disk meta data uses u32 for maintaining time stamps. So this will also last until year 2106. Assumption is that the usage of v0.90 will be deprecated by year 2106. Timestamp fields in the on disk meta data for v1.0 version already use 64 bit data types. Remove the truncation of the bits while writing to or reading from these from the disk. Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: NeilBrown <neilb@suse.com>	2016-01-06 11:39:53 +11:00
Arnd Bergmann	3312c951ef	md: avoid warning for 32-bit sector_t When CONFIG_LBDAF is not set, sector_t is only 32-bits wide, which means we cannot have devices with more than 2TB, and the code that is trying to handle compatibility support for large devices in md version 0.90 is meaningless but also causes a compile-time warning: drivers/md/md.c: In function 'super_90_load': drivers/md/md.c:1029:19: warning: large integer implicitly truncated to unsigned type [-Woverflow] drivers/md/md.c: In function 'super_90_rdev_size_change': drivers/md/md.c:1323:17: warning: large integer implicitly truncated to unsigned type [-Woverflow] This adds a check for CONFIG_LBDAF to avoid even getting into this code path, and also adds an explicit cast to let the compiler know it doesn't have to warn about the truncation. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: NeilBrown <neilb@suse.com>	2016-01-06 11:39:48 +11:00
Christoph Hellwig	ad66d445ee	raid5-cache: free meta_page earlier Once the I/O completed we don't need the meta page anymore. As the iounits can live on for a long time this reduces memory pressure a bit. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Shaohua Li <shli@fb.com> Signed-off-by: NeilBrown <neilb@suse.com>	2016-01-06 11:39:43 +11:00
Christoph Hellwig	3848c0bcb0	raid5-cache: simplify r5l_move_io_unit_list It's only used for one kind of move, so make that explicit. Also clean up the code a bit by using list_for_each_safe. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Shaohua Li <shli@fb.com> Signed-off-by: NeilBrown <neilb@suse.com>	2016-01-06 11:39:34 +11:00
Guoqing Jiang	abf3508d8f	md: update comment for md_allow_write MD_CHANGE_CLEAN had been replaced with MD_CHANGE_PENDING after commit 070dc6 ("md: resolve confusion of MD_CHANGE_CLEAN"), so make the change accordingly. Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: NeilBrown <neilb@suse.com>	2016-01-06 11:39:26 +11:00
Guoqing Jiang	e19508fa4d	md-cluster: update comments for MD_CLUSTER_SEND_LOCKED_ALREADY 1. fix unbalanced parentheses. 2. add more description about that MD_CLUSTER_SEND_LOCKED_ALREADY will be cleared after set it in add_new_disk. Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: NeilBrown <neilb@suse.com>	2016-01-06 11:39:21 +11:00
Guoqing Jiang	8b9277c814	md-cluster: Protect communication with mutexes Communication can happen through multiple threads. It is possible that one thread steps over another threads sequence. So, we use mutexes to protect both the send and receive sequences. Send communication is locked through state bit, MD_CLUSTER_SEND_LOCK. Communication is locked with bit manipulation in order to allow "lock and hold" for the add operation. In case of an add operation, if the lock is held, MD_CLUSTER_SEND_LOCKED_ALREADY is set. When md_update_sb() calls metadata_update_start(), it checks (in a single statement to avoid races), if the communication is already locked. If yes, it merely returns zero, else it locks the token lockresource. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: NeilBrown <neilb@suse.com>	2016-01-06 11:39:17 +11:00
Guoqing Jiang	15858fa5b0	md-cluster: Defer MD reloading to mddev->thread Reloading of superblock must be performed under reconfig_mutex. However, this cannot be done with md_reload_sb because it would deadlock with the message DLM lock. So, we defer it in md_check_recovery() which is executed by mddev->thread. This introduces a new flag, MD_RELOAD_SB, which if set, will reload the superblock. And good_device_nr is also added to 'struct mddev' which is used to get the num of the good device within cluster raid. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: NeilBrown <neilb@suse.com>	2016-01-06 11:39:10 +11:00
Guoqing Jiang	d323ef0f1a	md-cluster: update the documentation Update design documentation based on recent development. original version comes from Neil. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: NeilBrown <neilb@suse.com>	2016-01-06 11:39:06 +11:00
Guoqing Jiang	f6a2dc64ee	md-cluster: append some actions when change bitmap from clustered to none For clustered raid, we need to do extra actions when change bitmap to none. 1. check if all the bitmap lock could be get or not, if yes then we can continue the change since cluster raid is only active in current node. Otherwise return fail and unlock the related bitmap locks 2. set nodes to 0 and then leave cluster environment. 3. release other nodes's bitmap lock. Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: NeilBrown <neilb@suse.com>	2016-01-06 11:38:57 +11:00
Goldwyn Rodrigues	09afd2a8d6	md-cluster: Allow spare devices to be marked as faulty If a spare device was marked faulty, it would not be reflected in receiving nodes because it would mark it as activated and continue. Continue the operation, so it may be set as faulty. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: NeilBrown <neilb@suse.com>	2016-01-06 11:38:51 +11:00
Goldwyn Rodrigues	54a88392cd	md-cluster: Fix the remove sequence with the new MD reload code The remove disk message does not need metadata_update_start(), but can be an independent message. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: NeilBrown <neilb@suse.com>	2016-01-06 11:38:42 +11:00
Guoqing Jiang	659b254fa7	md-cluster: remove a disk asynchronously from cluster environment For cluster raid, if one disk couldn't be reach in one node, then other nodes would receive the REMOVE message for the disk. In receiving node, we can't call md_kick_rdev_from_array to remove the disk from array synchronously since the disk might still be busy in this node. So let's set a ClusterRemove flag on the disk, then let the thread to do the removal job eventually. Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: NeilBrown <neilb@suse.com>	2016-01-06 11:38:36 +11:00
Goldwyn Rodrigues	ac277c6a8a	md-cluster: Avoid the resync ping-pong If a RESYNCING message with (0,0) has been sent before, do not send it again. This avoids a resync ping pong between the nodes. We read the bitmap lockresource's LVB to figure out the previous value of the RESYNCING message. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: NeilBrown <neilb@suse.com>	2016-01-06 11:38:27 +11:00
Roman Gushchin	b46020aa3a	md/raid5: remove redundant check in stripe_add_to_batch_list() The stripe_add_to_batch_list() function is called only if stripe_can_batch() returned true, so there is no need for double check. Signed-off-by: Roman Gushchin <klamm@yandex-team.ru> Cc: Neil Brown <neilb@suse.com> Cc: linux-raid@vger.kernel.org Signed-off-by: NeilBrown <neilb@suse.com>	2016-01-06 11:38:22 +11:00
Linus Torvalds	168309855a	Linux 4.4-rc8	2016-01-03 15:15:37 -08:00
Linus Torvalds	429461608e	Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus Pull MIPS build fix from Ralf Baechle: "Fix a makefile issue resulting in build breakage with older binutils. This has sat in -next for a few days, testers and buildbot are happy with it, too though if you are going for another -rc that'd certainly help ironing out a few more issues" * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: MIPS: VDSO: Fix build error with binutils 2.24 and earlier	2016-01-03 11:49:31 -08:00
Linus Torvalds	4e5e384c46	Merge tag 'drm-intel-fixes-2016-01-02' of git://anongit.freedesktop.org/drm-intel Pull i915 drm fixes from Jani Nikula: "Two display fixes still for v4.4. The new year's resolution is to start using signed tags per Linus' request. This one is still unsigned; I want to fix this up in our maintainer scripts instead of doing it one-off" * tag 'drm-intel-fixes-2016-01-02' of git://anongit.freedesktop.org/drm-intel: drm/i915: increase the tries for HDMI hotplug live status checking drm/i915: Unbreak check_digital_port_conflicts()	2016-01-03 11:36:26 -08:00
Linus Torvalds	9c982e86db	PCI updates for v4.4: HiSilicon host bridge driver Fix 32-bit config reads (Dongdong Liu) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJWhZsJAAoJEFmIoMA60/r8JO8P/2zpQEewcPoFqtPFoAXXNVCg vRnUlSVs5kz3cj3gUjbSjannWJZXFvsFgz/V+f3nxaOSLel550ccphhdS+oyh3L+ NVeQka8nnIbsVmGvNmebxNteBXa2CTGlZB4snRHQw+n1XjacqPQOMeccN09jCXmK GBdJvs1Xs2rphGHq52cLkkqUdSCEayUiYK/4WgAzcBe8EFy5kWvbObcoBuX9/3Lm fjnoWPXYSZFr+uyW8Q5+MztrpXJeOZV/krRZjcH2NxnLr1Xs+PnrC/NNu3BZvKnH qGyLc3vMIpeYS2VGiwJDKzmahyKm4Elh1iJNoywHIGPf3o0WzjJgnsiryZWomytd nVueiL8Oy0wUxoLupnFGdBIgbNvBeQSdeqcrXzjRfYHdHn3iakQTarUpVjqDUOEW 4iO4R+Xohq6X4Yhdr9RFxg2tCLk4dJebvwRNSGwTPmDnPZqzoQmg5uK84R1QrlD7 BM/ggHPryOogmeCqr7wCifkl73pMcvlK7maKUZcTgBz1E9aCeaGbz7Nc3KhUxSYV jvP84dEBx0QN5M3523sn/TNRZsAztUaBgGJLwuLetPazOgGORZD2msMqpTCr1a3J 4TQjadvc5RWG4MBOeU9r2WdQeZdSwj/X41XVLVh3qCZaYQCz8aBGMb/PGdgWz2kZ cBunX4VY1+S/EHu4abuw =5MNY -----END PGP SIGNATURE----- Merge tag 'pci-v4.4-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI bugfix from Bjorn Helgaas: "Here's another fix for v4.4. This fixes 32-bit config reads for the HiSilicon driver. Obviously the driver is completely broken without this fix (apparently it actually was tested internally, but got broken somehow in the process of upstreaming it). Summary: HiSilicon host bridge driver Fix 32-bit config reads (Dongdong Liu)" * tag 'pci-v4.4-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: hisi: Fix hisi_pcie_cfg_read() 32-bit reads	2015-12-31 14:59:21 -08:00
Linus Torvalds	7c672dd601	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc Pull sparc fixes from David Miller: "Just some missing syscall wire ups" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc: Wire up mlock2 system call. sparc: Add all necessary direct socket system calls.	2015-12-31 14:46:49 -08:00
Linus Torvalds	8f5daf2a49	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Prevent XFRM per-cpu counter updates for one namespace from being applied to another namespace. Fix from DanS treetman. 2) Fix RCU de-reference in iwl_mvm_get_key_sta_id(), from Johannes Berg. 3) Remove ethernet header assumption in nft_do_chain_netdev(), from Pablo Neira Ayuso. 4) Fix cpsw PHY ident with multiple slaves and fixed-phy, from Pascal Speck. 5) Fix use after free in sixpack_close and mkiss_close. 6) Fix VXLAN fw assertion on bnx2x, from Yuval Mintz. 7) natsemi doesn't check for DMA mapping errors, from Alexey Khoroshilov. 8) Fix inverted test in ip6addrlbl_get(), from ANdrey Ryabinin. 9) Missing initialization of needed_headroom in geneve tunnel driver, from Paolo Abeni. 10) Fix conntrack template leak in openvswitch, from Joe Stringer. 11) Mission initialization of wq->flags in sock_alloc_inode(), from Nicolai Stange. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (35 commits) sctp: sctp should release assoc when sctp_make_abort_user return NULL in sctp_close net, socket, socket_wq: fix missing initialization of flags drivers: net: cpsw: fix error return code openvswitch: Fix template leak in error cases. sctp: label accepted/peeled off sockets sctp: use GFP_USER for user-controlled kmalloc qlcnic: fix a loop exit condition better net: cdc_ncm: avoid changing RX/TX buffers on MTU changes geneve: initialize needed_headroom ipv6: honor ifindex in case we receive ll addresses in router advertisements addrconf: always initialize sysctl table data ipv6/addrlabel: fix ip6addrlbl_get() switchdev: bridge: Pass ageing time as clock_t instead of jiffies sh_eth: fix 16-bit descriptor field access endianness too veth: don’t modify ip_summed; doing so treats packets with bad checksums as good. net: usb: cdc_ncm: Adding Dell DW5813 LTE AT&T Mobile Broadband Card net: usb: cdc_ncm: Adding Dell DW5812 LTE Verizon Mobile Broadband Card natsemi: add checks for dma mapping errors rhashtable: Kill harmless RCU warning in rhashtable_walk_init openvswitch: correct encoding of set tunnel action attributes ...	2015-12-31 14:40:43 -08:00
David S. Miller	42d85c52f8	sparc: Wire up mlock2 system call. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-12-31 15:38:56 -05:00
David S. Miller	8b30ca73b7	sparc: Add all necessary direct socket system calls. The GLIBC folks would like to eliminate socketcall support eventually, and this makes sense regardless so wire them all up. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-12-31 15:18:02 -05:00
Xin Long	068d8bd338	sctp: sctp should release assoc when sctp_make_abort_user return NULL in sctp_close In sctp_close, sctp_make_abort_user may return NULL because of memory allocation failure. If this happens, it will bypass any state change and never free the assoc. The assoc has no chance to be freed and it will be kept in memory with the state it had even after the socket is closed by sctp_close(). So if sctp_make_abort_user fails to allocate memory, we should abort the asoc via sctp_primitive_ABORT as well. Just like the annotation in sctp_sf_cookie_wait_prm_abort and sctp_sf_do_9_1_prm_abort said, "Even if we can't send the ABORT due to low memory delete the TCB. This is a departure from our typical NOMEM handling". But then the chunk is NULL (low memory) and the SCTP_CMD_REPLY cmd would dereference the chunk pointer, and system crash. So we should add SCTP_CMD_REPLY cmd only when the chunk is not NULL, just like other places where it adds SCTP_CMD_REPLY cmd. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-12-30 16:57:16 -05:00
David S. Miller	a0ccc3f277	iwlwifi * don't load firmware that won't exist for 7260 * fix RCU splat -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQEcBAABAgAGBQJWgRwtAAoJEG4XJFUm622bds8H/0FTIXrzKiW8QfdHcBoiRctE HmZEOXg6QLJIAYndqu67I+HPg/GYhlGsChyJYcDSBKsOoAl0J5+URFZW9d5wCaaC qkUjljYa9SYQem9w65YcC4QpOWhxT6Zhj/3PEegTLvs0tDR8mfVNK8czXj5kKx9e XOaBf8nqQqVCmLlazp/HAzEwugryUA2NkTNQhoxtxHcUe4nflyKjUltPupuyD/tA VzWUxPzsvgAuoV9IIdJrvQY1EXBDGV/ZoY3x73TNDlU5uweRxaYr5ncLY08NUMQM rGgY7EvFXkCtYyzPyUVLE6hEOf5+XHAOeISbjuCvaiksZeHNxO6tuDwjJQRsxUY= =r3Py -----END PGP SIGNATURE----- Merge tag 'wireless-drivers-for-davem-2015-12-28' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers Kalle Valo says: ==================== iwlwifi * don't load firmware that won't exist for 7260 * fix RCU splat ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-12-30 16:42:31 -05:00
Nicolai Stange	574aab1e02	net, socket, socket_wq: fix missing initialization of flags Commit `ceb5d58b21` ("net: fix sock_wake_async() rcu protection") from the current 4.4 release cycle introduced a new flags member in struct socket_wq and moved SOCKWQ_ASYNC_NOSPACE and SOCKWQ_ASYNC_WAITDATA from struct socket's flags member into that new place. Unfortunately, the new flags field is never initialized properly, at least not for the struct socket_wq instance created in sock_alloc_inode(). One particular issue I encountered because of this is that my GNU Emacs failed to draw anything on my desktop -- i.e. what I got is a transparent window, including the title bar. Bisection lead to the commit mentioned above and further investigation by means of strace told me that Emacs is indeed speaking to my Xorg through an O_ASYNC AF_UNIX socket. This is reproducible 100% of times and the fact that properly initializing the struct socket_wq ->flags fixes the issue leads me to the conclusion that somehow SOCKWQ_ASYNC_WAITDATA got set in the uninitialized ->flags, preventing my Emacs from receiving any SIGIO's due to data becoming available and it got stuck. Make sock_alloc_inode() set the newly created struct socket_wq's ->flags member to zero. Fixes: `ceb5d58b21` ("net: fix sock_wake_async() rcu protection") Signed-off-by: Nicolai Stange <nicstange@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-12-30 16:38:01 -05:00
Linus Torvalds	c6169202e4	Merge branch 'for-linus' of git://git.kernel.dk/linux-block Pull block fixes from Jens Axboe: "Make the block layer great again. Basically three amazing fixes in this pull request, split into 4 patches. Believe me, they should go into 4.4. Two of them fix a regression, the third and last fixes an easy-to-trigger bug. - Fix a bad irq enable through null_blk, for queue_mode=1 and using timer completions. Add a block helper to restart a queue asynchronously, and use that from null_blk. From me. - Fix a performance issue in NVMe. Some devices (Intel Pxxxx) expose a stripe boundary, and performance suffers if we cross it. We took that into account for merging, but not for the newer splitting code. Fix from Keith. - Fix a kernel oops in lightnvm with multiple channels. From Matias" * 'for-linus' of git://git.kernel.dk/linux-block: lightnvm: wrong offset in bad blk lun calculation null_blk: use async queue restart helper block: add blk_start_queue_async() block: Split bios on chunk boundaries	2015-12-30 10:26:20 -08:00
Gary Wang	3d8acd1f66	drm/i915: increase the tries for HDMI hotplug live status checking The total delay of HDMI hotplug detecting with 30ms is sometimes not enoughtfor HDMI live status up with specific HDMI monitors in BSW platform. After doing experiments for following monitors, it needs 80ms at least for those worst cases. Lenovo L246 1xwA (4 failed, necessary hot-plug delay: 58/40/60/40ms) Philips HH2AP (9 failed, necessary hot-plug delay: 80/50/50/60/46/40/58/58/39ms) BENQ ET-0035-N (6 failed, necessary hot-plug delay: 60/50/50/80/80/40ms) DELL U2713HM (2 failed, necessary hot-plug delay: 58/59ms) HP HP-LP2475w (5 failed, necessary hot-plug delay: 70/50/40/60/40ms) It looks like 70-80 ms is BSW platform needs in some bad cases of the monitors at this end (8 times delay at most). Keep less than 100ms for HDCP pulse HPD low (with at least 100ms) to respond a plug out. Reviewed-by: Cooper Chiou <cooper.chiou@intel.com> Tested-by: Gary Wang <gary.c.wang@intel.com> Cc: Gavin Hindman <gavin.hindman@intel.com> Cc: Sonika Jindal <sonika.jindal@intel.com> Cc: Shashank Sharma <shashank.sharma@intel.com> Cc: Shobhit Kumar <shobhit.kumar@intel.com> Signed-off-by: Gary Wang <gary.c.wang@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1450858295-12804-1-git-send-email-gary.c.wang@intel.com Tested-by: Shobhit Kumar <shobhit.kumar@intel.com> Cc: drm-intel-fixes@lists.freedesktop.org Fixes: `237ed86c69` ("drm/i915: Check live status before reading edid") Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> (cherry picked from commit `f8d03ea005`) [Jani: undo the file mode change of the original commit] Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2015-12-30 13:58:37 +02:00
Linus Torvalds	866be88a1a	Merge branch 'akpm' (patches from Andrew) Merge misc fixes from Andrew Morton: "9 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm/vmstat: fix overflow in mod_zone_page_state() ocfs2/dlm: clear migration_pending when migration target goes down mm/memory_hotplug.c: check for missing sections in test_pages_in_a_zone() ocfs2: fix flock panic issue m32r: add io*_rep helpers m32r: fix build failure arch/x86/xen/suspend.c: include xen/xen.h mm: memcontrol: fix possible memcg leak due to interrupted reclaim ocfs2: fix BUG when calculate new backup super	2015-12-29 18:04:41 -08:00
Linus Torvalds	e25bd6ca8f	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fix from Al Viro: "Fix for 3.15 breakage of fcntl64() in arm OABI compat. -stable fodder" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: [PATCH] arm: fix handling of F_OFD_... in oabi_fcntl64()	2015-12-29 17:46:29 -08:00
Heiko Carstens	6cdb18ad98	mm/vmstat: fix overflow in mod_zone_page_state() mod_zone_page_state() takes a "delta" integer argument. delta contains the number of pages that should be added or subtracted from a struct zone's vm_stat field. If a zone is larger than 8TB this will cause overflows. E.g. for a zone with a size slightly larger than 8TB the line mod_zone_page_state(zone, NR_ALLOC_BATCH, zone->managed_pages); in mm/page_alloc.c:free_area_init_core() will result in a negative result for the NR_ALLOC_BATCH entry within the zone's vm_stat, since 8TB contain 0x8xxxxxxx pages which will be sign extended to a negative value. Fix this by changing the delta argument to long type. This could fix an early boot problem seen on s390, where we have a 9TB system with only one node. ZONE_DMA contains 2GB and ZONE_NORMAL the rest. The system is trying to allocate a GFP_DMA page but ZONE_DMA is completely empty, so it tries to reclaim pages in an endless loop. This was seen on a heavily patched 3.10 kernel. One possible explaination seem to be the overflows caused by mod_zone_page_state(). Unfortunately I did not have the chance to verify that this patch actually fixes the problem, since I don't have access to the system right now. However the overflow problem does exist anyway. Given the description that a system with slightly less than 8TB does work, this seems to be a candidate for the observed problem. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-12-29 17:45:49 -08:00
xuejiufei	cc28d6d80f	ocfs2/dlm: clear migration_pending when migration target goes down We have found a BUG on res->migration_pending when migrating lock resources. The situation is as follows. dlm_mark_lockres_migration res->migration_pending = 1; __dlm_lockres_reserve_ast dlm_lockres_release_ast returns with res->migration_pending remains because other threads reserve asts wait dlm_migration_can_proceed returns 1 >>>>>>> o2hb found that target goes down and remove target from domain_map dlm_migration_can_proceed returns 1 dlm_mark_lockres_migrating returns -ESHOTDOWN with res->migration_pending still remains. When reentering dlm_mark_lockres_migrating(), it will trigger the BUG_ON with res->migration_pending. So clear migration_pending when target is down. Signed-off-by: Jiufei Xue <xuejiufei@huawei.com> Reviewed-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-12-29 17:45:49 -08:00
Andrew Banman	5f0f2887f4	mm/memory_hotplug.c: check for missing sections in test_pages_in_a_zone() test_pages_in_a_zone() does not account for the possibility of missing sections in the given pfn range. pfn_valid_within always returns 1 when CONFIG_HOLES_IN_ZONE is not set, allowing invalid pfns from missing sections to pass the test, leading to a kernel oops. Wrap an additional pfn loop with PAGES_PER_SECTION granularity to check for missing sections before proceeding into the zone-check code. This also prevents a crash from offlining memory devices with missing sections. Despite this, it may be a good idea to keep the related patch '[PATCH 3/3] drivers: memory: prohibit offlining of memory blocks with missing sections' because missing sections in a memory block may lead to other problems not covered by the scope of this fix. Signed-off-by: Andrew Banman <abanman@sgi.com> Acked-by: Alex Thorlton <athorlton@sgi.com> Cc: Russ Anderson <rja@sgi.com> Cc: Alex Thorlton <athorlton@sgi.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Greg KH <greg@kroah.com> Cc: Seth Jennings <sjennings@variantweb.net> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-12-29 17:45:49 -08:00
Junxiao Bi	b5a8bc338e	ocfs2: fix flock panic issue Commit `4f6563677a` ("Move locks API users to locks_lock_inode_wait()") move flock/posix lock indentify code to locks_lock_inode_wait(), but missed to set fl_flags to FL_FLOCK which caused the following kernel panic on 4.4.0_rc5. kernel BUG at fs/locks.c:1895! invalid opcode: 0000 [#1] SMP Modules linked in: ocfs2(O) ocfs2_dlmfs(O) ocfs2_stack_o2cb(O) ocfs2_dlm(O) ocfs2_nodemanager(O) ocfs2_stackglue(O) iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi xen_kbdfront xen_netfront xen_fbfront xen_blkfront CPU: 0 PID: 20268 Comm: flock_unit_test Tainted: G O 4.4.0-rc5-next-20151217 #1 Hardware name: Xen HVM domU, BIOS 4.3.1OVM 05/14/2014 task: ffff88007b3672c0 ti: ffff880028b58000 task.ti: ffff880028b58000 RIP: locks_lock_inode_wait+0x2e/0x160 Call Trace: ocfs2_do_flock+0x91/0x160 [ocfs2] ocfs2_flock+0x76/0xd0 [ocfs2] SyS_flock+0x10f/0x1a0 entry_SYSCALL_64_fastpath+0x12/0x71 Code: e5 41 57 41 56 49 89 fe 41 55 41 54 53 48 89 f3 48 81 ec 88 00 00 00 8b 46 40 83 e0 03 83 f8 01 0f 84 ad 00 00 00 83 f8 02 74 04 <0f> 0b eb fe 4c 8d ad 60 ff ff ff 4c 8d 7b 58 e8 0e 8e 73 00 4d RIP locks_lock_inode_wait+0x2e/0x160 RSP <ffff880028b5bce8> ---[ end trace dfca74ec9b5b274c ]--- Fixes: `4f6563677a` ("Move locks API users to locks_lock_inode_wait()") Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Joseph Qi <joseph.qi@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-12-29 17:45:49 -08:00
Sudip Mukherjee	92a8ed4c76	m32r: add io*_rep helpers m32r allmodconfig was failing with the error: error: implicit declaration of function 'read' On checking io.h it turned out that 'read' is not defined but 'readb' is defined and 'ioread8' will then obviously mean 'readb'. At the same time some of the helper functions ioreadN_rep() and iowriteN_rep() were missing which also led to the build failure. Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-12-29 17:45:49 -08:00
Sudip Mukherjee	6122192eb6	m32r: fix build failure m32r allmodconfig is failing with: In file included from ../include/linux/kvm_para.h:4:0, from ../kernel/watchdog.c:26: ../include/uapi/linux/kvm_para.h:30:26: fatal error: asm/kvm_para.h: No such file or directory kvm_para.h was not included in the build. Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-12-29 17:45:49 -08:00
Andrew Morton	facca61683	arch/x86/xen/suspend.c: include xen/xen.h Fix the build warning: arch/x86/xen/suspend.c: In function 'xen_arch_pre_suspend': arch/x86/xen/suspend.c:70:9: error: implicit declaration of function 'xen_pv_domain' [-Werror=implicit-function-declaration] if (xen_pv_domain()) ^ Reported-by: kbuild test robot <fengguang.wu@intel.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-12-29 17:45:49 -08:00
Vladimir Davydov	6df38689e0	mm: memcontrol: fix possible memcg leak due to interrupted reclaim Memory cgroup reclaim can be interrupted with mem_cgroup_iter_break() once enough pages have been reclaimed, in which case, in contrast to a full round-trip over a cgroup sub-tree, the current position stored in mem_cgroup_reclaim_iter of the target cgroup does not get invalidated and so is left holding the reference to the last scanned cgroup. If the target cgroup does not get scanned again (we might have just reclaimed the last page or all processes might exit and free their memory voluntary), we will leak it, because there is nobody to put the reference held by the iterator. The problem is easy to reproduce by running the following command sequence in a loop: mkdir /sys/fs/cgroup/memory/test echo 100M > /sys/fs/cgroup/memory/test/memory.limit_in_bytes echo $$ > /sys/fs/cgroup/memory/test/cgroup.procs memhog 150M echo $$ > /sys/fs/cgroup/memory/cgroup.procs rmdir test The cgroups generated by it will never get freed. This patch fixes this issue by making mem_cgroup_iter avoid taking reference to the current position. In order not to hit use-after-free bug while running reclaim in parallel with cgroup deletion, we make use of ->css_released cgroup callback to clear references to the dying cgroup in all reclaim iterators that might refer to it. This callback is called right before scheduling rcu work which will free css, so if we access iter->position from rcu read section, we might be sure it won't go away under us. [hannes@cmpxchg.org: clean up css ref handling] Fixes: `5ac8fb31ad` ("mm: memcontrol: convert reclaim iterator to simple css refcounting") Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: <stable@vger.kernel.org> [3.19+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-12-29 17:45:49 -08:00
Joseph Qi	5c9ee4cbf2	ocfs2: fix BUG when calculate new backup super When resizing, it firstly extends the last gd. Once it should backup super in the gd, it calculates new backup super and update the corresponding value. But it currently doesn't consider the situation that the backup super is already done. And in this case, it still sets the bit in gd bitmap and then decrease from bg_free_bits_count, which leads to a corrupted gd and trigger the BUG in ocfs2_block_group_set_bits: BUG_ON(le16_to_cpu(bg->bg_free_bits_count) < num_bits); So check whether the backup super is done and then do the updates. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Reviewed-by: Jiufei Xue <xuejiufei@huawei.com> Reviewed-by: Yiwen Jiang <jiangyiwen@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-12-29 17:45:49 -08:00

1 2 3 4 5 ...

562299 Commits All Branches Search

562299 Commits

All Branches