linux

Commit Graph

Author	SHA1	Message	Date
Guenter Roeck	6793abb4e8	net: dsa: Add support for switch EEPROM access On some chips it is possible to access the switch eeprom. Add infrastructure support for it. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-30 14:54:11 -04:00
Guenter Roeck	87e5f66b37	net: dsa/mv88e6123_61_65: Report chip temperature MV88E6123 and compatible chips support reading the chip temperature from PHY register 6:26. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-30 14:54:11 -04:00
Guenter Roeck	276db3b15d	net: dsa/mv88e6352: Report chip temperature MV88E6352 supports reading the chip temperature from two PHY registers, 6:26 and 6:27. Report it using the more accurate register 6:27. Also report temperature limit and alarm. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-30 14:54:11 -04:00
Guenter Roeck	51579c3f1a	net: dsa: Add support for reporting switch chip temperatures Some switches provide chip temperature data. Add support for reporting it through the hwmon subsystem. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-30 14:54:11 -04:00
Guenter Roeck	2716777b4f	net: dsa/mv88e6352: Add support for MV88E6176 MV88E6176 is mostly compatible to MV88E6352 and is documented in the same functional specification. Add support for it. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-30 14:54:10 -04:00
Guenter Roeck	3ad50cca39	net: dsa: Add support for Marvell 88E6352 Marvell 88E6352 is mostly compatible to MV88E6123/61/65, but requires indirect phy access. Also, its configuration registers are a bit different. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-30 14:54:10 -04:00
Guenter Roeck	a93e464a45	net: dsa: Report known silicon revisions for Marvell 88E6131 Report known silicon revisions when probing Marvell 88E6131 switches. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-30 14:54:10 -04:00
Guenter Roeck	3de6aa4c35	net: dsa: Report known silicon revisions for Marvell 88E6060 Report known silicon revisions when probing Marvell 88E6060 switches. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-30 14:54:10 -04:00
Guenter Roeck	734cbb5b6b	net: dsa: Don't set skb->protocol on outgoing tagged packets Setting skb->protocol to a private protocol type may result in warning messages such as e1000e 0000:00:19.0 em1: checksum_partial proto=dada! This happens if the L3 protocol is IP or IPv6 and skb->ip_summed is set to CHECKSUM_PARTIAL. Looking through the code, it appears that changing skb->protocol for transmitted packets is not necessary and may actually be harmful. For example, it prevents purposely unmodified (from a DSA perspective) network drivers from properly setting up their transmit checksum offload pointers since they inspect skb->protocol to set up the IPv4 header or IPv6 header pointers. So don't unnecessarily change the protocol field. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-30 14:54:10 -04:00
Greg Kroah-Hartman	d8e7d53a2f	PCI: Rename sysfs 'enabled' file back to 'enable' Back in commit `5136b2da77` ("PCI: convert bus code to use dev_groups"), I misstyped the 'enable' sysfs filename as 'enabled', which broke the userspace API. This patch fixes that issue by renaming the file back. Fixes: `5136b2da77` ("PCI: convert bus code to use dev_groups") Reported-by: Jeff Epler <jepler@unpythonic.net> Tested-by: Jeff Epler <jepler@unpythonic.net> # on v3.14-rt Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: stable@vger.kernel.org # 3.13	2014-10-30 11:17:10 -06:00
Linus Torvalds	3a2f22b7d0	fbdev fixes for 3.18 * Fix fb console option parsing * Fixes for OMAPDSS/OMAPFB crashes related to module unloading and device/driver binding & unbinding. * Fix for OMAP HDMI PLL locking failing in certain cases * Misc minor fixes for atmel lcdfb and OMAP -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJUUjoFAAoJEPo9qoy8lh71SaoP/jJR3kggzXGG47NmRp9ztzv7 HYWydxfE5OcEGXl69CqZwdeL9aT5Pgub3L2SJGH9X7aDJT/Qvo/dSir9tt9DoYO/ DreOr+opDinysdfZmbvd5+M9Or/j3iO2pyJZixrsrFmer1PiFsvY0Pqfu5MJI/xr oZOVJuxVyCF1/s8gcxSCNP4ntKY1cIacnDjPs1FxAaTeLK2478Iar/HdLGCS/1P0 mG/dnw2/vOT3P6hMnWsxgxVRk7tY1nLqIs9Z8v3fHIsJyuFlO0nB89WPFPoXZqw+ OMT7F/5KOH4rBtKVUx2JsGdFllixF1vmSRdRdlhIwVEZ5jloTCgUEmYAFnIBFufs 3Ui5GNKQ1JsEIkinFzm/h1AKGKP25hr7xQc8FNtaBWr7kp9dBZN6fD1fI2Ii+1eD hQNXzCdz1dSeEy4vvbLoV7LMvMl4LLEwKfE/42ISy2PQXGZG0KJgewf24x3n6hbS rcUsZTOm9s7CcHonPORUKJL+X5c8chLQkTApT01YlvdQPGdd9phFEOt0H8147GBf cHkMXyMcY9U/xABf9lb8raTzBQw0bUN/S5QCst+i6moPD6jOPfiJuNDZjx+X+chl FurJc38sJglx5mk3RzYhHdsFoDdYsPZAVWaS2Bg3cwdfYVlewgxuICE5RkAfOzqP P4fBow49gz8pIk6dkw1b =1WfV -----END PGP SIGNATURE----- Merge tag 'fbdev-fixes-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux Pull fbdev fixes from Tomi Valkeinen: - fix fb console option parsing - fixes for OMAPDSS/OMAPFB crashes related to module unloading and device/driver binding & unbinding. - fix for OMAP HDMI PLL locking failing in certain cases - misc minor fixes for atmel lcdfb and OMAP * tag 'fbdev-fixes-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux: omap: dss: connector-analog-tv: Add missing module device table OMAPDSS: DSI: Fix PLL_SELFEQDCO field width OMAPDSS: fix dispc register dump for preload & mflag OMAPDSS: DISPC: fix mflag offset OMAPDSS: HDMI: fix regsd write OMAPDSS: HDMI: fix PLL GO bit handling OMAPFB: fix releasing overlays OMAPFB: fix overlay disable when freeing resources. OMAPDSS: apply: wait pending updates on manager disable OMAPFB: remove __exit annotation OMAPDSS: set suppress_bind_attrs OMAPFB: add missing MODULE_ALIAS() drivers: video: fbdev: atmel_lcdfb.c: remove unnecessary header video/console: Resolve several shadow warnings fbcon: Fix option parsing control flow in fb_console_setup	2014-10-30 09:34:35 -07:00
Linus Torvalds	94712927d0	sound fixes for 3.18-rc3 Although the diffstat looks scary, it's just because of the removal of the dead code (s6000), thus it must not affect anything serious. Other than that, all small fixes. The only core fix is zero-clear for a PCM compat ioctl. The rest are driver-specific, bebob, sgtl500, adau1761, intel-sst, ad1889 and a few HD-audio quirks as usual. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABAgAGBQJUURWjAAoJEGwxgFQ9KSmkWiEP/R2n/jskNVH1Pk/9BwTpo+ND YsH1Ysni5+QfiEh4R+FJJxnaBS2nbXYGbCMG9qIoaEmkMkujMn7eoyq0QLoZVkVs tB4rEFkPwPKpZJjsPEwzgOk06FQR04qU2PbovA5dijC0mtLaaVwp5Hd/Xfvv09vQ ts3y+yA4Y7DpdIwJ727c9aeEALuZAdfAgqS755+ZDBXbY0uefEIXfV2eSj4Fuuis nWjMPdlfLAaBrwOa+h62mHkzw5jARGvEWlPC8Q9v+Z1liAJa+aDI1Pj7Ctr4NaFI DL3I5UigETjPAll90z9F0qdwV6Z8kmppw1uoZntt2OwFfyo37675hJX8AbHTjik8 BXVKwQeJkgwufHgHQBen80MQh59Whn34dJ9qKYHM5Xa+v/ex2ACxeJQIa3+1nxcd jbfszGlI2PZNZCNtYBxtcSpiXu0Sdfhfy16v4V6CCZ6kQ9hj7GQm9HrB9NrFAHED zWFbCf5ouu5M3i2j6VKLGmP0f6I/Mxb+4Rvw0KzlKKnM1eqFK/Ttqn0aHJXaTEb9 AoHbzlAgES1Lo8Ftg/xbX4J8NcAMQRkDUYpiSELFc/+LQ3rNU+Bcgf5aWa/rVnII PnClako0XPSQcRvvhVOMcJzuj0YLCr5TMtmlbDCDgPm39pgCkUoKhnJHeJ/IbCRG Ixk33L41prRJF3Yuw1CJ =1RIC -----END PGP SIGNATURE----- Merge tag 'sound-3.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "Although the diffstat looks scary, it's just because of the removal of the dead code (s6000), thus it must not affect anything serious. Other than that, all small fixes. The only core fix is zero-clear for a PCM compat ioctl. The rest are driver-specific, bebob, sgtl500, adau1761, intel-sst, ad1889 and a few HD-audio quirks as usual" * tag 'sound-3.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda - Add workaround for CMI8888 snoop behavior ALSA: pcm: Zero-clear reserved fields of PCM status ioctl in compat mode ALSA: bebob: Uninitialized id returned by saffirepro_both_clk_src_get ALSA: hda/realtek - New SSID for Headset quirk ALSA: ad1889: Fix probable mask then right shift defects ALSA: bebob: fix wrong decoding of clock information for Terratec PHASE 88 Rack FW ALSA: hda/realtek - Update restore default value for ALC283 ALSA: hda/realtek - Update restore default value for ALC282 ASoC: fsl: use strncpy() to prevent copying of over-long names ASoC: adau1761: Fix input PGA volume ASoC: s6000: remove driver ASoC: Intel: HSW/BDW only support S16 and S24 formats. ASoC: sgtl500: Document the required supplies	2014-10-30 09:11:38 -07:00
Jan Kara	ae9e9c6aee	ext4: make ext4_ext_convert_to_initialized() return proper number of blocks ext4_ext_convert_to_initialized() can return more blocks than are actually allocated from map->m_lblk in case where initial part of the on-disk extent is zeroed out. Luckily this doesn't have serious consequences because the caller currently uses the return value only to unmap metadata buffers. Anyway this is a data corruption/exposure problem waiting to happen so fix it. Coverity-id: 1226848 Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2014-10-30 10:53:17 -04:00
Jan Kara	4f879ca687	ext4: bail early when clearing inode journal flag fails When clearing inode journal flag, we call jbd2_journal_flush() to force all the journalled data to their final locations. Currently we ignore when this fails and continue clearing inode journal flag. This isn't a big problem because when jbd2_journal_flush() fails, journal is likely aborted anyway. But it can still lead to somewhat confusing results so rather bail out early. Coverity-id: 989044 Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2014-10-30 10:53:17 -04:00
Jan Kara	6050d47adc	ext4: bail out from make_indexed_dir() on first error When ext4_handle_dirty_dx_node() or ext4_handle_dirty_dirent_node() fail, there's really something wrong with the fs and there's no point in continuing further. Just return error from make_indexed_dir() in that case. Also initialize frames array so that if we return early due to error, dx_release() doesn't try to dereference uninitialized memory (which could happen also due to error in do_split()). Coverity-id: 741300 Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org	2014-10-30 10:53:17 -04:00
Theodore Ts'o	d48458d4a7	jbd2: use a better hash function for the revoke table The old hash function didn't work well for 64-bit block numbers, and used undefined (negative) shift right behavior. Use the generic 64-bit hash function instead. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reported-by: Andrey Ryabinin <a.ryabinin@samsung.com>	2014-10-30 10:53:17 -04:00
Dmitry Monakhov	a41537e69b	ext4: prevent bugon on race between write/fcntl O_DIRECT flags can be toggeled via fcntl(F_SETFL). But this value checked twice inside ext4_file_write_iter() and __generic_file_write() which result in BUG_ON inside ext4_direct_IO. Let's initialize iocb->private unconditionally. TESTCASE: xfstest:generic/036 https://patchwork.ozlabs.org/patch/402445/ #TYPICAL STACK TRACE: kernel BUG at fs/ext4/inode.c:2960! invalid opcode: 0000 [#1] SMP Modules linked in: brd iTCO_wdt lpc_ich mfd_core igb ptp dm_mirror dm_region_hash dm_log dm_mod CPU: 6 PID: 5505 Comm: aio-dio-fcntl-r Not tainted 3.17.0-rc2-00176-gff5c017 #161 Hardware name: Intel Corporation W2600CR/W2600CR, BIOS SE5C600.86B.99.99.x028.061320111235 06/13/2011 task: ffff88080e95a7c0 ti: ffff88080f908000 task.ti: ffff88080f908000 RIP: 0010:[<ffffffff811fabf2>] [<ffffffff811fabf2>] ext4_direct_IO+0x162/0x3d0 RSP: 0018:ffff88080f90bb58 EFLAGS: 00010246 RAX: 0000000000000400 RBX: ffff88080fdb2a28 RCX: 00000000a802c818 RDX: 0000040000080000 RSI: ffff88080d8aeb80 RDI: 0000000000000001 RBP: ffff88080f90bbc8 R08: 0000000000000000 R09: 0000000000001581 R10: 0000000000000000 R11: 0000000000000000 R12: ffff88080d8aeb80 R13: ffff88080f90bbf8 R14: ffff88080fdb28c8 R15: ffff88080fdb2a28 FS: 00007f23b2055700(0000) GS:ffff880818400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f23b2045000 CR3: 000000080cedf000 CR4: 00000000000407e0 Stack: ffff88080f90bb98 0000000000000000 7ffffffffffffffe ffff88080fdb2c30 0000000000000200 0000000000000200 0000000000000001 0000000000000200 ffff88080f90bbc8 ffff88080fdb2c30 ffff88080f90be08 0000000000000200 Call Trace: [<ffffffff8112ca9d>] generic_file_direct_write+0xed/0x180 [<ffffffff8112f2b2>] __generic_file_write_iter+0x222/0x370 [<ffffffff811f495b>] ext4_file_write_iter+0x34b/0x400 [<ffffffff811bd709>] ? aio_run_iocb+0x239/0x410 [<ffffffff811bd709>] ? aio_run_iocb+0x239/0x410 [<ffffffff810990e5>] ? local_clock+0x25/0x30 [<ffffffff810abd94>] ? __lock_acquire+0x274/0x700 [<ffffffff811f4610>] ? ext4_unwritten_wait+0xb0/0xb0 [<ffffffff811bd756>] aio_run_iocb+0x286/0x410 [<ffffffff810990e5>] ? local_clock+0x25/0x30 [<ffffffff810ac359>] ? lock_release_holdtime+0x29/0x190 [<ffffffff811bc05b>] ? lookup_ioctx+0x4b/0xf0 [<ffffffff811bde3b>] do_io_submit+0x55b/0x740 [<ffffffff811bdcaa>] ? do_io_submit+0x3ca/0x740 [<ffffffff811be030>] SyS_io_submit+0x10/0x20 [<ffffffff815ce192>] system_call_fastpath+0x16/0x1b Code: 01 48 8b 80 f0 01 00 00 48 8b 18 49 8b 45 10 0f 85 f1 01 00 00 48 03 45 c8 48 3b 43 48 0f 8f e3 01 00 00 49 83 7c 24 18 00 75 04 <0f> 0b eb fe f0 ff 83 ec 01 00 00 49 8b 44 24 18 8b 00 85 c0 89 RIP [<ffffffff811fabf2>] ext4_direct_IO+0x162/0x3d0 RSP <ffff88080f90bb58> Reported-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Cc: stable@vger.kernel.org	2014-10-30 10:53:16 -04:00
Darrick J. Wong	50460fe8c6	ext4: remove extent status procfs files if journal load fails If we can't load the journal, remove the procfs files for the extent status information file to avoid leaking resources. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org	2014-10-30 10:53:16 -04:00
Darrick J. Wong	6b992ff256	ext4: disallow changing journal_csum option during remount ext4 does not permit changing the metadata or journal checksum feature flag while mounted. Until we decide to support that, don't allow a remount to change the journal_csum flag (right now we silently fail to change anything). Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2014-10-30 10:53:16 -04:00
Darrick J. Wong	98c1a7593f	ext4: enable journal checksum when metadata checksum feature enabled If metadata checksumming is turned on for the FS, we need to tell the journal to use checksumming too. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org	2014-10-30 10:53:16 -04:00
Jan Kara	599a9b77ab	ext4: fix oops when loading block bitmap failed When we fail to load block bitmap in __ext4_new_inode() we will dereference NULL pointer in ext4_journal_get_write_access(). So check for error from ext4_read_block_bitmap(). Coverity-id: 989065 Cc: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2014-10-30 10:53:16 -04:00
Jan Kara	9378c6768e	ext4: fix overflow when updating superblock backups after resize When there are no meta block groups update_backups() will compute the backup block in 32-bit arithmetics thus possibly overflowing the block number and corrupting the filesystem. OTOH filesystems without meta block groups larger than 16 TB should be rare. Fix the problem by doing the counting in 64-bit arithmetics. Coverity-id: 741252 CC: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Lukas Czerner <lczerner@redhat.com>	2014-10-30 10:52:57 -04:00
Tomi Valkeinen	a942535d6e	Merge branch '3.18/omapdss-fixes' into 3.18/fbdev-fixes	2014-10-30 14:53:49 +02:00
Marek Belisko	4ee9d9d2c2	omap: dss: connector-analog-tv: Add missing module device table Without that fix connector-analog-tv driver isn't probed when compiled as module. Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com> Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>	2014-10-30 14:51:59 +02:00
Emil Tantilov	e3215f0ac7	ixgbe: fix race when setting advertised speed Following commands: modprobe ixgbe ifconfig ethX up ethtool -s ethX advertise 0x020 can lead to "setup link failed with code -14" error due to the setup_link call racing with the SFP detection routine in the watchdog. This patch resolves this issue by protecting the setup_link call with check for __IXGBE_IN_SFP_INIT. Reported-by: Scott Harrison <scoharr2@cisco.com> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2014-10-30 05:12:07 -07:00
Junwei Zhang	4d2fcfbcf8	ixgbe: need not repeat init skb with NULL Signed-off-by: Martin Zhang <martinbj2008@gmail.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2014-10-30 05:04:39 -07:00
Roman Gushchin	bc16e47f03	igb: don't reuse pages with pfmemalloc flag Incoming packet is dropped silently by sk_filter(), if the skb was allocated from pfmemalloc reserves and the corresponding socket is not marked with the SOCK_MEMALLOC flag. Igb driver allocates pages for DMA with __skb_alloc_page(), which calls alloc_pages_node() with the __GFP_MEMALLOC flag. So, in case of OOM condition, igb can get pages with pfmemalloc flag set. If an incoming packet hits the pfmemalloc page and is large enough (small packets are copying into the memory, allocated with netdev_alloc_skb_ip_align(), so they are not affected), it will be dropped. This behavior is ok under high memory pressure, but the problem is that the igb driver reuses these mapped pages. So, packets are still dropping even if all memory issues are gone and there is a plenty of free memory. In my case, some TCP sessions hang on a small percentage (< 0.1%) of machines days after OOMs. Fix this by avoiding reuse of such pages. Signed-off-by: Roman Gushchin <klamm@yandex-team.ru> Tested-by: Aaron Brown "aaron.f.brown@intel.com" Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2014-10-30 04:56:52 -07:00
Francesco Ruggeri	a22bb0b9b9	e1000: unset IFF_UNICAST_FLT on WMware 82545EM VMWare's e1000 implementation does not seem to support unicast filtering. This can be observed by configuring a macvlan interface on eth0 in a VM in VMWare Fusion 5.0.5, and trying to use that interface instead of eth0. Tested on 3.16. Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2014-10-30 04:47:39 -07:00
Ingo Molnar	21ee24bf5b	Merge branch 'urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/urgent Pull two RCU fixes from Paul E. McKenney: " - Complete the work of commit `dd56af42bd` (rcu: Eliminate deadlock between CPU hotplug and expedited grace periods), which was intended to allow synchronize_sched_expedited() to be safely used when holding locks acquired by CPU-hotplug notifiers. This commit makes the put_online_cpus() avoid the deadlock instead of just handling the get_online_cpus(). - Complete the work of commit `35ce7f29a4` (rcu: Create rcuo kthreads only for onlined CPUs), which was intended to allow RCU to avoid allocating unneeded kthreads on systems where the firmware says that there are more CPUs than are really present. This commit makes rcu_barrier() aware of the mismatch, so that it doesn't hang waiting for non-existent CPUs. " Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-10-30 07:37:37 +01:00
Ingo Molnar	d785452c99	perf/urgent fixes: User visible: * Fix report -F (abort, in_tx, mispredict, etc) segfaults for sample.data files without branch info (Jiri Olsa) * Add patch that should have went in a previous patchkit to use global cache provided by libunwind (Namhyung Kim) * Make CPUINFO_PROC an array to support different kernels, problem detected when the information reported via /proc/cpuinfo changed on ARM (Wang Nan) * 'perf probe' --demangle typo fix and a new --quiet option (Masami Hiramatsu) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJUUOC5AAoJEBpxZoYYoA71wRYH/A5DhJq50X4KhHAr/RLcEn3T HNFUGrQKChuczZ11JM9dZiCHN8TXTdG2ql71SFXzAQ96TP7LwWVcHlAQQV7CIwQM jGFwSqPrDK2pD8EhQoOEChobGpNDAUPZclUUwCdw22cvoig0xdfjQjpNehAsxyzI jgl6wb6hF/lHuoYXFxijrV3RbGcnNVUCHnjyhKD6WTplq3EvT16hsLFaM9t/r/p/ wL7wB5Sz8IUUSCWaq63u39spGwy+qH+bgZ9fyaccxCMhBPQ5+Eo9moHIfnjwuxbP d8l7pDnCMXGiaMO1Jt3BH1C3mQr3HNFCxxrgQOOLM9VVmxi5WPcmu945tC354uw= =Klbg -----END PGP SIGNATURE----- Merge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes from Arnaldo Carvalho de Melo: - Fix report -F (abort, in_tx, mispredict, etc) segfaults for sample.data files without branch info (Jiri Olsa) - Add patch that should have went in a previous patchkit to use global cache provided by libunwind (Namhyung Kim) - Make CPUINFO_PROC an array to support different kernels, problem detected when the information reported via /proc/cpuinfo changed on ARM (Wang Nan) - 'perf probe' --demangle typo fix and a new --quiet option (Masami Hiramatsu) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-10-30 07:32:34 +01:00
Linus Torvalds	a7ca10f263	Merge branch 'akpm' (incoming from Andrew Morton) Merge misc fixes from Andrew Morton: "21 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (21 commits) mm/balloon_compaction: fix deflation when compaction is disabled sh: fix sh770x SCIF memory regions zram: avoid NULL pointer access in concurrent situation mm/slab_common: don't check for duplicate cache names ocfs2: fix d_splice_alias() return code checking mm: rmap: split out page_remove_file_rmap() mm: memcontrol: fix missed end-writeback page accounting mm: page-writeback: inline account_page_dirtied() into single caller lib/bitmap.c: fix undefined shift in __bitmap_shift_{left\|right}() drivers/rtc/rtc-bq32k.c: fix register value memory-hotplug: clear pgdat which is allocated by bootmem in try_offline_node() drivers/rtc/rtc-s3c.c: fix initialization failure without rtc source clock kernel/kmod: fix use-after-free of the sub_info structure drivers/rtc/rtc-pm8xxx.c: rework to support pm8941 rtc mm, thp: fix collapsing of hugepages on madvise drivers: of: add return value to of_reserved_mem_device_init() mm: free compound page with correct order gcov: add ARM64 to GCOV_PROFILE_ALL fsnotify: next_i is freed during fsnotify_unmount_inodes. mm/compaction.c: avoid premature range skip in isolate_migratepages_range ...	2014-10-29 16:38:48 -07:00
Konstantin Khlebnikov	4d88e6f7d5	mm/balloon_compaction: fix deflation when compaction is disabled If CONFIG_BALLOON_COMPACTION=n balloon_page_insert() does not link pages with balloon and doesn't set PagePrivate flag, as a result balloon_page_dequeue() cannot get any pages because it thinks that all of them are isolated. Without balloon compaction nobody can isolate ballooned pages. It's safe to remove this check. Fixes: `d6d86c0a7f` ("mm/balloon_compaction: redesign ballooned pages management"). Signed-off-by: Konstantin Khlebnikov <k.khlebnikov@samsung.com> Reported-by: Matt Mullins <mmullins@mmlx.us> Cc: <stable@vger.kernel.org> [3.17] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-29 16:33:15 -07:00
Andriy Skulysh	5417421b27	sh: fix sh770x SCIF memory regions Resources scif1_resources & scif2_resources overlap. Actual SCIF region size is 0x10. This is regression from commit `d850acf975` ("sh: Declare SCIF register base and IRQ as resources") Signed-off-by: Andriy Skulysh <askulysh@gmail.com> Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-29 16:33:15 -07:00
Weijie Yang	5a99e95b8d	zram: avoid NULL pointer access in concurrent situation There is a rare NULL pointer bug in mem_used_total_show() and mem_used_max_store() in concurrent situation, like this: zram is not initialized, process A is a mem_used_total reader which runs periodically, while process B try to init zram. process A process B access meta, get a NULL value init zram, done init_done() is true access meta->mem_pool, get a NULL pointer BUG This patch fixes this issue. Signed-off-by: Weijie Yang <weijie.yang@samsung.com> Acked-by: Minchan Kim <minchan@kernel.org> Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-29 16:33:15 -07:00
Mikulas Patocka	8aba7e0a2c	mm/slab_common: don't check for duplicate cache names The SLUB cache merges caches with the same size and alignment and there was long standing bug with this behavior: - create the cache named "foo" - create the cache named "bar" (which is merged with "foo") - delete the cache named "foo" (but it stays allocated because "bar" uses it) - create the cache named "foo" again - it fails because the name "foo" is already used That bug was fixed in commit `694617474e` ("slab_common: fix the check for duplicate slab names") by not warning on duplicate cache names when the SLUB subsystem is used. Recently, cache merging was implemented the with SLAB subsystem too, in `12220dea07` ("mm/slab: support slab merge")). Therefore we need stop checking for duplicate names even for the SLAB subsystem. This patch fixes the bug by removing the check. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Acked-by: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-29 16:33:15 -07:00
Richard Weinberger	d3556babd7	ocfs2: fix d_splice_alias() return code checking d_splice_alias() can return a valid dentry, NULL or an ERR_PTR. Currently the code checks not for ERR_PTR and will cuase an oops in ocfs2_dentry_attach_lock(). Fix this by using IS_ERR_OR_NULL(). Signed-off-by: Richard Weinberger <richard@nod.at> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-29 16:33:15 -07:00
Johannes Weiner	8186eb6a79	mm: rmap: split out page_remove_file_rmap() page_remove_rmap() has too many branches on PageAnon() and is hard to follow. Move the file part into a separate function. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Michal Hocko <mhocko@suse.cz> Cc: Vladimir Davydov <vdavydov@parallels.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-29 16:33:15 -07:00
Johannes Weiner	d7365e783e	mm: memcontrol: fix missed end-writeback page accounting Commit `0a31bc97c8` ("mm: memcontrol: rewrite uncharge API") changed page migration to uncharge the old page right away. The page is locked, unmapped, truncated, and off the LRU, but it could race with writeback ending, which then doesn't unaccount the page properly: test_clear_page_writeback() migration wait_on_page_writeback() TestClearPageWriteback() mem_cgroup_migrate() clear PCG_USED mem_cgroup_update_page_stat() if (PageCgroupUsed(pc)) decrease memcg pages under writeback release pc->mem_cgroup->move_lock The per-page statistics interface is heavily optimized to avoid a function call and a lookup_page_cgroup() in the file unmap fast path, which means it doesn't verify whether a page is still charged before clearing PageWriteback() and it has to do it in the stat update later. Rework it so that it looks up the page's memcg once at the beginning of the transaction and then uses it throughout. The charge will be verified before clearing PageWriteback() and migration can't uncharge the page as long as that is still set. The RCU lock will protect the memcg past uncharge. As far as losing the optimization goes, the following test results are from a microbenchmark that maps, faults, and unmaps a 4GB sparse file three times in a nested fashion, so that there are two negative passes that don't account but still go through the new transaction overhead. There is no actual difference: old: 33.195102545 seconds time elapsed ( +- 0.01% ) new: 33.199231369 seconds time elapsed ( +- 0.03% ) The time spent in page_remove_rmap()'s callees still adds up to the same, but the time spent in the function itself seems reduced: # Children Self Command Shared Object Symbol old: 0.12% 0.11% filemapstress [kernel.kallsyms] [k] page_remove_rmap new: 0.12% 0.08% filemapstress [kernel.kallsyms] [k] page_remove_rmap Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Vladimir Davydov <vdavydov@parallels.com> Cc: <stable@vger.kernel.org> [3.17.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-29 16:33:15 -07:00
Johannes Weiner	3a3c02ecf7	mm: page-writeback: inline account_page_dirtied() into single caller A follow-up patch would have changed the call signature. To save the trouble, just fold it instead. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Vladimir Davydov <vdavydov@parallels.com> Cc: <stable@vger.kernel.org> [3.17.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-29 16:33:14 -07:00
Jan Kara	ea5d05b34a	lib/bitmap.c: fix undefined shift in __bitmap_shift_{left\|right}() If __bitmap_shift_left() or __bitmap_shift_right() are asked to shift by a multiple of BITS_PER_LONG, they will try to shift a long value by BITS_PER_LONG bits which is undefined. Change the functions to avoid the undefined shift. Coverity id: 1192175 Coverity id: 1192174 Signed-off-by: Jan Kara <jack@suse.cz> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-29 16:33:14 -07:00
Pavel Machek	5a6e7599d3	drivers/rtc/rtc-bq32k.c: fix register value Fix register value in bq32000 trickle charging. Mike reported that I'm using wrong value in one trickle-charging case, and after checking docs, I must admit he's right. Signed-off-by: Pavel Machek <pavel@denx.de> Reported-by: Mike Bremford <mike@bfo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-29 16:33:14 -07:00
Yasuaki Ishimatsu	35dca71c1f	memory-hotplug: clear pgdat which is allocated by bootmem in try_offline_node() When hot adding the same memory after hot removal, the following messages are shown: WARNING: CPU: 20 PID: 6 at mm/page_alloc.c:4968 free_area_init_node+0x3fe/0x426() ... Call Trace: dump_stack+0x46/0x58 warn_slowpath_common+0x81/0xa0 warn_slowpath_null+0x1a/0x20 free_area_init_node+0x3fe/0x426 hotadd_new_pgdat+0x90/0x110 add_memory+0xd4/0x200 acpi_memory_device_add+0x1aa/0x289 acpi_bus_attach+0xfd/0x204 acpi_bus_attach+0x178/0x204 acpi_bus_scan+0x6a/0x90 acpi_device_hotplug+0xe8/0x418 acpi_hotplug_work_fn+0x1f/0x2b process_one_work+0x14e/0x3f0 worker_thread+0x11b/0x510 kthread+0xe1/0x100 ret_from_fork+0x7c/0xb0 The detaled explanation is as follows: When hot removing memory, pgdat is set to 0 in try_offline_node(). But if the pgdat is allocated by bootmem allocator, the clearing step is skipped. And when hot adding the same memory, the uninitialized pgdat is reused. But free_area_init_node() checks wether pgdat is set to zero. As a result, free_area_init_node() hits WARN_ON(). This patch clears pgdat which is allocated by bootmem allocator in try_offline_node(). Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Zhang Zhen <zhenzhang.zhang@huawei.com> Cc: Wang Nan <wangnan0@huawei.com> Cc: Tang Chen <tangchen@cn.fujitsu.com> Reviewed-by: Toshi Kani <toshi.kani@hp.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-29 16:33:14 -07:00
Marek Szyprowski	eaf3a65908	drivers/rtc/rtc-s3c.c: fix initialization failure without rtc source clock Fix unconditional initialization failure on non-exynos3250 SoCs. Commit `df9e26d093` ("rtc: s3c: add support for RTC of Exynos3250 SoC") introduced rtc source clock support, but also added initialization failure on SoCs, which doesn't need such clock. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-29 16:33:14 -07:00
Martin Schwidefsky	0baf2a4dbf	kernel/kmod: fix use-after-free of the sub_info structure Found this in the message log on a s390 system: BUG kmalloc-192 (Not tainted): Poison overwritten Disabling lock debugging due to kernel taint INFO: 0x00000000684761f4-0x00000000684761f7. First byte 0xff instead of 0x6b INFO: Allocated in call_usermodehelper_setup+0x70/0x128 age=71 cpu=2 pid=648 __slab_alloc.isra.47.constprop.56+0x5f6/0x658 kmem_cache_alloc_trace+0x106/0x408 call_usermodehelper_setup+0x70/0x128 call_usermodehelper+0x62/0x90 cgroup_release_agent+0x178/0x1c0 process_one_work+0x36e/0x680 worker_thread+0x2f0/0x4f8 kthread+0x10a/0x120 kernel_thread_starter+0x6/0xc kernel_thread_starter+0x0/0xc INFO: Freed in call_usermodehelper_exec+0x110/0x1b8 age=71 cpu=2 pid=648 __slab_free+0x94/0x560 kfree+0x364/0x3e0 call_usermodehelper_exec+0x110/0x1b8 cgroup_release_agent+0x178/0x1c0 process_one_work+0x36e/0x680 worker_thread+0x2f0/0x4f8 kthread+0x10a/0x120 kernel_thread_starter+0x6/0xc kernel_thread_starter+0x0/0xc There is a use-after-free bug on the subprocess_info structure allocated by the user mode helper. In case do_execve() returns with an error ____call_usermodehelper() stores the error code to sub_info->retval, but sub_info can already have been freed. Regarding UMH_NO_WAIT, the sub_info structure can be freed by __call_usermodehelper() before the worker thread returns from do_execve(), allowing memory corruption when do_execve() failed after exec_mmap() is called. Regarding UMH_WAIT_EXEC, the call to umh_complete() allows call_usermodehelper_exec() to continue which then frees sub_info. To fix this race the code needs to make sure that the call to call_usermodehelper_freeinfo() is always done after the last store to sub_info->retval. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-29 16:33:14 -07:00
Stanimir Varbanov	c8d523a4b0	drivers/rtc/rtc-pm8xxx.c: rework to support pm8941 rtc Adds support for RTC device inside PM8941 PMIC. The RTC in this PMIC have two register spaces. Thus the rtc-pm8xxx is slightly reworked to reflect these differences. The register set for different PMIC chips are selected on DT compatible string base. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: simplify and fix locking in pm8xxx_rtc_set_time()] Signed-off-by: Stanimir Varbanov <svarbanov@mm-sol.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Stephen Boyd <sboyd@codeaurora.org> Cc: Josh Cartwright <joshc@codeaurora.org> Cc: Stanimir Varbanov <svarbanov@mm-sol.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-29 16:33:14 -07:00
David Rientjes	6d50e60cd2	mm, thp: fix collapsing of hugepages on madvise If an anonymous mapping is not allowed to fault thp memory and then madvise(MADV_HUGEPAGE) is used after fault, khugepaged will never collapse this memory into thp memory. This occurs because the madvise(2) handler for thp, hugepage_madvise(), clears VM_NOHUGEPAGE on the stack and it isn't stored in vma->vm_flags until the final action of madvise_behavior(). This causes the khugepaged_enter_vma_merge() to be a no-op in hugepage_madvise() when the vma had previously had VM_NOHUGEPAGE set. Fix this by passing the correct vma flags to the khugepaged mm slot handler. There's no chance khugepaged can run on this vma until after madvise_behavior() returns since we hold mm->mmap_sem. It would be possible to clear VM_NOHUGEPAGE directly from vma->vm_flags in hugepage_advise(), but I didn't want to introduce special case behavior into madvise_behavior(). I think it's best to just let it always set vma->vm_flags itself. Signed-off-by: David Rientjes <rientjes@google.com> Reported-by: Suleiman Souhlal <suleiman@google.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-29 16:33:14 -07:00
Marek Szyprowski	47f29df7db	drivers: of: add return value to of_reserved_mem_device_init() Driver calling of_reserved_mem_device_init() might be interested if the initialization has been successful or not, so add support for returning error code. This fixes a build warining caused by commit `7bfa5ab6fa` ("drivers: dma-coherent: add initialization from device tree"), which has been merged without this change and without fixing function return value. Fixes: `7bfa5ab6fa` ("drivers: dma-coherent: add initialization from device tree") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Grant Likely <grant.likely@linaro.org> Cc: Laura Abbott <lauraa@codeaurora.org> Cc: Josh Cartwright <joshc@codeaurora.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kyungmin Park <kyungmin.park@samsung.com> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-29 16:33:14 -07:00
Yu Zhao	5ddacbe92b	mm: free compound page with correct order Compound page should be freed by put_page() or free_pages() with correct order. Not doing so will cause tail pages leaked. The compound order can be obtained by compound_order() or use HPAGE_PMD_ORDER in our case. Some people would argue the latter is faster but I prefer the former which is more general. This bug was observed not just on our servers (the worst case we saw is 11G leaked on a 48G machine) but also on our workstations running Ubuntu based distro. $ cat /proc/vmstat \| grep thp_zero_page_alloc thp_zero_page_alloc 55 thp_zero_page_alloc_failed 0 This means there is (thp_zero_page_alloc - 1) * (2M - 4K) memory leaked. Fixes: `97ae17497e` ("thp: implement refcounting for huge zero page") Signed-off-by: Yu Zhao <yuzhao@google.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: David Rientjes <rientjes@google.com> Cc: Bob Liu <lliubbo@gmail.com> Cc: <stable@vger.kernel.org> [3.8+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-29 16:33:14 -07:00
Riku Voipio	f601de2044	gcov: add ARM64 to GCOV_PROFILE_ALL Following up the arm testing of gcov, turns out gcov on ARM64 works fine as well. Only change needed is adding ARM64 to Kconfig depends. Tested with qemu and mach-virt Signed-off-by: Riku Voipio <riku.voipio@linaro.org> Acked-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-29 16:33:14 -07:00
Jerry Hoemann	6424babfd6	fsnotify: next_i is freed during fsnotify_unmount_inodes. During file system stress testing on 3.10 and 3.12 based kernels, the umount command occasionally hung in fsnotify_unmount_inodes in the section of code: spin_lock(&inode->i_lock); if (inode->i_state & (I_FREEING\|I_WILL_FREE\|I_NEW)) { spin_unlock(&inode->i_lock); continue; } As this section of code holds the global inode_sb_list_lock, eventually the system hangs trying to acquire the lock. Multiple crash dumps showed: The inode->i_state == 0x60 and i_count == 0 and i_sb_list would point back at itself. As this is not the value of list upon entry to the function, the kernel never exits the loop. To help narrow down problem, the call to list_del_init in inode_sb_list_del was changed to list_del. This poisons the pointers in the i_sb_list and causes a kernel to panic if it transverse a freed inode. Subsequent stress testing paniced in fsnotify_unmount_inodes at the bottom of the list_for_each_entry_safe loop showing next_i had become free. We believe the root cause of the problem is that next_i is being freed during the window of time that the list_for_each_entry_safe loop temporarily releases inode_sb_list_lock to call fsnotify and fsnotify_inode_delete. The code in fsnotify_unmount_inodes attempts to prevent the freeing of inode and next_i by calling __iget. However, the code doesn't do the __iget call on next_i if i_count == 0 or if i_state & (I_FREEING \| I_WILL_FREE) The patch addresses this issue by advancing next_i in the above two cases until we either find a next_i which we can __iget or we reach the end of the list. This makes the handling of next_i more closely match the handling of the variable "inode." The time to reproduce the hang is highly variable (from hours to days.) We ran the stress test on a 3.10 kernel with the proposed patch for a week without failure. During list_for_each_entry_safe, next_i is becoming free causing the loop to never terminate. Advance next_i in those cases where __iget is not done. Signed-off-by: Jerry Hoemann <jerry.hoemann@hp.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Ken Helias <kenhelias@firemail.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-29 16:33:14 -07:00

... 2 3 4 5 6 ...

481397 Commits All Branches Search

481397 Commits

All Branches