linux

Commit Graph

Author	SHA1	Message	Date
Linus Torvalds	af7ddd8a62	DMA mapping updates for Linux 4.21 A huge update this time, but a lot of that is just consolidating or removing code: - provide a common DMA_MAPPING_ERROR definition and avoid indirect calls for dma_map_* error checking - use direct calls for the DMA direct mapping case, avoiding huge retpoline overhead for high performance workloads - merge the swiotlb dma_map_ops into dma-direct - provide a generic remapping DMA consistent allocator for architectures that have devices that perform DMA that is not cache coherent. Based on the existing arm64 implementation and also used for csky now. - improve the dma-debug infrastructure, including dynamic allocation of entries (Robin Murphy) - default to providing chaining scatterlist everywhere, with opt-outs for the few architectures (alpha, parisc, most arm32 variants) that can't cope with it - misc sparc32 dma-related cleanups - remove the dma_mark_clean arch hook used by swiotlb on ia64 and replace it with the generic noncoherent infrastructure - fix the return type of dma_set_max_seg_size (Niklas Söderlund) - move the dummy dma ops for not DMA capable devices from arm64 to common code (Robin Murphy) - ensure dma_alloc_coherent returns zeroed memory to avoid kernel data leaks through userspace. We already did this for most common architectures, but this ensures we do it everywhere. dma_zalloc_coherent has been deprecated and can hopefully be removed after -rc1 with a coccinelle script. -----BEGIN PGP SIGNATURE----- iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAlwctQgLHGhjaEBsc3Qu ZGUACgkQD55TZVIEUYMxgQ//dBpAfS4/J76CdAbYry2zqgcOUU9hIrD6NHiEMWov ltJxyvEl3LsUmIdEj3aCrYL9jZN0qsnCzn5BVj2c3jDIVgD64fAr7HDf/PbEEfKb j6/GgEnVLPZV+sQMvhNA5jOzHrkseaqPa4/pNLFZ/l8jnuZ2d+btusDWJpMoVDer TXVwtIfgeIu0gTygYOShLYXd5qptWKWsZEpbTZOO2sE6+x+ZJX7yQYUxYDTlcOIj JWVO2l5QNHPc5T9o2at+6L5aNUvnZOxT79sWgyZLn0Kc+FagKAVwfLqUEl0v7foG 8k/xca5/8p3afB1DfrIrtplJqis7cVgdyGxriwuuoO8X4F0nPyWwpGmxsBhrWwwl xTqC4UorEJ7QwoP6Azopk/vYI2QXIUBLjuCJCuFXZj9+2BGf4IfvBY1S2cLM9qLs HMcxQonuXJii044KEFS96ePEuiT+igVINweIFBKWcgNCEG0UQtyL6RQ1U5297ipF JiWZAqD+p9X52UdKS+oKfAiZEekMXn6Xyo97+YCiNpfOo0GP5eEcwhL+JpY4AiRq apPXtsRy2o1s8yfjdraUIM2Mc2n62vFKb35oUbGCd/QO9piPrFQHl6T0HHcHk4YR XrUXcHieFZBCYqh7ZVa4RL8Msq1wvGuTL4Dxl43mXdsMoUFRR6eSNWLoAV4IpOLZ WgA= =in72 -----END PGP SIGNATURE----- Merge tag 'dma-mapping-4.21' of git://git.infradead.org/users/hch/dma-mapping Pull DMA mapping updates from Christoph Hellwig: "A huge update this time, but a lot of that is just consolidating or removing code: - provide a common DMA_MAPPING_ERROR definition and avoid indirect calls for dma_map_* error checking - use direct calls for the DMA direct mapping case, avoiding huge retpoline overhead for high performance workloads - merge the swiotlb dma_map_ops into dma-direct - provide a generic remapping DMA consistent allocator for architectures that have devices that perform DMA that is not cache coherent. Based on the existing arm64 implementation and also used for csky now. - improve the dma-debug infrastructure, including dynamic allocation of entries (Robin Murphy) - default to providing chaining scatterlist everywhere, with opt-outs for the few architectures (alpha, parisc, most arm32 variants) that can't cope with it - misc sparc32 dma-related cleanups - remove the dma_mark_clean arch hook used by swiotlb on ia64 and replace it with the generic noncoherent infrastructure - fix the return type of dma_set_max_seg_size (Niklas Söderlund) - move the dummy dma ops for not DMA capable devices from arm64 to common code (Robin Murphy) - ensure dma_alloc_coherent returns zeroed memory to avoid kernel data leaks through userspace. We already did this for most common architectures, but this ensures we do it everywhere. dma_zalloc_coherent has been deprecated and can hopefully be removed after -rc1 with a coccinelle script" * tag 'dma-mapping-4.21' of git://git.infradead.org/users/hch/dma-mapping: (73 commits) dma-mapping: fix inverted logic in dma_supported dma-mapping: deprecate dma_zalloc_coherent dma-mapping: zero memory returned from dma_alloc_* sparc/iommu: fix ->map_sg return value sparc/io-unit: fix ->map_sg return value arm64: default to the direct mapping in get_arch_dma_ops PCI: Remove unused attr variable in pci_dma_configure ia64: only select ARCH_HAS_DMA_COHERENT_TO_PFN if swiotlb is enabled dma-mapping: bypass indirect calls for dma-direct vmd: use the proper dma_* APIs instead of direct methods calls dma-direct: merge swiotlb_dma_ops into the dma_direct code dma-direct: use dma_direct_map_page to implement dma_direct_map_sg dma-direct: improve addressability error reporting swiotlb: remove dma_mark_clean swiotlb: remove SWIOTLB_MAP_ERROR ACPI / scan: Refactor _CCA enforcement dma-mapping: factor out dummy DMA ops dma-mapping: always build the direct mapping code dma-mapping: move dma_cache_sync out of line dma-mapping: move various slow path functions out of line ...	2018-12-28 14:12:21 -08:00
Christoph Hellwig	356da6d0cd	dma-mapping: bypass indirect calls for dma-direct Avoid expensive indirect calls in the fast path DMA mapping operations by directly calling the dma_direct_* ops if we are using the directly mapped DMA operations. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Tested-by: Jesper Dangaard Brouer <brouer@redhat.com> Tested-by: Tony Luck <tony.luck@intel.com>	2018-12-13 21:06:18 +01:00
Eugeniy Paltsev	3624379d90	ARC: IOC: panic if kernel was started with previously enabled IOC If IOC was already enabled (due to bootloader) it technically needs to be reconfigured with aperture base,size corresponding to Linux memory map which will certainly be different than uboot's. But disabling and reenabling IOC when DMA might be potentially active is tricky business. To avoid random memory issues later, just panic here and ask user to upgrade bootloader to one which doesn't enable IOC This was actually seen as issue on some of the HSDK board with a version of uboot which enabled IOC. There were random issues later with starting of X or peripherals etc. Also while I'm at it, replace hardcoded bits in ARC_REG_IO_COH_PARTIAL and ARC_REG_IO_COH_ENABLE registers by definitions. Inspired by: https://lkml.org/lkml/2018/1/19/557 Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2018-11-12 10:38:27 -08:00
Eugeniy Paltsev	2b720e99a1	ARC: IOC: panic if both IOC and ZONE_HIGHMEM enabled Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2018-09-04 13:21:37 -07:00
Eugeniy Paltsev	2820a708d5	ARC: dma [IOC] Enable per device io coherency So far the IOC treatment was global on ARC, being turned on (or off) for all devices in the system. With this patch, this can now be done per device using the "dma-coherent" DT property; IOW with this patch we can use both HW-coherent and regular DMA peripherals simultaneously. The changes involved are too many so enlisting the summary below: 1. common code calls ARC arch_setup_dma_ops() per device. 2. For coherent dma (IOC) it plugs in generic @dma_direct_ops which doesn't need any arch specific backend: No need for any explicit cache flushes or MMU mappings to provide for uncached access - dma_(map\|sync)_single* return early as corresponding dma ops callbacks are NULL in generic code. So arch_sync_dma_() -> dma_cache_() need not handle the coherent dma case, hence drop ARC __dma_cache_*_ioc() which were no-op anyways 3. For noncoherent dma (non IOC) generic @dma_noncoherent_ops is used which in turns calls ARC specific routines - arch_dma_alloc() no longer checks for @ioc_enable since this is called only for !IOC case. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> [vgupta: rewrote changelog]	2018-09-04 13:21:37 -07:00
Randy Dunlap	ec837d620c	arc: fix type warnings in arc/mm/cache.c Fix type warnings in arch/arc/mm/cache.c. ../arch/arc/mm/cache.c: In function 'flush_anon_page': ../arch/arc/mm/cache.c:1062:55: warning: passing argument 2 of '__flush_dcache_page' makes integer from pointer without a cast [-Wint-conversion] __flush_dcache_page((phys_addr_t)page_address(page), page_address(page)); ^~~~~~~~~~~~~~~~~~ ../arch/arc/mm/cache.c:1013:59: note: expected 'long unsigned int' but argument is of type 'void *' void __flush_dcache_page(phys_addr_t paddr, unsigned long vaddr) ~~~~~~~~~~~~~~^~~~~ Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: linux-snps-arc@lists.infradead.org Cc: Elad Kanfi <eladkan@mellanox.com> Cc: Leon Romanovsky <leonro@mellanox.com> Cc: Ofer Levi <oferle@mellanox.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2018-07-30 11:48:50 -07:00
Eugeniy Paltsev	386177da9e	ARC: add SMP_CACHE_BYTES value validate Check that SMP_CACHE_BYTES (and hence ARCH_DMA_MINALIGN) is larger or equal to any cache line length by comparing it with values previously read from ARC cache BCR registers. Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2018-07-30 09:46:19 -07:00
Huang Ying	cb9f753a37	mm: fix races between swapoff and flush dcache Thanks to commit `4b3ef9daa4` ("mm/swap: split swap cache into 64MB trunks"), after swapoff the address_space associated with the swap device will be freed. So page_mapping() users which may touch the address_space need some kind of mechanism to prevent the address_space from being freed during accessing. The dcache flushing functions (flush_dcache_page(), etc) in architecture specific code may access the address_space of swap device for anonymous pages in swap cache via page_mapping() function. But in some cases there are no mechanisms to prevent the swap device from being swapoff, for example, CPU1 CPU2 __get_user_pages() swapoff() flush_dcache_page() mapping = page_mapping() ... exit_swap_address_space() ... kvfree(spaces) mapping_mapped(mapping) The address space may be accessed after being freed. But from cachetlb.txt and Russell King, flush_dcache_page() only care about file cache pages, for anonymous pages, flush_anon_page() should be used. The implementation of flush_dcache_page() in all architectures follows this too. They will check whether page_mapping() is NULL and whether mapping_mapped() is true to determine whether to flush the dcache immediately. And they will use interval tree (mapping->i_mmap) to find all user space mappings. While mapping_mapped() and mapping->i_mmap isn't used by anonymous pages in swap cache at all. So, to fix the race between swapoff and flush dcache, __page_mapping() is add to return the address_space for file cache pages and NULL otherwise. All page_mapping() invoking in flush dcache functions are replaced with page_mapping_file(). [akpm@linux-foundation.org: simplify page_mapping_file(), per Mike] Link: http://lkml.kernel.org/r/20180305083634.15174-1-ying.huang@intel.com Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Chen Liqin <liqin.linux@gmail.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: "David S. Miller" <davem@davemloft.net> Cc: Chris Zankel <chris@zankel.net> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Ley Foon Tan <lftan@altera.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-04-05 21:36:26 -07:00
Eugeniy Paltsev	8bbfbc2df6	ARCv2: cache: fix slc_entire_op: flush only instead of flush-n-inv slc_entire_op with OP_FLUSH command also invalidates it. This is a preventive fix as the current use of slc_entire_op is only with OP_FLUSH_N_INV where the invalidate is required. Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com> [vgupta: fixed changelog] Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2018-01-17 12:26:52 -08:00
Eugeniy Paltsev	9ed68785f7	ARC: mm: Decouple RAM base address from kernel link address [Needed for HSDK] Currently the first page of system (hence RAM base) is assumed to be @ CONFIG_LINUX_LINK_BASE, where kernel itself is linked. However is case of HSDK platform, for reasons explained in that patch, this is not true. kernel needs to be linked @ 0x9000_0000 while DDR is still wired at 0x8000_0000. To properly account for this 256M of RAM, we need to introduce a new option and base page frame accountiing off of it. Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> [vgupta: renamed CONFIG_KERNEL_RAM_BASE_ADDRESS => CONFIG_LINUX_RAM_BASE : simplified changelog]	2017-09-01 11:26:27 -07:00
Eugeniy Paltsev	bee91c3a3c	ARCv2: IOC: Tighten up the contraints (specifically base / size alignment) [Needed for HSDK] - Currently IOC base is hardcoded to 0x8000_0000 which is default value of LINUX_LINK_BASE, but may not always be the case - IOC programming model imposes the constraint that IOC aperture size needs to be aligned to IOC base address, which we were not checking so far. Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> [vgupta: reworked the changelog]	2017-09-01 11:26:26 -07:00
Vineet Gupta	ae0b63d97d	ARCv2: SLC: provide a line based flush routine for debugging Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2017-08-30 09:21:34 -07:00
Noam Camus	18ee4becb5	ARC: set boot print log level to PR_INFO Some of the boot printing code had printk() w/o explicit log level. This patch introduces consistency allowing platforms to switch to less verbose console logging using cmdline. NPS400 with 4K CPUs needs to avoid the cpu info printing for faster bootup. Signed-off-by: Noam Camus <noamca@mellanox.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2017-08-28 15:17:36 -07:00
Vineet Gupta	b5ddb6d547	ARCv2: PAE40: set MSB even if !CONFIG_ARC_HAS_PAE40 but PAE exists in SoC PAE40 confiuration in hardware extends some of the address registers for TLB/cache ops to 2 words. So far kernel was NOT setting the higher word if feature was not enabled in software which is wrong. Those need to be set to 0 in such case. Normally this would be done in the cache flush / tlb ops, however since these registers only exist conditionally, this would have to be conditional to a flag being set on boot which is expensive/ugly - specially for the more common case of PAE exists but not in use. Optimize that by zero'ing them once at boot - nobody will write to them afterwards Cc: stable@vger.kernel.org #4.4+ Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2017-08-04 13:56:35 +05:30
Alexey Brodkin	7d79cee2c6	ARCv2: PAE40: Explicitly set MSB counterpart of SLC region ops addresses It is necessary to explicitly set both SLC_AUX_RGN_START1 and SLC_AUX_RGN_END1 which hold MSB bits of the physical address correspondingly of region start and end otherwise SLC region operation is executed in unpredictable manner Without this patch, SLC flushes on HSDK (IOC disabled) were taking seconds. Cc: stable@vger.kernel.org #4.4+ Reported-by: Vladimir Kondratiev <vladimir.kondratiev@intel.com> Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> [vgupta: PAR40 regs only written if PAE40 exist]	2017-08-04 13:56:34 +05:30
Alexey Brodkin	b37174d95b	ARCv2: SLC: Make sure busy bit is set properly for region ops `c70c473396` "ARCv2: SLC: Make sure busy bit is set properly on SLC flushing" fixes problem for entire SLC operation where the problem was initially caught. But given a nature of the issue it is perfectly possible for busy bit to be read incorrectly even when region operation was started. So extending initial fix for regional operation as well. Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Cc: stable@vger.kernel.org #4.10 Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2017-08-04 13:50:07 +05:30
Vineet Gupta	ee40bd1e0c	ARCv2: mm: Merge 2 updates to DC_CTRL for region flush Region Flush has a weird programming model. 1. Flush or Invalidate is selected by DC_CTRL.RGN_OP 2 Flush-n-Invalidate is done by DC_CTRL.IM Given the code structuring before, case #2 above was generating two seperate updates to DC_CTRL which was pointless. \| 80a342b0 <__dma_cache_wback_inv_l1>: \| 80a342b0: clri r4 \| 80a342b4: lr r2,[dc_ctrl] \| 80a342b8: bset_s r2,r2,0x6 \| 80a342ba: sr r2,[dc_ctrl] <-- FIRST \| \| 80a342be: bmskn r3,r0,0x5 \| \| 80a342c2: lr r2,[dc_ctrl] \| 80a342c6: and r2,r2,0xfffff1ff \| 80a342ce: bset_s r2,r2,0x9 \| 80a342d0: sr r2,[dc_ctrl] <-- SECOND \| \| 80a342d4: add_s r1,r1,0x3f \| 80a342d6: bmsk_s r0,r0,0x5 \| 80a342d8: add_s r0,r0,r1 \| 80a342da: add_s r0,r0,r3 \| 80a342dc: sr r0,[78] \| 80a342e0: sr r3,[77] \|... \|... So move setting of DC_CTRL.RGN_OP into __before_dc_op() and combine with any other update. \| 80b63324 <__dma_cache_wback_inv_l1>: \| 80b63324: clri r3 \| 80b63328: lr r2,[dc_ctrl] \| 80b6332c: and r2,r2,0xfffff1ff \| 80b63334: or r2,r2,576 \| 80b63338: sr r2,[dc_ctrl] \| \| 80b6333c: add_s r1,r1,0x3f \| 80b6333e: bmskn r2,r0,0x5 \| 80b63342: add_s r0,r0,r1 \| 80b63344: sr r0,[78] \| 80b63348: sr r2,[77] Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2017-05-02 16:16:07 -07:00
Vineet Gupta	0d77117fc5	ARCv2: mm: Implement cache region flush operations These are more efficient than the per-line ops Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2017-05-02 15:57:22 -07:00
Vineet Gupta	7d3d162bbd	ARC: mm: Move full_page computation into cache version agnostic wrapper This reduces code duplication in each of cache version specific handlers Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2017-05-02 15:19:34 -07:00
Alexey Brodkin	c70c473396	ARCv2: SLC: Make sure busy bit is set properly on SLC flushing As reported in STAR 9001165532, an SLC control reg read (for checking busy state) right after SLC invalidate command may incorrectly return NOT busy causing software to NOT spin-wait while operation is underway. (and for some reason this only happens if L1 cache is also disabled - as required by IOC programming model) Suggested workaround is to do an additional Control Reg read, which ensures the 2nd read gets the right status. Cc: stable@vger.kernel.org #4.10 Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> [vgupta: reworte changelog a bit] Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2017-03-30 17:32:33 -07:00
Vineet Gupta	d0e73e2ac6	ARC: Revert "ARC: mm: IOC: Don't enable IOC by default" The programming model has been fixed with prev patches so re-enable it by default This reverts commit `23cb1f6440`. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2017-01-18 19:21:06 -08:00
Vineet Gupta	76894a72a0	ARC: mm: split arc_cache_init to allow __init reaping of bulk arc_cache_init() is called for each core so can't be tagged __init. However bulk of it is only executed by master core and thus is candidate for __init reaping. So split it up to allow that. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2017-01-18 19:21:02 -08:00
Vineet Gupta	e497c8e52a	ARCv2: IOC: Use actual memory size to setup aperture size vs. fixed 512M before. But this still assumes that all of memory is under IOC which may not be true for the SoC. Improve that later when this becomes a real issue, by specifying this from DT. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2017-01-18 14:52:43 -08:00
Vineet Gupta	8c47f83ba4	ARCv2: IOC: Adhere to progamming model guidelines to avoid DMA corruption On AXS103 release bitfiles, DMA data corruptions were seen because IOC setup was not following the recommended way in documentation. Flipping IOC on when caches are enabled or coherency transactions are in flight, might cause some of the memory operations to not observe coherency as expected. So strictly follow the programming model recommendations as documented in comment header above arc_ioc_setup() Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2017-01-18 14:48:33 -08:00
Vineet Gupta	d4911cdd32	ARCv2: IOC: refactor the IOC and SLC operations into own functions - Move IOC setup into arc_ioc_setup() - Move SLC disabling into arc_slc_disable() Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2017-01-18 14:35:10 -08:00
Vineet Gupta	fa84d7310d	ARC: mmu: clarify the MMUv3 programming model Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2017-01-04 17:12:09 -08:00
Vineet Gupta	08fe007968	ARC: mm: arc700: Don't assume 2 colours for aliasing VIPT dcache An ARC700 customer reported linux boot crashes when upgrading to bigger L1 dcache (64K from 32K). Turns out they had an aliasing VIPT config and current code only assumed 2 colours, while theirs had 4. So default to 4 colours and complain if there are fewer. Ideally this needs to be a Kconfig option, but heck that's too much of hassle for a single user. Cc: stable@vger.kernel.org Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2016-12-19 11:55:17 -08:00
Vineet Gupta	f64915be2d	ARC: mm: No need to save cache version in @cpuinfo Historical MMU revisions have been paired with Cache revision updates which are captured in MMU and Cache Build Configuration Registers respectively. This was used in boot code to check for configurations mismatches, speically in simulations (such as running with non existent caches, non pairing MMU and Cache version etc). This can instead be inferred from other cache params such as line size. So remove @ver from post processed @cpuinfo which could be used later to save soem other interesting info. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2016-12-19 11:54:41 -08:00
Vineet Gupta	23cb1f6440	ARC: mm: IOC: Don't enable IOC by default Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2016-11-28 09:18:36 -08:00
Vineet Gupta	711c1f2671	ARCv2: boot log: print IOC exists as well as enabled status Previously we would not print the case when IOC existed but was not enabled. And while at it, reduce one line off boot printing by consolidating the Peripheral address space and IO-Coherency which in a way applies to them Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2016-10-28 10:06:48 -07:00
Vineet Gupta	cf986d4702	ARCv2: IOC: use @ioc_enable not @ioc_exist where intended if user disables IOC from debugger at startup (by clearing @ioc_enable), @ioc_exists is cleared too. This means boot prints don't capture the fact that IOC was present but disabled which could be misleading. So invert how we use @ioc_enable and @ioc_exists and make it more canonical. @ioc_exists represent whether hardware is present or not and stays same whether enabled or not. @ioc_enable is still user driven, but will be auto-disabled if IOC hardware is not present, i.e. if @ioc_exist=0. This is opposite to what we were doing before, but much clearer. This means @ioc_enable is now the "exported" toggle in rest of code such as dma mapping API. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2016-10-24 09:24:47 -07:00
Vineet Gupta	26c01c49d5	ARCv2: Support dynamic peripheral address space in HS38 rel 3.0 cores HS release 3.0 provides for even more flexibility in specifying the volatile address space for mapping peripherals. With HS 2.1 @start was made flexible / programmable - with HS 3.0 even @end can be setup (vs. fixed to 0xFFFF_FFFF before). So add code to reflect that and while at it remove an unused struct defintion Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2016-09-30 14:48:17 -07:00
Vineet Gupta	45c3b08a11	ARC: Elide redundant setup of DMA callbacks For resources shared by all cores such as SLC and IOC, only the master core needs to do any setups / enabling / disabling etc. Cc: <stable@vger.kernel.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2016-08-10 10:16:46 -07:00
Andrea Gelmini	2547476a5e	Fix typos Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2016-05-30 10:07:32 +05:30
Kirill A. Shutemov	09cbfeaf1a	mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced long time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-04-04 10:41:08 -07:00
Vineet Gupta	deaf7565eb	ARCv2: ioremap: Support dynamic peripheral address space The peripheral address space is architectural address window which is uncached and typically used to wire up peripherals. For ARC700 cores (ARCompact ISA based) this was fixed to 1GB region 0xC000_0000 - 0xFFFF_FFFF. For ARCv2 based HS38 cores the start address is flexible and can be 0xC, 0xD, 0xE, 0xF 000_000 by programming AUX_NON_VOLATILE_LIMIT reg (typically done in bootloader) Further in cas of PAE, the physical address can extend beyond 4GB so need to confine this check, otherwise all pages beyond 4GB will be treated as uncached Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2016-03-19 14:34:10 +05:30
Vineet Gupta	f5db19e93f	ARC: dma: ioremap: use phys_addr_t consistenctly in code paths To support dma in physical memory beyond 4GB with PAE40 Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2016-03-19 14:34:09 +05:30
Adam Buchbinder	7423cc0cae	ARC: Fix misspellings in comments. Signed-off-by: Adam Buchbinder <adam.buchbinder@gmail.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2016-03-11 14:59:53 +05:30
Kirill A. Shutemov	e1534ae950	mm: differentiate page_mapped() from page_mapcount() for compound pages Let's define page_mapped() to be true for compound pages if any sub-pages of the compound page is mapped (with PMD or PTE). On other hand page_mapcount() return mapcount for this particular small page. This will make cases like page_get_anon_vma() behave correctly once we allow huge pages to be mapped with PTE. Most users outside core-mm should use page_mapcount() instead of page_mapped(). Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Tested-by: Sasha Levin <sasha.levin@oracle.com> Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Jerome Marchand <jmarchan@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Steve Capper <steve.capper@linaro.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-01-15 17:56:32 -08:00
Vineet Gupta	5a364c2a17	ARC: mm: PAE40 support This is the first working implementation of 40-bit physical address extension on ARCv2. Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-10-29 18:41:30 +05:30
Vineet Gupta	28b4af729f	ARC: mm: PAE40: switch to using phys_addr_t for physical addresses That way a single flip of phys_addr_t to 64 bit ensures all places dealing with physical addresses get correct data Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-10-28 19:50:29 +05:30
Vineet Gupta	336e2136e1	ARC: mm: preps ahead of HIGHMEM support Before we plug in highmem support, some of code needs to be ready for it - copy_user_highpage() needs to be using the kmap_atomic API - mk_pte() can't assume page_address() - do_page_fault() can't assume VMALLOC_END is end of kernel vaddr space Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-10-28 19:31:05 +05:30
Vineet Gupta	964cf28f9d	ARC: boot log: move helper macros to header for reuse Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-10-17 17:48:25 +05:30
Vineet Gupta	fd0881a24a	ARC: Eliminate some ARCv2 specific code for ARCompact build Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-21 15:06:43 +05:30
Alexey Brodkin	1648c70d30	ARCv2: IOC: Allow boot time disable Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-20 18:15:31 +05:30
Vineet Gupta	79335a2ca0	ARCv2: SLC: Allow boot time disable Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-20 18:11:52 +05:30
Alexey Brodkin	f2b0b25a37	ARCv2: Support IO Coherency and permutations involving L1 and L2 caches In case of ARCv2 CPU there're could be following configurations that affect cache handling for data exchanged with peripherals via DMA: [1] Only L1 cache exists [2] Both L1 and L2 exist, but no IO coherency unit [3] L1, L2 caches and IO coherency unit exist Current implementation takes care of [1] and [2]. Moreover support of [2] is implemented with run-time check for SLC existence which is not super optimal. This patch introduces support of [3] and rework of DMA ops usage. Instead of doing run-time check every time a particular DMA op is executed we'll have 3 different implementations of DMA ops and select appropriate one during init. As for IOC support for it we need: [a] Implement empty DMA ops because IOC takes care of cache coherency with DMAed data [b] Route dma_alloc_coherent() via dma_alloc_noncoherent() This is required to make IOC work in first place and also serves as optimization as LD/ST to coherent buffers can be srviced from caches w/o going all the way to memory Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> [vgupta: -Added some comments about IOC gains -Marked dma ops as static, -Massaged changelog a bit] Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-20 18:11:17 +05:30
Alexey Brodkin	b607eddd71	ARCv2: guard SLC DMA ops with spinlock SLC maintenance ops need to be serialized by software as there is no inherent buffering / quequing of aux commands. It can silently ignore a new aux operation if previous one is still ongoing (SLC_CTRL_BUSY) So gaurd the SLC op using a spin lock The spin lock doesn't seem to be contended even in heavy workloads such as iperf. On FPGA @ 75 MHz. [1] Before this change: ============================================================ # iperf -c 10.42.0.1 ------------------------------------------------------------ Client connecting to 10.42.0.1, TCP port 5001 TCP window size: 43.8 KByte (default) ------------------------------------------------------------ [ 3] local 10.42.0.110 port 38935 connected with 10.42.0.1 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 48.4 MBytes 40.6 Mbits/sec ============================================================ [2] After this change: ============================================================ # iperf -c 10.42.0.1 ------------------------------------------------------------ Client connecting to 10.42.0.1, TCP port 5001 TCP window size: 43.8 KByte (default) ------------------------------------------------------------ [ 3] local 10.42.0.243 port 60248 connected with 10.42.0.1 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 47.5 MBytes 39.8 Mbits/sec # iperf -c 10.42.0.1 ------------------------------------------------------------ Client connecting to 10.42.0.1, TCP port 5001 TCP window size: 43.8 KByte (default) ------------------------------------------------------------ [ 3] local 10.42.0.243 port 60249 connected with 10.42.0.1 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 54.9 MBytes 46.0 Mbits/sec ============================================================ Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Cc: arc-linux-dev@synopsys.com Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-07-06 10:12:39 +05:30
Vineet Gupta	795f455856	ARCv2: SLC: Handle explcit flush for DMA ops (w/o IO-coherency) L2 cache on ARCHS processors is called SLC (System Level Cache) For working DMA (in absence of hardware assisted IO Coherency) we need to manage SLC explicitly when buffers transition between cpu and controllers. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-06-25 06:00:19 +05:30
Vineet Gupta	bcc4d65abe	ARCv2: MMUv4: support aliasing icache config This is also default for AXS103 release Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-06-22 14:06:56 +05:30

1 2

55 Commits