linux/drivers/tty/serial/amba-pl011.c

2252 lines
57 KiB
C
Raw Normal View History

/*
* Driver for AMBA serial ports
*
* Based on drivers/char/serial.c, by Linus Torvalds, Theodore Ts'o.
*
* Copyright 1999 ARM Limited
* Copyright (C) 2000 Deep Blue Solutions Ltd.
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
* Copyright (C) 2010 ST-Ericsson SA
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*
* This is a generic driver for ARM AMBA-type serial ports. They
* have a lot of 16550-like features, but are not register compatible.
* Note that although they do have CTS, DCD and DSR inputs, they do
* not have an RI input, nor do they have DTR or RTS outputs. If
* required, these have to be supplied via some other means (eg, GPIO)
* and hooked into this driver.
*/
ARM: PL011: Add support for Rx DMA buffer polling. In DMA support, The received data is not pushed to tty until the DMA buffer is filled. But some megabyte rate chips such as BT expect fast response and data should be pushed immediately. In order to fix this issue, We suggest the use of the timer for polling DMA buffer. In our test, no data loss occurred at high-baudrate as compared with interrupt- driven (We tested with 3Mbps). We changes: - We add timer for polling. If we set poll_timer to 10, every 10ms, timer handler checks the residue in the dma buffer and transfer data to the tty. Also, last_residue is updated for the next polling. - poll_timeout is used to prevent the timer's system cost. If poll_timeout is set to 3000 and no data is received in 3 seconds, we inactivate poll timer and driver falls back to interrupt-driven. When data is received again in FIFO and UART irq is occurred, we switch back to DMA mode and start polling. - We use consistent DMA mappings to avoid from the frequent cache operation of the timer function for default. - pl011_dma_rx_chars is modified. the pending size is recalculated because data can be taken by polling. - the polling time is adjusted if dma rx poll is enabled but no rate is specified. Ideal polling interval to push 1 character at every interval is the reciprocal of 'baud rate / 10 line bits per character / 1000 ms per sec'. But It is very aggressive to system. Experimentally, '10000000 / baud' is suitable to receive dozens of characters. the poll rate can be specified statically by dma_rx_poll_rate of the platform data as well. Changes compared to v1: - Use of consistent DMA mappings. - Added dma_rx_poll_rate in platform data to specify the polling interval. - Added dma_rx_poll_timeout in platform data to specify the polling timeout. Changes compared to v2: - Use of consistent DMA mappings for default. - Added dma_rx_poll_enable in platform data to adjust the polling time according to the baud rate. - remove unnecessary lock from the polling function. Signed-off-by: Chanho Min <chanho.min@lge.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-03-27 17:38:11 +08:00
#if defined(CONFIG_SERIAL_AMBA_PL011_CONSOLE) && defined(CONFIG_MAGIC_SYSRQ)
#define SUPPORT_SYSRQ
#endif
#include <linux/module.h>
#include <linux/ioport.h>
#include <linux/init.h>
#include <linux/console.h>
#include <linux/sysrq.h>
#include <linux/device.h>
#include <linux/tty.h>
#include <linux/tty_flip.h>
#include <linux/serial_core.h>
#include <linux/serial.h>
#include <linux/amba/bus.h>
#include <linux/amba/serial.h>
#include <linux/clk.h>
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-24 16:04:11 +08:00
#include <linux/slab.h>
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
#include <linux/dmaengine.h>
#include <linux/dma-mapping.h>
#include <linux/scatterlist.h>
#include <linux/delay.h>
#include <linux/types.h>
#include <linux/of.h>
#include <linux/of_device.h>
#include <linux/pinctrl/consumer.h>
#include <linux/sizes.h>
#include <linux/io.h>
#define UART_NR 14
#define SERIAL_AMBA_MAJOR 204
#define SERIAL_AMBA_MINOR 64
#define SERIAL_AMBA_NR UART_NR
#define AMBA_ISR_PASS_LIMIT 256
#define UART_DR_ERROR (UART011_DR_OE|UART011_DR_BE|UART011_DR_PE|UART011_DR_FE)
#define UART_DUMMY_DR_RX (1 << 16)
/* There is by now at least one vendor with differing details, so handle it */
struct vendor_data {
unsigned int ifls;
unsigned int lcrh_tx;
unsigned int lcrh_rx;
bool oversampling;
bool dma_threshold;
serial: pl011: implement workaround for CTS clear event issue Problem Observed: - interrupt status is set by rising or falling edge on CTS line - interrupt status is cleared on a .0. to .1. transition of the interrupt-clear register bit 1. - interrupt-clear register is reset by hardware once the interrupt status is .0.. Remark: It seems not possible to read this register back by the CPU though, but internally this register exists. - when simultaneous set and reset event on the interrupt status happens, then the set-event has priority and the status remains .1.. As a result the interrupt-clear register is not reset to .0., and no new .0. to .1. transition can be detected on it when writing a .1. to it. This implies race condition, the clear must be performed at least one UARTCLK the riding edge of CTS RIS interrupt. Fix: Instead of resetting UART as done in commit c16d51a32bbb61ac8fd96f78b5ce2fccfe0fb4c3 "amba pl011: workaround for uart registers lockup" do the following: write .0. and then .1. to the interrupt-clear register to make sure that this transition is detected. According to the datasheet writing a .0. does not have any effect, but actually it allows to reset the internal interrupt-clear register. Take into account: The .0. needs to last at least for one clk_uart clock period (~ 38 MHz, 26.08ns) This way we can do away with the tasklet and keep only a tiny fix triggered by the variant flag introduced in this patch. Signed-off-by: Guillaume Jaunet <guillaume.jaunet@stericsson.com> Signed-off-by: Christophe Arnal <christophe.arnal@stericsson.com> Signed-off-by: Matthias Locher <Matthias.Locher@stericsson.com> Signed-off-by: Rajanikanth H.V <rajanikanth.hv@stericsson.com> Reviewed-by: Srinidhi Kasagar <srinidhi.kasagar@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-03-26 17:17:02 +08:00
bool cts_event_workaround;
unsigned int (*get_fifosize)(struct amba_device *dev);
};
static unsigned int get_fifosize_arm(struct amba_device *dev)
{
return amba_rev(dev) < 3 ? 16 : 32;
}
static struct vendor_data vendor_arm = {
.ifls = UART011_IFLS_RX4_8|UART011_IFLS_TX4_8,
.lcrh_tx = UART011_LCRH,
.lcrh_rx = UART011_LCRH,
.oversampling = false,
.dma_threshold = false,
serial: pl011: implement workaround for CTS clear event issue Problem Observed: - interrupt status is set by rising or falling edge on CTS line - interrupt status is cleared on a .0. to .1. transition of the interrupt-clear register bit 1. - interrupt-clear register is reset by hardware once the interrupt status is .0.. Remark: It seems not possible to read this register back by the CPU though, but internally this register exists. - when simultaneous set and reset event on the interrupt status happens, then the set-event has priority and the status remains .1.. As a result the interrupt-clear register is not reset to .0., and no new .0. to .1. transition can be detected on it when writing a .1. to it. This implies race condition, the clear must be performed at least one UARTCLK the riding edge of CTS RIS interrupt. Fix: Instead of resetting UART as done in commit c16d51a32bbb61ac8fd96f78b5ce2fccfe0fb4c3 "amba pl011: workaround for uart registers lockup" do the following: write .0. and then .1. to the interrupt-clear register to make sure that this transition is detected. According to the datasheet writing a .0. does not have any effect, but actually it allows to reset the internal interrupt-clear register. Take into account: The .0. needs to last at least for one clk_uart clock period (~ 38 MHz, 26.08ns) This way we can do away with the tasklet and keep only a tiny fix triggered by the variant flag introduced in this patch. Signed-off-by: Guillaume Jaunet <guillaume.jaunet@stericsson.com> Signed-off-by: Christophe Arnal <christophe.arnal@stericsson.com> Signed-off-by: Matthias Locher <Matthias.Locher@stericsson.com> Signed-off-by: Rajanikanth H.V <rajanikanth.hv@stericsson.com> Reviewed-by: Srinidhi Kasagar <srinidhi.kasagar@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-03-26 17:17:02 +08:00
.cts_event_workaround = false,
.get_fifosize = get_fifosize_arm,
};
static unsigned int get_fifosize_st(struct amba_device *dev)
{
return 64;
}
static struct vendor_data vendor_st = {
.ifls = UART011_IFLS_RX_HALF|UART011_IFLS_TX_HALF,
.lcrh_tx = ST_UART011_LCRH_TX,
.lcrh_rx = ST_UART011_LCRH_RX,
.oversampling = true,
.dma_threshold = true,
serial: pl011: implement workaround for CTS clear event issue Problem Observed: - interrupt status is set by rising or falling edge on CTS line - interrupt status is cleared on a .0. to .1. transition of the interrupt-clear register bit 1. - interrupt-clear register is reset by hardware once the interrupt status is .0.. Remark: It seems not possible to read this register back by the CPU though, but internally this register exists. - when simultaneous set and reset event on the interrupt status happens, then the set-event has priority and the status remains .1.. As a result the interrupt-clear register is not reset to .0., and no new .0. to .1. transition can be detected on it when writing a .1. to it. This implies race condition, the clear must be performed at least one UARTCLK the riding edge of CTS RIS interrupt. Fix: Instead of resetting UART as done in commit c16d51a32bbb61ac8fd96f78b5ce2fccfe0fb4c3 "amba pl011: workaround for uart registers lockup" do the following: write .0. and then .1. to the interrupt-clear register to make sure that this transition is detected. According to the datasheet writing a .0. does not have any effect, but actually it allows to reset the internal interrupt-clear register. Take into account: The .0. needs to last at least for one clk_uart clock period (~ 38 MHz, 26.08ns) This way we can do away with the tasklet and keep only a tiny fix triggered by the variant flag introduced in this patch. Signed-off-by: Guillaume Jaunet <guillaume.jaunet@stericsson.com> Signed-off-by: Christophe Arnal <christophe.arnal@stericsson.com> Signed-off-by: Matthias Locher <Matthias.Locher@stericsson.com> Signed-off-by: Rajanikanth H.V <rajanikanth.hv@stericsson.com> Reviewed-by: Srinidhi Kasagar <srinidhi.kasagar@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-03-26 17:17:02 +08:00
.cts_event_workaround = true,
.get_fifosize = get_fifosize_st,
};
static struct uart_amba_port *amba_ports[UART_NR];
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
/* Deals with DMA transactions */
struct pl011_sgbuf {
struct scatterlist sg;
char *buf;
};
struct pl011_dmarx_data {
struct dma_chan *chan;
struct completion complete;
bool use_buf_b;
struct pl011_sgbuf sgbuf_a;
struct pl011_sgbuf sgbuf_b;
dma_cookie_t cookie;
bool running;
ARM: PL011: Add support for Rx DMA buffer polling. In DMA support, The received data is not pushed to tty until the DMA buffer is filled. But some megabyte rate chips such as BT expect fast response and data should be pushed immediately. In order to fix this issue, We suggest the use of the timer for polling DMA buffer. In our test, no data loss occurred at high-baudrate as compared with interrupt- driven (We tested with 3Mbps). We changes: - We add timer for polling. If we set poll_timer to 10, every 10ms, timer handler checks the residue in the dma buffer and transfer data to the tty. Also, last_residue is updated for the next polling. - poll_timeout is used to prevent the timer's system cost. If poll_timeout is set to 3000 and no data is received in 3 seconds, we inactivate poll timer and driver falls back to interrupt-driven. When data is received again in FIFO and UART irq is occurred, we switch back to DMA mode and start polling. - We use consistent DMA mappings to avoid from the frequent cache operation of the timer function for default. - pl011_dma_rx_chars is modified. the pending size is recalculated because data can be taken by polling. - the polling time is adjusted if dma rx poll is enabled but no rate is specified. Ideal polling interval to push 1 character at every interval is the reciprocal of 'baud rate / 10 line bits per character / 1000 ms per sec'. But It is very aggressive to system. Experimentally, '10000000 / baud' is suitable to receive dozens of characters. the poll rate can be specified statically by dma_rx_poll_rate of the platform data as well. Changes compared to v1: - Use of consistent DMA mappings. - Added dma_rx_poll_rate in platform data to specify the polling interval. - Added dma_rx_poll_timeout in platform data to specify the polling timeout. Changes compared to v2: - Use of consistent DMA mappings for default. - Added dma_rx_poll_enable in platform data to adjust the polling time according to the baud rate. - remove unnecessary lock from the polling function. Signed-off-by: Chanho Min <chanho.min@lge.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-03-27 17:38:11 +08:00
struct timer_list timer;
unsigned int last_residue;
unsigned long last_jiffies;
bool auto_poll_rate;
unsigned int poll_rate;
unsigned int poll_timeout;
};
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
struct pl011_dmatx_data {
struct dma_chan *chan;
struct scatterlist sg;
char *buf;
bool queued;
};
/*
* We wrap our port structure around the generic uart_port.
*/
struct uart_amba_port {
struct uart_port port;
struct clk *clk;
const struct vendor_data *vendor;
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
unsigned int dmacr; /* dma control reg */
unsigned int im; /* interrupt mask */
unsigned int old_status;
unsigned int fifosize; /* vendor-specific */
unsigned int lcrh_tx; /* vendor-specific */
unsigned int lcrh_rx; /* vendor-specific */
unsigned int old_cr; /* state during shutdown */
bool autorts;
char type[12];
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
#ifdef CONFIG_DMA_ENGINE
/* DMA stuff */
bool using_tx_dma;
bool using_rx_dma;
struct pl011_dmarx_data dmarx;
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
struct pl011_dmatx_data dmatx;
#endif
};
/*
* Reads up to 256 characters from the FIFO or until it's empty and
* inserts them into the TTY layer. Returns the number of characters
* read from the FIFO.
*/
static int pl011_fifo_to_tty(struct uart_amba_port *uap)
{
u16 status, ch;
unsigned int flag, max_count = 256;
int fifotaken = 0;
while (max_count--) {
status = readw(uap->port.membase + UART01x_FR);
if (status & UART01x_FR_RXFE)
break;
/* Take chars from the FIFO and update status */
ch = readw(uap->port.membase + UART01x_DR) |
UART_DUMMY_DR_RX;
flag = TTY_NORMAL;
uap->port.icount.rx++;
fifotaken++;
if (unlikely(ch & UART_DR_ERROR)) {
if (ch & UART011_DR_BE) {
ch &= ~(UART011_DR_FE | UART011_DR_PE);
uap->port.icount.brk++;
if (uart_handle_break(&uap->port))
continue;
} else if (ch & UART011_DR_PE)
uap->port.icount.parity++;
else if (ch & UART011_DR_FE)
uap->port.icount.frame++;
if (ch & UART011_DR_OE)
uap->port.icount.overrun++;
ch &= uap->port.read_status_mask;
if (ch & UART011_DR_BE)
flag = TTY_BREAK;
else if (ch & UART011_DR_PE)
flag = TTY_PARITY;
else if (ch & UART011_DR_FE)
flag = TTY_FRAME;
}
if (uart_handle_sysrq_char(&uap->port, ch & 255))
continue;
uart_insert_char(&uap->port, ch, UART011_DR_OE, ch, flag);
}
return fifotaken;
}
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
/*
* All the DMA operation mode stuff goes inside this ifdef.
* This assumes that you have a generic DMA device interface,
* no custom DMA interfaces are supported.
*/
#ifdef CONFIG_DMA_ENGINE
#define PL011_DMA_BUFFER_SIZE PAGE_SIZE
static int pl011_sgbuf_init(struct dma_chan *chan, struct pl011_sgbuf *sg,
enum dma_data_direction dir)
{
ARM: PL011: Add support for Rx DMA buffer polling. In DMA support, The received data is not pushed to tty until the DMA buffer is filled. But some megabyte rate chips such as BT expect fast response and data should be pushed immediately. In order to fix this issue, We suggest the use of the timer for polling DMA buffer. In our test, no data loss occurred at high-baudrate as compared with interrupt- driven (We tested with 3Mbps). We changes: - We add timer for polling. If we set poll_timer to 10, every 10ms, timer handler checks the residue in the dma buffer and transfer data to the tty. Also, last_residue is updated for the next polling. - poll_timeout is used to prevent the timer's system cost. If poll_timeout is set to 3000 and no data is received in 3 seconds, we inactivate poll timer and driver falls back to interrupt-driven. When data is received again in FIFO and UART irq is occurred, we switch back to DMA mode and start polling. - We use consistent DMA mappings to avoid from the frequent cache operation of the timer function for default. - pl011_dma_rx_chars is modified. the pending size is recalculated because data can be taken by polling. - the polling time is adjusted if dma rx poll is enabled but no rate is specified. Ideal polling interval to push 1 character at every interval is the reciprocal of 'baud rate / 10 line bits per character / 1000 ms per sec'. But It is very aggressive to system. Experimentally, '10000000 / baud' is suitable to receive dozens of characters. the poll rate can be specified statically by dma_rx_poll_rate of the platform data as well. Changes compared to v1: - Use of consistent DMA mappings. - Added dma_rx_poll_rate in platform data to specify the polling interval. - Added dma_rx_poll_timeout in platform data to specify the polling timeout. Changes compared to v2: - Use of consistent DMA mappings for default. - Added dma_rx_poll_enable in platform data to adjust the polling time according to the baud rate. - remove unnecessary lock from the polling function. Signed-off-by: Chanho Min <chanho.min@lge.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-03-27 17:38:11 +08:00
dma_addr_t dma_addr;
sg->buf = dma_alloc_coherent(chan->device->dev,
PL011_DMA_BUFFER_SIZE, &dma_addr, GFP_KERNEL);
if (!sg->buf)
return -ENOMEM;
ARM: PL011: Add support for Rx DMA buffer polling. In DMA support, The received data is not pushed to tty until the DMA buffer is filled. But some megabyte rate chips such as BT expect fast response and data should be pushed immediately. In order to fix this issue, We suggest the use of the timer for polling DMA buffer. In our test, no data loss occurred at high-baudrate as compared with interrupt- driven (We tested with 3Mbps). We changes: - We add timer for polling. If we set poll_timer to 10, every 10ms, timer handler checks the residue in the dma buffer and transfer data to the tty. Also, last_residue is updated for the next polling. - poll_timeout is used to prevent the timer's system cost. If poll_timeout is set to 3000 and no data is received in 3 seconds, we inactivate poll timer and driver falls back to interrupt-driven. When data is received again in FIFO and UART irq is occurred, we switch back to DMA mode and start polling. - We use consistent DMA mappings to avoid from the frequent cache operation of the timer function for default. - pl011_dma_rx_chars is modified. the pending size is recalculated because data can be taken by polling. - the polling time is adjusted if dma rx poll is enabled but no rate is specified. Ideal polling interval to push 1 character at every interval is the reciprocal of 'baud rate / 10 line bits per character / 1000 ms per sec'. But It is very aggressive to system. Experimentally, '10000000 / baud' is suitable to receive dozens of characters. the poll rate can be specified statically by dma_rx_poll_rate of the platform data as well. Changes compared to v1: - Use of consistent DMA mappings. - Added dma_rx_poll_rate in platform data to specify the polling interval. - Added dma_rx_poll_timeout in platform data to specify the polling timeout. Changes compared to v2: - Use of consistent DMA mappings for default. - Added dma_rx_poll_enable in platform data to adjust the polling time according to the baud rate. - remove unnecessary lock from the polling function. Signed-off-by: Chanho Min <chanho.min@lge.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-03-27 17:38:11 +08:00
sg_init_table(&sg->sg, 1);
sg_set_page(&sg->sg, phys_to_page(dma_addr),
PL011_DMA_BUFFER_SIZE, offset_in_page(dma_addr));
sg_dma_address(&sg->sg) = dma_addr;
return 0;
}
static void pl011_sgbuf_free(struct dma_chan *chan, struct pl011_sgbuf *sg,
enum dma_data_direction dir)
{
if (sg->buf) {
ARM: PL011: Add support for Rx DMA buffer polling. In DMA support, The received data is not pushed to tty until the DMA buffer is filled. But some megabyte rate chips such as BT expect fast response and data should be pushed immediately. In order to fix this issue, We suggest the use of the timer for polling DMA buffer. In our test, no data loss occurred at high-baudrate as compared with interrupt- driven (We tested with 3Mbps). We changes: - We add timer for polling. If we set poll_timer to 10, every 10ms, timer handler checks the residue in the dma buffer and transfer data to the tty. Also, last_residue is updated for the next polling. - poll_timeout is used to prevent the timer's system cost. If poll_timeout is set to 3000 and no data is received in 3 seconds, we inactivate poll timer and driver falls back to interrupt-driven. When data is received again in FIFO and UART irq is occurred, we switch back to DMA mode and start polling. - We use consistent DMA mappings to avoid from the frequent cache operation of the timer function for default. - pl011_dma_rx_chars is modified. the pending size is recalculated because data can be taken by polling. - the polling time is adjusted if dma rx poll is enabled but no rate is specified. Ideal polling interval to push 1 character at every interval is the reciprocal of 'baud rate / 10 line bits per character / 1000 ms per sec'. But It is very aggressive to system. Experimentally, '10000000 / baud' is suitable to receive dozens of characters. the poll rate can be specified statically by dma_rx_poll_rate of the platform data as well. Changes compared to v1: - Use of consistent DMA mappings. - Added dma_rx_poll_rate in platform data to specify the polling interval. - Added dma_rx_poll_timeout in platform data to specify the polling timeout. Changes compared to v2: - Use of consistent DMA mappings for default. - Added dma_rx_poll_enable in platform data to adjust the polling time according to the baud rate. - remove unnecessary lock from the polling function. Signed-off-by: Chanho Min <chanho.min@lge.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-03-27 17:38:11 +08:00
dma_free_coherent(chan->device->dev,
PL011_DMA_BUFFER_SIZE, sg->buf,
sg_dma_address(&sg->sg));
}
}
static void pl011_dma_probe_initcall(struct device *dev, struct uart_amba_port *uap)
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
{
/* DMA is the sole user of the platform data right now */
struct amba_pl011_data *plat = uap->port.dev->platform_data;
struct dma_slave_config tx_conf = {
.dst_addr = uap->port.mapbase + UART01x_DR,
.dst_addr_width = DMA_SLAVE_BUSWIDTH_1_BYTE,
.direction = DMA_MEM_TO_DEV,
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
.dst_maxburst = uap->fifosize >> 1,
.device_fc = false,
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
};
struct dma_chan *chan;
dma_cap_mask_t mask;
chan = dma_request_slave_channel(dev, "tx");
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
if (!chan) {
/* We need platform data */
if (!plat || !plat->dma_filter) {
dev_info(uap->port.dev, "no DMA platform data\n");
return;
}
/* Try to acquire a generic DMA engine slave TX channel */
dma_cap_zero(mask);
dma_cap_set(DMA_SLAVE, mask);
chan = dma_request_channel(mask, plat->dma_filter,
plat->dma_tx_param);
if (!chan) {
dev_err(uap->port.dev, "no TX DMA channel!\n");
return;
}
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
}
dmaengine_slave_config(chan, &tx_conf);
uap->dmatx.chan = chan;
dev_info(uap->port.dev, "DMA channel TX %s\n",
dma_chan_name(uap->dmatx.chan));
/* Optionally make use of an RX channel as well */
chan = dma_request_slave_channel(dev, "rx");
if (!chan && plat->dma_rx_param) {
chan = dma_request_channel(mask, plat->dma_filter, plat->dma_rx_param);
if (!chan) {
dev_err(uap->port.dev, "no RX DMA channel!\n");
return;
}
}
if (chan) {
struct dma_slave_config rx_conf = {
.src_addr = uap->port.mapbase + UART01x_DR,
.src_addr_width = DMA_SLAVE_BUSWIDTH_1_BYTE,
.direction = DMA_DEV_TO_MEM,
.src_maxburst = uap->fifosize >> 1,
.device_fc = false,
};
dmaengine_slave_config(chan, &rx_conf);
uap->dmarx.chan = chan;
if (plat && plat->dma_rx_poll_enable) {
ARM: PL011: Add support for Rx DMA buffer polling. In DMA support, The received data is not pushed to tty until the DMA buffer is filled. But some megabyte rate chips such as BT expect fast response and data should be pushed immediately. In order to fix this issue, We suggest the use of the timer for polling DMA buffer. In our test, no data loss occurred at high-baudrate as compared with interrupt- driven (We tested with 3Mbps). We changes: - We add timer for polling. If we set poll_timer to 10, every 10ms, timer handler checks the residue in the dma buffer and transfer data to the tty. Also, last_residue is updated for the next polling. - poll_timeout is used to prevent the timer's system cost. If poll_timeout is set to 3000 and no data is received in 3 seconds, we inactivate poll timer and driver falls back to interrupt-driven. When data is received again in FIFO and UART irq is occurred, we switch back to DMA mode and start polling. - We use consistent DMA mappings to avoid from the frequent cache operation of the timer function for default. - pl011_dma_rx_chars is modified. the pending size is recalculated because data can be taken by polling. - the polling time is adjusted if dma rx poll is enabled but no rate is specified. Ideal polling interval to push 1 character at every interval is the reciprocal of 'baud rate / 10 line bits per character / 1000 ms per sec'. But It is very aggressive to system. Experimentally, '10000000 / baud' is suitable to receive dozens of characters. the poll rate can be specified statically by dma_rx_poll_rate of the platform data as well. Changes compared to v1: - Use of consistent DMA mappings. - Added dma_rx_poll_rate in platform data to specify the polling interval. - Added dma_rx_poll_timeout in platform data to specify the polling timeout. Changes compared to v2: - Use of consistent DMA mappings for default. - Added dma_rx_poll_enable in platform data to adjust the polling time according to the baud rate. - remove unnecessary lock from the polling function. Signed-off-by: Chanho Min <chanho.min@lge.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-03-27 17:38:11 +08:00
/* Set poll rate if specified. */
if (plat->dma_rx_poll_rate) {
uap->dmarx.auto_poll_rate = false;
uap->dmarx.poll_rate = plat->dma_rx_poll_rate;
} else {
/*
* 100 ms defaults to poll rate if not
* specified. This will be adjusted with
* the baud rate at set_termios.
*/
uap->dmarx.auto_poll_rate = true;
uap->dmarx.poll_rate = 100;
}
/* 3 secs defaults poll_timeout if not specified. */
if (plat->dma_rx_poll_timeout)
uap->dmarx.poll_timeout =
plat->dma_rx_poll_timeout;
else
uap->dmarx.poll_timeout = 3000;
} else
uap->dmarx.auto_poll_rate = false;
dev_info(uap->port.dev, "DMA channel RX %s\n",
dma_chan_name(uap->dmarx.chan));
}
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
}
#ifndef MODULE
/*
* Stack up the UARTs and let the above initcall be done at device
* initcall time, because the serial driver is called as an arch
* initcall, and at this time the DMA subsystem is not yet registered.
* At this point the driver will switch over to using DMA where desired.
*/
struct dma_uap {
struct list_head node;
struct uart_amba_port *uap;
struct device *dev;
};
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
static LIST_HEAD(pl011_dma_uarts);
static int __init pl011_dma_initcall(void)
{
struct list_head *node, *tmp;
list_for_each_safe(node, tmp, &pl011_dma_uarts) {
struct dma_uap *dmau = list_entry(node, struct dma_uap, node);
pl011_dma_probe_initcall(dmau->dev, dmau->uap);
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
list_del(node);
kfree(dmau);
}
return 0;
}
device_initcall(pl011_dma_initcall);
static void pl011_dma_probe(struct device *dev, struct uart_amba_port *uap)
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
{
struct dma_uap *dmau = kzalloc(sizeof(struct dma_uap), GFP_KERNEL);
if (dmau) {
dmau->uap = uap;
dmau->dev = dev;
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
list_add_tail(&dmau->node, &pl011_dma_uarts);
}
}
#else
static void pl011_dma_probe(struct device *dev, struct uart_amba_port *uap)
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
{
pl011_dma_probe_initcall(dev, uap);
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
}
#endif
static void pl011_dma_remove(struct uart_amba_port *uap)
{
/* TODO: remove the initcall if it has not yet executed */
if (uap->dmatx.chan)
dma_release_channel(uap->dmatx.chan);
if (uap->dmarx.chan)
dma_release_channel(uap->dmarx.chan);
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
}
/* Forward declare this for the refill routine */
static int pl011_dma_tx_refill(struct uart_amba_port *uap);
/*
* The current DMA TX buffer has been sent.
* Try to queue up another DMA buffer.
*/
static void pl011_dma_tx_callback(void *data)
{
struct uart_amba_port *uap = data;
struct pl011_dmatx_data *dmatx = &uap->dmatx;
unsigned long flags;
u16 dmacr;
spin_lock_irqsave(&uap->port.lock, flags);
if (uap->dmatx.queued)
dma_unmap_sg(dmatx->chan->device->dev, &dmatx->sg, 1,
DMA_TO_DEVICE);
dmacr = uap->dmacr;
uap->dmacr = dmacr & ~UART011_TXDMAE;
writew(uap->dmacr, uap->port.membase + UART011_DMACR);
/*
* If TX DMA was disabled, it means that we've stopped the DMA for
* some reason (eg, XOFF received, or we want to send an X-char.)
*
* Note: we need to be careful here of a potential race between DMA
* and the rest of the driver - if the driver disables TX DMA while
* a TX buffer completing, we must update the tx queued status to
* get further refills (hence we check dmacr).
*/
if (!(dmacr & UART011_TXDMAE) || uart_tx_stopped(&uap->port) ||
uart_circ_empty(&uap->port.state->xmit)) {
uap->dmatx.queued = false;
spin_unlock_irqrestore(&uap->port.lock, flags);
return;
}
if (pl011_dma_tx_refill(uap) <= 0) {
/*
* We didn't queue a DMA buffer for some reason, but we
* have data pending to be sent. Re-enable the TX IRQ.
*/
uap->im |= UART011_TXIM;
writew(uap->im, uap->port.membase + UART011_IMSC);
}
spin_unlock_irqrestore(&uap->port.lock, flags);
}
/*
* Try to refill the TX DMA buffer.
* Locking: called with port lock held and IRQs disabled.
* Returns:
* 1 if we queued up a TX DMA buffer.
* 0 if we didn't want to handle this by DMA
* <0 on error
*/
static int pl011_dma_tx_refill(struct uart_amba_port *uap)
{
struct pl011_dmatx_data *dmatx = &uap->dmatx;
struct dma_chan *chan = dmatx->chan;
struct dma_device *dma_dev = chan->device;
struct dma_async_tx_descriptor *desc;
struct circ_buf *xmit = &uap->port.state->xmit;
unsigned int count;
/*
* Try to avoid the overhead involved in using DMA if the
* transaction fits in the first half of the FIFO, by using
* the standard interrupt handling. This ensures that we
* issue a uart_write_wakeup() at the appropriate time.
*/
count = uart_circ_chars_pending(xmit);
if (count < (uap->fifosize >> 1)) {
uap->dmatx.queued = false;
return 0;
}
/*
* Bodge: don't send the last character by DMA, as this
* will prevent XON from notifying us to restart DMA.
*/
count -= 1;
/* Else proceed to copy the TX chars to the DMA buffer and fire DMA */
if (count > PL011_DMA_BUFFER_SIZE)
count = PL011_DMA_BUFFER_SIZE;
if (xmit->tail < xmit->head)
memcpy(&dmatx->buf[0], &xmit->buf[xmit->tail], count);
else {
size_t first = UART_XMIT_SIZE - xmit->tail;
size_t second = xmit->head;
memcpy(&dmatx->buf[0], &xmit->buf[xmit->tail], first);
if (second)
memcpy(&dmatx->buf[first], &xmit->buf[0], second);
}
dmatx->sg.length = count;
if (dma_map_sg(dma_dev->dev, &dmatx->sg, 1, DMA_TO_DEVICE) != 1) {
uap->dmatx.queued = false;
dev_dbg(uap->port.dev, "unable to map TX DMA\n");
return -EBUSY;
}
desc = dmaengine_prep_slave_sg(chan, &dmatx->sg, 1, DMA_MEM_TO_DEV,
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
if (!desc) {
dma_unmap_sg(dma_dev->dev, &dmatx->sg, 1, DMA_TO_DEVICE);
uap->dmatx.queued = false;
/*
* If DMA cannot be used right now, we complete this
* transaction via IRQ and let the TTY layer retry.
*/
dev_dbg(uap->port.dev, "TX DMA busy\n");
return -EBUSY;
}
/* Some data to go along to the callback */
desc->callback = pl011_dma_tx_callback;
desc->callback_param = uap;
/* All errors should happen at prepare time */
dmaengine_submit(desc);
/* Fire the DMA transaction */
dma_dev->device_issue_pending(chan);
uap->dmacr |= UART011_TXDMAE;
writew(uap->dmacr, uap->port.membase + UART011_DMACR);
uap->dmatx.queued = true;
/*
* Now we know that DMA will fire, so advance the ring buffer
* with the stuff we just dispatched.
*/
xmit->tail = (xmit->tail + count) & (UART_XMIT_SIZE - 1);
uap->port.icount.tx += count;
if (uart_circ_chars_pending(xmit) < WAKEUP_CHARS)
uart_write_wakeup(&uap->port);
return 1;
}
/*
* We received a transmit interrupt without a pending X-char but with
* pending characters.
* Locking: called with port lock held and IRQs disabled.
* Returns:
* false if we want to use PIO to transmit
* true if we queued a DMA buffer
*/
static bool pl011_dma_tx_irq(struct uart_amba_port *uap)
{
if (!uap->using_tx_dma)
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
return false;
/*
* If we already have a TX buffer queued, but received a
* TX interrupt, it will be because we've just sent an X-char.
* Ensure the TX DMA is enabled and the TX IRQ is disabled.
*/
if (uap->dmatx.queued) {
uap->dmacr |= UART011_TXDMAE;
writew(uap->dmacr, uap->port.membase + UART011_DMACR);
uap->im &= ~UART011_TXIM;
writew(uap->im, uap->port.membase + UART011_IMSC);
return true;
}
/*
* We don't have a TX buffer queued, so try to queue one.
* If we successfully queued a buffer, mask the TX IRQ.
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
*/
if (pl011_dma_tx_refill(uap) > 0) {
uap->im &= ~UART011_TXIM;
writew(uap->im, uap->port.membase + UART011_IMSC);
return true;
}
return false;
}
/*
* Stop the DMA transmit (eg, due to received XOFF).
* Locking: called with port lock held and IRQs disabled.
*/
static inline void pl011_dma_tx_stop(struct uart_amba_port *uap)
{
if (uap->dmatx.queued) {
uap->dmacr &= ~UART011_TXDMAE;
writew(uap->dmacr, uap->port.membase + UART011_DMACR);
}
}
/*
* Try to start a DMA transmit, or in the case of an XON/OFF
* character queued for send, try to get that character out ASAP.
* Locking: called with port lock held and IRQs disabled.
* Returns:
* false if we want the TX IRQ to be enabled
* true if we have a buffer queued
*/
static inline bool pl011_dma_tx_start(struct uart_amba_port *uap)
{
u16 dmacr;
if (!uap->using_tx_dma)
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
return false;
if (!uap->port.x_char) {
/* no X-char, try to push chars out in DMA mode */
bool ret = true;
if (!uap->dmatx.queued) {
if (pl011_dma_tx_refill(uap) > 0) {
uap->im &= ~UART011_TXIM;
ret = true;
} else {
uap->im |= UART011_TXIM;
ret = false;
}
writew(uap->im, uap->port.membase + UART011_IMSC);
} else if (!(uap->dmacr & UART011_TXDMAE)) {
uap->dmacr |= UART011_TXDMAE;
writew(uap->dmacr,
uap->port.membase + UART011_DMACR);
}
return ret;
}
/*
* We have an X-char to send. Disable DMA to prevent it loading
* the TX fifo, and then see if we can stuff it into the FIFO.
*/
dmacr = uap->dmacr;
uap->dmacr &= ~UART011_TXDMAE;
writew(uap->dmacr, uap->port.membase + UART011_DMACR);
if (readw(uap->port.membase + UART01x_FR) & UART01x_FR_TXFF) {
/*
* No space in the FIFO, so enable the transmit interrupt
* so we know when there is space. Note that once we've
* loaded the character, we should just re-enable DMA.
*/
return false;
}
writew(uap->port.x_char, uap->port.membase + UART01x_DR);
uap->port.icount.tx++;
uap->port.x_char = 0;
/* Success - restore the DMA state */
uap->dmacr = dmacr;
writew(dmacr, uap->port.membase + UART011_DMACR);
return true;
}
/*
* Flush the transmit buffer.
* Locking: called with port lock held and IRQs disabled.
*/
static void pl011_dma_flush_buffer(struct uart_port *port)
{
struct uart_amba_port *uap = (struct uart_amba_port *)port;
if (!uap->using_tx_dma)
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
return;
/* Avoid deadlock with the DMA engine callback */
spin_unlock(&uap->port.lock);
dmaengine_terminate_all(uap->dmatx.chan);
spin_lock(&uap->port.lock);
if (uap->dmatx.queued) {
dma_unmap_sg(uap->dmatx.chan->device->dev, &uap->dmatx.sg, 1,
DMA_TO_DEVICE);
uap->dmatx.queued = false;
uap->dmacr &= ~UART011_TXDMAE;
writew(uap->dmacr, uap->port.membase + UART011_DMACR);
}
}
static void pl011_dma_rx_callback(void *data);
static int pl011_dma_rx_trigger_dma(struct uart_amba_port *uap)
{
struct dma_chan *rxchan = uap->dmarx.chan;
struct pl011_dmarx_data *dmarx = &uap->dmarx;
struct dma_async_tx_descriptor *desc;
struct pl011_sgbuf *sgbuf;
if (!rxchan)
return -EIO;
/* Start the RX DMA job */
sgbuf = uap->dmarx.use_buf_b ?
&uap->dmarx.sgbuf_b : &uap->dmarx.sgbuf_a;
desc = dmaengine_prep_slave_sg(rxchan, &sgbuf->sg, 1,
DMA_DEV_TO_MEM,
DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
/*
* If the DMA engine is busy and cannot prepare a
* channel, no big deal, the driver will fall back
* to interrupt mode as a result of this error code.
*/
if (!desc) {
uap->dmarx.running = false;
dmaengine_terminate_all(rxchan);
return -EBUSY;
}
/* Some data to go along to the callback */
desc->callback = pl011_dma_rx_callback;
desc->callback_param = uap;
dmarx->cookie = dmaengine_submit(desc);
dma_async_issue_pending(rxchan);
uap->dmacr |= UART011_RXDMAE;
writew(uap->dmacr, uap->port.membase + UART011_DMACR);
uap->dmarx.running = true;
uap->im &= ~UART011_RXIM;
writew(uap->im, uap->port.membase + UART011_IMSC);
return 0;
}
/*
* This is called when either the DMA job is complete, or
* the FIFO timeout interrupt occurred. This must be called
* with the port spinlock uap->port.lock held.
*/
static void pl011_dma_rx_chars(struct uart_amba_port *uap,
u32 pending, bool use_buf_b,
bool readfifo)
{
struct tty_port *port = &uap->port.state->port;
struct pl011_sgbuf *sgbuf = use_buf_b ?
&uap->dmarx.sgbuf_b : &uap->dmarx.sgbuf_a;
int dma_count = 0;
u32 fifotaken = 0; /* only used for vdbg() */
ARM: PL011: Add support for Rx DMA buffer polling. In DMA support, The received data is not pushed to tty until the DMA buffer is filled. But some megabyte rate chips such as BT expect fast response and data should be pushed immediately. In order to fix this issue, We suggest the use of the timer for polling DMA buffer. In our test, no data loss occurred at high-baudrate as compared with interrupt- driven (We tested with 3Mbps). We changes: - We add timer for polling. If we set poll_timer to 10, every 10ms, timer handler checks the residue in the dma buffer and transfer data to the tty. Also, last_residue is updated for the next polling. - poll_timeout is used to prevent the timer's system cost. If poll_timeout is set to 3000 and no data is received in 3 seconds, we inactivate poll timer and driver falls back to interrupt-driven. When data is received again in FIFO and UART irq is occurred, we switch back to DMA mode and start polling. - We use consistent DMA mappings to avoid from the frequent cache operation of the timer function for default. - pl011_dma_rx_chars is modified. the pending size is recalculated because data can be taken by polling. - the polling time is adjusted if dma rx poll is enabled but no rate is specified. Ideal polling interval to push 1 character at every interval is the reciprocal of 'baud rate / 10 line bits per character / 1000 ms per sec'. But It is very aggressive to system. Experimentally, '10000000 / baud' is suitable to receive dozens of characters. the poll rate can be specified statically by dma_rx_poll_rate of the platform data as well. Changes compared to v1: - Use of consistent DMA mappings. - Added dma_rx_poll_rate in platform data to specify the polling interval. - Added dma_rx_poll_timeout in platform data to specify the polling timeout. Changes compared to v2: - Use of consistent DMA mappings for default. - Added dma_rx_poll_enable in platform data to adjust the polling time according to the baud rate. - remove unnecessary lock from the polling function. Signed-off-by: Chanho Min <chanho.min@lge.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-03-27 17:38:11 +08:00
struct pl011_dmarx_data *dmarx = &uap->dmarx;
int dmataken = 0;
if (uap->dmarx.poll_rate) {
/* The data can be taken by polling */
dmataken = sgbuf->sg.length - dmarx->last_residue;
/* Recalculate the pending size */
if (pending >= dmataken)
pending -= dmataken;
}
/* Pick the remain data from the DMA */
if (pending) {
/*
* First take all chars in the DMA pipe, then look in the FIFO.
* Note that tty_insert_flip_buf() tries to take as many chars
* as it can.
*/
ARM: PL011: Add support for Rx DMA buffer polling. In DMA support, The received data is not pushed to tty until the DMA buffer is filled. But some megabyte rate chips such as BT expect fast response and data should be pushed immediately. In order to fix this issue, We suggest the use of the timer for polling DMA buffer. In our test, no data loss occurred at high-baudrate as compared with interrupt- driven (We tested with 3Mbps). We changes: - We add timer for polling. If we set poll_timer to 10, every 10ms, timer handler checks the residue in the dma buffer and transfer data to the tty. Also, last_residue is updated for the next polling. - poll_timeout is used to prevent the timer's system cost. If poll_timeout is set to 3000 and no data is received in 3 seconds, we inactivate poll timer and driver falls back to interrupt-driven. When data is received again in FIFO and UART irq is occurred, we switch back to DMA mode and start polling. - We use consistent DMA mappings to avoid from the frequent cache operation of the timer function for default. - pl011_dma_rx_chars is modified. the pending size is recalculated because data can be taken by polling. - the polling time is adjusted if dma rx poll is enabled but no rate is specified. Ideal polling interval to push 1 character at every interval is the reciprocal of 'baud rate / 10 line bits per character / 1000 ms per sec'. But It is very aggressive to system. Experimentally, '10000000 / baud' is suitable to receive dozens of characters. the poll rate can be specified statically by dma_rx_poll_rate of the platform data as well. Changes compared to v1: - Use of consistent DMA mappings. - Added dma_rx_poll_rate in platform data to specify the polling interval. - Added dma_rx_poll_timeout in platform data to specify the polling timeout. Changes compared to v2: - Use of consistent DMA mappings for default. - Added dma_rx_poll_enable in platform data to adjust the polling time according to the baud rate. - remove unnecessary lock from the polling function. Signed-off-by: Chanho Min <chanho.min@lge.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-03-27 17:38:11 +08:00
dma_count = tty_insert_flip_string(port, sgbuf->buf + dmataken,
pending);
uap->port.icount.rx += dma_count;
if (dma_count < pending)
dev_warn(uap->port.dev,
"couldn't insert all characters (TTY is full?)\n");
}
ARM: PL011: Add support for Rx DMA buffer polling. In DMA support, The received data is not pushed to tty until the DMA buffer is filled. But some megabyte rate chips such as BT expect fast response and data should be pushed immediately. In order to fix this issue, We suggest the use of the timer for polling DMA buffer. In our test, no data loss occurred at high-baudrate as compared with interrupt- driven (We tested with 3Mbps). We changes: - We add timer for polling. If we set poll_timer to 10, every 10ms, timer handler checks the residue in the dma buffer and transfer data to the tty. Also, last_residue is updated for the next polling. - poll_timeout is used to prevent the timer's system cost. If poll_timeout is set to 3000 and no data is received in 3 seconds, we inactivate poll timer and driver falls back to interrupt-driven. When data is received again in FIFO and UART irq is occurred, we switch back to DMA mode and start polling. - We use consistent DMA mappings to avoid from the frequent cache operation of the timer function for default. - pl011_dma_rx_chars is modified. the pending size is recalculated because data can be taken by polling. - the polling time is adjusted if dma rx poll is enabled but no rate is specified. Ideal polling interval to push 1 character at every interval is the reciprocal of 'baud rate / 10 line bits per character / 1000 ms per sec'. But It is very aggressive to system. Experimentally, '10000000 / baud' is suitable to receive dozens of characters. the poll rate can be specified statically by dma_rx_poll_rate of the platform data as well. Changes compared to v1: - Use of consistent DMA mappings. - Added dma_rx_poll_rate in platform data to specify the polling interval. - Added dma_rx_poll_timeout in platform data to specify the polling timeout. Changes compared to v2: - Use of consistent DMA mappings for default. - Added dma_rx_poll_enable in platform data to adjust the polling time according to the baud rate. - remove unnecessary lock from the polling function. Signed-off-by: Chanho Min <chanho.min@lge.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-03-27 17:38:11 +08:00
/* Reset the last_residue for Rx DMA poll */
if (uap->dmarx.poll_rate)
dmarx->last_residue = sgbuf->sg.length;
/*
* Only continue with trying to read the FIFO if all DMA chars have
* been taken first.
*/
if (dma_count == pending && readfifo) {
/* Clear any error flags */
writew(UART011_OEIS | UART011_BEIS | UART011_PEIS | UART011_FEIS,
uap->port.membase + UART011_ICR);
/*
* If we read all the DMA'd characters, and we had an
* incomplete buffer, that could be due to an rx error, or
* maybe we just timed out. Read any pending chars and check
* the error status.
*
* Error conditions will only occur in the FIFO, these will
* trigger an immediate interrupt and stop the DMA job, so we
* will always find the error in the FIFO, never in the DMA
* buffer.
*/
fifotaken = pl011_fifo_to_tty(uap);
}
spin_unlock(&uap->port.lock);
dev_vdbg(uap->port.dev,
"Took %d chars from DMA buffer and %d chars from the FIFO\n",
dma_count, fifotaken);
tty_flip_buffer_push(port);
spin_lock(&uap->port.lock);
}
static void pl011_dma_rx_irq(struct uart_amba_port *uap)
{
struct pl011_dmarx_data *dmarx = &uap->dmarx;
struct dma_chan *rxchan = dmarx->chan;
struct pl011_sgbuf *sgbuf = dmarx->use_buf_b ?
&dmarx->sgbuf_b : &dmarx->sgbuf_a;
size_t pending;
struct dma_tx_state state;
enum dma_status dmastat;
/*
* Pause the transfer so we can trust the current counter,
* do this before we pause the PL011 block, else we may
* overflow the FIFO.
*/
if (dmaengine_pause(rxchan))
dev_err(uap->port.dev, "unable to pause DMA transfer\n");
dmastat = rxchan->device->device_tx_status(rxchan,
dmarx->cookie, &state);
if (dmastat != DMA_PAUSED)
dev_err(uap->port.dev, "unable to pause DMA transfer\n");
/* Disable RX DMA - incoming data will wait in the FIFO */
uap->dmacr &= ~UART011_RXDMAE;
writew(uap->dmacr, uap->port.membase + UART011_DMACR);
uap->dmarx.running = false;
pending = sgbuf->sg.length - state.residue;
BUG_ON(pending > PL011_DMA_BUFFER_SIZE);
/* Then we terminate the transfer - we now know our residue */
dmaengine_terminate_all(rxchan);
/*
* This will take the chars we have so far and insert
* into the framework.
*/
pl011_dma_rx_chars(uap, pending, dmarx->use_buf_b, true);
/* Switch buffer & re-trigger DMA job */
dmarx->use_buf_b = !dmarx->use_buf_b;
if (pl011_dma_rx_trigger_dma(uap)) {
dev_dbg(uap->port.dev, "could not retrigger RX DMA job "
"fall back to interrupt mode\n");
uap->im |= UART011_RXIM;
writew(uap->im, uap->port.membase + UART011_IMSC);
}
}
static void pl011_dma_rx_callback(void *data)
{
struct uart_amba_port *uap = data;
struct pl011_dmarx_data *dmarx = &uap->dmarx;
struct dma_chan *rxchan = dmarx->chan;
bool lastbuf = dmarx->use_buf_b;
struct pl011_sgbuf *sgbuf = dmarx->use_buf_b ?
&dmarx->sgbuf_b : &dmarx->sgbuf_a;
size_t pending;
struct dma_tx_state state;
int ret;
/*
* This completion interrupt occurs typically when the
* RX buffer is totally stuffed but no timeout has yet
* occurred. When that happens, we just want the RX
* routine to flush out the secondary DMA buffer while
* we immediately trigger the next DMA job.
*/
spin_lock_irq(&uap->port.lock);
/*
* Rx data can be taken by the UART interrupts during
* the DMA irq handler. So we check the residue here.
*/
rxchan->device->device_tx_status(rxchan, dmarx->cookie, &state);
pending = sgbuf->sg.length - state.residue;
BUG_ON(pending > PL011_DMA_BUFFER_SIZE);
/* Then we terminate the transfer - we now know our residue */
dmaengine_terminate_all(rxchan);
uap->dmarx.running = false;
dmarx->use_buf_b = !lastbuf;
ret = pl011_dma_rx_trigger_dma(uap);
pl011_dma_rx_chars(uap, pending, lastbuf, false);
spin_unlock_irq(&uap->port.lock);
/*
* Do this check after we picked the DMA chars so we don't
* get some IRQ immediately from RX.
*/
if (ret) {
dev_dbg(uap->port.dev, "could not retrigger RX DMA job "
"fall back to interrupt mode\n");
uap->im |= UART011_RXIM;
writew(uap->im, uap->port.membase + UART011_IMSC);
}
}
/*
* Stop accepting received characters, when we're shutting down or
* suspending this port.
* Locking: called with port lock held and IRQs disabled.
*/
static inline void pl011_dma_rx_stop(struct uart_amba_port *uap)
{
/* FIXME. Just disable the DMA enable */
uap->dmacr &= ~UART011_RXDMAE;
writew(uap->dmacr, uap->port.membase + UART011_DMACR);
}
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
ARM: PL011: Add support for Rx DMA buffer polling. In DMA support, The received data is not pushed to tty until the DMA buffer is filled. But some megabyte rate chips such as BT expect fast response and data should be pushed immediately. In order to fix this issue, We suggest the use of the timer for polling DMA buffer. In our test, no data loss occurred at high-baudrate as compared with interrupt- driven (We tested with 3Mbps). We changes: - We add timer for polling. If we set poll_timer to 10, every 10ms, timer handler checks the residue in the dma buffer and transfer data to the tty. Also, last_residue is updated for the next polling. - poll_timeout is used to prevent the timer's system cost. If poll_timeout is set to 3000 and no data is received in 3 seconds, we inactivate poll timer and driver falls back to interrupt-driven. When data is received again in FIFO and UART irq is occurred, we switch back to DMA mode and start polling. - We use consistent DMA mappings to avoid from the frequent cache operation of the timer function for default. - pl011_dma_rx_chars is modified. the pending size is recalculated because data can be taken by polling. - the polling time is adjusted if dma rx poll is enabled but no rate is specified. Ideal polling interval to push 1 character at every interval is the reciprocal of 'baud rate / 10 line bits per character / 1000 ms per sec'. But It is very aggressive to system. Experimentally, '10000000 / baud' is suitable to receive dozens of characters. the poll rate can be specified statically by dma_rx_poll_rate of the platform data as well. Changes compared to v1: - Use of consistent DMA mappings. - Added dma_rx_poll_rate in platform data to specify the polling interval. - Added dma_rx_poll_timeout in platform data to specify the polling timeout. Changes compared to v2: - Use of consistent DMA mappings for default. - Added dma_rx_poll_enable in platform data to adjust the polling time according to the baud rate. - remove unnecessary lock from the polling function. Signed-off-by: Chanho Min <chanho.min@lge.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-03-27 17:38:11 +08:00
/*
* Timer handler for Rx DMA polling.
* Every polling, It checks the residue in the dma buffer and transfer
* data to the tty. Also, last_residue is updated for the next polling.
*/
static void pl011_dma_rx_poll(unsigned long args)
{
struct uart_amba_port *uap = (struct uart_amba_port *)args;
struct tty_port *port = &uap->port.state->port;
struct pl011_dmarx_data *dmarx = &uap->dmarx;
struct dma_chan *rxchan = uap->dmarx.chan;
unsigned long flags = 0;
unsigned int dmataken = 0;
unsigned int size = 0;
struct pl011_sgbuf *sgbuf;
int dma_count;
struct dma_tx_state state;
sgbuf = dmarx->use_buf_b ? &uap->dmarx.sgbuf_b : &uap->dmarx.sgbuf_a;
rxchan->device->device_tx_status(rxchan, dmarx->cookie, &state);
if (likely(state.residue < dmarx->last_residue)) {
dmataken = sgbuf->sg.length - dmarx->last_residue;
size = dmarx->last_residue - state.residue;
dma_count = tty_insert_flip_string(port, sgbuf->buf + dmataken,
size);
if (dma_count == size)
dmarx->last_residue = state.residue;
dmarx->last_jiffies = jiffies;
}
tty_flip_buffer_push(port);
/*
* If no data is received in poll_timeout, the driver will fall back
* to interrupt mode. We will retrigger DMA at the first interrupt.
*/
if (jiffies_to_msecs(jiffies - dmarx->last_jiffies)
> uap->dmarx.poll_timeout) {
spin_lock_irqsave(&uap->port.lock, flags);
pl011_dma_rx_stop(uap);
spin_unlock_irqrestore(&uap->port.lock, flags);
uap->dmarx.running = false;
dmaengine_terminate_all(rxchan);
del_timer(&uap->dmarx.timer);
} else {
mod_timer(&uap->dmarx.timer,
jiffies + msecs_to_jiffies(uap->dmarx.poll_rate));
}
}
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
static void pl011_dma_startup(struct uart_amba_port *uap)
{
int ret;
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
if (!uap->dmatx.chan)
return;
uap->dmatx.buf = kmalloc(PL011_DMA_BUFFER_SIZE, GFP_KERNEL);
if (!uap->dmatx.buf) {
dev_err(uap->port.dev, "no memory for DMA TX buffer\n");
uap->port.fifosize = uap->fifosize;
return;
}
sg_init_one(&uap->dmatx.sg, uap->dmatx.buf, PL011_DMA_BUFFER_SIZE);
/* The DMA buffer is now the FIFO the TTY subsystem can use */
uap->port.fifosize = PL011_DMA_BUFFER_SIZE;
uap->using_tx_dma = true;
if (!uap->dmarx.chan)
goto skip_rx;
/* Allocate and map DMA RX buffers */
ret = pl011_sgbuf_init(uap->dmarx.chan, &uap->dmarx.sgbuf_a,
DMA_FROM_DEVICE);
if (ret) {
dev_err(uap->port.dev, "failed to init DMA %s: %d\n",
"RX buffer A", ret);
goto skip_rx;
}
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
ret = pl011_sgbuf_init(uap->dmarx.chan, &uap->dmarx.sgbuf_b,
DMA_FROM_DEVICE);
if (ret) {
dev_err(uap->port.dev, "failed to init DMA %s: %d\n",
"RX buffer B", ret);
pl011_sgbuf_free(uap->dmarx.chan, &uap->dmarx.sgbuf_a,
DMA_FROM_DEVICE);
goto skip_rx;
}
uap->using_rx_dma = true;
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
skip_rx:
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
/* Turn on DMA error (RX/TX will be enabled on demand) */
uap->dmacr |= UART011_DMAONERR;
writew(uap->dmacr, uap->port.membase + UART011_DMACR);
/*
* ST Micro variants has some specific dma burst threshold
* compensation. Set this to 16 bytes, so burst will only
* be issued above/below 16 bytes.
*/
if (uap->vendor->dma_threshold)
writew(ST_UART011_DMAWM_RX_16 | ST_UART011_DMAWM_TX_16,
uap->port.membase + ST_UART011_DMAWM);
if (uap->using_rx_dma) {
if (pl011_dma_rx_trigger_dma(uap))
dev_dbg(uap->port.dev, "could not trigger initial "
"RX DMA job, fall back to interrupt mode\n");
ARM: PL011: Add support for Rx DMA buffer polling. In DMA support, The received data is not pushed to tty until the DMA buffer is filled. But some megabyte rate chips such as BT expect fast response and data should be pushed immediately. In order to fix this issue, We suggest the use of the timer for polling DMA buffer. In our test, no data loss occurred at high-baudrate as compared with interrupt- driven (We tested with 3Mbps). We changes: - We add timer for polling. If we set poll_timer to 10, every 10ms, timer handler checks the residue in the dma buffer and transfer data to the tty. Also, last_residue is updated for the next polling. - poll_timeout is used to prevent the timer's system cost. If poll_timeout is set to 3000 and no data is received in 3 seconds, we inactivate poll timer and driver falls back to interrupt-driven. When data is received again in FIFO and UART irq is occurred, we switch back to DMA mode and start polling. - We use consistent DMA mappings to avoid from the frequent cache operation of the timer function for default. - pl011_dma_rx_chars is modified. the pending size is recalculated because data can be taken by polling. - the polling time is adjusted if dma rx poll is enabled but no rate is specified. Ideal polling interval to push 1 character at every interval is the reciprocal of 'baud rate / 10 line bits per character / 1000 ms per sec'. But It is very aggressive to system. Experimentally, '10000000 / baud' is suitable to receive dozens of characters. the poll rate can be specified statically by dma_rx_poll_rate of the platform data as well. Changes compared to v1: - Use of consistent DMA mappings. - Added dma_rx_poll_rate in platform data to specify the polling interval. - Added dma_rx_poll_timeout in platform data to specify the polling timeout. Changes compared to v2: - Use of consistent DMA mappings for default. - Added dma_rx_poll_enable in platform data to adjust the polling time according to the baud rate. - remove unnecessary lock from the polling function. Signed-off-by: Chanho Min <chanho.min@lge.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-03-27 17:38:11 +08:00
if (uap->dmarx.poll_rate) {
init_timer(&(uap->dmarx.timer));
uap->dmarx.timer.function = pl011_dma_rx_poll;
uap->dmarx.timer.data = (unsigned long)uap;
mod_timer(&uap->dmarx.timer,
jiffies +
msecs_to_jiffies(uap->dmarx.poll_rate));
uap->dmarx.last_residue = PL011_DMA_BUFFER_SIZE;
uap->dmarx.last_jiffies = jiffies;
}
}
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
}
static void pl011_dma_shutdown(struct uart_amba_port *uap)
{
if (!(uap->using_tx_dma || uap->using_rx_dma))
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
return;
/* Disable RX and TX DMA */
while (readw(uap->port.membase + UART01x_FR) & UART01x_FR_BUSY)
barrier();
spin_lock_irq(&uap->port.lock);
uap->dmacr &= ~(UART011_DMAONERR | UART011_RXDMAE | UART011_TXDMAE);
writew(uap->dmacr, uap->port.membase + UART011_DMACR);
spin_unlock_irq(&uap->port.lock);
if (uap->using_tx_dma) {
/* In theory, this should already be done by pl011_dma_flush_buffer */
dmaengine_terminate_all(uap->dmatx.chan);
if (uap->dmatx.queued) {
dma_unmap_sg(uap->dmatx.chan->device->dev, &uap->dmatx.sg, 1,
DMA_TO_DEVICE);
uap->dmatx.queued = false;
}
kfree(uap->dmatx.buf);
uap->using_tx_dma = false;
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
}
if (uap->using_rx_dma) {
dmaengine_terminate_all(uap->dmarx.chan);
/* Clean up the RX DMA */
pl011_sgbuf_free(uap->dmarx.chan, &uap->dmarx.sgbuf_a, DMA_FROM_DEVICE);
pl011_sgbuf_free(uap->dmarx.chan, &uap->dmarx.sgbuf_b, DMA_FROM_DEVICE);
ARM: PL011: Add support for Rx DMA buffer polling. In DMA support, The received data is not pushed to tty until the DMA buffer is filled. But some megabyte rate chips such as BT expect fast response and data should be pushed immediately. In order to fix this issue, We suggest the use of the timer for polling DMA buffer. In our test, no data loss occurred at high-baudrate as compared with interrupt- driven (We tested with 3Mbps). We changes: - We add timer for polling. If we set poll_timer to 10, every 10ms, timer handler checks the residue in the dma buffer and transfer data to the tty. Also, last_residue is updated for the next polling. - poll_timeout is used to prevent the timer's system cost. If poll_timeout is set to 3000 and no data is received in 3 seconds, we inactivate poll timer and driver falls back to interrupt-driven. When data is received again in FIFO and UART irq is occurred, we switch back to DMA mode and start polling. - We use consistent DMA mappings to avoid from the frequent cache operation of the timer function for default. - pl011_dma_rx_chars is modified. the pending size is recalculated because data can be taken by polling. - the polling time is adjusted if dma rx poll is enabled but no rate is specified. Ideal polling interval to push 1 character at every interval is the reciprocal of 'baud rate / 10 line bits per character / 1000 ms per sec'. But It is very aggressive to system. Experimentally, '10000000 / baud' is suitable to receive dozens of characters. the poll rate can be specified statically by dma_rx_poll_rate of the platform data as well. Changes compared to v1: - Use of consistent DMA mappings. - Added dma_rx_poll_rate in platform data to specify the polling interval. - Added dma_rx_poll_timeout in platform data to specify the polling timeout. Changes compared to v2: - Use of consistent DMA mappings for default. - Added dma_rx_poll_enable in platform data to adjust the polling time according to the baud rate. - remove unnecessary lock from the polling function. Signed-off-by: Chanho Min <chanho.min@lge.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-03-27 17:38:11 +08:00
if (uap->dmarx.poll_rate)
del_timer_sync(&uap->dmarx.timer);
uap->using_rx_dma = false;
}
}
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
static inline bool pl011_dma_rx_available(struct uart_amba_port *uap)
{
return uap->using_rx_dma;
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
}
static inline bool pl011_dma_rx_running(struct uart_amba_port *uap)
{
return uap->using_rx_dma && uap->dmarx.running;
}
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
#else
/* Blank functions if the DMA engine is not available */
static inline void pl011_dma_probe(struct device *dev, struct uart_amba_port *uap)
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
{
}
static inline void pl011_dma_remove(struct uart_amba_port *uap)
{
}
static inline void pl011_dma_startup(struct uart_amba_port *uap)
{
}
static inline void pl011_dma_shutdown(struct uart_amba_port *uap)
{
}
static inline bool pl011_dma_tx_irq(struct uart_amba_port *uap)
{
return false;
}
static inline void pl011_dma_tx_stop(struct uart_amba_port *uap)
{
}
static inline bool pl011_dma_tx_start(struct uart_amba_port *uap)
{
return false;
}
static inline void pl011_dma_rx_irq(struct uart_amba_port *uap)
{
}
static inline void pl011_dma_rx_stop(struct uart_amba_port *uap)
{
}
static inline int pl011_dma_rx_trigger_dma(struct uart_amba_port *uap)
{
return -EIO;
}
static inline bool pl011_dma_rx_available(struct uart_amba_port *uap)
{
return false;
}
static inline bool pl011_dma_rx_running(struct uart_amba_port *uap)
{
return false;
}
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
#define pl011_dma_flush_buffer NULL
#endif
static void pl011_stop_tx(struct uart_port *port)
{
struct uart_amba_port *uap = (struct uart_amba_port *)port;
uap->im &= ~UART011_TXIM;
writew(uap->im, uap->port.membase + UART011_IMSC);
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
pl011_dma_tx_stop(uap);
}
static void pl011_start_tx(struct uart_port *port)
{
struct uart_amba_port *uap = (struct uart_amba_port *)port;
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
if (!pl011_dma_tx_start(uap)) {
uap->im |= UART011_TXIM;
writew(uap->im, uap->port.membase + UART011_IMSC);
}
}
static void pl011_stop_rx(struct uart_port *port)
{
struct uart_amba_port *uap = (struct uart_amba_port *)port;
uap->im &= ~(UART011_RXIM|UART011_RTIM|UART011_FEIM|
UART011_PEIM|UART011_BEIM|UART011_OEIM);
writew(uap->im, uap->port.membase + UART011_IMSC);
pl011_dma_rx_stop(uap);
}
static void pl011_enable_ms(struct uart_port *port)
{
struct uart_amba_port *uap = (struct uart_amba_port *)port;
uap->im |= UART011_RIMIM|UART011_CTSMIM|UART011_DCDMIM|UART011_DSRMIM;
writew(uap->im, uap->port.membase + UART011_IMSC);
}
IRQ: Maintain regs pointer globally rather than passing to IRQ handlers Maintain a per-CPU global "struct pt_regs *" variable which can be used instead of passing regs around manually through all ~1800 interrupt handlers in the Linux kernel. The regs pointer is used in few places, but it potentially costs both stack space and code to pass it around. On the FRV arch, removing the regs parameter from all the genirq function results in a 20% speed up of the IRQ exit path (ie: from leaving timer_interrupt() to leaving do_IRQ()). Where appropriate, an arch may override the generic storage facility and do something different with the variable. On FRV, for instance, the address is maintained in GR28 at all times inside the kernel as part of general exception handling. Having looked over the code, it appears that the parameter may be handed down through up to twenty or so layers of functions. Consider a USB character device attached to a USB hub, attached to a USB controller that posts its interrupts through a cascaded auxiliary interrupt controller. A character device driver may want to pass regs to the sysrq handler through the input layer which adds another few layers of parameter passing. I've build this code with allyesconfig for x86_64 and i386. I've runtested the main part of the code on FRV and i386, though I can't test most of the drivers. I've also done partial conversion for powerpc and MIPS - these at least compile with minimal configurations. This will affect all archs. Mostly the changes should be relatively easy. Take do_IRQ(), store the regs pointer at the beginning, saving the old one: struct pt_regs *old_regs = set_irq_regs(regs); And put the old one back at the end: set_irq_regs(old_regs); Don't pass regs through to generic_handle_irq() or __do_IRQ(). In timer_interrupt(), this sort of change will be necessary: - update_process_times(user_mode(regs)); - profile_tick(CPU_PROFILING, regs); + update_process_times(user_mode(get_irq_regs())); + profile_tick(CPU_PROFILING); I'd like to move update_process_times()'s use of get_irq_regs() into itself, except that i386, alone of the archs, uses something other than user_mode(). Some notes on the interrupt handling in the drivers: (*) input_dev() is now gone entirely. The regs pointer is no longer stored in the input_dev struct. (*) finish_unlinks() in drivers/usb/host/ohci-q.c needs checking. It does something different depending on whether it's been supplied with a regs pointer or not. (*) Various IRQ handler function pointers have been moved to type irq_handler_t. Signed-Off-By: David Howells <dhowells@redhat.com> (cherry picked from 1b16e7ac850969f38b375e511e3fa2f474a33867 commit)
2006-10-05 21:55:46 +08:00
static void pl011_rx_chars(struct uart_amba_port *uap)
{
pl011_fifo_to_tty(uap);
spin_unlock(&uap->port.lock);
tty_flip_buffer_push(&uap->port.state->port);
/*
* If we were temporarily out of DMA mode for a while,
* attempt to switch back to DMA mode again.
*/
if (pl011_dma_rx_available(uap)) {
if (pl011_dma_rx_trigger_dma(uap)) {
dev_dbg(uap->port.dev, "could not trigger RX DMA job "
"fall back to interrupt mode again\n");
uap->im |= UART011_RXIM;
ARM: PL011: Add support for Rx DMA buffer polling. In DMA support, The received data is not pushed to tty until the DMA buffer is filled. But some megabyte rate chips such as BT expect fast response and data should be pushed immediately. In order to fix this issue, We suggest the use of the timer for polling DMA buffer. In our test, no data loss occurred at high-baudrate as compared with interrupt- driven (We tested with 3Mbps). We changes: - We add timer for polling. If we set poll_timer to 10, every 10ms, timer handler checks the residue in the dma buffer and transfer data to the tty. Also, last_residue is updated for the next polling. - poll_timeout is used to prevent the timer's system cost. If poll_timeout is set to 3000 and no data is received in 3 seconds, we inactivate poll timer and driver falls back to interrupt-driven. When data is received again in FIFO and UART irq is occurred, we switch back to DMA mode and start polling. - We use consistent DMA mappings to avoid from the frequent cache operation of the timer function for default. - pl011_dma_rx_chars is modified. the pending size is recalculated because data can be taken by polling. - the polling time is adjusted if dma rx poll is enabled but no rate is specified. Ideal polling interval to push 1 character at every interval is the reciprocal of 'baud rate / 10 line bits per character / 1000 ms per sec'. But It is very aggressive to system. Experimentally, '10000000 / baud' is suitable to receive dozens of characters. the poll rate can be specified statically by dma_rx_poll_rate of the platform data as well. Changes compared to v1: - Use of consistent DMA mappings. - Added dma_rx_poll_rate in platform data to specify the polling interval. - Added dma_rx_poll_timeout in platform data to specify the polling timeout. Changes compared to v2: - Use of consistent DMA mappings for default. - Added dma_rx_poll_enable in platform data to adjust the polling time according to the baud rate. - remove unnecessary lock from the polling function. Signed-off-by: Chanho Min <chanho.min@lge.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-03-27 17:38:11 +08:00
} else {
uap->im &= ~UART011_RXIM;
#ifdef CONFIG_DMA_ENGINE
ARM: PL011: Add support for Rx DMA buffer polling. In DMA support, The received data is not pushed to tty until the DMA buffer is filled. But some megabyte rate chips such as BT expect fast response and data should be pushed immediately. In order to fix this issue, We suggest the use of the timer for polling DMA buffer. In our test, no data loss occurred at high-baudrate as compared with interrupt- driven (We tested with 3Mbps). We changes: - We add timer for polling. If we set poll_timer to 10, every 10ms, timer handler checks the residue in the dma buffer and transfer data to the tty. Also, last_residue is updated for the next polling. - poll_timeout is used to prevent the timer's system cost. If poll_timeout is set to 3000 and no data is received in 3 seconds, we inactivate poll timer and driver falls back to interrupt-driven. When data is received again in FIFO and UART irq is occurred, we switch back to DMA mode and start polling. - We use consistent DMA mappings to avoid from the frequent cache operation of the timer function for default. - pl011_dma_rx_chars is modified. the pending size is recalculated because data can be taken by polling. - the polling time is adjusted if dma rx poll is enabled but no rate is specified. Ideal polling interval to push 1 character at every interval is the reciprocal of 'baud rate / 10 line bits per character / 1000 ms per sec'. But It is very aggressive to system. Experimentally, '10000000 / baud' is suitable to receive dozens of characters. the poll rate can be specified statically by dma_rx_poll_rate of the platform data as well. Changes compared to v1: - Use of consistent DMA mappings. - Added dma_rx_poll_rate in platform data to specify the polling interval. - Added dma_rx_poll_timeout in platform data to specify the polling timeout. Changes compared to v2: - Use of consistent DMA mappings for default. - Added dma_rx_poll_enable in platform data to adjust the polling time according to the baud rate. - remove unnecessary lock from the polling function. Signed-off-by: Chanho Min <chanho.min@lge.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-03-27 17:38:11 +08:00
/* Start Rx DMA poll */
if (uap->dmarx.poll_rate) {
uap->dmarx.last_jiffies = jiffies;
uap->dmarx.last_residue = PL011_DMA_BUFFER_SIZE;
mod_timer(&uap->dmarx.timer,
jiffies +
msecs_to_jiffies(uap->dmarx.poll_rate));
}
#endif
ARM: PL011: Add support for Rx DMA buffer polling. In DMA support, The received data is not pushed to tty until the DMA buffer is filled. But some megabyte rate chips such as BT expect fast response and data should be pushed immediately. In order to fix this issue, We suggest the use of the timer for polling DMA buffer. In our test, no data loss occurred at high-baudrate as compared with interrupt- driven (We tested with 3Mbps). We changes: - We add timer for polling. If we set poll_timer to 10, every 10ms, timer handler checks the residue in the dma buffer and transfer data to the tty. Also, last_residue is updated for the next polling. - poll_timeout is used to prevent the timer's system cost. If poll_timeout is set to 3000 and no data is received in 3 seconds, we inactivate poll timer and driver falls back to interrupt-driven. When data is received again in FIFO and UART irq is occurred, we switch back to DMA mode and start polling. - We use consistent DMA mappings to avoid from the frequent cache operation of the timer function for default. - pl011_dma_rx_chars is modified. the pending size is recalculated because data can be taken by polling. - the polling time is adjusted if dma rx poll is enabled but no rate is specified. Ideal polling interval to push 1 character at every interval is the reciprocal of 'baud rate / 10 line bits per character / 1000 ms per sec'. But It is very aggressive to system. Experimentally, '10000000 / baud' is suitable to receive dozens of characters. the poll rate can be specified statically by dma_rx_poll_rate of the platform data as well. Changes compared to v1: - Use of consistent DMA mappings. - Added dma_rx_poll_rate in platform data to specify the polling interval. - Added dma_rx_poll_timeout in platform data to specify the polling timeout. Changes compared to v2: - Use of consistent DMA mappings for default. - Added dma_rx_poll_enable in platform data to adjust the polling time according to the baud rate. - remove unnecessary lock from the polling function. Signed-off-by: Chanho Min <chanho.min@lge.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-03-27 17:38:11 +08:00
}
writew(uap->im, uap->port.membase + UART011_IMSC);
}
spin_lock(&uap->port.lock);
}
static void pl011_tx_chars(struct uart_amba_port *uap)
{
struct circ_buf *xmit = &uap->port.state->xmit;
int count;
if (uap->port.x_char) {
writew(uap->port.x_char, uap->port.membase + UART01x_DR);
uap->port.icount.tx++;
uap->port.x_char = 0;
return;
}
if (uart_circ_empty(xmit) || uart_tx_stopped(&uap->port)) {
pl011_stop_tx(&uap->port);
return;
}
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
/* If we are using DMA mode, try to send some characters. */
if (pl011_dma_tx_irq(uap))
return;
count = uap->fifosize >> 1;
do {
writew(xmit->buf[xmit->tail], uap->port.membase + UART01x_DR);
xmit->tail = (xmit->tail + 1) & (UART_XMIT_SIZE - 1);
uap->port.icount.tx++;
if (uart_circ_empty(xmit))
break;
} while (--count > 0);
if (uart_circ_chars_pending(xmit) < WAKEUP_CHARS)
uart_write_wakeup(&uap->port);
if (uart_circ_empty(xmit))
pl011_stop_tx(&uap->port);
}
static void pl011_modem_status(struct uart_amba_port *uap)
{
unsigned int status, delta;
status = readw(uap->port.membase + UART01x_FR) & UART01x_FR_MODEM_ANY;
delta = status ^ uap->old_status;
uap->old_status = status;
if (!delta)
return;
if (delta & UART01x_FR_DCD)
uart_handle_dcd_change(&uap->port, status & UART01x_FR_DCD);
if (delta & UART01x_FR_DSR)
uap->port.icount.dsr++;
if (delta & UART01x_FR_CTS)
uart_handle_cts_change(&uap->port, status & UART01x_FR_CTS);
wake_up_interruptible(&uap->port.state->port.delta_msr_wait);
}
IRQ: Maintain regs pointer globally rather than passing to IRQ handlers Maintain a per-CPU global "struct pt_regs *" variable which can be used instead of passing regs around manually through all ~1800 interrupt handlers in the Linux kernel. The regs pointer is used in few places, but it potentially costs both stack space and code to pass it around. On the FRV arch, removing the regs parameter from all the genirq function results in a 20% speed up of the IRQ exit path (ie: from leaving timer_interrupt() to leaving do_IRQ()). Where appropriate, an arch may override the generic storage facility and do something different with the variable. On FRV, for instance, the address is maintained in GR28 at all times inside the kernel as part of general exception handling. Having looked over the code, it appears that the parameter may be handed down through up to twenty or so layers of functions. Consider a USB character device attached to a USB hub, attached to a USB controller that posts its interrupts through a cascaded auxiliary interrupt controller. A character device driver may want to pass regs to the sysrq handler through the input layer which adds another few layers of parameter passing. I've build this code with allyesconfig for x86_64 and i386. I've runtested the main part of the code on FRV and i386, though I can't test most of the drivers. I've also done partial conversion for powerpc and MIPS - these at least compile with minimal configurations. This will affect all archs. Mostly the changes should be relatively easy. Take do_IRQ(), store the regs pointer at the beginning, saving the old one: struct pt_regs *old_regs = set_irq_regs(regs); And put the old one back at the end: set_irq_regs(old_regs); Don't pass regs through to generic_handle_irq() or __do_IRQ(). In timer_interrupt(), this sort of change will be necessary: - update_process_times(user_mode(regs)); - profile_tick(CPU_PROFILING, regs); + update_process_times(user_mode(get_irq_regs())); + profile_tick(CPU_PROFILING); I'd like to move update_process_times()'s use of get_irq_regs() into itself, except that i386, alone of the archs, uses something other than user_mode(). Some notes on the interrupt handling in the drivers: (*) input_dev() is now gone entirely. The regs pointer is no longer stored in the input_dev struct. (*) finish_unlinks() in drivers/usb/host/ohci-q.c needs checking. It does something different depending on whether it's been supplied with a regs pointer or not. (*) Various IRQ handler function pointers have been moved to type irq_handler_t. Signed-Off-By: David Howells <dhowells@redhat.com> (cherry picked from 1b16e7ac850969f38b375e511e3fa2f474a33867 commit)
2006-10-05 21:55:46 +08:00
static irqreturn_t pl011_int(int irq, void *dev_id)
{
struct uart_amba_port *uap = dev_id;
unsigned long flags;
unsigned int status, pass_counter = AMBA_ISR_PASS_LIMIT;
int handled = 0;
serial: pl011: implement workaround for CTS clear event issue Problem Observed: - interrupt status is set by rising or falling edge on CTS line - interrupt status is cleared on a .0. to .1. transition of the interrupt-clear register bit 1. - interrupt-clear register is reset by hardware once the interrupt status is .0.. Remark: It seems not possible to read this register back by the CPU though, but internally this register exists. - when simultaneous set and reset event on the interrupt status happens, then the set-event has priority and the status remains .1.. As a result the interrupt-clear register is not reset to .0., and no new .0. to .1. transition can be detected on it when writing a .1. to it. This implies race condition, the clear must be performed at least one UARTCLK the riding edge of CTS RIS interrupt. Fix: Instead of resetting UART as done in commit c16d51a32bbb61ac8fd96f78b5ce2fccfe0fb4c3 "amba pl011: workaround for uart registers lockup" do the following: write .0. and then .1. to the interrupt-clear register to make sure that this transition is detected. According to the datasheet writing a .0. does not have any effect, but actually it allows to reset the internal interrupt-clear register. Take into account: The .0. needs to last at least for one clk_uart clock period (~ 38 MHz, 26.08ns) This way we can do away with the tasklet and keep only a tiny fix triggered by the variant flag introduced in this patch. Signed-off-by: Guillaume Jaunet <guillaume.jaunet@stericsson.com> Signed-off-by: Christophe Arnal <christophe.arnal@stericsson.com> Signed-off-by: Matthias Locher <Matthias.Locher@stericsson.com> Signed-off-by: Rajanikanth H.V <rajanikanth.hv@stericsson.com> Reviewed-by: Srinidhi Kasagar <srinidhi.kasagar@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-03-26 17:17:02 +08:00
unsigned int dummy_read;
spin_lock_irqsave(&uap->port.lock, flags);
status = readw(uap->port.membase + UART011_MIS);
if (status) {
do {
serial: pl011: implement workaround for CTS clear event issue Problem Observed: - interrupt status is set by rising or falling edge on CTS line - interrupt status is cleared on a .0. to .1. transition of the interrupt-clear register bit 1. - interrupt-clear register is reset by hardware once the interrupt status is .0.. Remark: It seems not possible to read this register back by the CPU though, but internally this register exists. - when simultaneous set and reset event on the interrupt status happens, then the set-event has priority and the status remains .1.. As a result the interrupt-clear register is not reset to .0., and no new .0. to .1. transition can be detected on it when writing a .1. to it. This implies race condition, the clear must be performed at least one UARTCLK the riding edge of CTS RIS interrupt. Fix: Instead of resetting UART as done in commit c16d51a32bbb61ac8fd96f78b5ce2fccfe0fb4c3 "amba pl011: workaround for uart registers lockup" do the following: write .0. and then .1. to the interrupt-clear register to make sure that this transition is detected. According to the datasheet writing a .0. does not have any effect, but actually it allows to reset the internal interrupt-clear register. Take into account: The .0. needs to last at least for one clk_uart clock period (~ 38 MHz, 26.08ns) This way we can do away with the tasklet and keep only a tiny fix triggered by the variant flag introduced in this patch. Signed-off-by: Guillaume Jaunet <guillaume.jaunet@stericsson.com> Signed-off-by: Christophe Arnal <christophe.arnal@stericsson.com> Signed-off-by: Matthias Locher <Matthias.Locher@stericsson.com> Signed-off-by: Rajanikanth H.V <rajanikanth.hv@stericsson.com> Reviewed-by: Srinidhi Kasagar <srinidhi.kasagar@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-03-26 17:17:02 +08:00
if (uap->vendor->cts_event_workaround) {
/* workaround to make sure that all bits are unlocked.. */
writew(0x00, uap->port.membase + UART011_ICR);
/*
* WA: introduce 26ns(1 uart clk) delay before W1C;
* single apb access will incur 2 pclk(133.12Mhz) delay,
* so add 2 dummy reads
*/
dummy_read = readw(uap->port.membase + UART011_ICR);
dummy_read = readw(uap->port.membase + UART011_ICR);
}
writew(status & ~(UART011_TXIS|UART011_RTIS|
UART011_RXIS),
uap->port.membase + UART011_ICR);
if (status & (UART011_RTIS|UART011_RXIS)) {
if (pl011_dma_rx_running(uap))
pl011_dma_rx_irq(uap);
else
pl011_rx_chars(uap);
}
if (status & (UART011_DSRMIS|UART011_DCDMIS|
UART011_CTSMIS|UART011_RIMIS))
pl011_modem_status(uap);
if (status & UART011_TXIS)
pl011_tx_chars(uap);
serial: pl011: implement workaround for CTS clear event issue Problem Observed: - interrupt status is set by rising or falling edge on CTS line - interrupt status is cleared on a .0. to .1. transition of the interrupt-clear register bit 1. - interrupt-clear register is reset by hardware once the interrupt status is .0.. Remark: It seems not possible to read this register back by the CPU though, but internally this register exists. - when simultaneous set and reset event on the interrupt status happens, then the set-event has priority and the status remains .1.. As a result the interrupt-clear register is not reset to .0., and no new .0. to .1. transition can be detected on it when writing a .1. to it. This implies race condition, the clear must be performed at least one UARTCLK the riding edge of CTS RIS interrupt. Fix: Instead of resetting UART as done in commit c16d51a32bbb61ac8fd96f78b5ce2fccfe0fb4c3 "amba pl011: workaround for uart registers lockup" do the following: write .0. and then .1. to the interrupt-clear register to make sure that this transition is detected. According to the datasheet writing a .0. does not have any effect, but actually it allows to reset the internal interrupt-clear register. Take into account: The .0. needs to last at least for one clk_uart clock period (~ 38 MHz, 26.08ns) This way we can do away with the tasklet and keep only a tiny fix triggered by the variant flag introduced in this patch. Signed-off-by: Guillaume Jaunet <guillaume.jaunet@stericsson.com> Signed-off-by: Christophe Arnal <christophe.arnal@stericsson.com> Signed-off-by: Matthias Locher <Matthias.Locher@stericsson.com> Signed-off-by: Rajanikanth H.V <rajanikanth.hv@stericsson.com> Reviewed-by: Srinidhi Kasagar <srinidhi.kasagar@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-03-26 17:17:02 +08:00
if (pass_counter-- == 0)
break;
status = readw(uap->port.membase + UART011_MIS);
} while (status != 0);
handled = 1;
}
spin_unlock_irqrestore(&uap->port.lock, flags);
return IRQ_RETVAL(handled);
}
static unsigned int pl011_tx_empty(struct uart_port *port)
{
struct uart_amba_port *uap = (struct uart_amba_port *)port;
unsigned int status = readw(uap->port.membase + UART01x_FR);
return status & (UART01x_FR_BUSY|UART01x_FR_TXFF) ? 0 : TIOCSER_TEMT;
}
static unsigned int pl011_get_mctrl(struct uart_port *port)
{
struct uart_amba_port *uap = (struct uart_amba_port *)port;
unsigned int result = 0;
unsigned int status = readw(uap->port.membase + UART01x_FR);
#define TIOCMBIT(uartbit, tiocmbit) \
if (status & uartbit) \
result |= tiocmbit
TIOCMBIT(UART01x_FR_DCD, TIOCM_CAR);
TIOCMBIT(UART01x_FR_DSR, TIOCM_DSR);
TIOCMBIT(UART01x_FR_CTS, TIOCM_CTS);
TIOCMBIT(UART011_FR_RI, TIOCM_RNG);
#undef TIOCMBIT
return result;
}
static void pl011_set_mctrl(struct uart_port *port, unsigned int mctrl)
{
struct uart_amba_port *uap = (struct uart_amba_port *)port;
unsigned int cr;
cr = readw(uap->port.membase + UART011_CR);
#define TIOCMBIT(tiocmbit, uartbit) \
if (mctrl & tiocmbit) \
cr |= uartbit; \
else \
cr &= ~uartbit
TIOCMBIT(TIOCM_RTS, UART011_CR_RTS);
TIOCMBIT(TIOCM_DTR, UART011_CR_DTR);
TIOCMBIT(TIOCM_OUT1, UART011_CR_OUT1);
TIOCMBIT(TIOCM_OUT2, UART011_CR_OUT2);
TIOCMBIT(TIOCM_LOOP, UART011_CR_LBE);
if (uap->autorts) {
/* We need to disable auto-RTS if we want to turn RTS off */
TIOCMBIT(TIOCM_RTS, UART011_CR_RTSEN);
}
#undef TIOCMBIT
writew(cr, uap->port.membase + UART011_CR);
}
static void pl011_break_ctl(struct uart_port *port, int break_state)
{
struct uart_amba_port *uap = (struct uart_amba_port *)port;
unsigned long flags;
unsigned int lcr_h;
spin_lock_irqsave(&uap->port.lock, flags);
lcr_h = readw(uap->port.membase + uap->lcrh_tx);
if (break_state == -1)
lcr_h |= UART01x_LCRH_BRK;
else
lcr_h &= ~UART01x_LCRH_BRK;
writew(lcr_h, uap->port.membase + uap->lcrh_tx);
spin_unlock_irqrestore(&uap->port.lock, flags);
}
#ifdef CONFIG_CONSOLE_POLL
static void pl011_quiesce_irqs(struct uart_port *port)
{
struct uart_amba_port *uap = (struct uart_amba_port *)port;
unsigned char __iomem *regs = uap->port.membase;
writew(readw(regs + UART011_MIS), regs + UART011_ICR);
/*
* There is no way to clear TXIM as this is "ready to transmit IRQ", so
* we simply mask it. start_tx() will unmask it.
*
* Note we can race with start_tx(), and if the race happens, the
* polling user might get another interrupt just after we clear it.
* But it should be OK and can happen even w/o the race, e.g.
* controller immediately got some new data and raised the IRQ.
*
* And whoever uses polling routines assumes that it manages the device
* (including tx queue), so we're also fine with start_tx()'s caller
* side.
*/
writew(readw(regs + UART011_IMSC) & ~UART011_TXIM, regs + UART011_IMSC);
}
static int pl011_get_poll_char(struct uart_port *port)
{
struct uart_amba_port *uap = (struct uart_amba_port *)port;
unsigned int status;
/*
* The caller might need IRQs lowered, e.g. if used with KDB NMI
* debugger.
*/
pl011_quiesce_irqs(port);
status = readw(uap->port.membase + UART01x_FR);
if (status & UART01x_FR_RXFE)
return NO_POLL_CHAR;
return readw(uap->port.membase + UART01x_DR);
}
static void pl011_put_poll_char(struct uart_port *port,
unsigned char ch)
{
struct uart_amba_port *uap = (struct uart_amba_port *)port;
while (readw(uap->port.membase + UART01x_FR) & UART01x_FR_TXFF)
barrier();
writew(ch, uap->port.membase + UART01x_DR);
}
#endif /* CONFIG_CONSOLE_POLL */
tty/serial/amba-pl011: Implement poll_init callback The callback is used to initialize the hardware, nothing else should be done, i.e. we should not request interrupts (but we can and do unmask some of them, as they might be useful for NMI entry). As a side-effect, the patch also fixes a division by zero[1] when booting with kgdboc options specified (e.g. kgdboc=ttyAMA0,115200n8). The issue happens because serial core calls set_termios callback, but the driver doesn't know clock frequency, and thus cannot calculate proper baud rate values. [1] WARNING: at drivers/tty/serial/serial_core.c:400 uart_get_baud_rate+0xe8/0x14c() Modules linked in: [<c0018e50>] (unwind_backtrace+0x0/0xf0) from [<c0020ae8>] (warn_slowpath_common+0x4c/0x64) [<c0020ae8>] (warn_slowpath_common+0x4c/0x64) from [<c0020b1c>] (warn_slowpath_null+0x1c/0x24) [<c0020b1c>] (warn_slowpath_null+0x1c/0x24) from [<c0185ed8>] (uart_get_baud_rate+0xe8/0x14c) [<c0185ed8>] (uart_get_baud_rate+0xe8/0x14c) from [<c0187078>] (pl011_set_termios+0x48/0x278) [<c0187078>] (pl011_set_termios+0x48/0x278) from [<c01850b0>] (uart_set_options+0xe8/0x114) [<c01850b0>] (uart_set_options+0xe8/0x114) from [<c0185de4>] (uart_poll_init+0xd4/0xe0) [<c0185de4>] (uart_poll_init+0xd4/0xe0) from [<c016da8c>] (tty_find_polling_driver+0x100/0x17c) [<c016da8c>] (tty_find_polling_driver+0x100/0x17c) from [<c0188538>] (configure_kgdboc+0xc8/0x1b8) [<c0188538>] (configure_kgdboc+0xc8/0x1b8) from [<c00088a4>] (do_one_initcall+0x30/0x168) [<c00088a4>] (do_one_initcall+0x30/0x168) from [<c033784c>] (do_basic_setup+0x94/0xc8) [<c033784c>] (do_basic_setup+0x94/0xc8) from [<c03378e0>] (kernel_init+0x60/0xf4) [<c03378e0>] (kernel_init+0x60/0xf4) from [<c00144a0>] (kernel_thread_exit+0x0/0x8) ---[ end trace 7d41c9186f342c40 ]--- Division by zero in kernel. [<c0018e50>] (unwind_backtrace+0x0/0xf0) from [<c014546c>] (Ldiv0+0x8/0x10) [<c014546c>] (Ldiv0+0x8/0x10) from [<c0187098>] (pl011_set_termios+0x68/0x278) [<c0187098>] (pl011_set_termios+0x68/0x278) from [<c01850b0>] (uart_set_options+0xe8/0x114) [<c01850b0>] (uart_set_options+0xe8/0x114) from [<c0185de4>] (uart_poll_init+0xd4/0xe0) [<c0185de4>] (uart_poll_init+0xd4/0xe0) from [<c016da8c>] (tty_find_polling_driver+0x100/0x17c) [<c016da8c>] (tty_find_polling_driver+0x100/0x17c) from [<c0188538>] (configure_kgdboc+0xc8/0x1b8) [<c0188538>] (configure_kgdboc+0xc8/0x1b8) from [<c00088a4>] (do_one_initcall+0x30/0x168) [<c00088a4>] (do_one_initcall+0x30/0x168) from [<c033784c>] (do_basic_setup+0x94/0xc8) [<c033784c>] (do_basic_setup+0x94/0xc8) from [<c03378e0>] (kernel_init+0x60/0xf4) [<c03378e0>] (kernel_init+0x60/0xf4) from [<c00144a0>] (kernel_thread_exit+0x0/0x8) Division by zero in kernel. [<c0018e50>] (unwind_backtrace+0x0/0xf0) from [<c014546c>] (Ldiv0+0x8/0x10) [<c014546c>] (Ldiv0+0x8/0x10) from [<c0183a98>] (uart_update_timeout+0x4c/0x5c) [<c0183a98>] (uart_update_timeout+0x4c/0x5c) from [<c01870f8>] (pl011_set_termios+0xc8/0x278) [<c01870f8>] (pl011_set_termios+0xc8/0x278) from [<c01850b0>] (uart_set_options+0xe8/0x114) [<c01850b0>] (uart_set_options+0xe8/0x114) from [<c0185de4>] (uart_poll_init+0xd4/0xe0) [<c0185de4>] (uart_poll_init+0xd4/0xe0) from [<c016da8c>] (tty_find_polling_driver+0x100/0x17c) [<c016da8c>] (tty_find_polling_driver+0x100/0x17c) from [<c0188538>] (configure_kgdboc+0xc8/0x1b8) [<c0188538>] (configure_kgdboc+0xc8/0x1b8) from [<c00088a4>] (do_one_initcall+0x30/0x168) [<c00088a4>] (do_one_initcall+0x30/0x168) from [<c033784c>] (do_basic_setup+0x94/0xc8) [<c033784c>] (do_basic_setup+0x94/0xc8) from [<c03378e0>] (kernel_init+0x60/0xf4) [<c03378e0>] (kernel_init+0x60/0xf4) from [<c00144a0>] (kernel_thread_exit+0x0/0x8) Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org> Acked-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-09-25 05:27:54 +08:00
static int pl011_hwinit(struct uart_port *port)
{
struct uart_amba_port *uap = (struct uart_amba_port *)port;
int retval;
/* Optionaly enable pins to be muxed in and configured */
pinctrl_pm_select_default_state(port->dev);
/*
* Try to enable the clock producer.
*/
retval = clk_prepare_enable(uap->clk);
if (retval)
goto out;
uap->port.uartclk = clk_get_rate(uap->clk);
/* Clear pending error and receive interrupts */
writew(UART011_OEIS | UART011_BEIS | UART011_PEIS | UART011_FEIS |
UART011_RTIS | UART011_RXIS, uap->port.membase + UART011_ICR);
tty/serial/amba-pl011: Implement poll_init callback The callback is used to initialize the hardware, nothing else should be done, i.e. we should not request interrupts (but we can and do unmask some of them, as they might be useful for NMI entry). As a side-effect, the patch also fixes a division by zero[1] when booting with kgdboc options specified (e.g. kgdboc=ttyAMA0,115200n8). The issue happens because serial core calls set_termios callback, but the driver doesn't know clock frequency, and thus cannot calculate proper baud rate values. [1] WARNING: at drivers/tty/serial/serial_core.c:400 uart_get_baud_rate+0xe8/0x14c() Modules linked in: [<c0018e50>] (unwind_backtrace+0x0/0xf0) from [<c0020ae8>] (warn_slowpath_common+0x4c/0x64) [<c0020ae8>] (warn_slowpath_common+0x4c/0x64) from [<c0020b1c>] (warn_slowpath_null+0x1c/0x24) [<c0020b1c>] (warn_slowpath_null+0x1c/0x24) from [<c0185ed8>] (uart_get_baud_rate+0xe8/0x14c) [<c0185ed8>] (uart_get_baud_rate+0xe8/0x14c) from [<c0187078>] (pl011_set_termios+0x48/0x278) [<c0187078>] (pl011_set_termios+0x48/0x278) from [<c01850b0>] (uart_set_options+0xe8/0x114) [<c01850b0>] (uart_set_options+0xe8/0x114) from [<c0185de4>] (uart_poll_init+0xd4/0xe0) [<c0185de4>] (uart_poll_init+0xd4/0xe0) from [<c016da8c>] (tty_find_polling_driver+0x100/0x17c) [<c016da8c>] (tty_find_polling_driver+0x100/0x17c) from [<c0188538>] (configure_kgdboc+0xc8/0x1b8) [<c0188538>] (configure_kgdboc+0xc8/0x1b8) from [<c00088a4>] (do_one_initcall+0x30/0x168) [<c00088a4>] (do_one_initcall+0x30/0x168) from [<c033784c>] (do_basic_setup+0x94/0xc8) [<c033784c>] (do_basic_setup+0x94/0xc8) from [<c03378e0>] (kernel_init+0x60/0xf4) [<c03378e0>] (kernel_init+0x60/0xf4) from [<c00144a0>] (kernel_thread_exit+0x0/0x8) ---[ end trace 7d41c9186f342c40 ]--- Division by zero in kernel. [<c0018e50>] (unwind_backtrace+0x0/0xf0) from [<c014546c>] (Ldiv0+0x8/0x10) [<c014546c>] (Ldiv0+0x8/0x10) from [<c0187098>] (pl011_set_termios+0x68/0x278) [<c0187098>] (pl011_set_termios+0x68/0x278) from [<c01850b0>] (uart_set_options+0xe8/0x114) [<c01850b0>] (uart_set_options+0xe8/0x114) from [<c0185de4>] (uart_poll_init+0xd4/0xe0) [<c0185de4>] (uart_poll_init+0xd4/0xe0) from [<c016da8c>] (tty_find_polling_driver+0x100/0x17c) [<c016da8c>] (tty_find_polling_driver+0x100/0x17c) from [<c0188538>] (configure_kgdboc+0xc8/0x1b8) [<c0188538>] (configure_kgdboc+0xc8/0x1b8) from [<c00088a4>] (do_one_initcall+0x30/0x168) [<c00088a4>] (do_one_initcall+0x30/0x168) from [<c033784c>] (do_basic_setup+0x94/0xc8) [<c033784c>] (do_basic_setup+0x94/0xc8) from [<c03378e0>] (kernel_init+0x60/0xf4) [<c03378e0>] (kernel_init+0x60/0xf4) from [<c00144a0>] (kernel_thread_exit+0x0/0x8) Division by zero in kernel. [<c0018e50>] (unwind_backtrace+0x0/0xf0) from [<c014546c>] (Ldiv0+0x8/0x10) [<c014546c>] (Ldiv0+0x8/0x10) from [<c0183a98>] (uart_update_timeout+0x4c/0x5c) [<c0183a98>] (uart_update_timeout+0x4c/0x5c) from [<c01870f8>] (pl011_set_termios+0xc8/0x278) [<c01870f8>] (pl011_set_termios+0xc8/0x278) from [<c01850b0>] (uart_set_options+0xe8/0x114) [<c01850b0>] (uart_set_options+0xe8/0x114) from [<c0185de4>] (uart_poll_init+0xd4/0xe0) [<c0185de4>] (uart_poll_init+0xd4/0xe0) from [<c016da8c>] (tty_find_polling_driver+0x100/0x17c) [<c016da8c>] (tty_find_polling_driver+0x100/0x17c) from [<c0188538>] (configure_kgdboc+0xc8/0x1b8) [<c0188538>] (configure_kgdboc+0xc8/0x1b8) from [<c00088a4>] (do_one_initcall+0x30/0x168) [<c00088a4>] (do_one_initcall+0x30/0x168) from [<c033784c>] (do_basic_setup+0x94/0xc8) [<c033784c>] (do_basic_setup+0x94/0xc8) from [<c03378e0>] (kernel_init+0x60/0xf4) [<c03378e0>] (kernel_init+0x60/0xf4) from [<c00144a0>] (kernel_thread_exit+0x0/0x8) Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org> Acked-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-09-25 05:27:54 +08:00
/*
* Save interrupts enable mask, and enable RX interrupts in case if
* the interrupt is used for NMI entry.
*/
uap->im = readw(uap->port.membase + UART011_IMSC);
writew(UART011_RTIM | UART011_RXIM, uap->port.membase + UART011_IMSC);
if (uap->port.dev->platform_data) {
struct amba_pl011_data *plat;
plat = uap->port.dev->platform_data;
if (plat->init)
plat->init();
}
return 0;
out:
return retval;
}
static int pl011_startup(struct uart_port *port)
{
struct uart_amba_port *uap = (struct uart_amba_port *)port;
unsigned int cr;
int retval;
retval = pl011_hwinit(port);
if (retval)
goto clk_dis;
writew(uap->im, uap->port.membase + UART011_IMSC);
/*
* Allocate the IRQ
*/
retval = request_irq(uap->port.irq, pl011_int, 0, "uart-pl011", uap);
if (retval)
goto clk_dis;
writew(uap->vendor->ifls, uap->port.membase + UART011_IFLS);
/*
* Provoke TX FIFO interrupt into asserting.
*/
cr = UART01x_CR_UARTEN | UART011_CR_TXE | UART011_CR_LBE;
writew(cr, uap->port.membase + UART011_CR);
writew(0, uap->port.membase + UART011_FBRD);
writew(1, uap->port.membase + UART011_IBRD);
writew(0, uap->port.membase + uap->lcrh_rx);
if (uap->lcrh_tx != uap->lcrh_rx) {
int i;
/*
* Wait 10 PCLKs before writing LCRH_TX register,
* to get this delay write read only register 10 times
*/
for (i = 0; i < 10; ++i)
writew(0xff, uap->port.membase + UART011_MIS);
writew(0, uap->port.membase + uap->lcrh_tx);
}
writew(0, uap->port.membase + UART01x_DR);
while (readw(uap->port.membase + UART01x_FR) & UART01x_FR_BUSY)
barrier();
/* restore RTS and DTR */
cr = uap->old_cr & (UART011_CR_RTS | UART011_CR_DTR);
cr |= UART01x_CR_UARTEN | UART011_CR_RXE | UART011_CR_TXE;
writew(cr, uap->port.membase + UART011_CR);
/*
* initialise the old status of the modem signals
*/
uap->old_status = readw(uap->port.membase + UART01x_FR) & UART01x_FR_MODEM_ANY;
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
/* Startup DMA */
pl011_dma_startup(uap);
/*
* Finally, enable interrupts, only timeouts when using DMA
* if initial RX DMA job failed, start in interrupt mode
* as well.
*/
spin_lock_irq(&uap->port.lock);
/* Clear out any spuriously appearing RX interrupts */
writew(UART011_RTIS | UART011_RXIS,
uap->port.membase + UART011_ICR);
uap->im = UART011_RTIM;
if (!pl011_dma_rx_running(uap))
uap->im |= UART011_RXIM;
writew(uap->im, uap->port.membase + UART011_IMSC);
spin_unlock_irq(&uap->port.lock);
return 0;
clk_dis:
clk_disable_unprepare(uap->clk);
return retval;
}
static void pl011_shutdown_channel(struct uart_amba_port *uap,
unsigned int lcrh)
{
unsigned long val;
val = readw(uap->port.membase + lcrh);
val &= ~(UART01x_LCRH_BRK | UART01x_LCRH_FEN);
writew(val, uap->port.membase + lcrh);
}
static void pl011_shutdown(struct uart_port *port)
{
struct uart_amba_port *uap = (struct uart_amba_port *)port;
unsigned int cr;
/*
* disable all interrupts
*/
spin_lock_irq(&uap->port.lock);
uap->im = 0;
writew(uap->im, uap->port.membase + UART011_IMSC);
writew(0xffff, uap->port.membase + UART011_ICR);
spin_unlock_irq(&uap->port.lock);
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
pl011_dma_shutdown(uap);
/*
* Free the interrupt
*/
free_irq(uap->port.irq, uap);
/*
* disable the port
* disable the port. It should not disable RTS and DTR.
* Also RTS and DTR state should be preserved to restore
* it during startup().
*/
uap->autorts = false;
cr = readw(uap->port.membase + UART011_CR);
uap->old_cr = cr;
cr &= UART011_CR_RTS | UART011_CR_DTR;
cr |= UART01x_CR_UARTEN | UART011_CR_TXE;
writew(cr, uap->port.membase + UART011_CR);
/*
* disable break condition and fifos
*/
pl011_shutdown_channel(uap, uap->lcrh_rx);
if (uap->lcrh_rx != uap->lcrh_tx)
pl011_shutdown_channel(uap, uap->lcrh_tx);
/*
* Shut down the clock producer
*/
clk_disable_unprepare(uap->clk);
/* Optionally let pins go into sleep states */
pinctrl_pm_select_sleep_state(port->dev);
if (uap->port.dev->platform_data) {
struct amba_pl011_data *plat;
plat = uap->port.dev->platform_data;
if (plat->exit)
plat->exit();
}
}
static void
pl011_set_termios(struct uart_port *port, struct ktermios *termios,
struct ktermios *old)
{
struct uart_amba_port *uap = (struct uart_amba_port *)port;
unsigned int lcr_h, old_cr;
unsigned long flags;
unsigned int baud, quot, clkdiv;
if (uap->vendor->oversampling)
clkdiv = 8;
else
clkdiv = 16;
/*
* Ask the core to calculate the divisor for us.
*/
baud = uart_get_baud_rate(port, termios, old, 0,
port->uartclk / clkdiv);
#ifdef CONFIG_DMA_ENGINE
ARM: PL011: Add support for Rx DMA buffer polling. In DMA support, The received data is not pushed to tty until the DMA buffer is filled. But some megabyte rate chips such as BT expect fast response and data should be pushed immediately. In order to fix this issue, We suggest the use of the timer for polling DMA buffer. In our test, no data loss occurred at high-baudrate as compared with interrupt- driven (We tested with 3Mbps). We changes: - We add timer for polling. If we set poll_timer to 10, every 10ms, timer handler checks the residue in the dma buffer and transfer data to the tty. Also, last_residue is updated for the next polling. - poll_timeout is used to prevent the timer's system cost. If poll_timeout is set to 3000 and no data is received in 3 seconds, we inactivate poll timer and driver falls back to interrupt-driven. When data is received again in FIFO and UART irq is occurred, we switch back to DMA mode and start polling. - We use consistent DMA mappings to avoid from the frequent cache operation of the timer function for default. - pl011_dma_rx_chars is modified. the pending size is recalculated because data can be taken by polling. - the polling time is adjusted if dma rx poll is enabled but no rate is specified. Ideal polling interval to push 1 character at every interval is the reciprocal of 'baud rate / 10 line bits per character / 1000 ms per sec'. But It is very aggressive to system. Experimentally, '10000000 / baud' is suitable to receive dozens of characters. the poll rate can be specified statically by dma_rx_poll_rate of the platform data as well. Changes compared to v1: - Use of consistent DMA mappings. - Added dma_rx_poll_rate in platform data to specify the polling interval. - Added dma_rx_poll_timeout in platform data to specify the polling timeout. Changes compared to v2: - Use of consistent DMA mappings for default. - Added dma_rx_poll_enable in platform data to adjust the polling time according to the baud rate. - remove unnecessary lock from the polling function. Signed-off-by: Chanho Min <chanho.min@lge.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-03-27 17:38:11 +08:00
/*
* Adjust RX DMA polling rate with baud rate if not specified.
*/
if (uap->dmarx.auto_poll_rate)
uap->dmarx.poll_rate = DIV_ROUND_UP(10000000, baud);
#endif
if (baud > port->uartclk/16)
quot = DIV_ROUND_CLOSEST(port->uartclk * 8, baud);
else
quot = DIV_ROUND_CLOSEST(port->uartclk * 4, baud);
switch (termios->c_cflag & CSIZE) {
case CS5:
lcr_h = UART01x_LCRH_WLEN_5;
break;
case CS6:
lcr_h = UART01x_LCRH_WLEN_6;
break;
case CS7:
lcr_h = UART01x_LCRH_WLEN_7;
break;
default: // CS8
lcr_h = UART01x_LCRH_WLEN_8;
break;
}
if (termios->c_cflag & CSTOPB)
lcr_h |= UART01x_LCRH_STP2;
if (termios->c_cflag & PARENB) {
lcr_h |= UART01x_LCRH_PEN;
if (!(termios->c_cflag & PARODD))
lcr_h |= UART01x_LCRH_EPS;
}
if (uap->fifosize > 1)
lcr_h |= UART01x_LCRH_FEN;
spin_lock_irqsave(&port->lock, flags);
/*
* Update the per-port timeout.
*/
uart_update_timeout(port, termios->c_cflag, baud);
port->read_status_mask = UART011_DR_OE | 255;
if (termios->c_iflag & INPCK)
port->read_status_mask |= UART011_DR_FE | UART011_DR_PE;
if (termios->c_iflag & (BRKINT | PARMRK))
port->read_status_mask |= UART011_DR_BE;
/*
* Characters to ignore
*/
port->ignore_status_mask = 0;
if (termios->c_iflag & IGNPAR)
port->ignore_status_mask |= UART011_DR_FE | UART011_DR_PE;
if (termios->c_iflag & IGNBRK) {
port->ignore_status_mask |= UART011_DR_BE;
/*
* If we're ignoring parity and break indicators,
* ignore overruns too (for real raw support).
*/
if (termios->c_iflag & IGNPAR)
port->ignore_status_mask |= UART011_DR_OE;
}
/*
* Ignore all characters if CREAD is not set.
*/
if ((termios->c_cflag & CREAD) == 0)
port->ignore_status_mask |= UART_DUMMY_DR_RX;
if (UART_ENABLE_MS(port, termios->c_cflag))
pl011_enable_ms(port);
/* first, disable everything */
old_cr = readw(port->membase + UART011_CR);
writew(0, port->membase + UART011_CR);
if (termios->c_cflag & CRTSCTS) {
if (old_cr & UART011_CR_RTS)
old_cr |= UART011_CR_RTSEN;
old_cr |= UART011_CR_CTSEN;
uap->autorts = true;
} else {
old_cr &= ~(UART011_CR_CTSEN | UART011_CR_RTSEN);
uap->autorts = false;
}
if (uap->vendor->oversampling) {
if (baud > port->uartclk / 16)
old_cr |= ST_UART011_CR_OVSFACT;
else
old_cr &= ~ST_UART011_CR_OVSFACT;
}
serial: pl011: handle corruption at high clock speeds This works around a few glitches in the ST version of the PL011 serial driver when using very high baud rates, as we do in the Ux500: 3, 3.25, 4 and 4.05 Mbps. Problem Observed/rootcause: When using high baud-rates, and the baudrate*8 is getting close to the provided clock frequency (so a division factor close to 1), when using bursts of characters (so they are abutted), then it seems as if there is not enough time to detect the beginning of the start-bit which is a timing reference for the entire character, and thus the sampling moment of character bits is moving towards the end of each bit, instead of the middle. Fix: Increase slightly the RX baud rate of the UART above the theoretical baudrate by 5%. This will definitely give more margin time to the UART_RX to correctly sample the data at the middle of the bit period. Also fix the ages old copy-paste error in the very stressed comment, it's referencing the registers used in the PL010 driver rather than the PL011 ones. Signed-off-by: Guillaume Jaunet <guillaume.jaunet@stericsson.com> Signed-off-by: Christophe Arnal <christophe.arnal@stericsson.com> Signed-off-by: Matthias Locher <matthias.locher@stericsson.com> Signed-off-by: Rajanikanth HV <rajanikanth.hv@stericsson.com> Cc: stable <stable@vger.kernel.org> Cc: Bibek Basu <bibek.basu@stericsson.com> Cc: Par-Gunnar Hjalmdahl <par-gunnar.hjalmdahl@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-09-26 23:21:36 +08:00
/*
* Workaround for the ST Micro oversampling variants to
* increase the bitrate slightly, by lowering the divisor,
* to avoid delayed sampling of start bit at high speeds,
* else we see data corruption.
*/
if (uap->vendor->oversampling) {
if ((baud >= 3000000) && (baud < 3250000) && (quot > 1))
quot -= 1;
else if ((baud > 3250000) && (quot > 2))
quot -= 2;
}
/* Set baud rate */
writew(quot & 0x3f, port->membase + UART011_FBRD);
writew(quot >> 6, port->membase + UART011_IBRD);
/*
* ----------v----------v----------v----------v-----
serial: pl011: handle corruption at high clock speeds This works around a few glitches in the ST version of the PL011 serial driver when using very high baud rates, as we do in the Ux500: 3, 3.25, 4 and 4.05 Mbps. Problem Observed/rootcause: When using high baud-rates, and the baudrate*8 is getting close to the provided clock frequency (so a division factor close to 1), when using bursts of characters (so they are abutted), then it seems as if there is not enough time to detect the beginning of the start-bit which is a timing reference for the entire character, and thus the sampling moment of character bits is moving towards the end of each bit, instead of the middle. Fix: Increase slightly the RX baud rate of the UART above the theoretical baudrate by 5%. This will definitely give more margin time to the UART_RX to correctly sample the data at the middle of the bit period. Also fix the ages old copy-paste error in the very stressed comment, it's referencing the registers used in the PL010 driver rather than the PL011 ones. Signed-off-by: Guillaume Jaunet <guillaume.jaunet@stericsson.com> Signed-off-by: Christophe Arnal <christophe.arnal@stericsson.com> Signed-off-by: Matthias Locher <matthias.locher@stericsson.com> Signed-off-by: Rajanikanth HV <rajanikanth.hv@stericsson.com> Cc: stable <stable@vger.kernel.org> Cc: Bibek Basu <bibek.basu@stericsson.com> Cc: Par-Gunnar Hjalmdahl <par-gunnar.hjalmdahl@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-09-26 23:21:36 +08:00
* NOTE: lcrh_tx and lcrh_rx MUST BE WRITTEN AFTER
* UART011_FBRD & UART011_IBRD.
* ----------^----------^----------^----------^-----
*/
writew(lcr_h, port->membase + uap->lcrh_rx);
if (uap->lcrh_rx != uap->lcrh_tx) {
int i;
/*
* Wait 10 PCLKs before writing LCRH_TX register,
* to get this delay write read only register 10 times
*/
for (i = 0; i < 10; ++i)
writew(0xff, uap->port.membase + UART011_MIS);
writew(lcr_h, port->membase + uap->lcrh_tx);
}
writew(old_cr, port->membase + UART011_CR);
spin_unlock_irqrestore(&port->lock, flags);
}
static const char *pl011_type(struct uart_port *port)
{
struct uart_amba_port *uap = (struct uart_amba_port *)port;
return uap->port.type == PORT_AMBA ? uap->type : NULL;
}
/*
* Release the memory region(s) being used by 'port'
*/
static void pl011_release_port(struct uart_port *port)
{
release_mem_region(port->mapbase, SZ_4K);
}
/*
* Request the memory region(s) being used by 'port'
*/
static int pl011_request_port(struct uart_port *port)
{
return request_mem_region(port->mapbase, SZ_4K, "uart-pl011")
!= NULL ? 0 : -EBUSY;
}
/*
* Configure/autoconfigure the port.
*/
static void pl011_config_port(struct uart_port *port, int flags)
{
if (flags & UART_CONFIG_TYPE) {
port->type = PORT_AMBA;
pl011_request_port(port);
}
}
/*
* verify the new serial_struct (for TIOCSSERIAL).
*/
static int pl011_verify_port(struct uart_port *port, struct serial_struct *ser)
{
int ret = 0;
if (ser->type != PORT_UNKNOWN && ser->type != PORT_AMBA)
ret = -EINVAL;
if (ser->irq < 0 || ser->irq >= nr_irqs)
ret = -EINVAL;
if (ser->baud_base < 9600)
ret = -EINVAL;
return ret;
}
static struct uart_ops amba_pl011_pops = {
.tx_empty = pl011_tx_empty,
.set_mctrl = pl011_set_mctrl,
.get_mctrl = pl011_get_mctrl,
.stop_tx = pl011_stop_tx,
.start_tx = pl011_start_tx,
.stop_rx = pl011_stop_rx,
.enable_ms = pl011_enable_ms,
.break_ctl = pl011_break_ctl,
.startup = pl011_startup,
.shutdown = pl011_shutdown,
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
.flush_buffer = pl011_dma_flush_buffer,
.set_termios = pl011_set_termios,
.type = pl011_type,
.release_port = pl011_release_port,
.request_port = pl011_request_port,
.config_port = pl011_config_port,
.verify_port = pl011_verify_port,
#ifdef CONFIG_CONSOLE_POLL
tty/serial/amba-pl011: Implement poll_init callback The callback is used to initialize the hardware, nothing else should be done, i.e. we should not request interrupts (but we can and do unmask some of them, as they might be useful for NMI entry). As a side-effect, the patch also fixes a division by zero[1] when booting with kgdboc options specified (e.g. kgdboc=ttyAMA0,115200n8). The issue happens because serial core calls set_termios callback, but the driver doesn't know clock frequency, and thus cannot calculate proper baud rate values. [1] WARNING: at drivers/tty/serial/serial_core.c:400 uart_get_baud_rate+0xe8/0x14c() Modules linked in: [<c0018e50>] (unwind_backtrace+0x0/0xf0) from [<c0020ae8>] (warn_slowpath_common+0x4c/0x64) [<c0020ae8>] (warn_slowpath_common+0x4c/0x64) from [<c0020b1c>] (warn_slowpath_null+0x1c/0x24) [<c0020b1c>] (warn_slowpath_null+0x1c/0x24) from [<c0185ed8>] (uart_get_baud_rate+0xe8/0x14c) [<c0185ed8>] (uart_get_baud_rate+0xe8/0x14c) from [<c0187078>] (pl011_set_termios+0x48/0x278) [<c0187078>] (pl011_set_termios+0x48/0x278) from [<c01850b0>] (uart_set_options+0xe8/0x114) [<c01850b0>] (uart_set_options+0xe8/0x114) from [<c0185de4>] (uart_poll_init+0xd4/0xe0) [<c0185de4>] (uart_poll_init+0xd4/0xe0) from [<c016da8c>] (tty_find_polling_driver+0x100/0x17c) [<c016da8c>] (tty_find_polling_driver+0x100/0x17c) from [<c0188538>] (configure_kgdboc+0xc8/0x1b8) [<c0188538>] (configure_kgdboc+0xc8/0x1b8) from [<c00088a4>] (do_one_initcall+0x30/0x168) [<c00088a4>] (do_one_initcall+0x30/0x168) from [<c033784c>] (do_basic_setup+0x94/0xc8) [<c033784c>] (do_basic_setup+0x94/0xc8) from [<c03378e0>] (kernel_init+0x60/0xf4) [<c03378e0>] (kernel_init+0x60/0xf4) from [<c00144a0>] (kernel_thread_exit+0x0/0x8) ---[ end trace 7d41c9186f342c40 ]--- Division by zero in kernel. [<c0018e50>] (unwind_backtrace+0x0/0xf0) from [<c014546c>] (Ldiv0+0x8/0x10) [<c014546c>] (Ldiv0+0x8/0x10) from [<c0187098>] (pl011_set_termios+0x68/0x278) [<c0187098>] (pl011_set_termios+0x68/0x278) from [<c01850b0>] (uart_set_options+0xe8/0x114) [<c01850b0>] (uart_set_options+0xe8/0x114) from [<c0185de4>] (uart_poll_init+0xd4/0xe0) [<c0185de4>] (uart_poll_init+0xd4/0xe0) from [<c016da8c>] (tty_find_polling_driver+0x100/0x17c) [<c016da8c>] (tty_find_polling_driver+0x100/0x17c) from [<c0188538>] (configure_kgdboc+0xc8/0x1b8) [<c0188538>] (configure_kgdboc+0xc8/0x1b8) from [<c00088a4>] (do_one_initcall+0x30/0x168) [<c00088a4>] (do_one_initcall+0x30/0x168) from [<c033784c>] (do_basic_setup+0x94/0xc8) [<c033784c>] (do_basic_setup+0x94/0xc8) from [<c03378e0>] (kernel_init+0x60/0xf4) [<c03378e0>] (kernel_init+0x60/0xf4) from [<c00144a0>] (kernel_thread_exit+0x0/0x8) Division by zero in kernel. [<c0018e50>] (unwind_backtrace+0x0/0xf0) from [<c014546c>] (Ldiv0+0x8/0x10) [<c014546c>] (Ldiv0+0x8/0x10) from [<c0183a98>] (uart_update_timeout+0x4c/0x5c) [<c0183a98>] (uart_update_timeout+0x4c/0x5c) from [<c01870f8>] (pl011_set_termios+0xc8/0x278) [<c01870f8>] (pl011_set_termios+0xc8/0x278) from [<c01850b0>] (uart_set_options+0xe8/0x114) [<c01850b0>] (uart_set_options+0xe8/0x114) from [<c0185de4>] (uart_poll_init+0xd4/0xe0) [<c0185de4>] (uart_poll_init+0xd4/0xe0) from [<c016da8c>] (tty_find_polling_driver+0x100/0x17c) [<c016da8c>] (tty_find_polling_driver+0x100/0x17c) from [<c0188538>] (configure_kgdboc+0xc8/0x1b8) [<c0188538>] (configure_kgdboc+0xc8/0x1b8) from [<c00088a4>] (do_one_initcall+0x30/0x168) [<c00088a4>] (do_one_initcall+0x30/0x168) from [<c033784c>] (do_basic_setup+0x94/0xc8) [<c033784c>] (do_basic_setup+0x94/0xc8) from [<c03378e0>] (kernel_init+0x60/0xf4) [<c03378e0>] (kernel_init+0x60/0xf4) from [<c00144a0>] (kernel_thread_exit+0x0/0x8) Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org> Acked-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-09-25 05:27:54 +08:00
.poll_init = pl011_hwinit,
.poll_get_char = pl011_get_poll_char,
.poll_put_char = pl011_put_poll_char,
#endif
};
static struct uart_amba_port *amba_ports[UART_NR];
#ifdef CONFIG_SERIAL_AMBA_PL011_CONSOLE
static void pl011_console_putchar(struct uart_port *port, int ch)
{
struct uart_amba_port *uap = (struct uart_amba_port *)port;
while (readw(uap->port.membase + UART01x_FR) & UART01x_FR_TXFF)
barrier();
writew(ch, uap->port.membase + UART01x_DR);
}
static void
pl011_console_write(struct console *co, const char *s, unsigned int count)
{
struct uart_amba_port *uap = amba_ports[co->index];
unsigned int status, old_cr, new_cr;
unsigned long flags;
int locked = 1;
clk_enable(uap->clk);
local_irq_save(flags);
if (uap->port.sysrq)
locked = 0;
else if (oops_in_progress)
locked = spin_trylock(&uap->port.lock);
else
spin_lock(&uap->port.lock);
/*
* First save the CR then disable the interrupts
*/
old_cr = readw(uap->port.membase + UART011_CR);
new_cr = old_cr & ~UART011_CR_CTSEN;
new_cr |= UART01x_CR_UARTEN | UART011_CR_TXE;
writew(new_cr, uap->port.membase + UART011_CR);
uart_console_write(&uap->port, s, count, pl011_console_putchar);
/*
* Finally, wait for transmitter to become empty
* and restore the TCR
*/
do {
status = readw(uap->port.membase + UART01x_FR);
} while (status & UART01x_FR_BUSY);
writew(old_cr, uap->port.membase + UART011_CR);
if (locked)
spin_unlock(&uap->port.lock);
local_irq_restore(flags);
clk_disable(uap->clk);
}
static void __init
pl011_console_get_options(struct uart_amba_port *uap, int *baud,
int *parity, int *bits)
{
if (readw(uap->port.membase + UART011_CR) & UART01x_CR_UARTEN) {
unsigned int lcr_h, ibrd, fbrd;
lcr_h = readw(uap->port.membase + uap->lcrh_tx);
*parity = 'n';
if (lcr_h & UART01x_LCRH_PEN) {
if (lcr_h & UART01x_LCRH_EPS)
*parity = 'e';
else
*parity = 'o';
}
if ((lcr_h & 0x60) == UART01x_LCRH_WLEN_7)
*bits = 7;
else
*bits = 8;
ibrd = readw(uap->port.membase + UART011_IBRD);
fbrd = readw(uap->port.membase + UART011_FBRD);
*baud = uap->port.uartclk * 4 / (64 * ibrd + fbrd);
if (uap->vendor->oversampling) {
if (readw(uap->port.membase + UART011_CR)
& ST_UART011_CR_OVSFACT)
*baud *= 2;
}
}
}
static int __init pl011_console_setup(struct console *co, char *options)
{
struct uart_amba_port *uap;
int baud = 38400;
int bits = 8;
int parity = 'n';
int flow = 'n';
int ret;
/*
* Check whether an invalid uart number has been specified, and
* if so, search for the first available port that does have
* console support.
*/
if (co->index >= UART_NR)
co->index = 0;
uap = amba_ports[co->index];
if (!uap)
return -ENODEV;
/* Allow pins to be muxed in and configured */
pinctrl_pm_select_default_state(uap->port.dev);
ret = clk_prepare(uap->clk);
if (ret)
return ret;
if (uap->port.dev->platform_data) {
struct amba_pl011_data *plat;
plat = uap->port.dev->platform_data;
if (plat->init)
plat->init();
}
uap->port.uartclk = clk_get_rate(uap->clk);
if (options)
uart_parse_options(options, &baud, &parity, &bits, &flow);
else
pl011_console_get_options(uap, &baud, &parity, &bits);
return uart_set_options(&uap->port, co, baud, parity, bits, flow);
}
static struct uart_driver amba_reg;
static struct console amba_console = {
.name = "ttyAMA",
.write = pl011_console_write,
.device = uart_console_device,
.setup = pl011_console_setup,
.flags = CON_PRINTBUFFER,
.index = -1,
.data = &amba_reg,
};
#define AMBA_CONSOLE (&amba_console)
#else
#define AMBA_CONSOLE NULL
#endif
static struct uart_driver amba_reg = {
.owner = THIS_MODULE,
.driver_name = "ttyAMA",
.dev_name = "ttyAMA",
.major = SERIAL_AMBA_MAJOR,
.minor = SERIAL_AMBA_MINOR,
.nr = UART_NR,
.cons = AMBA_CONSOLE,
};
static int pl011_probe_dt_alias(int index, struct device *dev)
{
struct device_node *np;
static bool seen_dev_with_alias = false;
static bool seen_dev_without_alias = false;
int ret = index;
if (!IS_ENABLED(CONFIG_OF))
return ret;
np = dev->of_node;
if (!np)
return ret;
ret = of_alias_get_id(np, "serial");
if (IS_ERR_VALUE(ret)) {
seen_dev_without_alias = true;
ret = index;
} else {
seen_dev_with_alias = true;
if (ret >= ARRAY_SIZE(amba_ports) || amba_ports[ret] != NULL) {
dev_warn(dev, "requested serial port %d not available.\n", ret);
ret = index;
}
}
if (seen_dev_with_alias && seen_dev_without_alias)
dev_warn(dev, "aliased and non-aliased serial devices found in device tree. Serial port enumeration may be unpredictable.\n");
return ret;
}
static int pl011_probe(struct amba_device *dev, const struct amba_id *id)
{
struct uart_amba_port *uap;
struct vendor_data *vendor = id->data;
void __iomem *base;
int i, ret;
for (i = 0; i < ARRAY_SIZE(amba_ports); i++)
if (amba_ports[i] == NULL)
break;
if (i == ARRAY_SIZE(amba_ports)) {
ret = -EBUSY;
goto out;
}
uap = devm_kzalloc(&dev->dev, sizeof(struct uart_amba_port),
GFP_KERNEL);
if (uap == NULL) {
ret = -ENOMEM;
goto out;
}
i = pl011_probe_dt_alias(i, &dev->dev);
base = devm_ioremap(&dev->dev, dev->res.start,
resource_size(&dev->res));
if (!base) {
ret = -ENOMEM;
goto out;
}
uap->clk = devm_clk_get(&dev->dev, NULL);
if (IS_ERR(uap->clk)) {
ret = PTR_ERR(uap->clk);
goto out;
}
uap->vendor = vendor;
uap->lcrh_rx = vendor->lcrh_rx;
uap->lcrh_tx = vendor->lcrh_tx;
uap->old_cr = 0;
uap->fifosize = vendor->get_fifosize(dev);
uap->port.dev = &dev->dev;
uap->port.mapbase = dev->res.start;
uap->port.membase = base;
uap->port.iotype = UPIO_MEM;
uap->port.irq = dev->irq[0];
uap->port.fifosize = uap->fifosize;
uap->port.ops = &amba_pl011_pops;
uap->port.flags = UPF_BOOT_AUTOCONF;
uap->port.line = i;
pl011_dma_probe(&dev->dev, uap);
/* Ensure interrupts from this UART are masked and cleared */
writew(0, uap->port.membase + UART011_IMSC);
writew(0xffff, uap->port.membase + UART011_ICR);
snprintf(uap->type, sizeof(uap->type), "PL011 rev%u", amba_rev(dev));
amba_ports[i] = uap;
amba_set_drvdata(dev, uap);
ret = uart_add_one_port(&amba_reg, &uap->port);
if (ret) {
amba_set_drvdata(dev, NULL);
amba_ports[i] = NULL;
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
pl011_dma_remove(uap);
}
out:
return ret;
}
static int pl011_remove(struct amba_device *dev)
{
struct uart_amba_port *uap = amba_get_drvdata(dev);
int i;
amba_set_drvdata(dev, NULL);
uart_remove_one_port(&amba_reg, &uap->port);
for (i = 0; i < ARRAY_SIZE(amba_ports); i++)
if (amba_ports[i] == uap)
amba_ports[i] = NULL;
ARM: PL011: Add support for transmit DMA Add DMA engine support for transmit to the PL011 driver. Based on a patch from Linus Walliej, with the following changes: - remove RX DMA support. As PL011 doesn't give us receive timeout interrupts, we only get notified of received data when the RX DMA has completed. This rather sucks for interactive use of the TTY. - remove abuse of completions. Completions are supposed to be for events, not to tell what condition buffers are in. Replace it with a simple 'queued' bool. - fix locking - it is only safe to access the circular buffer with the port lock held. - only map the DMA buffer when required - if we're ever behind an IOMMU this helps keep IOMMU usage down, and also ensures that we're legal when we change the scatterlist entry length. - fix XON/XOFF sending - we must send XON/XOFF characters out as soon as possible - waiting for up to 4095 characters in the DMA buffer to be sent first is not acceptable. - fix XON/XOFF receive handling - we need to stop DMA when instructed to by the TTY layer, and restart it again when instructed to. There is a subtle problem here: we must not completely empty the circular buffer with DMA, otherwise we will not be notified of XON. - change the 'enable_dma' flag into a 'using DMA' flag, and track whether we can use TX DMA by whether the channel pointer is non-NULL. This gives us more control over whether we use DMA in the driver. - we don't need to have the TX DMA buffer continually allocated for each port - instead, allocate it when the port starts up, and free it when it's shut down. Update the 'using DMA' flag if we get the buffer, and adjust the TTY FIFO size appropriately. - if we're going to use PIO to send characters, use the existing IRQ based functionality rather than reimplementing it. This also ensures we call uart_write_wakeup() at the appropriate time, otherwise we'll stall. - use DMA engine helper functions for type safety. - fix init when built as a module - we can't have to initcall functions, so we must settle on one. This means we can eliminate the deferred DMA initialization. - there is no need to terminate transfers on a failed prep_slave_sg() call - nothing has been setup, so nothing needs to be terminated. This avoids a potential deadlock in the DMA engine code (tasklet->callback->failed prepare->terminate->tasklet_disable which then ends up waiting for the tasklet to finish running.) - Dan says that the submission callback should not return an error: | dma_submit_error() is something I should have removed after commit | a0587bcf "ioat1: move descriptor allocation from submit to prep" all | errors should be notified by prep failing to return a descriptor | handle. Negative dma_cookie_t values are only returned by the | dma_async_memcpy* calls which translate a prep failure into -ENOMEM. So remove the error handling at that point. This also solves the potential deadlock mentioned in the previous comment. Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-23 01:24:39 +08:00
pl011_dma_remove(uap);
return 0;
}
#ifdef CONFIG_PM
static int pl011_suspend(struct amba_device *dev, pm_message_t state)
{
struct uart_amba_port *uap = amba_get_drvdata(dev);
if (!uap)
return -EINVAL;
return uart_suspend_port(&amba_reg, &uap->port);
}
static int pl011_resume(struct amba_device *dev)
{
struct uart_amba_port *uap = amba_get_drvdata(dev);
if (!uap)
return -EINVAL;
return uart_resume_port(&amba_reg, &uap->port);
}
#endif
static struct amba_id pl011_ids[] = {
{
.id = 0x00041011,
.mask = 0x000fffff,
.data = &vendor_arm,
},
{
.id = 0x00380802,
.mask = 0x00ffffff,
.data = &vendor_st,
},
{ 0, 0 },
};
MODULE_DEVICE_TABLE(amba, pl011_ids);
static struct amba_driver pl011_driver = {
.drv = {
.name = "uart-pl011",
},
.id_table = pl011_ids,
.probe = pl011_probe,
.remove = pl011_remove,
#ifdef CONFIG_PM
.suspend = pl011_suspend,
.resume = pl011_resume,
#endif
};
static int __init pl011_init(void)
{
int ret;
printk(KERN_INFO "Serial: AMBA PL011 UART driver\n");
ret = uart_register_driver(&amba_reg);
if (ret == 0) {
ret = amba_driver_register(&pl011_driver);
if (ret)
uart_unregister_driver(&amba_reg);
}
return ret;
}
static void __exit pl011_exit(void)
{
amba_driver_unregister(&pl011_driver);
uart_unregister_driver(&amba_reg);
}
/*
* While this can be a module, if builtin it's most likely the console
* So let's leave module_exit but move module_init to an earlier place
*/
arch_initcall(pl011_init);
module_exit(pl011_exit);
MODULE_AUTHOR("ARM Ltd/Deep Blue Solutions Ltd");
MODULE_DESCRIPTION("ARM AMBA serial port driver");
MODULE_LICENSE("GPL");