2005-08-05 10:30:31 +08:00
|
|
|
/*
|
|
|
|
* iSCSI Initiator TCP Transport
|
|
|
|
* Copyright (C) 2004 Dmitry Yusupov
|
|
|
|
* Copyright (C) 2004 Alex Aizman
|
2006-04-07 10:26:46 +08:00
|
|
|
* Copyright (C) 2005 - 2006 Mike Christie
|
|
|
|
* Copyright (C) 2006 Red Hat, Inc. All rights reserved.
|
2005-08-05 10:30:31 +08:00
|
|
|
* maintained by open-iscsi@googlegroups.com
|
|
|
|
*
|
|
|
|
* This program is free software; you can redistribute it and/or modify
|
|
|
|
* it under the terms of the GNU General Public License as published
|
|
|
|
* by the Free Software Foundation; either version 2 of the License, or
|
|
|
|
* (at your option) any later version.
|
|
|
|
*
|
|
|
|
* This program is distributed in the hope that it will be useful, but
|
|
|
|
* WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
|
|
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
|
|
|
* General Public License for more details.
|
|
|
|
*
|
|
|
|
* See the file COPYING included with this distribution for more details.
|
|
|
|
*/
|
|
|
|
|
|
|
|
#ifndef ISCSI_TCP_H
|
|
|
|
#define ISCSI_TCP_H
|
|
|
|
|
2006-04-07 10:26:46 +08:00
|
|
|
#include <scsi/libiscsi.h>
|
2005-08-05 10:30:31 +08:00
|
|
|
|
|
|
|
/* Socket's Receive state machine */
|
|
|
|
#define IN_PROGRESS_WAIT_HEADER 0x0
|
|
|
|
#define IN_PROGRESS_HEADER_GATHER 0x1
|
|
|
|
#define IN_PROGRESS_DATA_RECV 0x2
|
|
|
|
#define IN_PROGRESS_DDIGEST_RECV 0x3
|
2007-05-31 01:57:20 +08:00
|
|
|
#define IN_PROGRESS_PAD_RECV 0x4
|
2005-08-05 10:30:31 +08:00
|
|
|
|
|
|
|
/* xmit state machine */
|
[SCSI] iscsi_tcp: fix potential lockup with write commands
There is a race condition in iscsi_tcp.c that may cause it to forget
that it received a R2T from the target. This race may cause a data-out
command (such as a write) to lock up. The race occurs here:
static int
iscsi_send_unsol_pdu(struct iscsi_conn *conn, struct iscsi_cmd_task *ctask)
{
struct iscsi_tcp_cmd_task *tcp_ctask = ctask->dd_data;
int rc;
if (tcp_ctask->xmstate & XMSTATE_UNS_HDR) {
BUG_ON(!ctask->unsol_count);
tcp_ctask->xmstate &= ~XMSTATE_UNS_HDR; <---- RACE
...
static int
iscsi_r2t_rsp(struct iscsi_conn *conn, struct iscsi_cmd_task *ctask)
{
...
tcp_ctask->xmstate |= XMSTATE_SOL_HDR_INIT; <---- RACE
...
While iscsi_xmitworker() (called from scsi_queue_work()) is preparing to
send unsolicited data, iscsi_tcp_data_recv() (called from
tcp_read_sock()) interrupts it upon receipt of a R2T from the target.
Both contexts do read-modify-write of tcp_ctask->xmstate. Usually, gcc
on x86 will make &= and |= atomic on UP (not guaranteed of course), but
in this case iscsi_send_unsol_pdu() reads the value of xmstate before
clearing the bit, which causes gcc to read xmstate into a CPU register,
test it, clear the bit, and then store it back to memory. If the recv
interrupt happens during this sequence, then the XMSTATE_SOL_HDR_INIT
bit set by the recv interrupt will be lost, and the R2T will be
forgotten.
The patch below (against 2.6.24-rc1) converts accesses of xmstate to use
set_bit, clear_bit, and test_bit instead of |= and &=. I have tested
this patch and verified that it fixes the problem. Another possible
approach would be to hold a lock during most of the rx/tx setup and
post-processing, and drop the lock only for the actual rx/tx.
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2007-11-15 04:38:42 +08:00
|
|
|
#define XMSTATE_VALUE_IDLE 0
|
|
|
|
#define XMSTATE_BIT_CMD_HDR_INIT 0
|
|
|
|
#define XMSTATE_BIT_CMD_HDR_XMIT 1
|
|
|
|
#define XMSTATE_BIT_IMM_HDR 2
|
|
|
|
#define XMSTATE_BIT_IMM_DATA 3
|
|
|
|
#define XMSTATE_BIT_UNS_INIT 4
|
|
|
|
#define XMSTATE_BIT_UNS_HDR 5
|
|
|
|
#define XMSTATE_BIT_UNS_DATA 6
|
|
|
|
#define XMSTATE_BIT_SOL_HDR 7
|
|
|
|
#define XMSTATE_BIT_SOL_DATA 8
|
|
|
|
#define XMSTATE_BIT_W_PAD 9
|
|
|
|
#define XMSTATE_BIT_W_RESEND_PAD 10
|
|
|
|
#define XMSTATE_BIT_W_RESEND_DATA_DIGEST 11
|
|
|
|
#define XMSTATE_BIT_IMM_HDR_INIT 12
|
|
|
|
#define XMSTATE_BIT_SOL_HDR_INIT 13
|
[SCSI] iscsi_tcp: fix padding, data digests, and IO at weird offsets
iscsi_tcp calculates padding by using the expected transfer length. This
has the problem where if we have immediate data = no and initial R2T =
yes, and the transfer length ended up needing padding then we send:
1. header
2. padding which should have gone after data
3. data
Besides this bug, we also assume the target will always ask for nice
transfer lengths and the first burst length will always be a nice value.
As far as I can tell form the RFC this is not a requirement. It would be
silly to do this, but if someone did it we will end doing bad things.
Finally the last bug in that bit of code is in our handling of the
recalculation of data digests when we do not send a whole iscsi_buf in
one try. The bug here is that we call crypto_digest_final on a
iscsi_sendpage error, then when we send the rest of the iscsi_buf, we
doiscsi_data_digest_init and this causes the previous data digest to be
lost.
And to make matters worse, some of these bugs are replicated over and
over and over again for immediate data, solicited data and unsolicited
data. So the attached patch made over the iscsi git tree (see
kernel.org/git for details) which I updated today to include the patches
I said I merged, consolidates the sending of data, padding and digests
and calculation of data digests and fixes the above bugs.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-09-01 06:09:27 +08:00
|
|
|
|
2005-08-05 10:30:31 +08:00
|
|
|
#define ISCSI_PAD_LEN 4
|
|
|
|
#define ISCSI_SG_TABLESIZE SG_ALL
|
|
|
|
#define ISCSI_TCP_MAX_CMD_LEN 16
|
|
|
|
|
2006-08-24 16:45:50 +08:00
|
|
|
struct crypto_hash;
|
2006-04-07 10:26:46 +08:00
|
|
|
struct socket;
|
2005-08-05 10:30:31 +08:00
|
|
|
|
|
|
|
/* Socket connection recieve helper */
|
|
|
|
struct iscsi_tcp_recv {
|
|
|
|
struct iscsi_hdr *hdr;
|
|
|
|
struct sk_buff *skb;
|
|
|
|
int offset;
|
|
|
|
int len;
|
|
|
|
int hdr_offset;
|
|
|
|
int copy;
|
|
|
|
int copied;
|
|
|
|
int padding;
|
|
|
|
struct iscsi_cmd_task *ctask; /* current cmd in progress */
|
|
|
|
|
|
|
|
/* copied and flipped values */
|
|
|
|
int datalen;
|
|
|
|
int datadgst;
|
2006-04-07 10:26:46 +08:00
|
|
|
char zero_copy_hdr;
|
2005-08-05 10:30:31 +08:00
|
|
|
};
|
|
|
|
|
2006-04-07 10:26:46 +08:00
|
|
|
struct iscsi_tcp_conn {
|
|
|
|
struct iscsi_conn *iscsi_conn;
|
|
|
|
struct socket *sock;
|
2005-08-05 10:30:31 +08:00
|
|
|
struct iscsi_hdr hdr; /* header placeholder */
|
|
|
|
char hdrext[4*sizeof(__u16) +
|
|
|
|
sizeof(__u32)];
|
|
|
|
int data_copied;
|
|
|
|
int stop_stage; /* conn_stop() flag: *
|
|
|
|
* stop to recover, *
|
|
|
|
* stop to terminate */
|
|
|
|
/* iSCSI connection-wide sequencing */
|
|
|
|
int hdr_size; /* PDU header size */
|
|
|
|
|
|
|
|
/* control data */
|
|
|
|
struct iscsi_tcp_recv in; /* TCP receive context */
|
|
|
|
int in_progress; /* connection state machine */
|
|
|
|
|
|
|
|
/* old values for socket callbacks */
|
|
|
|
void (*old_data_ready)(struct sock *, int);
|
|
|
|
void (*old_state_change)(struct sock *);
|
|
|
|
void (*old_write_space)(struct sock *);
|
|
|
|
|
2006-09-01 06:09:28 +08:00
|
|
|
/* data and header digests */
|
2006-09-24 04:33:43 +08:00
|
|
|
struct hash_desc tx_hash; /* CRC32C (Tx) */
|
|
|
|
struct hash_desc rx_hash; /* CRC32C (Rx) */
|
2005-08-05 10:30:31 +08:00
|
|
|
|
2006-04-07 10:26:46 +08:00
|
|
|
/* MIB custom statistics */
|
2005-08-05 10:30:31 +08:00
|
|
|
uint32_t sendpage_failures_cnt;
|
|
|
|
uint32_t discontiguous_hdr_cnt;
|
2006-01-14 08:05:44 +08:00
|
|
|
|
|
|
|
ssize_t (*sendpage)(struct socket *, struct page *, int, size_t, int);
|
2005-08-05 10:30:31 +08:00
|
|
|
};
|
|
|
|
|
|
|
|
struct iscsi_buf {
|
|
|
|
struct scatterlist sg;
|
|
|
|
unsigned int sent;
|
2006-01-14 08:05:47 +08:00
|
|
|
char use_sendmsg;
|
2005-08-05 10:30:31 +08:00
|
|
|
};
|
|
|
|
|
|
|
|
struct iscsi_data_task {
|
|
|
|
struct iscsi_data hdr; /* PDU */
|
|
|
|
char hdrext[sizeof(__u32)]; /* Header-Digest */
|
|
|
|
struct iscsi_buf digestbuf; /* digest buffer */
|
|
|
|
uint32_t digest; /* data digest */
|
|
|
|
};
|
|
|
|
|
2006-04-07 10:26:46 +08:00
|
|
|
struct iscsi_tcp_mgmt_task {
|
|
|
|
struct iscsi_hdr hdr;
|
|
|
|
char hdrext[sizeof(__u32)]; /* Header-Digest */
|
[SCSI] iscsi_tcp: fix potential lockup with write commands
There is a race condition in iscsi_tcp.c that may cause it to forget
that it received a R2T from the target. This race may cause a data-out
command (such as a write) to lock up. The race occurs here:
static int
iscsi_send_unsol_pdu(struct iscsi_conn *conn, struct iscsi_cmd_task *ctask)
{
struct iscsi_tcp_cmd_task *tcp_ctask = ctask->dd_data;
int rc;
if (tcp_ctask->xmstate & XMSTATE_UNS_HDR) {
BUG_ON(!ctask->unsol_count);
tcp_ctask->xmstate &= ~XMSTATE_UNS_HDR; <---- RACE
...
static int
iscsi_r2t_rsp(struct iscsi_conn *conn, struct iscsi_cmd_task *ctask)
{
...
tcp_ctask->xmstate |= XMSTATE_SOL_HDR_INIT; <---- RACE
...
While iscsi_xmitworker() (called from scsi_queue_work()) is preparing to
send unsolicited data, iscsi_tcp_data_recv() (called from
tcp_read_sock()) interrupts it upon receipt of a R2T from the target.
Both contexts do read-modify-write of tcp_ctask->xmstate. Usually, gcc
on x86 will make &= and |= atomic on UP (not guaranteed of course), but
in this case iscsi_send_unsol_pdu() reads the value of xmstate before
clearing the bit, which causes gcc to read xmstate into a CPU register,
test it, clear the bit, and then store it back to memory. If the recv
interrupt happens during this sequence, then the XMSTATE_SOL_HDR_INIT
bit set by the recv interrupt will be lost, and the R2T will be
forgotten.
The patch below (against 2.6.24-rc1) converts accesses of xmstate to use
set_bit, clear_bit, and test_bit instead of |= and &=. I have tested
this patch and verified that it fixes the problem. Another possible
approach would be to hold a lock during most of the rx/tx setup and
post-processing, and drop the lock only for the actual rx/tx.
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2007-11-15 04:38:42 +08:00
|
|
|
unsigned long xmstate; /* mgmt xmit progress */
|
2005-08-05 10:30:31 +08:00
|
|
|
struct iscsi_buf headbuf; /* header buffer */
|
|
|
|
struct iscsi_buf sendbuf; /* in progress buffer */
|
|
|
|
int sent;
|
|
|
|
};
|
|
|
|
|
|
|
|
struct iscsi_r2t_info {
|
|
|
|
__be32 ttt; /* copied from R2T */
|
|
|
|
__be32 exp_statsn; /* copied from R2T */
|
|
|
|
uint32_t data_length; /* copied from R2T */
|
|
|
|
uint32_t data_offset; /* copied from R2T */
|
|
|
|
struct iscsi_buf headbuf; /* Data-Out Header Buffer */
|
|
|
|
struct iscsi_buf sendbuf; /* Data-Out in progress buffer*/
|
|
|
|
int sent; /* R2T sequence progress */
|
|
|
|
int data_count; /* DATA-Out payload progress */
|
|
|
|
struct scatterlist *sg; /* per-R2T SG list */
|
|
|
|
int solicit_datasn;
|
2006-05-19 09:31:36 +08:00
|
|
|
struct iscsi_data_task dtask; /* which data task */
|
2005-08-05 10:30:31 +08:00
|
|
|
};
|
|
|
|
|
2006-04-07 10:26:46 +08:00
|
|
|
struct iscsi_tcp_cmd_task {
|
|
|
|
struct iscsi_cmd hdr;
|
2005-08-05 10:30:31 +08:00
|
|
|
char hdrext[4*sizeof(__u16)+ /* AHS */
|
|
|
|
sizeof(__u32)]; /* HeaderDigest */
|
|
|
|
char pad[ISCSI_PAD_LEN];
|
2006-04-07 10:26:46 +08:00
|
|
|
int pad_count; /* padded bytes */
|
2005-08-05 10:30:31 +08:00
|
|
|
struct iscsi_buf headbuf; /* header buf (xmit) */
|
|
|
|
struct iscsi_buf sendbuf; /* in progress buffer*/
|
[SCSI] iscsi_tcp: fix potential lockup with write commands
There is a race condition in iscsi_tcp.c that may cause it to forget
that it received a R2T from the target. This race may cause a data-out
command (such as a write) to lock up. The race occurs here:
static int
iscsi_send_unsol_pdu(struct iscsi_conn *conn, struct iscsi_cmd_task *ctask)
{
struct iscsi_tcp_cmd_task *tcp_ctask = ctask->dd_data;
int rc;
if (tcp_ctask->xmstate & XMSTATE_UNS_HDR) {
BUG_ON(!ctask->unsol_count);
tcp_ctask->xmstate &= ~XMSTATE_UNS_HDR; <---- RACE
...
static int
iscsi_r2t_rsp(struct iscsi_conn *conn, struct iscsi_cmd_task *ctask)
{
...
tcp_ctask->xmstate |= XMSTATE_SOL_HDR_INIT; <---- RACE
...
While iscsi_xmitworker() (called from scsi_queue_work()) is preparing to
send unsolicited data, iscsi_tcp_data_recv() (called from
tcp_read_sock()) interrupts it upon receipt of a R2T from the target.
Both contexts do read-modify-write of tcp_ctask->xmstate. Usually, gcc
on x86 will make &= and |= atomic on UP (not guaranteed of course), but
in this case iscsi_send_unsol_pdu() reads the value of xmstate before
clearing the bit, which causes gcc to read xmstate into a CPU register,
test it, clear the bit, and then store it back to memory. If the recv
interrupt happens during this sequence, then the XMSTATE_SOL_HDR_INIT
bit set by the recv interrupt will be lost, and the R2T will be
forgotten.
The patch below (against 2.6.24-rc1) converts accesses of xmstate to use
set_bit, clear_bit, and test_bit instead of |= and &=. I have tested
this patch and verified that it fixes the problem. Another possible
approach would be to hold a lock during most of the rx/tx setup and
post-processing, and drop the lock only for the actual rx/tx.
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2007-11-15 04:38:42 +08:00
|
|
|
unsigned long xmstate; /* xmit xtate machine */
|
2005-08-05 10:30:31 +08:00
|
|
|
int sent;
|
|
|
|
struct scatterlist *sg; /* per-cmd SG list */
|
|
|
|
struct scatterlist *bad_sg; /* assert statement */
|
|
|
|
int sg_count; /* SG's to process */
|
2007-05-31 01:57:14 +08:00
|
|
|
uint32_t exp_datasn; /* expected target's R2TSN/DataSN */
|
2005-08-05 10:30:31 +08:00
|
|
|
int data_offset;
|
|
|
|
struct iscsi_r2t_info *r2t; /* in progress R2T */
|
|
|
|
struct iscsi_queue r2tpool;
|
|
|
|
struct kfifo *r2tqueue;
|
|
|
|
struct iscsi_r2t_info **r2ts;
|
|
|
|
int digest_count;
|
|
|
|
uint32_t immdigest; /* for imm data */
|
|
|
|
struct iscsi_buf immbuf; /* for imm data digest */
|
2006-05-19 09:31:36 +08:00
|
|
|
struct iscsi_data_task unsol_dtask; /* unsol data task */
|
2005-08-05 10:30:31 +08:00
|
|
|
};
|
|
|
|
|
|
|
|
#endif /* ISCSI_H */
|