xfs: always verify the log tail during recovery

Log tail verification currently only occurs when torn writes are
detected at the head of the log. This was introduced because a
change in the head block due to torn writes can lead to a change in
the tail block (each log record header references the current tail)
and the tail block should be verified before log recovery proceeds.

Tail corruption is possible outside of torn write scenarios,
however. For example, partial log writes can be detected and cleared
during the initial head/tail block discovery process. If the partial
write coincides with a tail overwrite, the log tail is corrupted and
recovery fails.

To facilitate correct handling of log tail overwites, update log
recovery to always perform tail verification. This is necessary to
detect potential tail overwrite conditions when torn writes may not
have occurred. This changes normal (i.e., no torn writes) recovery
behavior slightly to detect and return CRC related errors near the
tail before actual recovery starts.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
This commit is contained in:
Brian Foster 2017-08-08 18:21:51 -07:00 committed by Darrick J. Wong
parent 284f1c2c9b
commit 5297ac1f6d
1 changed files with 3 additions and 23 deletions

View File

@ -1183,31 +1183,11 @@ xlog_verify_head(
ASSERT(0);
return 0;
}
/*
* Now verify the tail based on the updated head. This is
* required because the torn writes trimmed from the head could
* have been written over the tail of a previous record. Return
* any errors since recovery cannot proceed if the tail is
* corrupt.
*
* XXX: This leaves a gap in truly robust protection from torn
* writes in the log. If the head is behind the tail, the tail
* pushes forward to create some space and then a crash occurs
* causing the writes into the previous record's tail region to
* tear, log recovery isn't able to recover.
*
* How likely is this to occur? If possible, can we do something
* more intelligent here? Is it safe to push the tail forward if
* we can determine that the tail is within the range of the
* torn write (e.g., the kernel can only overwrite the tail if
* it has actually been pushed forward)? Alternatively, could we
* somehow prevent this condition at runtime?
*/
error = xlog_verify_tail(log, *head_blk, *tail_blk);
}
if (error)
return error;
return error;
return xlog_verify_tail(log, *head_blk, *tail_blk);
}
/*