229 lines
7.7 KiB
TeX
229 lines
7.7 KiB
TeX
% Copyright 2003,2005,2007 Alain Knaff.
|
|
|
|
% This documentation is for Mtools which is a collection of tools to
|
|
% allow Unix systems to manipulate MS-DOS files.
|
|
|
|
% Permission is granted to copy, distribute and/or modify this document
|
|
% under the terms of the GNU Free Documentation License, Version 1.3 or
|
|
% any later version published by the Free Software Foundation; with no
|
|
% Invariant Sections, with no Front-Cover Texts, and with no Back-Cover
|
|
%Texts. A copy of the license is included in the section entitled
|
|
% ``GNU Free Documentation License''.
|
|
|
|
\documentclass[a4paper,12pt]{article}
|
|
|
|
\usepackage[T1]{fontenc}
|
|
\usepackage[latin1]{inputenc}
|
|
\usepackage{pslatex}
|
|
\usepackage[pdfpagemode=None,colorlinks]{hyperref}
|
|
|
|
\author{Alain Knaff}
|
|
\title{How mformat-3.9.10 and above calculates needed FAT size}
|
|
|
|
\begin{document}
|
|
|
|
\maketitle
|
|
|
|
This small document explains the formula used by {\tt mformat.c} to
|
|
figure out fat size and number of clusters. Due to the way that
|
|
filesystem parameters are stored in the boot sector, we can afford to
|
|
have a FAT that is larger than need to be to accommodate the clusters
|
|
present on disk, but under no circumstances can we have one that is
|
|
too small.
|
|
|
|
In this discussion, we use the following variable names:
|
|
|
|
\begin{tabular}{|r|p{12cm}|}
|
|
|
|
\hline
|
|
|
|
$fatNybls$&
|
|
Number of nubbles (4 bit unit per FAT). This is 3 for FAT12, 4 for
|
|
FAT16, and 8 for FAT16\\
|
|
|
|
\hline
|
|
|
|
$numClus$&
|
|
Number of clusters on the disk\\
|
|
|
|
\hline
|
|
|
|
$clusSiz$&
|
|
Size of a cluster, expressed in sectors\\
|
|
|
|
\hline
|
|
|
|
$secSiz$&
|
|
Size of a sector, in bytes. Size of a sector in nybbles is $secSiz$ * 2\\
|
|
|
|
\hline
|
|
|
|
$nfats$&
|
|
Number of FAT copies, usually two\\
|
|
|
|
\hline
|
|
|
|
$remSects$&
|
|
``Remaining sectors'', after number of boot sectors and root directory
|
|
have been accounted for\\
|
|
|
|
\hline
|
|
|
|
$fatLen$&
|
|
Length of the FAT, in sectors\\
|
|
|
|
\hline
|
|
|
|
|
|
\end{tabular}
|
|
|
|
\ \\
|
|
|
|
Taking into account both data and fat, each cluster takes up the
|
|
following number of nybbles (units of 4 bytes):
|
|
|
|
|
|
$$clusSiz * (2*secSiz) + nfats * fatNybls$$
|
|
|
|
This accounts for the data of the cluster ($clusSiz * secSiz$),
|
|
as well as for the space taken up by its descriptor.
|
|
|
|
The space taken up by all clusters together, plus the space taken by
|
|
descriptors for clusters 0 and 1 ($2*nfats*fatNybls$) should be less
|
|
than what is available.
|
|
|
|
Additional sectors may be lost due to slack (you have to use a full
|
|
FAT sector, you also have to use a full cluster in the data
|
|
area). Thus, an {\em upper bound} for the number of clusters is as
|
|
follows:
|
|
|
|
$$
|
|
numClus \le {2*remSect*secSiz - 2*nfats*fatNybls \over
|
|
2*clusSiz*secSiz + nfats*fatNybls}
|
|
$$
|
|
|
|
|
|
$$
|
|
numClus \le {(remSect+2*clusSiz)*2*secSiz \over
|
|
2*clusSiz*secSiz + nfats*fatNybls} - 2
|
|
$$
|
|
|
|
|
|
These will take up at most the following space in one copy of the FAT
|
|
(we have to round up, because even a half-full fat sector must be
|
|
taken in its entirety)
|
|
|
|
$$
|
|
fatLen \le \left\lceil { (numClus+2)*fatNybls \over secSiz * 2 } \right\rceil
|
|
$$
|
|
|
|
|
|
$$
|
|
fatLen \le \left\lceil {
|
|
\left( { 2*(remSect+2*clusSiz)*secSiz \over
|
|
2*clusSiz*secSiz + nfats*fatNybls} \right) * fatNybls \over
|
|
2*secSiz
|
|
} \right\rceil
|
|
$$
|
|
|
|
|
|
$$
|
|
fatLen \le \left\lceil {
|
|
(remSect+2*clusSiz)* fatNybls \over
|
|
2*clusSiz*secSiz + nfats*fatNybls
|
|
} \right\rceil
|
|
$$
|
|
|
|
The space left after FAT sector has been deduced is now {\em less than
|
|
or equal} to what would be needed for the data area of the clusters
|
|
(including fractional clusters), which is good, as we may have under
|
|
no circumstances {\em more} clusters in the data area than in the FAT.
|
|
An important point in this calculation is that we based the needed FAT
|
|
size on a {\em fractional} number of clusters, rather than a rounded
|
|
down amount of clusters. Indeed, using a rounded down number could
|
|
have exposed us to a situation where we had an {\em almost enough}
|
|
space for one more cluster (i.e. not enough space for data + FAT, but
|
|
enough for data alone). This situation, combined with a situation
|
|
where the last FAT sector was flush full could have lead to a
|
|
situation where $numClus$ would become too large for the FAT to
|
|
accommodate. I think this is clearer with an example:
|
|
\begin{itemize}
|
|
\item $remSect=4127$, $clusSiz=1$, $nfats=1$
|
|
\item (Non rounded down) $numClus={(4127+2)*(1024) \over 1032} -
|
|
2 =4094.992$
|
|
\item Rounded down: 4094 clusters
|
|
\item These fit into 16 FAT sectors, exactly
|
|
\item ... leaving us 4095 clusters, which is one to many (because
|
|
these 4095 clusters would now need 17 FAT clusters)
|
|
\end{itemize}
|
|
|
|
Keeping the fractional part (0.992) allows us to round {\em up} the
|
|
needed number of FAT sectors to 17, nicely solving this problem.
|
|
|
|
The downside of counting the fractional part however is that we quite
|
|
often waste a sector in the really common situation where both $nfats$
|
|
and $clusSiz$ are even, while $remSect$ is odd. An easy way to address
|
|
this is to subtract one from $remSect$ before application of the
|
|
formula, if this case is detected. Such operation carries no risk, as
|
|
the odd final sector cannot be used to make a full cluster.
|
|
|
|
There is still a case however, where fatLen must be grown manually
|
|
after application of the formula: If numClus exceeds the maximum
|
|
number of clusters allowable for FAT12 or FAT16), we need to shrink
|
|
$numClus$ after application of the formula, and manually make the FAT
|
|
larger in order to take up any excess space.
|
|
|
|
Also note that as we used upper bounds, we may waste a certain number
|
|
of sectors, which an exact calculation may not have wasted. However,
|
|
normally, we should not lose more than one sector per FAT copy.
|
|
|
|
N.B. In its document at \url{http://www.microsoft.com/hwdev/download/hardware/fatgen103.pdf},
|
|
Microsoft proposes a much simpler formula. However, this formula is
|
|
both wrong (i.e. occasionally it produces a smaller FAT than is
|
|
needed for the clusters on disk), less generic (only works for sector
|
|
sizes equal to 512), and less efficient (in case of FAT32, it may
|
|
waste up to 8 sectors!)
|
|
|
|
The formula is the following (for FAT16):
|
|
$$
|
|
fatLen \le \left\lceil { remSect \over 256 * clusSiz + nfats} \right\rceil
|
|
$$
|
|
|
|
Note that it doesn't account for the dummy clusters 0 and 1. Thus, if
|
|
we have 258 sectors remaining, with a cluster size of 1, and two FAT
|
|
copies, the Microsoft formula mistakenly assumes fatLen = 1. This
|
|
leaves 258 - 2 = 256 sectors for clusters, which yields 256 clusters.
|
|
However, those clusters do not fit into the FAT, as two clusters are
|
|
lost (0 and 1). However, to Micro\$ofts' credit, this is not actually
|
|
the formula they're using (tested with $remsect=160056$ and
|
|
$clusSize=4$), so this seems to be a documentation issue rather than a
|
|
genuine bug.
|
|
|
|
In case of FAT32, Microsoft just halves the denominator. This is
|
|
wasteful, as only the $256*clusSiz$ part would need to be halved.
|
|
|
|
If we assume 16777000, and a cluster size of 8, our formula would give
|
|
us:
|
|
$$fatLen = \left\lceil (16777000 + 16) * 8 \over 2 * 8 * 512 + 16
|
|
\right\rceil = 16352$$
|
|
This would leave $16777000-2*16352=16744296$ sectors for the clusters,
|
|
i.e. 2093037 clusters. The FAT descriptors for those 2093037 clusters
|
|
do indeed fit into our 16352 fat sectors.
|
|
|
|
Microsoft, on the other hand, would get: $$fatLen = \left\lceil{
|
|
16777000 \over (256 * 8 + 2)/2} \right\rceil = 16368$$ This would only
|
|
leave $16777000-2*16368=16744264$, i.e. 2093033 clusters, thus wasting
|
|
4 clusters. The FAT would be 28 sectors too big, i.e. more than the
|
|
mere 8 sectors announced by Microsoft! Unfortunately, I currently
|
|
don't have access to any sufficiently recent Windows to check out
|
|
whether this really happens in the code, or whether it is only a
|
|
documentation issue as well.
|
|
|
|
Oh, and before somebody points it out: the formula in this document
|
|
occassionnally wastes sectors too, although not as much (Example: With
|
|
$remsect=685$, $clusSiz=1$ and $nfats=2$ our formula gives $fatLen=3$,
|
|
which leaves 679 clusters. However, we could use $fatLen=2$, leaving
|
|
681 clusters.
|
|
|
|
\end{document}
|