2020-04-25 21:19:08 +08:00
|
|
|
/* SPDX-License-Identifier: GPL-2.0 */
|
2012-11-29 12:28:09 +08:00
|
|
|
/*
|
f2fs: add garbage collection functions
This adds on-demand and background cleaning functions.
- The basic background cleaning policy is trying to do cleaning jobs as much as
possible whenever the system is idle. Once the background cleaning is done,
the cleaner sleeps an amount of time not to interfere with VFS calls. The time
is dynamically adjusted according to the status of whole segments, which is
decreased when the following conditions are satisfied.
. GC is not conducted currently, and
. IO subsystem is idle by checking the number of requets in bdev's request
list, and
. There are enough dirty segments.
Otherwise, the time is increased incrementally until to the maximum time.
Note that, min and max times are 10 secs and 30 secs by default.
- F2FS adopts a default victim selection policy where background cleaning uses
a cost-benefit algorithm, while on-demand cleaning uses a greedy algorithm.
- The method of moving data during the cleaning is slightly different between
background and on-demand cleaning schemes. In the case of background cleaning,
F2FS loads the data, and marks them as dirty. Then, F2FS expects that the data
will be moved by flusher or VM. In the case of on-demand cleaning, F2FS should
move the data right away.
- In order to identify valid blocks in a victim segment, F2FS scans the bitmap
of the segment managed as an SIT entry.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 16:13:01 +08:00
|
|
|
* fs/f2fs/gc.h
|
|
|
|
*
|
|
|
|
* Copyright (c) 2012 Samsung Electronics Co., Ltd.
|
|
|
|
* http://www.samsung.com/
|
|
|
|
*/
|
|
|
|
#define GC_THREAD_MIN_WB_PAGES 1 /*
|
|
|
|
* a threshold to determine
|
|
|
|
* whether IO subsystem is idle
|
|
|
|
* or not
|
|
|
|
*/
|
2017-08-07 13:09:00 +08:00
|
|
|
#define DEF_GC_THREAD_URGENT_SLEEP_TIME 500 /* 500 ms */
|
2013-08-04 22:09:40 +08:00
|
|
|
#define DEF_GC_THREAD_MIN_SLEEP_TIME 30000 /* milliseconds */
|
|
|
|
#define DEF_GC_THREAD_MAX_SLEEP_TIME 60000
|
|
|
|
#define DEF_GC_THREAD_NOGC_SLEEP_TIME 300000 /* wait 5 min */
|
f2fs: support age threshold based garbage collection
There are several issues in current background GC algorithm:
- valid blocks is one of key factors during cost overhead calculation,
so if segment has less valid block, however even its age is young or
it locates hot segment, CB algorithm will still choose the segment as
victim, it's not appropriate.
- GCed data/node will go to existing logs, no matter in-there datas'
update frequency is the same or not, it may mix hot and cold data
again.
- GC alloctor mainly use LFS type segment, it will cost free segment
more quickly.
This patch introduces a new algorithm named age threshold based
garbage collection to solve above issues, there are three steps
mainly:
1. select a source victim:
- set an age threshold, and select candidates beased threshold:
e.g.
0 means youngest, 100 means oldest, if we set age threshold to 80
then select dirty segments which has age in range of [80, 100] as
candiddates;
- set candidate_ratio threshold, and select candidates based the
ratio, so that we can shrink candidates to those oldest segments;
- select target segment with fewest valid blocks in order to
migrate blocks with minimum cost;
2. select a target victim:
- select candidates beased age threshold;
- set candidate_radius threshold, search candidates whose age is
around source victims, searching radius should less than the
radius threshold.
- select target segment with most valid blocks in order to avoid
migrating current target segment.
3. merge valid blocks from source victim into target victim with
SSR alloctor.
Test steps:
- create 160 dirty segments:
* half of them have 128 valid blocks per segment
* left of them have 384 valid blocks per segment
- run background GC
Benefit: GC count and block movement count both decrease obviously:
- Before:
- Valid: 86
- Dirty: 1
- Prefree: 11
- Free: 6001 (6001)
GC calls: 162 (BG: 220)
- data segments : 160 (160)
- node segments : 2 (2)
Try to move 41454 blocks (BG: 41454)
- data blocks : 40960 (40960)
- node blocks : 494 (494)
IPU: 0 blocks
SSR: 0 blocks in 0 segments
LFS: 41364 blocks in 81 segments
- After:
- Valid: 87
- Dirty: 0
- Prefree: 4
- Free: 6008 (6008)
GC calls: 75 (BG: 76)
- data segments : 74 (74)
- node segments : 1 (1)
Try to move 12813 blocks (BG: 12813)
- data blocks : 12544 (12544)
- node blocks : 269 (269)
IPU: 0 blocks
SSR: 12032 blocks in 77 segments
LFS: 855 blocks in 2 segments
Signed-off-by: Chao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: fix a bug along with pinfile in-mem segment & clean up]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-08-04 21:14:49 +08:00
|
|
|
|
|
|
|
/* choose candidates from sections which has age of more than 7 days */
|
|
|
|
#define DEF_GC_THREAD_AGE_THRESHOLD (60 * 60 * 24 * 7)
|
|
|
|
#define DEF_GC_THREAD_CANDIDATE_RATIO 20 /* select 20% oldest sections as candidates */
|
|
|
|
#define DEF_GC_THREAD_MAX_CANDIDATE_COUNT 10 /* select at most 10 sections as candidates */
|
|
|
|
#define DEF_GC_THREAD_AGE_WEIGHT 60 /* age weight */
|
|
|
|
#define DEFAULT_ACCURACY_CLASS 10000 /* accuracy class */
|
|
|
|
|
f2fs: add garbage collection functions
This adds on-demand and background cleaning functions.
- The basic background cleaning policy is trying to do cleaning jobs as much as
possible whenever the system is idle. Once the background cleaning is done,
the cleaner sleeps an amount of time not to interfere with VFS calls. The time
is dynamically adjusted according to the status of whole segments, which is
decreased when the following conditions are satisfied.
. GC is not conducted currently, and
. IO subsystem is idle by checking the number of requets in bdev's request
list, and
. There are enough dirty segments.
Otherwise, the time is increased incrementally until to the maximum time.
Note that, min and max times are 10 secs and 30 secs by default.
- F2FS adopts a default victim selection policy where background cleaning uses
a cost-benefit algorithm, while on-demand cleaning uses a greedy algorithm.
- The method of moving data during the cleaning is slightly different between
background and on-demand cleaning schemes. In the case of background cleaning,
F2FS loads the data, and marks them as dirty. Then, F2FS expects that the data
will be moved by flusher or VM. In the case of on-demand cleaning, F2FS should
move the data right away.
- In order to identify valid blocks in a victim segment, F2FS scans the bitmap
of the segment managed as an SIT entry.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 16:13:01 +08:00
|
|
|
#define LIMIT_INVALID_BLOCK 40 /* percentage over total user space */
|
|
|
|
#define LIMIT_FREE_BLOCK 40 /* percentage over invalid + free space */
|
|
|
|
|
2017-12-08 08:25:39 +08:00
|
|
|
#define DEF_GC_FAILED_PINNED_FILES 2048
|
|
|
|
|
f2fs: add garbage collection functions
This adds on-demand and background cleaning functions.
- The basic background cleaning policy is trying to do cleaning jobs as much as
possible whenever the system is idle. Once the background cleaning is done,
the cleaner sleeps an amount of time not to interfere with VFS calls. The time
is dynamically adjusted according to the status of whole segments, which is
decreased when the following conditions are satisfied.
. GC is not conducted currently, and
. IO subsystem is idle by checking the number of requets in bdev's request
list, and
. There are enough dirty segments.
Otherwise, the time is increased incrementally until to the maximum time.
Note that, min and max times are 10 secs and 30 secs by default.
- F2FS adopts a default victim selection policy where background cleaning uses
a cost-benefit algorithm, while on-demand cleaning uses a greedy algorithm.
- The method of moving data during the cleaning is slightly different between
background and on-demand cleaning schemes. In the case of background cleaning,
F2FS loads the data, and marks them as dirty. Then, F2FS expects that the data
will be moved by flusher or VM. In the case of on-demand cleaning, F2FS should
move the data right away.
- In order to identify valid blocks in a victim segment, F2FS scans the bitmap
of the segment managed as an SIT entry.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 16:13:01 +08:00
|
|
|
/* Search max. number of dirty segments to select a victim segment */
|
2014-01-08 12:45:08 +08:00
|
|
|
#define DEF_MAX_VICTIM_SEARCH 4096 /* covers 8GB */
|
f2fs: add garbage collection functions
This adds on-demand and background cleaning functions.
- The basic background cleaning policy is trying to do cleaning jobs as much as
possible whenever the system is idle. Once the background cleaning is done,
the cleaner sleeps an amount of time not to interfere with VFS calls. The time
is dynamically adjusted according to the status of whole segments, which is
decreased when the following conditions are satisfied.
. GC is not conducted currently, and
. IO subsystem is idle by checking the number of requets in bdev's request
list, and
. There are enough dirty segments.
Otherwise, the time is increased incrementally until to the maximum time.
Note that, min and max times are 10 secs and 30 secs by default.
- F2FS adopts a default victim selection policy where background cleaning uses
a cost-benefit algorithm, while on-demand cleaning uses a greedy algorithm.
- The method of moving data during the cleaning is slightly different between
background and on-demand cleaning schemes. In the case of background cleaning,
F2FS loads the data, and marks them as dirty. Then, F2FS expects that the data
will be moved by flusher or VM. In the case of on-demand cleaning, F2FS should
move the data right away.
- In order to identify valid blocks in a victim segment, F2FS scans the bitmap
of the segment managed as an SIT entry.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 16:13:01 +08:00
|
|
|
|
|
|
|
struct f2fs_gc_kthread {
|
|
|
|
struct task_struct *f2fs_gc_task;
|
|
|
|
wait_queue_head_t gc_wait_queue_head;
|
2013-08-04 22:09:40 +08:00
|
|
|
|
|
|
|
/* for gc sleep time */
|
2017-08-07 13:09:00 +08:00
|
|
|
unsigned int urgent_sleep_time;
|
2013-08-04 22:09:40 +08:00
|
|
|
unsigned int min_sleep_time;
|
|
|
|
unsigned int max_sleep_time;
|
|
|
|
unsigned int no_gc_sleep_time;
|
2013-08-04 22:10:15 +08:00
|
|
|
|
|
|
|
/* for changing gc mode */
|
2017-08-07 13:09:00 +08:00
|
|
|
unsigned int gc_wake;
|
2021-03-27 17:57:06 +08:00
|
|
|
|
|
|
|
/* for GC_MERGE mount option */
|
|
|
|
wait_queue_head_t fggc_wq; /*
|
|
|
|
* caller of f2fs_balance_fs()
|
|
|
|
* will wait on this wait queue.
|
|
|
|
*/
|
f2fs: add garbage collection functions
This adds on-demand and background cleaning functions.
- The basic background cleaning policy is trying to do cleaning jobs as much as
possible whenever the system is idle. Once the background cleaning is done,
the cleaner sleeps an amount of time not to interfere with VFS calls. The time
is dynamically adjusted according to the status of whole segments, which is
decreased when the following conditions are satisfied.
. GC is not conducted currently, and
. IO subsystem is idle by checking the number of requets in bdev's request
list, and
. There are enough dirty segments.
Otherwise, the time is increased incrementally until to the maximum time.
Note that, min and max times are 10 secs and 30 secs by default.
- F2FS adopts a default victim selection policy where background cleaning uses
a cost-benefit algorithm, while on-demand cleaning uses a greedy algorithm.
- The method of moving data during the cleaning is slightly different between
background and on-demand cleaning schemes. In the case of background cleaning,
F2FS loads the data, and marks them as dirty. Then, F2FS expects that the data
will be moved by flusher or VM. In the case of on-demand cleaning, F2FS should
move the data right away.
- In order to identify valid blocks in a victim segment, F2FS scans the bitmap
of the segment managed as an SIT entry.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 16:13:01 +08:00
|
|
|
};
|
|
|
|
|
2014-11-28 23:49:40 +08:00
|
|
|
struct gc_inode_list {
|
|
|
|
struct list_head ilist;
|
|
|
|
struct radix_tree_root iroot;
|
|
|
|
};
|
|
|
|
|
f2fs: support age threshold based garbage collection
There are several issues in current background GC algorithm:
- valid blocks is one of key factors during cost overhead calculation,
so if segment has less valid block, however even its age is young or
it locates hot segment, CB algorithm will still choose the segment as
victim, it's not appropriate.
- GCed data/node will go to existing logs, no matter in-there datas'
update frequency is the same or not, it may mix hot and cold data
again.
- GC alloctor mainly use LFS type segment, it will cost free segment
more quickly.
This patch introduces a new algorithm named age threshold based
garbage collection to solve above issues, there are three steps
mainly:
1. select a source victim:
- set an age threshold, and select candidates beased threshold:
e.g.
0 means youngest, 100 means oldest, if we set age threshold to 80
then select dirty segments which has age in range of [80, 100] as
candiddates;
- set candidate_ratio threshold, and select candidates based the
ratio, so that we can shrink candidates to those oldest segments;
- select target segment with fewest valid blocks in order to
migrate blocks with minimum cost;
2. select a target victim:
- select candidates beased age threshold;
- set candidate_radius threshold, search candidates whose age is
around source victims, searching radius should less than the
radius threshold.
- select target segment with most valid blocks in order to avoid
migrating current target segment.
3. merge valid blocks from source victim into target victim with
SSR alloctor.
Test steps:
- create 160 dirty segments:
* half of them have 128 valid blocks per segment
* left of them have 384 valid blocks per segment
- run background GC
Benefit: GC count and block movement count both decrease obviously:
- Before:
- Valid: 86
- Dirty: 1
- Prefree: 11
- Free: 6001 (6001)
GC calls: 162 (BG: 220)
- data segments : 160 (160)
- node segments : 2 (2)
Try to move 41454 blocks (BG: 41454)
- data blocks : 40960 (40960)
- node blocks : 494 (494)
IPU: 0 blocks
SSR: 0 blocks in 0 segments
LFS: 41364 blocks in 81 segments
- After:
- Valid: 87
- Dirty: 0
- Prefree: 4
- Free: 6008 (6008)
GC calls: 75 (BG: 76)
- data segments : 74 (74)
- node segments : 1 (1)
Try to move 12813 blocks (BG: 12813)
- data blocks : 12544 (12544)
- node blocks : 269 (269)
IPU: 0 blocks
SSR: 12032 blocks in 77 segments
LFS: 855 blocks in 2 segments
Signed-off-by: Chao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: fix a bug along with pinfile in-mem segment & clean up]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-08-04 21:14:49 +08:00
|
|
|
struct victim_info {
|
|
|
|
unsigned long long mtime; /* mtime of section */
|
|
|
|
unsigned int segno; /* section No. */
|
|
|
|
};
|
|
|
|
|
|
|
|
struct victim_entry {
|
|
|
|
struct rb_node rb_node; /* rb node located in rb-tree */
|
|
|
|
union {
|
|
|
|
struct {
|
|
|
|
unsigned long long mtime; /* mtime of section */
|
|
|
|
unsigned int segno; /* segment No. */
|
|
|
|
};
|
|
|
|
struct victim_info vi; /* victim info */
|
|
|
|
};
|
|
|
|
struct list_head list;
|
|
|
|
};
|
|
|
|
|
2012-11-29 12:28:09 +08:00
|
|
|
/*
|
f2fs: add garbage collection functions
This adds on-demand and background cleaning functions.
- The basic background cleaning policy is trying to do cleaning jobs as much as
possible whenever the system is idle. Once the background cleaning is done,
the cleaner sleeps an amount of time not to interfere with VFS calls. The time
is dynamically adjusted according to the status of whole segments, which is
decreased when the following conditions are satisfied.
. GC is not conducted currently, and
. IO subsystem is idle by checking the number of requets in bdev's request
list, and
. There are enough dirty segments.
Otherwise, the time is increased incrementally until to the maximum time.
Note that, min and max times are 10 secs and 30 secs by default.
- F2FS adopts a default victim selection policy where background cleaning uses
a cost-benefit algorithm, while on-demand cleaning uses a greedy algorithm.
- The method of moving data during the cleaning is slightly different between
background and on-demand cleaning schemes. In the case of background cleaning,
F2FS loads the data, and marks them as dirty. Then, F2FS expects that the data
will be moved by flusher or VM. In the case of on-demand cleaning, F2FS should
move the data right away.
- In order to identify valid blocks in a victim segment, F2FS scans the bitmap
of the segment managed as an SIT entry.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 16:13:01 +08:00
|
|
|
* inline functions
|
|
|
|
*/
|
f2fs: support zone capacity less than zone size
NVMe Zoned Namespace devices can have zone-capacity less than zone-size.
Zone-capacity indicates the maximum number of sectors that are usable in
a zone beginning from the first sector of the zone. This makes the sectors
sectors after the zone-capacity till zone-size to be unusable.
This patch set tracks zone-size and zone-capacity in zoned devices and
calculate the usable blocks per segment and usable segments per section.
If zone-capacity is less than zone-size mark only those segments which
start before zone-capacity as free segments. All segments at and beyond
zone-capacity are treated as permanently used segments. In cases where
zone-capacity does not align with segment size the last segment will start
before zone-capacity and end beyond the zone-capacity of the zone. For
such spanning segments only sectors within the zone-capacity are used.
During writes and GC manage the usable segments in a section and usable
blocks per segment. Segments which are beyond zone-capacity are never
allocated, and do not need to be garbage collected, only the segments
which are before zone-capacity needs to garbage collected.
For spanning segments based on the number of usable blocks in that
segment, write to blocks only up to zone-capacity.
Zone-capacity is device specific and cannot be configured by the user.
Since NVMe ZNS device zones are sequentially write only, a block device
with conventional zones or any normal block device is needed along with
the ZNS device for the metadata operations of F2fs.
A typical nvme-cli output of a zoned device shows zone start and capacity
and write pointer as below:
SLBA: 0x0 WP: 0x0 Cap: 0x18800 State: EMPTY Type: SEQWRITE_REQ
SLBA: 0x20000 WP: 0x20000 Cap: 0x18800 State: EMPTY Type: SEQWRITE_REQ
SLBA: 0x40000 WP: 0x40000 Cap: 0x18800 State: EMPTY Type: SEQWRITE_REQ
Here zone size is 64MB, capacity is 49MB, WP is at zone start as the zones
are in EMPTY state. For each zone, only zone start + 49MB is usable area,
any lba/sector after 49MB cannot be read or written to, the drive will fail
any attempts to read/write. So, the second zone starts at 64MB and is
usable till 113MB (64 + 49) and the range between 113 and 128MB is
again unusable. The next zone starts at 128MB, and so on.
Signed-off-by: Aravind Ramesh <aravind.ramesh@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-07-16 20:56:56 +08:00
|
|
|
|
|
|
|
/*
|
|
|
|
* On a Zoned device zone-capacity can be less than zone-size and if
|
|
|
|
* zone-capacity is not aligned to f2fs segment size(2MB), then the segment
|
|
|
|
* starting just before zone-capacity has some blocks spanning across the
|
|
|
|
* zone-capacity, these blocks are not usable.
|
|
|
|
* Such spanning segments can be in free list so calculate the sum of usable
|
|
|
|
* blocks in currently free segments including normal and spanning segments.
|
|
|
|
*/
|
|
|
|
static inline block_t free_segs_blk_count_zoned(struct f2fs_sb_info *sbi)
|
|
|
|
{
|
|
|
|
block_t free_seg_blks = 0;
|
|
|
|
struct free_segmap_info *free_i = FREE_I(sbi);
|
|
|
|
int j;
|
|
|
|
|
|
|
|
spin_lock(&free_i->segmap_lock);
|
|
|
|
for (j = 0; j < MAIN_SEGS(sbi); j++)
|
|
|
|
if (!test_bit(j, free_i->free_segmap))
|
|
|
|
free_seg_blks += f2fs_usable_blks_in_seg(sbi, j);
|
|
|
|
spin_unlock(&free_i->segmap_lock);
|
|
|
|
|
|
|
|
return free_seg_blks;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline block_t free_segs_blk_count(struct f2fs_sb_info *sbi)
|
|
|
|
{
|
|
|
|
if (f2fs_sb_has_blkzoned(sbi))
|
|
|
|
return free_segs_blk_count_zoned(sbi);
|
|
|
|
|
|
|
|
return free_segments(sbi) << sbi->log_blocks_per_seg;
|
|
|
|
}
|
|
|
|
|
f2fs: add garbage collection functions
This adds on-demand and background cleaning functions.
- The basic background cleaning policy is trying to do cleaning jobs as much as
possible whenever the system is idle. Once the background cleaning is done,
the cleaner sleeps an amount of time not to interfere with VFS calls. The time
is dynamically adjusted according to the status of whole segments, which is
decreased when the following conditions are satisfied.
. GC is not conducted currently, and
. IO subsystem is idle by checking the number of requets in bdev's request
list, and
. There are enough dirty segments.
Otherwise, the time is increased incrementally until to the maximum time.
Note that, min and max times are 10 secs and 30 secs by default.
- F2FS adopts a default victim selection policy where background cleaning uses
a cost-benefit algorithm, while on-demand cleaning uses a greedy algorithm.
- The method of moving data during the cleaning is slightly different between
background and on-demand cleaning schemes. In the case of background cleaning,
F2FS loads the data, and marks them as dirty. Then, F2FS expects that the data
will be moved by flusher or VM. In the case of on-demand cleaning, F2FS should
move the data right away.
- In order to identify valid blocks in a victim segment, F2FS scans the bitmap
of the segment managed as an SIT entry.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 16:13:01 +08:00
|
|
|
static inline block_t free_user_blocks(struct f2fs_sb_info *sbi)
|
|
|
|
{
|
f2fs: support zone capacity less than zone size
NVMe Zoned Namespace devices can have zone-capacity less than zone-size.
Zone-capacity indicates the maximum number of sectors that are usable in
a zone beginning from the first sector of the zone. This makes the sectors
sectors after the zone-capacity till zone-size to be unusable.
This patch set tracks zone-size and zone-capacity in zoned devices and
calculate the usable blocks per segment and usable segments per section.
If zone-capacity is less than zone-size mark only those segments which
start before zone-capacity as free segments. All segments at and beyond
zone-capacity are treated as permanently used segments. In cases where
zone-capacity does not align with segment size the last segment will start
before zone-capacity and end beyond the zone-capacity of the zone. For
such spanning segments only sectors within the zone-capacity are used.
During writes and GC manage the usable segments in a section and usable
blocks per segment. Segments which are beyond zone-capacity are never
allocated, and do not need to be garbage collected, only the segments
which are before zone-capacity needs to garbage collected.
For spanning segments based on the number of usable blocks in that
segment, write to blocks only up to zone-capacity.
Zone-capacity is device specific and cannot be configured by the user.
Since NVMe ZNS device zones are sequentially write only, a block device
with conventional zones or any normal block device is needed along with
the ZNS device for the metadata operations of F2fs.
A typical nvme-cli output of a zoned device shows zone start and capacity
and write pointer as below:
SLBA: 0x0 WP: 0x0 Cap: 0x18800 State: EMPTY Type: SEQWRITE_REQ
SLBA: 0x20000 WP: 0x20000 Cap: 0x18800 State: EMPTY Type: SEQWRITE_REQ
SLBA: 0x40000 WP: 0x40000 Cap: 0x18800 State: EMPTY Type: SEQWRITE_REQ
Here zone size is 64MB, capacity is 49MB, WP is at zone start as the zones
are in EMPTY state. For each zone, only zone start + 49MB is usable area,
any lba/sector after 49MB cannot be read or written to, the drive will fail
any attempts to read/write. So, the second zone starts at 64MB and is
usable till 113MB (64 + 49) and the range between 113 and 128MB is
again unusable. The next zone starts at 128MB, and so on.
Signed-off-by: Aravind Ramesh <aravind.ramesh@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-07-16 20:56:56 +08:00
|
|
|
block_t free_blks, ovp_blks;
|
|
|
|
|
|
|
|
free_blks = free_segs_blk_count(sbi);
|
|
|
|
ovp_blks = overprovision_segments(sbi) << sbi->log_blocks_per_seg;
|
|
|
|
|
|
|
|
if (free_blks < ovp_blks)
|
f2fs: add garbage collection functions
This adds on-demand and background cleaning functions.
- The basic background cleaning policy is trying to do cleaning jobs as much as
possible whenever the system is idle. Once the background cleaning is done,
the cleaner sleeps an amount of time not to interfere with VFS calls. The time
is dynamically adjusted according to the status of whole segments, which is
decreased when the following conditions are satisfied.
. GC is not conducted currently, and
. IO subsystem is idle by checking the number of requets in bdev's request
list, and
. There are enough dirty segments.
Otherwise, the time is increased incrementally until to the maximum time.
Note that, min and max times are 10 secs and 30 secs by default.
- F2FS adopts a default victim selection policy where background cleaning uses
a cost-benefit algorithm, while on-demand cleaning uses a greedy algorithm.
- The method of moving data during the cleaning is slightly different between
background and on-demand cleaning schemes. In the case of background cleaning,
F2FS loads the data, and marks them as dirty. Then, F2FS expects that the data
will be moved by flusher or VM. In the case of on-demand cleaning, F2FS should
move the data right away.
- In order to identify valid blocks in a victim segment, F2FS scans the bitmap
of the segment managed as an SIT entry.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 16:13:01 +08:00
|
|
|
return 0;
|
f2fs: support zone capacity less than zone size
NVMe Zoned Namespace devices can have zone-capacity less than zone-size.
Zone-capacity indicates the maximum number of sectors that are usable in
a zone beginning from the first sector of the zone. This makes the sectors
sectors after the zone-capacity till zone-size to be unusable.
This patch set tracks zone-size and zone-capacity in zoned devices and
calculate the usable blocks per segment and usable segments per section.
If zone-capacity is less than zone-size mark only those segments which
start before zone-capacity as free segments. All segments at and beyond
zone-capacity are treated as permanently used segments. In cases where
zone-capacity does not align with segment size the last segment will start
before zone-capacity and end beyond the zone-capacity of the zone. For
such spanning segments only sectors within the zone-capacity are used.
During writes and GC manage the usable segments in a section and usable
blocks per segment. Segments which are beyond zone-capacity are never
allocated, and do not need to be garbage collected, only the segments
which are before zone-capacity needs to garbage collected.
For spanning segments based on the number of usable blocks in that
segment, write to blocks only up to zone-capacity.
Zone-capacity is device specific and cannot be configured by the user.
Since NVMe ZNS device zones are sequentially write only, a block device
with conventional zones or any normal block device is needed along with
the ZNS device for the metadata operations of F2fs.
A typical nvme-cli output of a zoned device shows zone start and capacity
and write pointer as below:
SLBA: 0x0 WP: 0x0 Cap: 0x18800 State: EMPTY Type: SEQWRITE_REQ
SLBA: 0x20000 WP: 0x20000 Cap: 0x18800 State: EMPTY Type: SEQWRITE_REQ
SLBA: 0x40000 WP: 0x40000 Cap: 0x18800 State: EMPTY Type: SEQWRITE_REQ
Here zone size is 64MB, capacity is 49MB, WP is at zone start as the zones
are in EMPTY state. For each zone, only zone start + 49MB is usable area,
any lba/sector after 49MB cannot be read or written to, the drive will fail
any attempts to read/write. So, the second zone starts at 64MB and is
usable till 113MB (64 + 49) and the range between 113 and 128MB is
again unusable. The next zone starts at 128MB, and so on.
Signed-off-by: Aravind Ramesh <aravind.ramesh@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-07-16 20:56:56 +08:00
|
|
|
|
|
|
|
return free_blks - ovp_blks;
|
f2fs: add garbage collection functions
This adds on-demand and background cleaning functions.
- The basic background cleaning policy is trying to do cleaning jobs as much as
possible whenever the system is idle. Once the background cleaning is done,
the cleaner sleeps an amount of time not to interfere with VFS calls. The time
is dynamically adjusted according to the status of whole segments, which is
decreased when the following conditions are satisfied.
. GC is not conducted currently, and
. IO subsystem is idle by checking the number of requets in bdev's request
list, and
. There are enough dirty segments.
Otherwise, the time is increased incrementally until to the maximum time.
Note that, min and max times are 10 secs and 30 secs by default.
- F2FS adopts a default victim selection policy where background cleaning uses
a cost-benefit algorithm, while on-demand cleaning uses a greedy algorithm.
- The method of moving data during the cleaning is slightly different between
background and on-demand cleaning schemes. In the case of background cleaning,
F2FS loads the data, and marks them as dirty. Then, F2FS expects that the data
will be moved by flusher or VM. In the case of on-demand cleaning, F2FS should
move the data right away.
- In order to identify valid blocks in a victim segment, F2FS scans the bitmap
of the segment managed as an SIT entry.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 16:13:01 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
static inline block_t limit_invalid_user_blocks(struct f2fs_sb_info *sbi)
|
|
|
|
{
|
|
|
|
return (long)(sbi->user_block_count * LIMIT_INVALID_BLOCK) / 100;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline block_t limit_free_user_blocks(struct f2fs_sb_info *sbi)
|
|
|
|
{
|
|
|
|
block_t reclaimable_user_blocks = sbi->user_block_count -
|
|
|
|
written_block_count(sbi);
|
|
|
|
return (long)(reclaimable_user_blocks * LIMIT_FREE_BLOCK) / 100;
|
|
|
|
}
|
|
|
|
|
2015-01-26 20:24:21 +08:00
|
|
|
static inline void increase_sleep_time(struct f2fs_gc_kthread *gc_th,
|
2017-08-07 23:12:46 +08:00
|
|
|
unsigned int *wait)
|
f2fs: add garbage collection functions
This adds on-demand and background cleaning functions.
- The basic background cleaning policy is trying to do cleaning jobs as much as
possible whenever the system is idle. Once the background cleaning is done,
the cleaner sleeps an amount of time not to interfere with VFS calls. The time
is dynamically adjusted according to the status of whole segments, which is
decreased when the following conditions are satisfied.
. GC is not conducted currently, and
. IO subsystem is idle by checking the number of requets in bdev's request
list, and
. There are enough dirty segments.
Otherwise, the time is increased incrementally until to the maximum time.
Note that, min and max times are 10 secs and 30 secs by default.
- F2FS adopts a default victim selection policy where background cleaning uses
a cost-benefit algorithm, while on-demand cleaning uses a greedy algorithm.
- The method of moving data during the cleaning is slightly different between
background and on-demand cleaning schemes. In the case of background cleaning,
F2FS loads the data, and marks them as dirty. Then, F2FS expects that the data
will be moved by flusher or VM. In the case of on-demand cleaning, F2FS should
move the data right away.
- In order to identify valid blocks in a victim segment, F2FS scans the bitmap
of the segment managed as an SIT entry.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 16:13:01 +08:00
|
|
|
{
|
2017-08-07 23:12:46 +08:00
|
|
|
unsigned int min_time = gc_th->min_sleep_time;
|
|
|
|
unsigned int max_time = gc_th->max_sleep_time;
|
|
|
|
|
2015-01-26 20:24:21 +08:00
|
|
|
if (*wait == gc_th->no_gc_sleep_time)
|
|
|
|
return;
|
2013-04-24 12:00:14 +08:00
|
|
|
|
2017-08-07 23:12:46 +08:00
|
|
|
if ((long long)*wait + (long long)min_time > (long long)max_time)
|
|
|
|
*wait = max_time;
|
|
|
|
else
|
|
|
|
*wait += min_time;
|
f2fs: add garbage collection functions
This adds on-demand and background cleaning functions.
- The basic background cleaning policy is trying to do cleaning jobs as much as
possible whenever the system is idle. Once the background cleaning is done,
the cleaner sleeps an amount of time not to interfere with VFS calls. The time
is dynamically adjusted according to the status of whole segments, which is
decreased when the following conditions are satisfied.
. GC is not conducted currently, and
. IO subsystem is idle by checking the number of requets in bdev's request
list, and
. There are enough dirty segments.
Otherwise, the time is increased incrementally until to the maximum time.
Note that, min and max times are 10 secs and 30 secs by default.
- F2FS adopts a default victim selection policy where background cleaning uses
a cost-benefit algorithm, while on-demand cleaning uses a greedy algorithm.
- The method of moving data during the cleaning is slightly different between
background and on-demand cleaning schemes. In the case of background cleaning,
F2FS loads the data, and marks them as dirty. Then, F2FS expects that the data
will be moved by flusher or VM. In the case of on-demand cleaning, F2FS should
move the data right away.
- In order to identify valid blocks in a victim segment, F2FS scans the bitmap
of the segment managed as an SIT entry.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 16:13:01 +08:00
|
|
|
}
|
|
|
|
|
2015-01-26 20:24:21 +08:00
|
|
|
static inline void decrease_sleep_time(struct f2fs_gc_kthread *gc_th,
|
2017-08-07 23:12:46 +08:00
|
|
|
unsigned int *wait)
|
f2fs: add garbage collection functions
This adds on-demand and background cleaning functions.
- The basic background cleaning policy is trying to do cleaning jobs as much as
possible whenever the system is idle. Once the background cleaning is done,
the cleaner sleeps an amount of time not to interfere with VFS calls. The time
is dynamically adjusted according to the status of whole segments, which is
decreased when the following conditions are satisfied.
. GC is not conducted currently, and
. IO subsystem is idle by checking the number of requets in bdev's request
list, and
. There are enough dirty segments.
Otherwise, the time is increased incrementally until to the maximum time.
Note that, min and max times are 10 secs and 30 secs by default.
- F2FS adopts a default victim selection policy where background cleaning uses
a cost-benefit algorithm, while on-demand cleaning uses a greedy algorithm.
- The method of moving data during the cleaning is slightly different between
background and on-demand cleaning schemes. In the case of background cleaning,
F2FS loads the data, and marks them as dirty. Then, F2FS expects that the data
will be moved by flusher or VM. In the case of on-demand cleaning, F2FS should
move the data right away.
- In order to identify valid blocks in a victim segment, F2FS scans the bitmap
of the segment managed as an SIT entry.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 16:13:01 +08:00
|
|
|
{
|
2017-08-07 23:12:46 +08:00
|
|
|
unsigned int min_time = gc_th->min_sleep_time;
|
|
|
|
|
2015-01-26 20:24:21 +08:00
|
|
|
if (*wait == gc_th->no_gc_sleep_time)
|
|
|
|
*wait = gc_th->max_sleep_time;
|
2013-04-24 12:00:14 +08:00
|
|
|
|
2017-08-07 23:12:46 +08:00
|
|
|
if ((long long)*wait - (long long)min_time < (long long)min_time)
|
|
|
|
*wait = min_time;
|
|
|
|
else
|
|
|
|
*wait -= min_time;
|
f2fs: add garbage collection functions
This adds on-demand and background cleaning functions.
- The basic background cleaning policy is trying to do cleaning jobs as much as
possible whenever the system is idle. Once the background cleaning is done,
the cleaner sleeps an amount of time not to interfere with VFS calls. The time
is dynamically adjusted according to the status of whole segments, which is
decreased when the following conditions are satisfied.
. GC is not conducted currently, and
. IO subsystem is idle by checking the number of requets in bdev's request
list, and
. There are enough dirty segments.
Otherwise, the time is increased incrementally until to the maximum time.
Note that, min and max times are 10 secs and 30 secs by default.
- F2FS adopts a default victim selection policy where background cleaning uses
a cost-benefit algorithm, while on-demand cleaning uses a greedy algorithm.
- The method of moving data during the cleaning is slightly different between
background and on-demand cleaning schemes. In the case of background cleaning,
F2FS loads the data, and marks them as dirty. Then, F2FS expects that the data
will be moved by flusher or VM. In the case of on-demand cleaning, F2FS should
move the data right away.
- In order to identify valid blocks in a victim segment, F2FS scans the bitmap
of the segment managed as an SIT entry.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 16:13:01 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
static inline bool has_enough_invalid_blocks(struct f2fs_sb_info *sbi)
|
|
|
|
{
|
|
|
|
block_t invalid_user_blocks = sbi->user_block_count -
|
|
|
|
written_block_count(sbi);
|
|
|
|
/*
|
2014-08-06 22:22:50 +08:00
|
|
|
* Background GC is triggered with the following conditions.
|
f2fs: add garbage collection functions
This adds on-demand and background cleaning functions.
- The basic background cleaning policy is trying to do cleaning jobs as much as
possible whenever the system is idle. Once the background cleaning is done,
the cleaner sleeps an amount of time not to interfere with VFS calls. The time
is dynamically adjusted according to the status of whole segments, which is
decreased when the following conditions are satisfied.
. GC is not conducted currently, and
. IO subsystem is idle by checking the number of requets in bdev's request
list, and
. There are enough dirty segments.
Otherwise, the time is increased incrementally until to the maximum time.
Note that, min and max times are 10 secs and 30 secs by default.
- F2FS adopts a default victim selection policy where background cleaning uses
a cost-benefit algorithm, while on-demand cleaning uses a greedy algorithm.
- The method of moving data during the cleaning is slightly different between
background and on-demand cleaning schemes. In the case of background cleaning,
F2FS loads the data, and marks them as dirty. Then, F2FS expects that the data
will be moved by flusher or VM. In the case of on-demand cleaning, F2FS should
move the data right away.
- In order to identify valid blocks in a victim segment, F2FS scans the bitmap
of the segment managed as an SIT entry.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 16:13:01 +08:00
|
|
|
* 1. There are a number of invalid blocks.
|
|
|
|
* 2. There is not enough free space.
|
|
|
|
*/
|
|
|
|
if (invalid_user_blocks > limit_invalid_user_blocks(sbi) &&
|
|
|
|
free_user_blocks(sbi) < limit_free_user_blocks(sbi))
|
|
|
|
return true;
|
|
|
|
return false;
|
|
|
|
}
|