Add docs for reindex data stream REST endpoints (#120653)

Add documentation for new REST endpoints related to data stream upgrade. 
Endpoints:
- /_migration/reindex
- /_migration/reindex/{index}/_status
- /_migration/reindex/{index}/_cancel
- /_create_from/{source}/{dest}
This commit is contained in:
Parker Timmins 2025-01-28 19:44:56 -06:00 committed by GitHub
parent 3df200384f
commit 635a4c21de
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
9 changed files with 733 additions and 4 deletions

View File

@ -0,0 +1,142 @@
[[indices-create-index-from-source]]
=== Create index from source API
++++
<titleabbrev>Create index from source</titleabbrev>
++++
.New API reference
[sidebar]
--
For the most up-to-date API details, refer to {api-es}/group/endpoint-indices[Index APIs].
--
[[indices-create-index-from-source-api-request]]
==== {api-request-title}
`PUT /_create_from/<source>/<dest>`
`POST/_create_from/<source>/<dest>`
[[indices-create-index-from-source-api-prereqs]]
==== {api-prereq-title}
* If the {es} {security-features} are enabled, you must have the `manage`
<<privileges-list-indices,index privilege>> for the index.
[[indices-create-index-from-source-api-desc]]
==== {api-description-title}
This api allows you to add a new index to an {es} cluster, using an existing source index as a basis for the new index.
The settings and mappings from the source index will copied over to the destination index. You can also provide
override settings and mappings which will be combined with the source settings and mappings when creating the
destination index.
[[indices-create-index-from-source-api-path-params]]
==== {api-path-parms-title}
`<source>`::
(Required, string) Name of the existing source index which will be used as a basis.
`<dest>`::
(Required, string) Name of the destination index which will be created.
[role="child_attributes"]
[[indices-create-index-from-source-api-request-body]]
==== {api-request-body-title}
`settings_override`::
(Optional, <<index-modules-settings,index setting object>>) Settings which override the source settings.
`mappings_override`::
(Optional, <<mapping,mapping object>>) Mappings which override the source mappings.
`remove_index_blocks`::
(Optional, boolean) Filter out any index blocks from the source index when creating the destination index.
Defaults to `true`.
[[indices-create-index-from-source-api-example]]
==== {api-examples-title}
Start by creating a source index that we'll copy using this API.
[source,console]
--------------------------------------------------
PUT /my-index
{
"settings": {
"index": {
"number_of_shards": 3,
"blocks.write": true
}
},
"mappings": {
"properties": {
"field1": { "type": "text" }
}
}
}
--------------------------------------------------
// TESTSETUP
Now we create a destination index from the source index. This new index will have the same mappings and settings
as the source index.
[source,console]
--------------------------------------------------
POST _create_from/my-index/my-new-index
--------------------------------------------------
Alternatively, we could override some of the source's settings and mappings. This will use the source settings
and mappings as a basis and combine these with the overrides to create the destination settings and mappings.
[source,console]
--------------------------------------------------
POST _create_from/my-index/my-new-index
{
"settings_override": {
"index": {
"number_of_shards": 5
}
},
"mappings_override": {
"properties": {
"field2": { "type": "boolean" }
}
}
}
--------------------------------------------------
Since the destination index is empty, we very likely will want to write into the index after creation.
This would not be possible if the source index contains an <<index-block-settings,index write block>> which is copied over to the destination index.
One way to handle this is to remove the index write block using a settings override. For example, the following
settings override removes all index blocks.
[source,console]
--------------------------------------------------
POST _create_from/my-index/my-new-index
{
"settings_override": {
"index": {
"blocks.write": null,
"blocks.read": null,
"blocks.read_only": null,
"blocks.read_only_allow_delete": null,
"blocks.metadata": null
}
}
}
--------------------------------------------------
Since this is a common scenario, index blocks are actually removed by default. This is controlled with the parameter
`remove_index_blocks`, which defaults to `true`. If we want the destination index to contains the index blocks from
the source index, we can do the following:
[source,console]
--------------------------------------------------
POST _create_from/my-index/my-new-index
{
"remove_index_blocks": false
}
--------------------------------------------------

View File

@ -0,0 +1,64 @@
[role="xpack"]
[[data-stream-reindex-cancel-api]]
=== Reindex data stream cancel API
++++
<titleabbrev>Reindex data stream cancel</titleabbrev>
++++
.New API reference
[sidebar]
--
For the most up-to-date API details, refer to {api-es}/group/endpoint-migration[Migration APIs].
--
include::{es-ref-dir}/migration/apis/shared-migration-apis-tip.asciidoc[]
Cancels a running data stream reindex task which was started by the <<data-stream-reindex-api, data stream reindex API>>.
Any backing indices that have already been reindexed and swapped into the data stream will remain in the data stream.
Only backing indices which are currently being reindexed, or pending backing indices which are still waiting to be reindexed, will be cancelled.
Once a data stream reindex task is cancelled it will no longer be accessible through the
<<data-stream-reindex-status-api,status API>>. If a reindex task is not currently running
this API will return `resource_not_found_exception`.
///////////////////////////////////////////////////////////
[source,console]
------------------------------------------------------
POST _migration/reindex
{
"source": {
"index": "my-data-stream"
},
"mode": "upgrade"
}
------------------------------------------------------
// TESTSETUP
// TEST[setup:my_data_stream]
///////////////////////////////////////////////////////////
[source,console]
----
POST _migration/reindex/my-data-stream/_cancel
----
// TEST[teardown:data_stream_cleanup]
[[data-stream-reindex-cancel-request]]
==== {api-request-title}
`GET /_migration/reindex/<data-stream>/_cancel`
[[data-stream-reindex-cancel-prereqs]]
==== {api-prereq-title}
* If the {es} {security-features} are enabled, you must have the `manage`
<<privileges-list-indices,index privilege>> for the data stream.
[[data-stream-reindex-cancel-path-params]]
==== {api-path-parms-title}
`<data-stream>`::
(Required, string)
Name of data stream to cancel reindexing.

View File

@ -0,0 +1,157 @@
[role="xpack"]
[[data-stream-reindex-status-api]]
=== Reindex data stream status API
++++
<titleabbrev>Reindex data stream status</titleabbrev>
++++
.New API reference
[sidebar]
--
For the most up-to-date API details, refer to {api-es}/group/endpoint-migration[Migration APIs].
--
include::{es-ref-dir}/migration/apis/shared-migration-apis-tip.asciidoc[]
Obtains the current status of a reindex task for the requested data stream. This status is
available while the reindex task is running and for 24 hours after completion of the task,
whether it succeeds or fails. If the task is cancelled, the status is no longer available.
If the task fails, the exception will be listed within the status.
///////////////////////////////////////////////////////////
[source,console]
------------------------------------------------------
POST _migration/reindex
{
"source": {
"index": "my-data-stream"
},
"mode": "upgrade"
}
------------------------------------------------------
// TESTSETUP
// TEST[setup:my_data_stream]
[source,console]
------------------------------------------------------
POST /_migration/reindex/my-data-stream/_cancel
DELETE _data_stream/my-data-stream
DELETE _index_template/my-data-stream-template
------------------------------------------------------
// TEARDOWN
///////////////////////////////////////////////////////////
[[data-stream-reindex-status-api-request]]
==== {api-request-title}
`GET /_migration/reindex/<data-stream>/_status`
[[data-stream-reindex-status-prereqs]]
==== {api-prereq-title}
* If the {es} {security-features} are enabled, you must have the `manage`
<<privileges-list-indices,index privilege>> for the data stream.
[[data-stream-reindex-status-path-params]]
==== {api-path-parms-title}
`<data-stream>`::
(Required, string)
Name of data stream to get status for. The reindex task for the
data stream should be currently running or have been completed in the last 24 hours.
[role="child_attributes"]
[[data-stream-reindex-status-response-body]]
==== {api-response-body-title}
`start_time`::
(Optional, <<time-units,time value>>) The time when the reindex task started.
`start_time_millis`::
(integer) The time when the reindex task started, in milliseconds since the epoch.
`complete`::
(boolean) `false` if the reindex task is still running, and `true` if the task has completed with success or failure.
`total_indices_in_data_stream`::
(integer) The total number of backing indices in the data stream, including the write index.
`total_indices_requiring_upgrade`::
(integer) The number of backing indices that need to be upgraded. These will consist of the indices which have an
older version and are not read-only.
`successes`::
(integer) The number of backing indices which have already been successfully upgraded.
`in_progress`::
(array of objects) Information on the backing indices which are currently being reindexed.
+
.Properties of objects in `in_progress`
[%collapsible%open]
=====
`index`::
(string) The name of the source backing index.
`total_doc_count`::
(integer) The number of documents in the source backing index.
`reindexed_doc_count`::
(integer) The number of documents which have already been added to the destination backing index.
=====
`pending`::
(integer) The number of backing indices which still need to be upgraded and have not yet been started.
`errors`::
(array of objects) Information on any errors which have occurred.
+
.Properties of objects in `errors`
[%collapsible%open]
=====
`index`::
(string) The name of a backing index which has had an error during reindex.
`message`::
(string) Description of the error.
=====
`exceptions`::
(Optional, string)
Exception message for a reindex failure if the failure could not be tied to a particular index.
[[data-stream-reindex-status-example]]
==== {api-examples-title}
[source,console]
----
GET _migration/reindex/my-data-stream/_status
----
The following is a typical response:
[source,console-result]
----
{
"start_time_millis": 1737676174349,
"complete": false,
"total_indices_in_data_stream": 4,
"total_indices_requiring_upgrade": 3,
"successes": 1,
"in_progress": [
{
"index": ".ds-my-data-stream-2025.01.23-000002",
"total_doc_count": 10000000,
"reindexed_doc_count": 1000
}
],
"pending": 1,
"errors": []
}
----
// TEST[skip:cannot easily clean up reindex task between tests]
For a more in-depth example showing the usage of this API along with the <<data-stream-reindex-api,reindex>> and <<data-stream-reindex-cancel-api,cancel>> APIs,
see this <<reindex-data-stream-api-example,example>>.

View File

@ -0,0 +1,358 @@
[role="xpack"]
[[data-stream-reindex-api]]
=== Reindex data stream API
++++
<titleabbrev>Reindex data stream</titleabbrev>
++++
.New API reference
[sidebar]
--
For the most up-to-date API details, refer to {api-es}/group/endpoint-migration[Migration APIs].
--
include::{es-ref-dir}/migration/apis/shared-migration-apis-tip.asciidoc[]
The reindex data stream API is used to upgrade the backing indices of a data stream to the most
recent major version. It works by reindexing each backing index into a new index, then replacing the original
backing index with its replacement and deleting the original backing index. The settings and mappings
from the original backing indices are copied to the resulting backing indices.
This api runs in the background because reindexing all indices in a large data stream
is expected to take a large amount of time and resources. The endpoint will return immediately and a persistent
task will be created to run in the background. The current status of the task can be checked with
the <<data-stream-reindex-status-api,reindex status API>>. This status will be available for 24 hours after the task completes, whether
it finished successfully or failed. If the status is still available for a task, the task must be cancelled before it can be re-run.
A running or recently completed data stream reindex task can be cancelled using the <<data-stream-reindex-cancel-api,reindex cancel API>>.
///////////////////////////////////////////////////////////
[source,console]
------------------------------------------------------
POST /_migration/reindex/my-data-stream/_cancel
DELETE _data_stream/my-data-stream
DELETE _index_template/my-data-stream-template
------------------------------------------------------
// TEARDOWN
///////////////////////////////////////////////////////////
[[data-stream-reindex-api-request]]
==== {api-request-title}
`POST /_migration/reindex`
[[data-stream-reindex-api-prereqs]]
==== {api-prereq-title}
* If the {es} {security-features} are enabled, you must have the `manage`
<<privileges-list-indices,index privilege>> for the data stream.
[[data-stream-reindex-body]]
==== {api-request-body-title}
`source`::
`index`:::
(Required, string) The name of the data stream to upgrade.
`mode`::
(Required, enum) Set to `upgrade` to upgrade the data stream in-place, using the same source and destination
data stream. Each out-of-date backing index will be reindexed. Then the new backing index is swapped into the data stream and the old index is deleted.
Currently, the only allowed value for this parameter is `upgrade`.
[[reindex-data-stream-api-settings]]
==== Settings
You can use the following settings to control the behavior of the reindex data stream API:
[[migrate_max_concurrent_indices_reindexed_per_data_stream-setting]]
// tag::migrate_max_concurrent_indices_reindexed_per_data_stream-setting-tag[]
`migrate.max_concurrent_indices_reindexed_per_data_stream`
(<<dynamic-cluster-setting,Dynamic>>)
The number of backing indices within a given data stream which will be reindexed concurrently. Defaults to `1`.
// end::migrate_max_concurrent_indices_reindexed_per_data_stream-tag[]
[[migrate_data_stream_reindex_max_request_per_second-setting]]
// tag::migrate_data_stream_reindex_max_request_per_second-setting-tag[]
`migrate.data_stream_reindex_max_request_per_second`
(<<dynamic-cluster-setting,Dynamic>>)
The average maximum number of documents within a given backing index to reindex per second.
Defaults to `1000`, though can be any decimal number greater than `0`.
To remove throttling, set to `-1`.
This setting can be used to throttle the reindex process and manage resource usage.
Consult the <<docs-reindex-throttle,reindex throttle docs>> for more information.
// end::migrate_data_stream_reindex_max_request_per_second-tag[]
[[reindex-data-stream-api-example]]
==== {api-examples-title}
Assume we have a data stream `my-data-stream` with the following backing indices, all of which have index major version 7.x.
* .ds-my-data-stream-2025.01.23-000001
* .ds-my-data-stream-2025.01.23-000002
* .ds-my-data-stream-2025.01.23-000003
Let's also assume that `.ds-my-data-stream-2025.01.23-000003` is the write index.
If {es} is version 8.x and we wish to upgrade to major version 9.x, the version 7.x indices must be upgraded in preparation.
We can use this API to reindex a data stream with version 7.x backing indices and make them version 8 backing indices.
Start by calling the API:
[[reindex-data-stream-start]]
[source,console]
----
POST _migration/reindex
{
"source": {
"index": "my-data-stream"
},
"mode": "upgrade"
}
----
// TEST[setup:my_data_stream]
As this task runs in the background this API will return immediately.
The task will do the following.
First, the data stream is rolled over. So that no documents are lost during the reindex, we add <<index-block-settings,write blocks>>
to the existing backing indices before reindexing them. Since a data stream's write index cannot have a write block,
the data stream is must be rolled over. This will produce a new write index, `.ds-my-data-stream-2025.01.23-000004`; which
has an 8.x version and thus does not need to be upgraded.
Once the data stream has a write index with an 8.x version we can proceed with reindexing the old indices.
For each of the version 7.x indices, we now do the following:
* Add a write block to the source index to guarantee that no writes are lost.
* Open the source index if it is closed.
* Delete the destination index if one exists. This is done in case we are retrying after a failure, so that we start with a fresh index.
* Create the destination index using the <<indices-create-index-from-source, create from source API>>.
This copies the settings and mappings from the old backing index to the new backing index.
* Use the <<docs-reindex, reindex API>> to copy the contents of the old backing index to the new backing index.
* Close the destination index if the source index was originally closed.
* Replace the old index in the data stream with the new index, using the <<modify-data-streams-api,modify data streams API>>.
* Finally, the old backing index is deleted.
By default only one backing index will be processed at a time.
This can be modified using the <<migrate_max_concurrent_indices_reindexed_per_data_stream-setting,`migrate_max_concurrent_indices_reindexed_per_data_stream-setting` setting>>.
While the reindex data stream task is running, we can inspect the current status using the <<data-stream-reindex-status-api,reindex status API>>:
[source,console]
----
GET /_migration/reindex/my-data-stream/_status
----
// TEST[continued]
For the above example, the following would be a possible status:
[source,console-result]
----
{
"start_time_millis": 1737676174349,
"complete": false,
"total_indices_in_data_stream": 4,
"total_indices_requiring_upgrade": 3,
"successes": 0,
"in_progress": [
{
"index": ".ds-my-data-stream-2025.01.23-000001",
"total_doc_count": 10000000,
"reindexed_doc_count": 999999
}
],
"pending": 2,
"errors": []
}
----
// TEST[skip:specific value is part of explanation]
This output means that the first backing index, `.ds-my-data-stream-2025.01.23-000001`, is currently being processed,
and none of the backing indices have yet completed. Notice that `total_indices_in_data_stream` has a value of `4`,
because after the rollover, there are 4 indices in the data stream. But the new write index has an 8.x version, and
thus doesn't need to be reindexed, so `total_indices_requiring_upgrade` is only 3.
[[reindex-data-stream-cancel-restart]]
===== Cancelling and Restarting
The <<reindex-data-stream-api-settings, reindex datastream settings>> provide a few ways to control the performance and
resource usage of a reindex task. This example shows how we can stop a running reindex task, modify the settings,
and restart the task.
Continuing with the above example, assume the reindexing task has not yet completed, and the <<data-stream-reindex-status-api,reindex status API>>
returns the following:
[source,console-result]
----
{
"start_time_millis": 1737676174349,
"complete": false,
"total_indices_in_data_stream": 4,
"total_indices_requiring_upgrade": 3,
"successes": 1,
"in_progress": [
{
"index": ".ds-my-data-stream-2025.01.23-000002",
"total_doc_count": 10000000,
"reindexed_doc_count": 1000
}
],
"pending": 1,
"errors": []
}
----
// TEST[skip:specific value is part of explanation]
Let's assume the task has been running for a long time. By default, we throttle how many requests the reindex operation
can execute per second. This keeps the reindex process from consuming too many resources.
But the default value of `1000` request per second will not be correct for all use cases.
The <<migrate_data_stream_reindex_max_request_per_second-setting,`migrate.data_stream_reindex_max_request_per_second` setting>>
can be used to increase or decrease the number of requests per second, or to remove the throttle entirely.
Changing this setting won't have an effect on the backing index that is currently being reindexed.
For example, changing the setting won't have an effect on `.ds-my-data-stream-2025.01.23-000002`, but would have an
effect on the next backing index.
But in the above status, `.ds-my-data-stream-2025.01.23-000002` has values of 1000 and 10M for the
`reindexed_doc_count` and `total_doc_count`, respectively. This means it has only reindexed 0.01% of the documents in the index.
It might be a good time to cancel the run and optimize some settings without losing much work.
So we call the <<data-stream-reindex-cancel-api,cancel API>>:
[source,console]
----
POST /_migration/reindex/my-data-stream/_cancel
----
// TEST[skip:task will not be present]
Now we can use the <<cluster-update-settings, update cluster settings API>> to increase the throttle:
[source,console]
--------------------------------------------------
PUT /_cluster/settings
{
"persistent" : {
"migrate.data_stream_reindex_max_request_per_second" : 10000
}
}
--------------------------------------------------
// TEST[continued]
The <<reindex-data-stream-start,original reindex command>> can now be used to restart reindexing.
Because the first backing index, `.ds-my-data-stream-2025.01.23-000001`, has already been reindexed and thus is already version 8.x,
it will be skipped. The task will start by reindexing `.ds-my-data-stream-2025.01.23-000002` again from the beginning.
Later, once all the backing indices have finished, the <<data-stream-reindex-status-api,reindex status API>> will return something like the following:
[source,console-result]
----
{
"start_time_millis": 1737676174349,
"complete": true,
"total_indices_in_data_stream": 4,
"total_indices_requiring_upgrade": 2,
"successes": 2,
"in_progress": [],
"pending": 0,
"errors": []
}
----
// TEST[skip:specific value is part of explanation]
Notice that the value of `total_indices_requiring_upgrade` is `2`, unlike the previous status, which had a value of `3`.
This is because `.ds-my-data-stream-2025.01.23-000001` was upgraded before the task cancellation.
After the restart, the API sees that it does not need to be upgraded, thus does not include it in `total_indices_requiring_upgrade` or `successes`,
despite the fact that it upgraded successfully.
The completed status will be accessible from the status API for 24 hours after completion of the task.
We can now check the data stream to verify that indices were upgraded:
[source,console]
----
GET _data_stream/my-data-stream?filter_path=data_streams.indices.index_name
----
// TEST[continued]
which returns:
[source,console-result]
----
{
"data_streams": [
{
"indices": [
{
"index_name": ".migrated-ds-my-data-stream-2025.01.23-000003"
},
{
"index_name": ".migrated-ds-my-data-stream-2025.01.23-000002"
},
{
"index_name": ".migrated-ds-my-data-stream-2025.01.23-000001"
},
{
"index_name": ".ds-my-data-stream-2025.01.23-000004"
}
]
}
]
}
----
// TEST[skip:did not actually run reindex]
Index `.ds-my-data-stream-2025.01.23-000004` is the write index and didn't need to be upgraded because it was created with version 8.x.
The other three backing indices are now prefixed with `.migrated` because they have been upgraded.
We can now check the indices and verify that they have version 8.x:
[source,console]
----
GET .migrated-ds-my-data-stream-2025.01.23-000001?human&filter_path=*.settings.index.version.created_string
----
// TEST[skip:migrated index does not exist]
which returns:
[source,console-result]
----
{
".migrated-ds-my-data-stream-2025.01.23-000001": {
"settings": {
"index": {
"version": {
"created_string": "8.18.0"
}
}
}
}
}
----
// TEST[skip:migrated index does not exist]
[[reindex-data-stream-handling-failure]]
===== Handling Failures
Since the reindex data stream API runs in the background, failure information can be obtained through the <<data-stream-reindex-status-api,reindex status API>>.
For example, if the backing index `.ds-my-data-stream-2025.01.23-000002` was accidentally deleted by a user, we would see a status like the following:
[source,console-result]
----
{
"start_time_millis": 1737676174349,
"complete": false,
"total_indices_in_data_stream": 4,
"total_indices_requiring_upgrade": 3,
"successes": 1,
"in_progress": [],
"pending": 1,
"errors": [
{
"index": ".ds-my-data-stream-2025.01.23-000002",
"message": "index [.ds-my-data-stream-2025.01.23-000002] does not exist"
}
]
}
----
// TEST[skip:result just part of explanation]
Once the issue has been fixed, the failed reindex task can be re-run. First, the failed run's status must be cleared
using the <<data-stream-reindex-cancel-api,reindex cancel API>>. Then the
<<reindex-data-stream-start,original reindex command>> can be called to pick up where it left off.

View File

@ -14,6 +14,14 @@ include::apis/shared-migration-apis-tip.asciidoc[]
* <<migration-api-deprecation>>
* <<feature-migration-api>>
* <<data-stream-reindex-api>>
* <<data-stream-reindex-status-api>>
* <<data-stream-reindex-cancel-api>>
* <<indices-create-index-from-source>>
include::apis/deprecation.asciidoc[]
include::apis/feature-migration.asciidoc[]
include::apis/data-stream-reindex.asciidoc[]
include::apis/data-stream-reindex-status.asciidoc[]
include::apis/data-stream-reindex-cancel.asciidoc[]
include::apis/create-index-from-source.asciidoc[]

View File

@ -1,7 +1,7 @@
{
"indices.cancel_migrate_reindex":{
"documentation":{
"url":"https://www.elastic.co/guide/en/elasticsearch/reference/master/data-stream-reindex.html",
"url":"https://www.elastic.co/guide/en/elasticsearch/reference/master/data-stream-reindex-cancel-api.html",
"description":"This API returns the status of a migration reindex attempt for a data stream or index"
},
"stability":"experimental",

View File

@ -1,7 +1,7 @@
{
"indices.create_from":{
"documentation":{
"url":"https://www.elastic.co/guide/en/elasticsearch/reference/master/data-stream-reindex.html",
"url":"https://www.elastic.co/guide/en/elasticsearch/reference/master/indices-create-index-from-source.html",
"description":"This API creates a destination from a source index. It copies the mappings and settings from the source index while allowing request settings and mappings to override the source values."
},
"stability":"experimental",

View File

@ -1,7 +1,7 @@
{
"indices.get_migrate_reindex_status":{
"documentation":{
"url":"https://www.elastic.co/guide/en/elasticsearch/reference/master/data-stream-reindex.html",
"url":"https://www.elastic.co/guide/en/elasticsearch/reference/master/data-stream-reindex-status-api.html",
"description":"This API returns the status of a migration reindex attempt for a data stream or index"
},
"stability":"experimental",

View File

@ -1,7 +1,7 @@
{
"indices.migrate_reindex":{
"documentation":{
"url":"https://www.elastic.co/guide/en/elasticsearch/reference/master/data-stream-reindex.html",
"url":"https://www.elastic.co/guide/en/elasticsearch/reference/master/data-stream-reindex-api.html",
"description":"This API reindexes all legacy backing indices for a data stream. It does this in a persistent task. The persistent task id is returned immediately, and the reindexing work is completed in that task"
},
"stability":"experimental",