From 489a38895e9f00c47eaf4bf46f1c0fe83227356c Mon Sep 17 00:00:00 2001 From: Kathleen DeRusso Date: Fri, 11 Apr 2025 08:55:47 -0400 Subject: [PATCH] Update chunking_settings docs for semantic_text (#126634) * Update chunking_settings docs for semantic_text * Remove redundancy --- .../mapping-reference/semantic-text.md | 28 ++++++++++++++++--- 1 file changed, 24 insertions(+), 4 deletions(-) diff --git a/docs/reference/elasticsearch/mapping-reference/semantic-text.md b/docs/reference/elasticsearch/mapping-reference/semantic-text.md index 22a19c84e3ce..f8e11c6269c9 100644 --- a/docs/reference/elasticsearch/mapping-reference/semantic-text.md +++ b/docs/reference/elasticsearch/mapping-reference/semantic-text.md @@ -109,10 +109,30 @@ to create the endpoint. If not specified, the {{infer}} endpoint defined by `inference_id` will be used at both index and query time. `chunking_settings` -: (Optional, object) Sets chunking settings that will override the settings -configured by the `inference_id` endpoint. -See [chunking settings attributes](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-inference-put) -in the {{infer}} API documentation for a complete list of available options. +: (Optional, object) Settings for chunking text into smaller passages. +If specified, these will override the chunking settings set in the {infer-cap} +endpoint associated with `inference_id`. +If chunking settings are updated, they will not be applied to existing documents +until they are reindexed. + +::::{dropdown} Valid values for `chunking_settings` +`type` +: Indicates the type of chunking strategy to use. Valid values are `word` or +`sentence`. Required. + +`max_chunk_size` +: The maximum number of works in a chunk. Required. + +`overlap` +: The number of overlapping words allowed in chunks. This cannot be defined as +more than half of the `max_chunk_size`. Required for `word` type chunking +settings. + +`sentence_overlap` +: The number of overlapping sentences allowed in chunks. Valid values are `0` +or `1`. Required for `sentence` type chunking settings + +:::: ## {{infer-cap}} endpoint validation [infer-endpoint-validation]