[DOCS] Documents that ELSER is the default service for `semantic_text` (#115769)

This commit is contained in:
István Zoltán Szabó 2024-11-25 14:07:30 +01:00 committed by GitHub
parent e319875d7e
commit 339e431081
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
2 changed files with 31 additions and 52 deletions

View File

@ -13,25 +13,45 @@ Long passages are <<auto-text-chunking, automatically chunked>> to smaller secti
The `semantic_text` field type specifies an inference endpoint identifier that will be used to generate embeddings.
You can create the inference endpoint by using the <<put-inference-api>>.
This field type and the <<query-dsl-semantic-query,`semantic` query>> type make it simpler to perform semantic search on your data.
If you don't specify an inference endpoint, the <<infer-service-elser,ELSER service>> is used by default.
Using `semantic_text`, you won't need to specify how to generate embeddings for your data, or how to index it.
The {infer} endpoint automatically determines the embedding generation, indexing, and query to use.
If you use the ELSER service, you can set up `semantic_text` with the following API request:
[source,console]
------------------------------------------------------------
PUT my-index-000001
{
"mappings": {
"properties": {
"inference_field": {
"type": "semantic_text"
}
}
}
}
------------------------------------------------------------
If you use a service other than ELSER, you must create an {infer} endpoint using the <<put-inference-api>> and reference it when setting up `semantic_text` as the following example demonstrates:
[source,console]
------------------------------------------------------------
PUT my-index-000002
{
"mappings": {
"properties": {
"inference_field": {
"type": "semantic_text",
"inference_id": "my-elser-endpoint"
"inference_id": "my-openai-endpoint" <1>
}
}
}
}
------------------------------------------------------------
// TEST[skip:Requires inference endpoint]
<1> The `inference_id` of the {infer} endpoint to use to generate embeddings.
The recommended way to use semantic_text is by having dedicated {infer} endpoints for ingestion and search.
@ -40,7 +60,7 @@ After creating dedicated {infer} endpoints for both, you can reference them usin
[source,console]
------------------------------------------------------------
PUT my-index-000002
PUT my-index-000003
{
"mappings": {
"properties": {

View File

@ -21,45 +21,9 @@ This tutorial uses the <<inference-example-elser,`elser` service>> for demonstra
[[semantic-text-requirements]]
==== Requirements
To use the `semantic_text` field type, you must have an {infer} endpoint deployed in
your cluster using the <<put-inference-api>>.
This tutorial uses the <<infer-service-elser,ELSER service>> for demonstration, which is created automatically as needed.
To use the `semantic_text` field type with an {infer} service other than ELSER, you must create an inference endpoint using the <<put-inference-api>>.
[discrete]
[[semantic-text-infer-endpoint]]
==== Create the {infer} endpoint
Create an inference endpoint by using the <<put-inference-api>>:
[source,console]
------------------------------------------------------------
PUT _inference/sparse_embedding/my-elser-endpoint <1>
{
"service": "elser", <2>
"service_settings": {
"adaptive_allocations": { <3>
"enabled": true,
"min_number_of_allocations": 3,
"max_number_of_allocations": 10
},
"num_threads": 1
}
}
------------------------------------------------------------
// TEST[skip:TBD]
<1> The task type is `sparse_embedding` in the path as the `elser` service will
be used and ELSER creates sparse vectors. The `inference_id` is
`my-elser-endpoint`.
<2> The `elser` service is used in this example.
<3> This setting enables and configures {ml-docs}/ml-nlp-auto-scale.html#nlp-model-adaptive-allocations[adaptive allocations].
Adaptive allocations make it possible for ELSER to automatically scale up or down resources based on the current load on the process.
[NOTE]
====
You might see a 502 bad gateway error in the response when using the {kib} Console.
This error usually just reflects a timeout, while the model downloads in the background.
You can check the download progress in the {ml-app} UI.
If using the Python client, you can set the `timeout` parameter to a higher value.
====
[discrete]
[[semantic-text-index-mapping]]
@ -75,8 +39,7 @@ PUT semantic-embeddings
"mappings": {
"properties": {
"content": { <1>
"type": "semantic_text", <2>
"inference_id": "my-elser-endpoint" <3>
"type": "semantic_text" <2>
}
}
}
@ -85,18 +48,14 @@ PUT semantic-embeddings
// TEST[skip:TBD]
<1> The name of the field to contain the generated embeddings.
<2> The field to contain the embeddings is a `semantic_text` field.
<3> The `inference_id` is the inference endpoint you created in the previous step.
It will be used to generate the embeddings based on the input text.
Every time you ingest data into the related `semantic_text` field, this endpoint will be used for creating the vector representation of the text.
Since no `inference_id` is provided, the <<infer-service-elser,ELSER service>> is used by default.
To use a different {infer} service, you must create an {infer} endpoint first using the <<put-inference-api>> and then specify it in the `semantic_text` field mapping using the `inference_id` parameter.
[NOTE]
====
If you're using web crawlers or connectors to generate indices, you have to
<<indices-put-mapping,update the index mappings>> for these indices to
include the `semantic_text` field. Once the mapping is updated, you'll need to run
a full web crawl or a full connector sync. This ensures that all existing
documents are reprocessed and updated with the new semantic embeddings,
enabling semantic search on the updated data.
If you're using web crawlers or connectors to generate indices, you have to <<indices-put-mapping,update the index mappings>> for these indices to include the `semantic_text` field.
Once the mapping is updated, you'll need to run a full web crawl or a full connector sync.
This ensures that all existing documents are reprocessed and updated with the new semantic embeddings, enabling semantic search on the updated data.
====
@ -288,4 +247,4 @@ query from the `semantic-embedding` index:
* If you want to use `semantic_text` in hybrid search, refer to https://colab.research.google.com/github/elastic/elasticsearch-labs/blob/main/notebooks/search/09-semantic-text.ipynb[this notebook] for a step-by-step guide.
* For more information on how to optimize your ELSER endpoints, refer to {ml-docs}/ml-nlp-elser.html#elser-recommendations[the ELSER recommendations] section in the model documentation.
* To learn more about model autoscaling, refer to the {ml-docs}/ml-nlp-auto-scale.html[trained model autoscaling] page.
* To learn more about model autoscaling, refer to the {ml-docs}/ml-nlp-auto-scale.html[trained model autoscaling] page.