[DOCS] Documents that ELSER is the default service for `semantic_text` (#115769)
This commit is contained in:
parent
e319875d7e
commit
339e431081
|
@ -13,25 +13,45 @@ Long passages are <<auto-text-chunking, automatically chunked>> to smaller secti
|
|||
The `semantic_text` field type specifies an inference endpoint identifier that will be used to generate embeddings.
|
||||
You can create the inference endpoint by using the <<put-inference-api>>.
|
||||
This field type and the <<query-dsl-semantic-query,`semantic` query>> type make it simpler to perform semantic search on your data.
|
||||
If you don't specify an inference endpoint, the <<infer-service-elser,ELSER service>> is used by default.
|
||||
|
||||
Using `semantic_text`, you won't need to specify how to generate embeddings for your data, or how to index it.
|
||||
The {infer} endpoint automatically determines the embedding generation, indexing, and query to use.
|
||||
|
||||
If you use the ELSER service, you can set up `semantic_text` with the following API request:
|
||||
|
||||
[source,console]
|
||||
------------------------------------------------------------
|
||||
PUT my-index-000001
|
||||
{
|
||||
"mappings": {
|
||||
"properties": {
|
||||
"inference_field": {
|
||||
"type": "semantic_text"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
------------------------------------------------------------
|
||||
|
||||
If you use a service other than ELSER, you must create an {infer} endpoint using the <<put-inference-api>> and reference it when setting up `semantic_text` as the following example demonstrates:
|
||||
|
||||
[source,console]
|
||||
------------------------------------------------------------
|
||||
PUT my-index-000002
|
||||
{
|
||||
"mappings": {
|
||||
"properties": {
|
||||
"inference_field": {
|
||||
"type": "semantic_text",
|
||||
"inference_id": "my-elser-endpoint"
|
||||
"inference_id": "my-openai-endpoint" <1>
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
------------------------------------------------------------
|
||||
// TEST[skip:Requires inference endpoint]
|
||||
<1> The `inference_id` of the {infer} endpoint to use to generate embeddings.
|
||||
|
||||
|
||||
The recommended way to use semantic_text is by having dedicated {infer} endpoints for ingestion and search.
|
||||
|
@ -40,7 +60,7 @@ After creating dedicated {infer} endpoints for both, you can reference them usin
|
|||
|
||||
[source,console]
|
||||
------------------------------------------------------------
|
||||
PUT my-index-000002
|
||||
PUT my-index-000003
|
||||
{
|
||||
"mappings": {
|
||||
"properties": {
|
||||
|
|
|
@ -21,45 +21,9 @@ This tutorial uses the <<inference-example-elser,`elser` service>> for demonstra
|
|||
[[semantic-text-requirements]]
|
||||
==== Requirements
|
||||
|
||||
To use the `semantic_text` field type, you must have an {infer} endpoint deployed in
|
||||
your cluster using the <<put-inference-api>>.
|
||||
This tutorial uses the <<infer-service-elser,ELSER service>> for demonstration, which is created automatically as needed.
|
||||
To use the `semantic_text` field type with an {infer} service other than ELSER, you must create an inference endpoint using the <<put-inference-api>>.
|
||||
|
||||
[discrete]
|
||||
[[semantic-text-infer-endpoint]]
|
||||
==== Create the {infer} endpoint
|
||||
|
||||
Create an inference endpoint by using the <<put-inference-api>>:
|
||||
|
||||
[source,console]
|
||||
------------------------------------------------------------
|
||||
PUT _inference/sparse_embedding/my-elser-endpoint <1>
|
||||
{
|
||||
"service": "elser", <2>
|
||||
"service_settings": {
|
||||
"adaptive_allocations": { <3>
|
||||
"enabled": true,
|
||||
"min_number_of_allocations": 3,
|
||||
"max_number_of_allocations": 10
|
||||
},
|
||||
"num_threads": 1
|
||||
}
|
||||
}
|
||||
------------------------------------------------------------
|
||||
// TEST[skip:TBD]
|
||||
<1> The task type is `sparse_embedding` in the path as the `elser` service will
|
||||
be used and ELSER creates sparse vectors. The `inference_id` is
|
||||
`my-elser-endpoint`.
|
||||
<2> The `elser` service is used in this example.
|
||||
<3> This setting enables and configures {ml-docs}/ml-nlp-auto-scale.html#nlp-model-adaptive-allocations[adaptive allocations].
|
||||
Adaptive allocations make it possible for ELSER to automatically scale up or down resources based on the current load on the process.
|
||||
|
||||
[NOTE]
|
||||
====
|
||||
You might see a 502 bad gateway error in the response when using the {kib} Console.
|
||||
This error usually just reflects a timeout, while the model downloads in the background.
|
||||
You can check the download progress in the {ml-app} UI.
|
||||
If using the Python client, you can set the `timeout` parameter to a higher value.
|
||||
====
|
||||
|
||||
[discrete]
|
||||
[[semantic-text-index-mapping]]
|
||||
|
@ -75,8 +39,7 @@ PUT semantic-embeddings
|
|||
"mappings": {
|
||||
"properties": {
|
||||
"content": { <1>
|
||||
"type": "semantic_text", <2>
|
||||
"inference_id": "my-elser-endpoint" <3>
|
||||
"type": "semantic_text" <2>
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -85,18 +48,14 @@ PUT semantic-embeddings
|
|||
// TEST[skip:TBD]
|
||||
<1> The name of the field to contain the generated embeddings.
|
||||
<2> The field to contain the embeddings is a `semantic_text` field.
|
||||
<3> The `inference_id` is the inference endpoint you created in the previous step.
|
||||
It will be used to generate the embeddings based on the input text.
|
||||
Every time you ingest data into the related `semantic_text` field, this endpoint will be used for creating the vector representation of the text.
|
||||
Since no `inference_id` is provided, the <<infer-service-elser,ELSER service>> is used by default.
|
||||
To use a different {infer} service, you must create an {infer} endpoint first using the <<put-inference-api>> and then specify it in the `semantic_text` field mapping using the `inference_id` parameter.
|
||||
|
||||
[NOTE]
|
||||
====
|
||||
If you're using web crawlers or connectors to generate indices, you have to
|
||||
<<indices-put-mapping,update the index mappings>> for these indices to
|
||||
include the `semantic_text` field. Once the mapping is updated, you'll need to run
|
||||
a full web crawl or a full connector sync. This ensures that all existing
|
||||
documents are reprocessed and updated with the new semantic embeddings,
|
||||
enabling semantic search on the updated data.
|
||||
If you're using web crawlers or connectors to generate indices, you have to <<indices-put-mapping,update the index mappings>> for these indices to include the `semantic_text` field.
|
||||
Once the mapping is updated, you'll need to run a full web crawl or a full connector sync.
|
||||
This ensures that all existing documents are reprocessed and updated with the new semantic embeddings, enabling semantic search on the updated data.
|
||||
====
|
||||
|
||||
|
||||
|
@ -288,4 +247,4 @@ query from the `semantic-embedding` index:
|
|||
|
||||
* If you want to use `semantic_text` in hybrid search, refer to https://colab.research.google.com/github/elastic/elasticsearch-labs/blob/main/notebooks/search/09-semantic-text.ipynb[this notebook] for a step-by-step guide.
|
||||
* For more information on how to optimize your ELSER endpoints, refer to {ml-docs}/ml-nlp-elser.html#elser-recommendations[the ELSER recommendations] section in the model documentation.
|
||||
* To learn more about model autoscaling, refer to the {ml-docs}/ml-nlp-auto-scale.html[trained model autoscaling] page.
|
||||
* To learn more about model autoscaling, refer to the {ml-docs}/ml-nlp-auto-scale.html[trained model autoscaling] page.
|
||||
|
|
Loading…
Reference in New Issue