Commit Graph

86804 Commits

Author SHA1 Message Date
Bogdan Pintea 0a8091605b
ESQL: Pushdown constructs doing case-insensitive regexes (#128393)
This introduces an optimization to pushdown to Lucense those language constructs that aim at case-insensitive regular expression matching, used with `LIKE` and `RLIKE` operators, such as:
* `| WHERE TO_LOWER(field) LIKE "abc*"`
* `| WHERE TO_UPPER(field) RLIKE "ABC.*"` 
 
These are now pushed as case-insensitive `wildcard` and `regexp` respectively queries down to Lucene.

Closes #127479
2025-05-30 10:55:00 +02:00
Mridula cc461afa0a
Fix: Allow non-score sorts in pinned retriever sub-retrievers (#128323)
* Fix scoring and sort handling in pinned retriever

* Remove books.es from version control and add to .gitignore

* Remove books.es from version control and add to .gitignore

* Remove books.es entries from .gitignore

* fixed the mess

* Update x-pack/plugin/search-business-rules/src/test/java/org/elasticsearch/xpack/searchbusinessrules/retriever/PinnedRetrieverBuilderTests.java

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update x-pack/plugin/search-business-rules/src/main/java/org/elasticsearch/xpack/searchbusinessrules/retriever/PinnedRetrieverBuilder.java

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-05-30 10:01:01 +03:00
elasticsearchmachine e453b53ab6 Mute org.elasticsearch.test.apmintegration.TracesApmIT testApmIntegration #128651 2025-05-30 15:55:28 +10:00
Nhat Nguyen 1ab2e6ca6c
Combine small pages in Limit (#128531)
Currently, the Limit operator does not combine small pages into larger 
ones; it simply passes them along, except for chunking pages larger than
the limit. This change integrates EstimatesRowSize into Limit and
adjusts it to emit larger pages. As a result, pages up to twice the
pageSize may be emitted, which is preferable to emitting undersized
pages. This should reduce the number of transport requests and responses
between clusters or coordinator-data nodes for queries without TopN or
STATS when target shards produce small pages due to their size or highly
selective filters.
2025-05-29 16:04:49 -07:00
elasticsearchmachine 7bf2864e79 Mute org.elasticsearch.packaging.test.DockerTests test085EnvironmentVariablesAreRespectedUnderDockerExec #128115 2025-05-30 08:59:21 +10:00
Jonathan Buttner 9db18373ba
Adding configurable inference service (#127939)
* Inference changes

* Custom service fixes

* Update docs/changelog/127939.yaml

* Cleaning up from failed merge

* Fixing changelog

* [CI] Auto commit changes from spotless

* Fixing test

* Adding feature flag

* [CI] Auto commit changes from spotless

---------

Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
2025-05-30 00:12:19 +03:00
Patrick Doyle 77595cbccd
[Entitlements] Add test entitlement bootstrap and initialization classes (#128625)
* Initialization class as argument to EntitlementAgent

* visibility changes

* WIP: test entitlement bootstrap and initialization classes

* Simplify

* Moving packages to reduce visibility

* adjust visibility

* add plugins descriptor + policy parsing

* PR comments

* update visibility, uncomment TestBuildInfoParser usage

* [CI] Auto commit changes from spotless

* Factor out createPolicyManager to help merge

* TestEntitlementInitialization is not yet implemented

* Respond to PR comments

---------

Co-authored-by: Lorenzo Dematte <lorenzo.dematte@elastic.co>
Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
2025-05-29 22:36:39 +03:00
elasticsearchmachine e241efabb6 Prune changelogs after 8.18.2 release 2025-05-29 17:47:46 +00:00
elasticsearchmachine 1994232847 Bump versions after 8.18.2 release 2025-05-29 17:46:21 +00:00
Leonardo Hoet 107daf3321
Implemented ChatCompletion task for Google VertexAI with Gemini Models (#128105)
* Implemented ChatCompletion task for Google VertexAI with Gemini Models

* changelog

* System Instruction bugfix

* Mapping role assistant -> model in vertex ai chat completion request for compatibility

* GoogleVertexAI chat completion using SSE events. Removed JsonArrayEventParser

* Removed buffer from GoogleVertexAiUnifiedStreamingProcessor

* Casting inference inputs with `castoTo`

* Registered GoogleVertexAiChatCompletionServiceSettings in InferenceNamedWriteablesProvider. Added InferenceSettingsTests

* Changed transport version to 8_19 for vertexai chatcompletion

* Fix to transport version. Moved ML_INFERENCE_VERTEXAI_CHATCOMPLETION_ADDED to the right location

* VertexAI Chat completion request entity jsonStringToMap using `ensureExpectedToken`

* Fixed TransportVersions. Left vertexAi chat completion 8_19 and added new one for ML_INFERENCE_VERTEXAI_CHATCOMPLETION_ADDDED

* Refactor switch statements by if-else for older java compatibility. Improved indentation via `{}`

* Removed GoogleVertexAiChatCompletionResponseEntity and refactored code around it.

* Removed redundant test `testUnifiedCompletionInfer_WithGoogleVertexAiModel`

* Returning whole body when fail to parse response from VertexAI

* Refactor use GenericRequestManager instead of GoogleVertexAiCompletionRequestManager

* Refactored to constructorArg for mandatory args in GoogleVertexAiUnifiedStreamingProcessor

* Changed transport version in GoogleVertexAiChatCompletionServiceSettings

* Bugfix in tool calling with role tool

* GoogleVertexAiModel added documentation info on rateLimitGroupingHash

* [CI] Auto commit changes from spotless

* Fix: using Locale.ROOT when calling toLowerCase

* Fix: Renamed test class to match convention & modified use of forbidden api

* Fix: Failing test in InferenceServicesIT

---------

Co-authored-by: lhoet <lhoet@google.com>
Co-authored-by: Jonathan Buttner <56361221+jonathan-buttner@users.noreply.github.com>
Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
2025-05-29 13:35:24 -04:00
Lorenzo Dematté 554b96aec9
[Entitlements] Add missing NIO async network instrumentation (#128582)
This PR adds some additional instrumentation to ensure we capture more cases in which we use async network usage via channels and `select`
2025-05-29 19:52:10 +03:00
Chris Hegarty 7f05ab9cf6
Fix and test off-heap stats when using direct IO for accessing the raw vectors (#128615)
Fix and test off-heap stats when using direct IO for accessing the raw vectors. The direct IO reader is not using off-heap, so it returns an empty map to indicate that there is no off-heap requirements. I added some overloaded of tests with different directories to verify this.

Note: For 9.1 we're still using reflection to access the internals of non-ES readers, but DirectIO is an ES reader so we can use our internal OffHeapStats interface (rather than reflection). This is all replaced when we eventually get Lucene 10.3.
2025-05-29 17:43:07 +01:00
Nik Everett 1b151eda4b
ESQL: Compute engine support for tagged queries (#128521)
Begins adding support for running "tagged queries" to the compute
engine. Here, it's just the `LuceneSourceOperator` because that's
useful and contained.

Example time! Say you are running:
```
FROM foo
| STATS MAX(v) BY ROUND_TO(g, 0, 100, 1000, 100000)
```

It's *often* faster to run this as four queries:
* The docs that round to `0`
* The docs that round to `100`
* The docs that round to `1000`
* The docs that round to `100000`

This creates an ESQL operator that can run these queries, one after the
other and attach those tags.

Aggs uses this trick and it's *way* faster when it can push down count
queries, but it's still faster when it pushes doc loading things. This
implementation in `LuceneSourceOperator` is quite similar to the doc
loading version in _search.

I don't have performance measurements yet because I haven't plugged this
into the language. In _search we call this `filter-by-filter` and enable
it when each group averages to more than 5000 documents and when there
isn't an `_doc_count` field. It's faster in those cases not to push. I
expect we'll be pretty similar.
2025-05-29 12:41:58 -04:00
Pat Whelan bf0dc6e7f2
[ML] Write Chat Completion JSON (#128592)
Most providers write the UnifiedCompletionRequest JSON as we received
it, with some exception:
- the modelId can be null and/or overwritten from various locations
- `max_completion_tokens` repalced `max_tokens`, but some providers
  still use the deprecated field name
We will handle the variations using Params, otherwise all of the
XContent building code has moved into UnifiedCompletionRequest so it can
be reused across providers.
2025-05-29 19:39:34 +03:00
Graeme Mjehovich 57d4e15b62
Bugfix: Prevent invalid privileges in manage roles privilege (#128532)
This PR addresses the bug reported in
[#127496](https://github.com/elastic/elasticsearch/issues/127496)

**Changes:** - Added validation logic in `ConfigurableClusterPrivileges`
to ensure privileges defined for a global cluster manage role privilege
are valid  - Added unit test to `ManageRolePrivilegesTest` to ensure
invalid privilege is caught during role creation - Updated
`BulkPutRoleRestIT` to assert that an error is thrown and that the role
is not created.

Both existing and new unit/integration tests passed locally.
2025-05-30 02:14:54 +10:00
Chris Hegarty 68d5cc6973
Use all bbq index types in DirectIOIT (#128620)
This commit is a small change to use all bbq index types in DirectIOIT. While not strictly necessary, it is good add the additional coverage.
2025-05-29 16:13:12 +01:00
Lisa Cawley 3b54afd2b7
[DOCS] Edit dynamic and static setting links (#128537) 2025-05-29 08:00:11 -07:00
Andrei Dan 2375e89a5f
Make writerWithOffset fully delegate to the writer it wraps (#126937)
writerWithOffset uses a lambda to create a RangeMissingHandler however,
the RangeMissingHandler interface has a default implementation for `sharedInputStreamFactory`.

This makes `writerWithOffset` delegate to the received writer only for the `fillCacheRange`
method where the writer itself perhaps didn't have the `sharedInputStream` method invoked
(always invoking `sharedInputStream` before `fillCacheRange` is part of the contract of the
RangeMissingHandler interface)

This PR makes `writerWithOffset` delegate the `sharedInputStream` to the underlying writer.
2025-05-29 15:58:16 +01:00
Keith Massey dc2fbe19a6
Removing the data stream settings feature flag (#128594) 2025-05-29 09:50:14 -05:00
elasticsearchmachine 56e75ce8fe Mute org.elasticsearch.reservedstate.service.FileSettingsServiceIT testSymlinkUpdateTriggerReload #128619 2025-05-30 00:02:33 +10:00
Joshua Adams 1c1907fec9
Fix typo in DistributedArchitectureGuide (#128373)
Fixes two typos in the `DistributedArchitectureGuide` where `node` was
incorrectly called `NodeB`.
2025-05-29 23:30:08 +10:00
Rene Groeschke df43f8396b
Fix serialization issue with RunTask using Configuration Cache (#128596) 2025-05-29 09:52:00 +02:00
elasticsearchmachine f65949d899 Mute org.elasticsearch.packaging.test.DockerTests test124CanRestartContainerWithStackLoggingConfig #128121 2025-05-29 17:08:38 +10:00
Nik Everett 33fc85fdff
ESQL: Move some mappers into their expressions (#128342)
This moves half of the remaining centralized expression mapper logic
into the individual expressions. This is how all but 3 of the remaining
expressions work. Let's try and be fully consistent.
2025-05-28 20:50:17 -04:00
Nick Tindall 83d7b7dd70
Add REST & Transport layers section to GeneralArchitectureGuide.md (#126377)
Closes: ES-7885
2025-05-29 03:42:45 +03:00
Stanislav Malyshev 5c482957af
Re-enable and fix the EsqlRestValidationIT test (#128542)
* Re-enable and fix the EsqlRestValidationIT test
2025-05-28 14:57:29 -06:00
Pat Whelan 86cef7f88e
[ML] InferenceService support aliases (#128584)
"elser" is an alias for "elasticsearch", and "sagemaker" is an alias for
"amazon_sagemaker".

Users can continue to create and use providers by their alias.
Elasticsearch will continue to support the alias when it reads the
configuration from the internal index.
2025-05-28 23:36:21 +03:00
Nik Everett 8c89037786
ESQL: Speed up semantic_text tests (#128591)
Speeds up the semantic_text tests by using a test inference plugin. That
skips downloading the normal inference setup.

Closes #128511 Closes #128513 Closes #128571 Closes #128572 Closes
#128573 Closes #128574
2025-05-29 06:25:25 +10:00
Brian Seeders c2ad34b97f
[release-notes] Update automation to use new markdown format (#124161) 2025-05-28 14:53:02 -04:00
Ioana Tagirta f275b71766
ES|QL: Improve field resolution for FORK (#128501) 2025-05-28 20:01:29 +02:00
Patrick Doyle ba50798f62
Split PolicyChecker from PolicyManager (#128004)
* Split PolicyChecker from PolicyManager

* Restore EntitlementCheckerUtils

* [CI] Auto commit changes from spotless

---------

Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
2025-05-28 12:48:14 -04:00
Keith Massey 83a13b9cc4
Making the data stream settngs rest-api-spec consistent with the elasticsearch-specification repository (#128535) 2025-05-28 09:28:11 -05:00
Dan Rubinstein 53668f7565
Improve exception for trained model deployment scale up timeout (#128218)
* Improve exception for trained model deployment scale up timeout

* Update docs/changelog/128218.yaml

* Rename exception and update exception message

---------

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2025-05-28 10:17:30 -04:00
kosabogi ac08e9c8fb
Adds ml-cpp release notes for 9.0 (#128567) 2025-05-28 14:44:45 +02:00
Chris Hegarty 4a831d0d72
[9.x] Add an integration test to verify DirectIO is used for BBQ rescoring (#128465)
Port DirectIOIT from lucene_snapshot to main.

relates #128370
2025-05-28 11:32:38 +01:00
Gal Lalouche 0aae7f6e0d
ESQL: Add INLINESTATS capability requirement to tests (#128556)
It looks like there are a bunch of tests that failed due to missing
INLINESTATS capability.

Resolves #128512.
2025-05-28 13:26:59 +03:00
Tanguy Leroux e5cdc581cf
Add integration test for concurrent multipart uploads on Azure (#128503)
Enhances existing integration test to account for #128449.

Relates ES-11815
2025-05-28 10:59:35 +02:00
Yang Wang 6bc1452b43
Make repositories project aware (#128285)
Pass project-id explicitly to repository factory and make it part of the
repository interface.

Relates: ES-11839
2025-05-28 17:29:39 +10:00
elasticsearchmachine 790be1ea28 Mute org.elasticsearch.xpack.esql.action.CrossClusterQueryWithPartialResultsIT testFailToStartRequestOnRemoteCluster #128545 2025-05-28 17:28:47 +10:00
Ioana Tagirta 392777d842
Return unsupported attributes in FORK output (#128508) 2025-05-28 08:39:59 +02:00
elasticsearchmachine df82c0629d Mute org.elasticsearch.xpack.ccr.index.engine.FollowingEngineTests testProcessOnceOnPrimary #128541 2025-05-28 14:08:48 +10:00
Nhat Nguyen 890c4e282f Mute EsqlRestValidationIT
Tracked at #128543
2025-05-27 20:54:47 -07:00
Nhat Nguyen 7f2e55fd80
Skip indexing points for seq_no in tsdb and logsdb (#128139)
This change skips indexing points for the seq_no field in tsdb and 
logsdb indices to reduce storage requirements and improve indexing
throughput. Although this optimization could be applied to all new
indices, it is limited to tsdb and logsdb, where seq_no usage is
expected to be limited and storage requirements are more critical.

Co-authored-by: Martijn van Groningen <martijn.v.groningen@gmail.com>
2025-05-27 15:59:44 -07:00
Keith Massey 41f186dca0
Adding prefer_ilm as a whitelisted data stream setting (#128375) 2025-05-27 15:42:08 -05:00
Stanislav Malyshev 8484b71126
ES|QL: Make skip_unavailable catch all errors (#128163)
* Make skip_unavailable catch all errors
2025-05-27 13:49:10 -06:00
Nik Everett 7532ad5e97
ESQL: Raise timeout on test suite (#128525)
It uses `semantic_text` which can be quite slow.

Closes #128513 Closes #128511
2025-05-28 04:53:12 +10:00
Pat Whelan 28307688f7
[ML] Integrate OpenAi Chat Completion in SageMaker (#127767)
SageMaker now supports Completion and Chat Completion using the OpenAI
interfaces.

Additionally:
- Fixed bug related to timeouts being nullable, default to 30s timeout
- Exposed existing OpenAi request/response parsing logic for reuse
2025-05-27 21:50:10 +03:00
Carlos Delgado 13f3864ed9
Fix KQL usage in in STATS .. BY (#128371) 2025-05-27 20:57:12 +03:00
Patrick Doyle a78f1f04b7
Refactor TestBuildInfoPluginFuncTest for clarity (#128469)
* Refactor TestBuildInfoPluginFuncTest for clarity

* Further simplify TestBuildInfoPluginFuncTest
2025-05-27 13:50:26 -04:00
Jonathan Buttner 0404c077de
Adjusting the google vertex ai batch size to match documentation (#128518) 2025-05-27 13:42:39 -04:00