* Granting `kibana_system` reserved role access to "all" privileges to `.adhoc.alerts*` and `.internal.adhoc.alerts*` indices
* Update docs/changelog/127321.yaml
* [CI] Auto commit changes from spotless
* Replace `"all"` with the specific privileges for the `kibana_system` role
* Fix tests
* Fix CI
* Updated privileges
* Updated privileges
Add `"maintenance"` to allow `refresh=true` option on bulk API call.
* Remove redundant code
---------
Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
The auth code injects the pattern `["*", "-*"]` to specify that it's okay to return an empty response because user's patterns did not match any remote clusters. However, we fail to recognise this specific pattern and `groupIndices()` eventually associates it with the local cluster. This causes Elasticsearch to fallback to the local cluster unknowingly and return its details back to the user even though the user hasn't requested any info about the local cluster.
Added support for the three primary scalar grid functions:
* `ST_GEOHASH(geom, precision)`
* `ST_GEOTILE(geom, precision)`
* `ST_GEOHEX(geom, precision)`
As well as versions of these three that take an optional `geo_shape` boundary (must be a `BBOX` ie. `Rectangle`).
And also supporting conversion functions that convert the grid-id from long to string and back to long.
This work represents the core of the feature to support geo-grid aggregations in ES|QL.
* Implement SAML custom attributes support for Identity Provider
This commit adds support for custom attributes in SAML single sign-on requests
in the Elasticsearch X-Pack Identity Provider plugin. This feature allows
passage of custom key-value attributes in SAML requests and responses.
Key components:
- Added SamlInitiateSingleSignOnAttributes class for holding attributes
- Added validation for null and empty attribute keys
- Updated request and response objects to handle attributes
- Modified authentication flow to process attributes
- Added test coverage to validate attributes functionality
The implementation follows Elasticsearch patterns with robust validation
and serialization mechanisms, while maintaining backward compatibility.
* Add test for SAML custom attributes in authentication response
This commit adds a comprehensive test that verifies SAML custom attributes
are correctly handled in the authentication response builder. The test ensures:
1. Custom attributes with single and multiple values are properly included
2. The response with custom attributes is still correctly signed
3. The XML schema validation still passes with custom attributes
4. We can locate and verify individual attribute values in the response
This provides critical test coverage for the SAML custom attributes
feature implementation.
* Add backward compatibility overload for SuccessfulAuthenticationResponseMessageBuilder.build
This commit adds an overloaded build method that accepts only two parameters
(user and authenticationState) and forwards the call to the three-parameter
version with null for the customAttributes parameter. This maintains backward
compatibility with existing code that doesn't use custom attributes.
This fixes a compilation error in ServerlessSsoIT.java which was still using
the two-parameter method signature.
Signed-off-by: lloydmeta <lloydmeta@gmail.com>
* Add validation for duplicate SAML attribute keys
This commit enhances the SAML attributes implementation by adding validation
for duplicate attribute keys. When the same attribute key appears multiple
times in a request, the validation will now fail with a clear error message.
Signed-off-by: lloydmeta <lloydmeta@gmail.com>
* Refactor SAML attributes validation to follow standard patterns
This commit improves the SAML attributes validation by:
1. Adding a dedicated validate() method to SamlInitiateSingleSignOnAttributes
that centralizes validation logic in one place
2. Moving validation from constructor to dedicated method for better error reporting
3. Checking both for null/empty keys and duplicate keys in the validate() method
4. Updating SamlInitiateSingleSignOnRequest to use the new validation method
5. Adding comprehensive tests for the new validation approach
These changes follow standard Elasticsearch validation patterns, making the
code more maintainable and consistent with the rest of the codebase.
* Update docs/changelog/128176.yaml
* Improve SAML response validation in identity provider tests
Enhanced the testCustomAttributesInIdpInitiatedSso test to properly validate
both SAML response structure and custom attributes using DOM parsing and XPath.
Key improvements:
- Validate SAML Response/Assertion elements exist
- Precisely validate custom attributes (department, region) and their values
- Use namespace-aware XML parsing for resilience to format changes
Signed-off-by: lloydmeta <lloydmeta@gmail.com>
* Simplify SAML attributes representation using JSON object/Map structure
Also, replace internal Attribute class list with a simpler Map<String, List<String>>
structure
This change:
- Removes the redundant Attribute class and replaces it with a direct Map
implementation for storing attribute key-value pairs
- Eliminates the duplicate "attributes" nesting in the JSON structure
- Simplifies attribute validation without needing duplicate key checking
- Updates all related tests and integration points to work with the new structure
Before:
```js
{
// others
"attributes": {
"attributes": [
{
"key": "department",
"values": ["engineering", "product"]
}
]
}
}
After:
```js
{
// other
"attributes": {
"department": ["engineering", "product"]
}
}
```
(Verified by spitting out JSON entity in IdentityProviderAuthenticationIT.generateSamlResponseWithAttributes
... saw `{"entity_id":"ec:123456:abcdefg","acs":"https://sp1.test.es.elasticsearch.org/saml/acs","attributes":{"department":["engineering","product"],"region":["APJ"]}}`)
Signed-off-by: lloydmeta <lloydmeta@gmail.com>
* * Fix up toString dangling quote.
Signed-off-by: lloydmeta <lloydmeta@gmail.com>
* * Remove attributes from Response object.
Signed-off-by: lloydmeta <lloydmeta@gmail.com>
* * Remove friendly name.
* Make attributes map final in SamlInitiateSingleSignOnAttributes
Signed-off-by: lloydmeta <lloydmeta@gmail.com>
* * Cleanup serdes by using existing utils in the ES codebase
Signed-off-by: lloydmeta <lloydmeta@gmail.com>
* Touchup comment
Signed-off-by: lloydmeta <lloydmeta@gmail.com>
* Update x-pack/plugin/identity-provider/src/test/java/org/elasticsearch/xpack/idp/action/SamlInitiateSingleSignOnRequestTests.java
Co-authored-by: Tim Vernum <tim@adjective.org>
* Add transport-version checks
---------
Signed-off-by: lloydmeta <lloydmeta@gmail.com>
Co-authored-by: Tim Vernum <tim@adjective.org>
* New l2 normalizer added
* L2 score normaliser is registered
* test case added to the yaml
* Documentation added
* Resolved checkstyle issues
* Update docs/changelog/128504.yaml
* Update docs/reference/elasticsearch/rest-apis/retrievers.md
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Score 0 test case added to check for corner cases
* Edited the markdown doc description
* Pruned the comment
* Renamed the variable
* Added comment to the class
* Unit tests added
* Spotless and checkstyle fixed
* Fixed build failure
* Fixed the forbidden test
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
The tests sometimes pass and then fail to clean up the indicies
indirectly, so they will now directly delete the indices they created
for the test.
Fix#128577
If an index is deleted after a snapshot has written its shardGenerations
file but before the snapshot is finalized, we exclude this index from the
snapshot because its indexMetadata is no longer available. However,
the shardGenerations file is still valid in that it is the latest copy with all
necessary information despite it containing an extra snapshot entry.
This is OK. Instead of dropping this shardGenerations file, this PR
changes to carry it forward by updating RepositoryData and relevant
in-progress snapshots so that the next finalization builds on top of this one.
Co-authored-by: David Turner <david.turner@elastic.co>
* Count and Count distinct functions for tsds
* issues with test
* fixup of tests
* fixup of output types
* fix naming and helper fns
* fixup
* fix other test
* ugh mistake
This change extracts the field extraction logic previously run inside
`TimeSeriesSourceOperator` into a separate operator, executed in a
separate driver. We should consider consolidating this operator with
`ValuesSourceReaderOperator`. I tried to extend
`ValuesSourceReaderOperator` to support this, but it may take some time
to complete. Our current goal is to move quickly with experimental
support for querying time-series data in ES|QL, so I am proceeding with
a separate operator. I will spend more time on combining these two
operators later.
With #128419 and this PR, the query time for the example below decreased
from 41ms to 27ms.
```
POST /_query
{
"profile": true,
"query": "TS metrics-hostmetricsreceiver.otel-default
| WHERE @timestamp >= \"2025-05-08T18:00:08.001Z\"
| STATS cpu = avg(rate(`metrics.process.cpu.time`)) BY host.name, BUCKET(@timestamp, 5 minute)"
}
```
This introduces an optimization to pushdown to Lucense those language constructs that aim at case-insensitive regular expression matching, used with `LIKE` and `RLIKE` operators, such as:
* `| WHERE TO_LOWER(field) LIKE "abc*"`
* `| WHERE TO_UPPER(field) RLIKE "ABC.*"`
These are now pushed as case-insensitive `wildcard` and `regexp` respectively queries down to Lucene.
Closes#127479
* Fix scoring and sort handling in pinned retriever
* Remove books.es from version control and add to .gitignore
* Remove books.es from version control and add to .gitignore
* Remove books.es entries from .gitignore
* fixed the mess
* Update x-pack/plugin/search-business-rules/src/test/java/org/elasticsearch/xpack/searchbusinessrules/retriever/PinnedRetrieverBuilderTests.java
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Update x-pack/plugin/search-business-rules/src/main/java/org/elasticsearch/xpack/searchbusinessrules/retriever/PinnedRetrieverBuilder.java
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Currently, the Limit operator does not combine small pages into larger
ones; it simply passes them along, except for chunking pages larger than
the limit. This change integrates EstimatesRowSize into Limit and
adjusts it to emit larger pages. As a result, pages up to twice the
pageSize may be emitted, which is preferable to emitting undersized
pages. This should reduce the number of transport requests and responses
between clusters or coordinator-data nodes for queries without TopN or
STATS when target shards produce small pages due to their size or highly
selective filters.
* Inference changes
* Custom service fixes
* Update docs/changelog/127939.yaml
* Cleaning up from failed merge
* Fixing changelog
* [CI] Auto commit changes from spotless
* Fixing test
* Adding feature flag
* [CI] Auto commit changes from spotless
---------
Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
* Implemented ChatCompletion task for Google VertexAI with Gemini Models
* changelog
* System Instruction bugfix
* Mapping role assistant -> model in vertex ai chat completion request for compatibility
* GoogleVertexAI chat completion using SSE events. Removed JsonArrayEventParser
* Removed buffer from GoogleVertexAiUnifiedStreamingProcessor
* Casting inference inputs with `castoTo`
* Registered GoogleVertexAiChatCompletionServiceSettings in InferenceNamedWriteablesProvider. Added InferenceSettingsTests
* Changed transport version to 8_19 for vertexai chatcompletion
* Fix to transport version. Moved ML_INFERENCE_VERTEXAI_CHATCOMPLETION_ADDED to the right location
* VertexAI Chat completion request entity jsonStringToMap using `ensureExpectedToken`
* Fixed TransportVersions. Left vertexAi chat completion 8_19 and added new one for ML_INFERENCE_VERTEXAI_CHATCOMPLETION_ADDDED
* Refactor switch statements by if-else for older java compatibility. Improved indentation via `{}`
* Removed GoogleVertexAiChatCompletionResponseEntity and refactored code around it.
* Removed redundant test `testUnifiedCompletionInfer_WithGoogleVertexAiModel`
* Returning whole body when fail to parse response from VertexAI
* Refactor use GenericRequestManager instead of GoogleVertexAiCompletionRequestManager
* Refactored to constructorArg for mandatory args in GoogleVertexAiUnifiedStreamingProcessor
* Changed transport version in GoogleVertexAiChatCompletionServiceSettings
* Bugfix in tool calling with role tool
* GoogleVertexAiModel added documentation info on rateLimitGroupingHash
* [CI] Auto commit changes from spotless
* Fix: using Locale.ROOT when calling toLowerCase
* Fix: Renamed test class to match convention & modified use of forbidden api
* Fix: Failing test in InferenceServicesIT
---------
Co-authored-by: lhoet <lhoet@google.com>
Co-authored-by: Jonathan Buttner <56361221+jonathan-buttner@users.noreply.github.com>
Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
Begins adding support for running "tagged queries" to the compute
engine. Here, it's just the `LuceneSourceOperator` because that's
useful and contained.
Example time! Say you are running:
```
FROM foo
| STATS MAX(v) BY ROUND_TO(g, 0, 100, 1000, 100000)
```
It's *often* faster to run this as four queries:
* The docs that round to `0`
* The docs that round to `100`
* The docs that round to `1000`
* The docs that round to `100000`
This creates an ESQL operator that can run these queries, one after the
other and attach those tags.
Aggs uses this trick and it's *way* faster when it can push down count
queries, but it's still faster when it pushes doc loading things. This
implementation in `LuceneSourceOperator` is quite similar to the doc
loading version in _search.
I don't have performance measurements yet because I haven't plugged this
into the language. In _search we call this `filter-by-filter` and enable
it when each group averages to more than 5000 documents and when there
isn't an `_doc_count` field. It's faster in those cases not to push. I
expect we'll be pretty similar.
Most providers write the UnifiedCompletionRequest JSON as we received
it, with some exception:
- the modelId can be null and/or overwritten from various locations
- `max_completion_tokens` repalced `max_tokens`, but some providers
still use the deprecated field name
We will handle the variations using Params, otherwise all of the
XContent building code has moved into UnifiedCompletionRequest so it can
be reused across providers.
This PR addresses the bug reported in
[#127496](https://github.com/elastic/elasticsearch/issues/127496)
**Changes:** - Added validation logic in `ConfigurableClusterPrivileges`
to ensure privileges defined for a global cluster manage role privilege
are valid - Added unit test to `ManageRolePrivilegesTest` to ensure
invalid privilege is caught during role creation - Updated
`BulkPutRoleRestIT` to assert that an error is thrown and that the role
is not created.
Both existing and new unit/integration tests passed locally.
writerWithOffset uses a lambda to create a RangeMissingHandler however,
the RangeMissingHandler interface has a default implementation for `sharedInputStreamFactory`.
This makes `writerWithOffset` delegate to the received writer only for the `fillCacheRange`
method where the writer itself perhaps didn't have the `sharedInputStream` method invoked
(always invoking `sharedInputStream` before `fillCacheRange` is part of the contract of the
RangeMissingHandler interface)
This PR makes `writerWithOffset` delegate the `sharedInputStream` to the underlying writer.
This moves half of the remaining centralized expression mapper logic
into the individual expressions. This is how all but 3 of the remaining
expressions work. Let's try and be fully consistent.
"elser" is an alias for "elasticsearch", and "sagemaker" is an alias for
"amazon_sagemaker".
Users can continue to create and use providers by their alias.
Elasticsearch will continue to support the alias when it reads the
configuration from the internal index.
Speeds up the semantic_text tests by using a test inference plugin. That
skips downloading the normal inference setup.
Closes#128511Closes#128513Closes#128571Closes#128572 Closes
#128573Closes#128574
This change skips indexing points for the seq_no field in tsdb and
logsdb indices to reduce storage requirements and improve indexing
throughput. Although this optimization could be applied to all new
indices, it is limited to tsdb and logsdb, where seq_no usage is
expected to be limited and storage requirements are more critical.
Co-authored-by: Martijn van Groningen <martijn.v.groningen@gmail.com>
SageMaker now supports Completion and Chat Completion using the OpenAI
interfaces.
Additionally:
- Fixed bug related to timeouts being nullable, default to 30s timeout
- Exposed existing OpenAi request/response parsing logic for reuse
This adds a new logical optimization rule to purge a Join in case the
merge key(s) are null. The null detection is based on recognizing a tree
pattern where the join sits atop a project and/or eval (possibly a few
nodes deep) which contains a reference to a `null`, reference which
matches the join key.
It works at coordinator planning level, but it's most useful locally,
after insertions of `nulls` in the plan on detecting missing fields.
The Join is substituted with a projection with the same attributes as
the join, atop an eval with all join's right fields aliased to null.
Closes#125577.