elasticsearch

Commit Graph

Author	SHA1	Message	Date
Ievgen Sorokopud	550cddf5ee	Granting `kibana_system` reserved role access to "all" privileges to `.adhoc.alerts` and `.internal.adhoc.alerts` indices (#127321 ) * Granting `kibana_system` reserved role access to "all" privileges to `.adhoc.alerts` and `.internal.adhoc.alerts` indices * Update docs/changelog/127321.yaml * [CI] Auto commit changes from spotless * Replace `"all"` with the specific privileges for the `kibana_system` role * Fix tests * Fix CI * Updated privileges * Updated privileges Add `"maintenance"` to allow `refresh=true` option on bulk API call. * Remove redundant code --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>	2025-06-03 15:37:52 +02:00
Samiul Monir	fa152a9182	Unit test to validate copy_to for semantic_text field in esql (#128795 ) * Unit test to validate copy_to for semantic_text field in esql * Fix the unit test * fix test for single node	2025-06-03 09:04:26 -04:00
Carlos Delgado	dc3d515226	ESQL - full text functions verifier tests refactor (#128775 )	2025-06-03 15:59:57 +03:00
Pawan Kartik	3f1e1b3c30	Handle the indices pattern `["", "-"]` when grouping indices by cluster name (#128610 ) The auth code injects the pattern `["", "-"]` to specify that it's okay to return an empty response because user's patterns did not match any remote clusters. However, we fail to recognise this specific pattern and `groupIndices()` eventually associates it with the local cluster. This causes Elasticsearch to fallback to the local cluster unknowingly and return its details back to the user even though the user hasn't requested any info about the local cluster.	2025-06-03 13:51:36 +01:00
Luigi Dell'Aquila	d1d7302574	ES\|QL: Add support for LOOKUP JOIN on aliases (#128519 )	2025-06-03 14:23:49 +03:00
Krishna Chaitanya Reddy Burri	186972c285	Add rapid7_insightvm.asset_vulnerability source indices to kibana_system role permissions (#128661 )	2025-06-03 16:16:16 +05:30
Mridula	e14109958e	Clarify Javadoc for L2ScoreNormalizer (l2_norm) (#128808 ) * propgating retrievers to inner retrievers * Java doc fixed * Cleaned up * Update docs/changelog/128808.yaml * Enhanced comment as stated by the copilot * Delete docs/changelog/128808.yaml	2025-06-03 13:18:21 +03:00
Craig Taverner	11f0c5526a	ES\|QL Support for ST_GEOHASH, ST_GEOTILE and ST_GEOHEX (#125143 ) Added support for the three primary scalar grid functions: * `ST_GEOHASH(geom, precision)` * `ST_GEOTILE(geom, precision)` * `ST_GEOHEX(geom, precision)` As well as versions of these three that take an optional `geo_shape` boundary (must be a `BBOX` ie. `Rectangle`). And also supporting conversion functions that convert the grid-id from long to string and back to long. This work represents the core of the feature to support geo-grid aggregations in ES\|QL.	2025-06-03 11:49:34 +02:00
Lloyd	70368c26e5	Add transport version support for IDP_CUSTOM_SAML_ATTRIBUTES_ADDED_8_19 (#128798 )	2025-06-03 15:45:18 +09:00
Lloyd	2625200341	Implement SAML custom attributes support for Identity Provider (#128176 ) * Implement SAML custom attributes support for Identity Provider This commit adds support for custom attributes in SAML single sign-on requests in the Elasticsearch X-Pack Identity Provider plugin. This feature allows passage of custom key-value attributes in SAML requests and responses. Key components: - Added SamlInitiateSingleSignOnAttributes class for holding attributes - Added validation for null and empty attribute keys - Updated request and response objects to handle attributes - Modified authentication flow to process attributes - Added test coverage to validate attributes functionality The implementation follows Elasticsearch patterns with robust validation and serialization mechanisms, while maintaining backward compatibility. * Add test for SAML custom attributes in authentication response This commit adds a comprehensive test that verifies SAML custom attributes are correctly handled in the authentication response builder. The test ensures: 1. Custom attributes with single and multiple values are properly included 2. The response with custom attributes is still correctly signed 3. The XML schema validation still passes with custom attributes 4. We can locate and verify individual attribute values in the response This provides critical test coverage for the SAML custom attributes feature implementation. * Add backward compatibility overload for SuccessfulAuthenticationResponseMessageBuilder.build This commit adds an overloaded build method that accepts only two parameters (user and authenticationState) and forwards the call to the three-parameter version with null for the customAttributes parameter. This maintains backward compatibility with existing code that doesn't use custom attributes. This fixes a compilation error in ServerlessSsoIT.java which was still using the two-parameter method signature. Signed-off-by: lloydmeta <lloydmeta@gmail.com> * Add validation for duplicate SAML attribute keys This commit enhances the SAML attributes implementation by adding validation for duplicate attribute keys. When the same attribute key appears multiple times in a request, the validation will now fail with a clear error message. Signed-off-by: lloydmeta <lloydmeta@gmail.com> * Refactor SAML attributes validation to follow standard patterns This commit improves the SAML attributes validation by: 1. Adding a dedicated validate() method to SamlInitiateSingleSignOnAttributes that centralizes validation logic in one place 2. Moving validation from constructor to dedicated method for better error reporting 3. Checking both for null/empty keys and duplicate keys in the validate() method 4. Updating SamlInitiateSingleSignOnRequest to use the new validation method 5. Adding comprehensive tests for the new validation approach These changes follow standard Elasticsearch validation patterns, making the code more maintainable and consistent with the rest of the codebase. * Update docs/changelog/128176.yaml * Improve SAML response validation in identity provider tests Enhanced the testCustomAttributesInIdpInitiatedSso test to properly validate both SAML response structure and custom attributes using DOM parsing and XPath. Key improvements: - Validate SAML Response/Assertion elements exist - Precisely validate custom attributes (department, region) and their values - Use namespace-aware XML parsing for resilience to format changes Signed-off-by: lloydmeta <lloydmeta@gmail.com> * Simplify SAML attributes representation using JSON object/Map structure Also, replace internal Attribute class list with a simpler Map<String, List<String>> structure This change: - Removes the redundant Attribute class and replaces it with a direct Map implementation for storing attribute key-value pairs - Eliminates the duplicate "attributes" nesting in the JSON structure - Simplifies attribute validation without needing duplicate key checking - Updates all related tests and integration points to work with the new structure Before: ```js { // others "attributes": { "attributes": [ { "key": "department", "values": ["engineering", "product"] } ] } } After: ```js { // other "attributes": { "department": ["engineering", "product"] } } ``` (Verified by spitting out JSON entity in IdentityProviderAuthenticationIT.generateSamlResponseWithAttributes ... saw `{"entity_id":"ec:123456:abcdefg","acs":"https://sp1.test.es.elasticsearch.org/saml/acs","attributes":{"department":["engineering","product"],"region":["APJ"]}}`) Signed-off-by: lloydmeta <lloydmeta@gmail.com> * * Fix up toString dangling quote. Signed-off-by: lloydmeta <lloydmeta@gmail.com> * * Remove attributes from Response object. Signed-off-by: lloydmeta <lloydmeta@gmail.com> * * Remove friendly name. * Make attributes map final in SamlInitiateSingleSignOnAttributes Signed-off-by: lloydmeta <lloydmeta@gmail.com> * * Cleanup serdes by using existing utils in the ES codebase Signed-off-by: lloydmeta <lloydmeta@gmail.com> * Touchup comment Signed-off-by: lloydmeta <lloydmeta@gmail.com> * Update x-pack/plugin/identity-provider/src/test/java/org/elasticsearch/xpack/idp/action/SamlInitiateSingleSignOnRequestTests.java Co-authored-by: Tim Vernum <tim@adjective.org> * Add transport-version checks --------- Signed-off-by: lloydmeta <lloydmeta@gmail.com> Co-authored-by: Tim Vernum <tim@adjective.org>	2025-06-03 05:26:41 +03:00
Mridula	81fba27b6b	Add l2_norm normalization support to linear retriever (#128504 ) * New l2 normalizer added * L2 score normaliser is registered * test case added to the yaml * Documentation added * Resolved checkstyle issues * Update docs/changelog/128504.yaml * Update docs/reference/elasticsearch/rest-apis/retrievers.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Score 0 test case added to check for corner cases * Edited the markdown doc description * Pruned the comment * Renamed the variable * Added comment to the class * Unit tests added * Spotless and checkstyle fixed * Fixed build failure * Fixed the forbidden test --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-06-02 14:59:03 +01:00
Pat Whelan	09ccd91b53	[Transform] Delete indices after test (#128690 ) The tests sometimes pass and then fail to clean up the indicies indirectly, so they will now directly delete the indices they created for the test. Fix #128577	2025-06-02 16:56:15 +03:00
Benjamin Trent	2a44166a2c	Applying Apache Lucene fix: https://github.com/apache/lucene/pull/14732 (#128671 ) * Applying Apache Lucene fix: https://github.com/apache/lucene/pull/14732 * fixing test * fixing annot	2025-06-02 09:50:25 -04:00
Mike Pellegrini	adda402a4c	Fix minmax normalizer handling of single-doc result sets (#128689 )	2025-06-02 09:39:44 -04:00
Iván Cea Fontenla	d597e50117	Use StringBuilder instead of StringBuffer (#128665 )	2025-06-02 14:29:22 +03:00
Carlos Delgado	b6880808d5	Refactor full text functions optimizer tests to add coverage to all functions (#128611 )	2025-06-02 11:02:49 +03:00
Ievgen Degtiarenko	03173af0c7	Start polling after data computation is started (#128575 )	2025-06-02 09:34:06 +02:00
Yang Wang	aa0397fb49	Update shardGenerations for all indices on snapshot finalization (#128650 ) If an index is deleted after a snapshot has written its shardGenerations file but before the snapshot is finalized, we exclude this index from the snapshot because its indexMetadata is no longer available. However, the shardGenerations file is still valid in that it is the latest copy with all necessary information despite it containing an extra snapshot entry. This is OK. Instead of dropping this shardGenerations file, this PR changes to carry it forward by updating RepositoryData and relevant in-progress snapshots so that the next finalization builds on top of this one. Co-authored-by: David Turner <david.turner@elastic.co>	2025-06-02 15:09:04 +10:00
Jonathan Buttner	7f7cb83083	[ML] Ensure EIS auth response exists before executing tests (#128640 ) * Add response before each test file for eis * Fixing the expected values * Adding response before cluster starts	2025-05-30 14:45:28 -04:00
Aurélien FOUCRET	2850e2b761	[ES\|QL] RERANK command default inferenceId (#128685 ) * [ES\|QL] Make inferenceId optional for the `RERANK` command (default to `.rerank-v1-elasticsearch`) * lint * Remove old test. * Fix failing test	2025-05-30 20:59:12 +03:00
Pablo	b735cf9998	Count and Count distinct functions for tsds (#128530 ) * Count and Count distinct functions for tsds * issues with test * fixup of tests * fixup of output types * fix naming and helper fns * fixup * fix other test * ugh mistake	2025-05-30 09:19:47 -07:00
Aurélien FOUCRET	6d4a613f16	[ES\|QL] Enable telemetry for the rerank command (#128679 )	2025-05-30 19:03:53 +03:00
Nhat Nguyen	c14e6020c1	Run field extraction concurrently in TS (#128643 ) This change extracts the field extraction logic previously run inside `TimeSeriesSourceOperator` into a separate operator, executed in a separate driver. We should consider consolidating this operator with `ValuesSourceReaderOperator`. I tried to extend `ValuesSourceReaderOperator` to support this, but it may take some time to complete. Our current goal is to move quickly with experimental support for querying time-series data in ES\|QL, so I am proceeding with a separate operator. I will spend more time on combining these two operators later. With #128419 and this PR, the query time for the example below decreased from 41ms to 27ms. ``` POST /_query { "profile": true, "query": "TS metrics-hostmetricsreceiver.otel-default \| WHERE @timestamp >= \"2025-05-08T18:00:08.001Z\" \| STATS cpu = avg(rate(`metrics.process.cpu.time`)) BY host.name, BUCKET(@timestamp, 5 minute)" } ```	2025-05-30 07:15:04 -07:00
Bogdan Pintea	0a8091605b	ESQL: Pushdown constructs doing case-insensitive regexes (#128393 ) This introduces an optimization to pushdown to Lucense those language constructs that aim at case-insensitive regular expression matching, used with `LIKE` and `RLIKE` operators, such as: * `\| WHERE TO_LOWER(field) LIKE "abc"` `\| WHERE TO_UPPER(field) RLIKE "ABC.*"` These are now pushed as case-insensitive `wildcard` and `regexp` respectively queries down to Lucene. Closes #127479	2025-05-30 10:55:00 +02:00
Mridula	cc461afa0a	Fix: Allow non-score sorts in pinned retriever sub-retrievers (#128323 ) * Fix scoring and sort handling in pinned retriever * Remove books.es from version control and add to .gitignore * Remove books.es from version control and add to .gitignore * Remove books.es entries from .gitignore * fixed the mess * Update x-pack/plugin/search-business-rules/src/test/java/org/elasticsearch/xpack/searchbusinessrules/retriever/PinnedRetrieverBuilderTests.java Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update x-pack/plugin/search-business-rules/src/main/java/org/elasticsearch/xpack/searchbusinessrules/retriever/PinnedRetrieverBuilder.java Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-05-30 10:01:01 +03:00
Nhat Nguyen	1ab2e6ca6c	Combine small pages in Limit (#128531 ) Currently, the Limit operator does not combine small pages into larger ones; it simply passes them along, except for chunking pages larger than the limit. This change integrates EstimatesRowSize into Limit and adjusts it to emit larger pages. As a result, pages up to twice the pageSize may be emitted, which is preferable to emitting undersized pages. This should reduce the number of transport requests and responses between clusters or coordinator-data nodes for queries without TopN or STATS when target shards produce small pages due to their size or highly selective filters.	2025-05-29 16:04:49 -07:00
Jonathan Buttner	9db18373ba	Adding configurable inference service (#127939 ) * Inference changes * Custom service fixes * Update docs/changelog/127939.yaml * Cleaning up from failed merge * Fixing changelog * [CI] Auto commit changes from spotless * Fixing test * Adding feature flag * [CI] Auto commit changes from spotless --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>	2025-05-30 00:12:19 +03:00
Leonardo Hoet	107daf3321	Implemented ChatCompletion task for Google VertexAI with Gemini Models (#128105 ) * Implemented ChatCompletion task for Google VertexAI with Gemini Models * changelog * System Instruction bugfix * Mapping role assistant -> model in vertex ai chat completion request for compatibility * GoogleVertexAI chat completion using SSE events. Removed JsonArrayEventParser * Removed buffer from GoogleVertexAiUnifiedStreamingProcessor * Casting inference inputs with `castoTo` * Registered GoogleVertexAiChatCompletionServiceSettings in InferenceNamedWriteablesProvider. Added InferenceSettingsTests * Changed transport version to 8_19 for vertexai chatcompletion * Fix to transport version. Moved ML_INFERENCE_VERTEXAI_CHATCOMPLETION_ADDED to the right location * VertexAI Chat completion request entity jsonStringToMap using `ensureExpectedToken` * Fixed TransportVersions. Left vertexAi chat completion 8_19 and added new one for ML_INFERENCE_VERTEXAI_CHATCOMPLETION_ADDDED * Refactor switch statements by if-else for older java compatibility. Improved indentation via `{}` * Removed GoogleVertexAiChatCompletionResponseEntity and refactored code around it. * Removed redundant test `testUnifiedCompletionInfer_WithGoogleVertexAiModel` * Returning whole body when fail to parse response from VertexAI * Refactor use GenericRequestManager instead of GoogleVertexAiCompletionRequestManager * Refactored to constructorArg for mandatory args in GoogleVertexAiUnifiedStreamingProcessor * Changed transport version in GoogleVertexAiChatCompletionServiceSettings * Bugfix in tool calling with role tool * GoogleVertexAiModel added documentation info on rateLimitGroupingHash * [CI] Auto commit changes from spotless * Fix: using Locale.ROOT when calling toLowerCase * Fix: Renamed test class to match convention & modified use of forbidden api * Fix: Failing test in InferenceServicesIT --------- Co-authored-by: lhoet <lhoet@google.com> Co-authored-by: Jonathan Buttner <56361221+jonathan-buttner@users.noreply.github.com> Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>	2025-05-29 13:35:24 -04:00
Nik Everett	1b151eda4b	ESQL: Compute engine support for tagged queries (#128521 ) Begins adding support for running "tagged queries" to the compute engine. Here, it's just the `LuceneSourceOperator` because that's useful and contained. Example time! Say you are running: ``` FROM foo \| STATS MAX(v) BY ROUND_TO(g, 0, 100, 1000, 100000) ``` It's often faster to run this as four queries: * The docs that round to `0` * The docs that round to `100` * The docs that round to `1000` * The docs that round to `100000` This creates an ESQL operator that can run these queries, one after the other and attach those tags. Aggs uses this trick and it's way faster when it can push down count queries, but it's still faster when it pushes doc loading things. This implementation in `LuceneSourceOperator` is quite similar to the doc loading version in _search. I don't have performance measurements yet because I haven't plugged this into the language. In _search we call this `filter-by-filter` and enable it when each group averages to more than 5000 documents and when there isn't an `_doc_count` field. It's faster in those cases not to push. I expect we'll be pretty similar.	2025-05-29 12:41:58 -04:00
Pat Whelan	bf0dc6e7f2	[ML] Write Chat Completion JSON (#128592 ) Most providers write the UnifiedCompletionRequest JSON as we received it, with some exception: - the modelId can be null and/or overwritten from various locations - `max_completion_tokens` repalced `max_tokens`, but some providers still use the deprecated field name We will handle the variations using Params, otherwise all of the XContent building code has moved into UnifiedCompletionRequest so it can be reused across providers.	2025-05-29 19:39:34 +03:00
Graeme Mjehovich	57d4e15b62	Bugfix: Prevent invalid privileges in manage roles privilege (#128532 ) This PR addresses the bug reported in [#127496](https://github.com/elastic/elasticsearch/issues/127496) Changes: - Added validation logic in `ConfigurableClusterPrivileges` to ensure privileges defined for a global cluster manage role privilege are valid - Added unit test to `ManageRolePrivilegesTest` to ensure invalid privilege is caught during role creation - Updated `BulkPutRoleRestIT` to assert that an error is thrown and that the role is not created. Both existing and new unit/integration tests passed locally.	2025-05-30 02:14:54 +10:00
Andrei Dan	2375e89a5f	Make writerWithOffset fully delegate to the writer it wraps (#126937 ) writerWithOffset uses a lambda to create a RangeMissingHandler however, the RangeMissingHandler interface has a default implementation for `sharedInputStreamFactory`. This makes `writerWithOffset` delegate to the received writer only for the `fillCacheRange` method where the writer itself perhaps didn't have the `sharedInputStream` method invoked (always invoking `sharedInputStream` before `fillCacheRange` is part of the contract of the RangeMissingHandler interface) This PR makes `writerWithOffset` delegate the `sharedInputStream` to the underlying writer.	2025-05-29 15:58:16 +01:00
Nik Everett	33fc85fdff	ESQL: Move some mappers into their expressions (#128342 ) This moves half of the remaining centralized expression mapper logic into the individual expressions. This is how all but 3 of the remaining expressions work. Let's try and be fully consistent.	2025-05-28 20:50:17 -04:00
Stanislav Malyshev	5c482957af	Re-enable and fix the EsqlRestValidationIT test (#128542 ) * Re-enable and fix the EsqlRestValidationIT test	2025-05-28 14:57:29 -06:00
Pat Whelan	86cef7f88e	[ML] InferenceService support aliases (#128584 ) "elser" is an alias for "elasticsearch", and "sagemaker" is an alias for "amazon_sagemaker". Users can continue to create and use providers by their alias. Elasticsearch will continue to support the alias when it reads the configuration from the internal index.	2025-05-28 23:36:21 +03:00
Nik Everett	8c89037786	ESQL: Speed up semantic_text tests (#128591 ) Speeds up the semantic_text tests by using a test inference plugin. That skips downloading the normal inference setup. Closes #128511 Closes #128513 Closes #128571 Closes #128572 Closes #128573 Closes #128574	2025-05-29 06:25:25 +10:00
Ioana Tagirta	f275b71766	ES\|QL: Improve field resolution for FORK (#128501 )	2025-05-28 20:01:29 +02:00
Dan Rubinstein	53668f7565	Improve exception for trained model deployment scale up timeout (#128218 ) * Improve exception for trained model deployment scale up timeout * Update docs/changelog/128218.yaml * Rename exception and update exception message --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2025-05-28 10:17:30 -04:00
Gal Lalouche	0aae7f6e0d	ESQL: Add INLINESTATS capability requirement to tests (#128556 ) It looks like there are a bunch of tests that failed due to missing INLINESTATS capability. Resolves #128512.	2025-05-28 13:26:59 +03:00
Yang Wang	6bc1452b43	Make repositories project aware (#128285 ) Pass project-id explicitly to repository factory and make it part of the repository interface. Relates: ES-11839	2025-05-28 17:29:39 +10:00
Ioana Tagirta	392777d842	Return unsupported attributes in FORK output (#128508 )	2025-05-28 08:39:59 +02:00
Nhat Nguyen	7f2e55fd80	Skip indexing points for seq_no in tsdb and logsdb (#128139 ) This change skips indexing points for the seq_no field in tsdb and logsdb indices to reduce storage requirements and improve indexing throughput. Although this optimization could be applied to all new indices, it is limited to tsdb and logsdb, where seq_no usage is expected to be limited and storage requirements are more critical. Co-authored-by: Martijn van Groningen <martijn.v.groningen@gmail.com>	2025-05-27 15:59:44 -07:00
Stanislav Malyshev	8484b71126	ES\|QL: Make skip_unavailable catch all errors (#128163 ) * Make skip_unavailable catch all errors	2025-05-27 13:49:10 -06:00
Nik Everett	7532ad5e97	ESQL: Raise timeout on test suite (#128525 ) It uses `semantic_text` which can be quite slow. Closes #128513 Closes #128511	2025-05-28 04:53:12 +10:00
Pat Whelan	28307688f7	[ML] Integrate OpenAi Chat Completion in SageMaker (#127767 ) SageMaker now supports Completion and Chat Completion using the OpenAI interfaces. Additionally: - Fixed bug related to timeouts being nullable, default to 30s timeout - Exposed existing OpenAi request/response parsing logic for reuse	2025-05-27 21:50:10 +03:00
Carlos Delgado	13f3864ed9	Fix KQL usage in in STATS .. BY (#128371 )	2025-05-27 20:57:12 +03:00
Jonathan Buttner	0404c077de	Adjusting the google vertex ai batch size to match documentation (#128518 )	2025-05-27 13:42:39 -04:00
Luigi Dell'Aquila	6d636706c6	Add support for parameters in LIMIT command (#128464 )	2025-05-27 19:00:12 +03:00
Ievgen Degtiarenko	9e57f1471b	generate sequence diagramRefactor logger declaration (#128457 )	2025-05-27 16:09:30 +02:00
Bogdan Pintea	21fe40a9b5	ESQL: Add optimization to purge join on null merge key (#127583 ) This adds a new logical optimization rule to purge a Join in case the merge key(s) are null. The null detection is based on recognizing a tree pattern where the join sits atop a project and/or eval (possibly a few nodes deep) which contains a reference to a `null`, reference which matches the join key. It works at coordinator planning level, but it's most useful locally, after insertions of `nulls` in the plan on detecting missing fields. The Join is substituted with a projection with the same attributes as the join, atop an eval with all join's right fields aliased to null. Closes #125577.	2025-05-27 23:38:18 +10:00

1 2 3 4 5 ...

21249 Commits