Commit Graph

5219 Commits

Author SHA1 Message Date
Ryan Ernst e5d5c17c99
Use directory name as project name for libs (#115720)
The libs projects are configured to all begin with `elasticsearch-`.
While this is desireable for the artifacts to contain this consistent
prefix, it means the project names don't match up with their
directories. Additionally, it creates complexities for subproject naming
that must be manually adjusted.

This commit adjusts the project names for those under libs to be their
directory names. The resulting artifacts for these libs are kept the
same, all beginning with `elasticsearch-`.
2024-10-29 13:02:28 -07:00
Artem Prigoda 7feb4d5159
Revert "[test] Dynamically pick up the upper bound snapshot index version (#1…" (#115827)
This reverts commit 32dee6aaae.
2024-10-29 12:03:06 +01:00
Chris Hegarty c9a989407e
Add unit test for negative values in ByteBufferStreamInput::readVLong (#115749) 2024-10-29 10:15:40 +00:00
Lorenzo Dematté 2522c9877a
Fix testApmIntegration histogram assertions (#115578) 2024-10-29 09:51:39 +01:00
Kostas Krikellas b97b6637a6
[TEST] Run tsdb tests with both base and trial licenses (#115653)
* Run tsdb tests with both base and trial licenses.

* ignore license error in serverless

* update

* update

* update
2024-10-29 09:57:03 +02:00
Kathleen DeRusso 690ad1ea60
Query rules retriever (#114855) 2024-10-28 22:32:29 +01:00
Keith Massey 8787f0de09
Removing code that rarely adds a legacy global template to yaml rest tests (#115799) 2024-10-28 15:07:38 -05:00
Moritz Mack 68316f7d17
Remove metering from ingest service to occur afterwards when parsing the final document (#114895) 2024-10-25 11:52:06 -07:00
Christoph Büscher 6cec96cc1e
Fix TimeSeriesRateAggregatorTests file leak (#115278)
With Lucene 10, IndexWriter requires a parent document field in order to
use index sorting with document blocks. This lead to different IAE and file
leaks in this test which are fixed by adapting the corresponding location in
the test setup.
2024-10-25 15:44:59 +02:00
Salvatore Campagna e7897bdeff
Return `_ignored_source` as a top level array field (#115328)
This PR introduces a fix for the `fields` and `stored_fields` APIs and the way `_ignored_source` field is handled:

1. **Return `_ignored_source` as a top-level array metadata field**:
   - The `_ignored_source` field is now returned as a top-level array in the metadata as done with other metadata fields.

2. **Return `_ignored_source` as an array of values**:
   - Even when there is only a single ignored field, `_ignored_source` will now be returned as an array of values. This is done to be consistent with how the `_ignored` field is returned.

Without this fix, we would return the `_ignored_source` field twice, as a top-level field and as part of the `fields` array. Also, without this fix, we would only return a single value instead of all ignored field values.
2024-10-25 09:57:12 +02:00
Nhat Nguyen f444c86f85
Add lookup index mode (#115143)
This change introduces a new index mode, lookup, for indices intended 
for lookup operations in ES|QL. Lookup indices must have a single shard
and be replicated to all data nodes by default. Aside from these
requirements, they function as standard indices. Documentation will be
added later when the lookup operator in ES|QL is implemented.
2024-10-24 13:47:20 -07:00
Artem Prigoda b4edc3ddab
Remove loading on-disk cluster metadata from the manifest file (#114698)
Since metadata storage was moved to Lucene in #50907 (7.16.0), we shouldn't encounter any on-disk global metadata files, so we can remove support for loading them.
2024-10-24 21:26:23 +02:00
Andrei Dan 327f23254a
Allow for queries on _tier to skip shards during coordinator rewrite (#114990)
The `_tier` metadata field was not used on the  coordinator when
rewriting queries in order to exclude shards that don't match. This lead
to queries in the following form to continue to report failures even
though the only unavailable shards were in the  tier that was excluded
from search (frozen tier in this example):

```
POST testing/_search
{
  "query": {
    "bool": {
      "must_not": [
        {
          "term": {
            "_tier": "data_frozen"
          }
        }
      ]
    }
  }
}
```

This PR addresses this by having the queries that can execute on `_tier`
(term, match, query string, simple query string, prefix, wildcard)
execute a coordinator rewrite to  exclude the indices that don't match
the `_tier` query  **before** attempting to reach to the shards (shards,
that might not be available and raise errors). 

Fixes #114910
2024-10-24 21:41:47 +11:00
Nick Tindall 7599d4cf43
Use Azure blob batch API to delete blobs in batches (#114566)
Closes ES-9777
2024-10-24 19:51:52 +11:00
Ryan Ernst 06a3e19021
Remove LongGCDisruption scheme (#115046)
Long GC disruption relies on Thread.resume, which is removed in JDK 23.
Tests that use it predate more modern disruption tests. This commit
removes gc disruption and the master disruption tests. Note that tests
relying on this scheme have already not been running since JDK 20 first
deprecated Thread.resume.
2024-10-23 10:28:17 -07:00
Martijn van Groningen 387062eb80
Sometimes delegate to SourceLoader in ValueSourceReaderOperator for required stored fields (#115114)
If source is required by a block loader then the StoredFieldsSpec that gets populated should be enhanced by SourceLoader#requiredStoredFields(...) in ValuesSourceReaderOperator. Otherwise in case of synthetic source many stored fields aren't loaded, which causes only a subset of _source to be synthesized. For example when unmapped fields exist or field values that exceed configured ignore above will not appear is _source.

This happens when field types fallback to a block loader implementation that uses _source. The required field values are then extracted from the source once loaded.

This change also reverts the production code changes introduced via #114903. That change only ensured that _ignored_source field was added to the required list of stored fields. In reality more fields could be required. This change is better fix, since it handles also other cases and the SourceLoader implementation indicates which stored fields are needed.

Closes #115076
2024-10-23 10:20:42 +02:00
Artem Prigoda 530d15029e
Remove direct cloning of BytesTransportRequests (#114808)
All request handlers should be able to read `BytesTransportRequest` to
a class than can copied by re-serializing. Direct copying was only
necessary by the legacy `JOIN_VALIDATE_ACTION_NAME` request handler.

See #89926
2024-10-23 09:30:16 +02:00
Artem Prigoda 32dee6aaae
[test] Dynamically pick up the upper bound snapshot index version (#114703)
Pick an index version between the minimum compatible and latest known
version for snapshot testing.
2024-10-23 09:30:02 +02:00
Tim Vernum f6c0a245fd
[Test] Add client param indexExists (#115180)
This allows us to use the admin client to easily check whether an
index exists (that may not be visible to the standard client)
2024-10-22 11:20:22 +11:00
David Kyle 1cf8d496c8
[ML] Do not create the .inference index as a side effect of calling usage (#115023)
The Inference usage API calls GET _inference/_all and because the default 
configs are persisted on read it causes the creation of the .inference index.
This action is undesirable and causes test failures by leaking the system index
out of the test clean up code.
2024-10-21 16:09:20 +02:00
Luca Cavanna 8efd08b019
Upgrade to Lucene 10 (#114741)
The most relevant ES changes that upgrading to Lucene 10 requires are:

- use the appropriate IOContext
- Scorer / ScorerSupplier breaking changes
- Regex automaton are no longer determinized by default
- minimize moved to test classes
- introduce Elasticsearch900Codec
- adjust slicing code according to the added support for intra-segment concurrency
- disable intra-segment concurrency in tests
- adjust accessor methods for many Lucene classes that became a record
- adapt to breaking changes in the analysis area

Co-authored-by: Christoph Büscher <christophbuescher@posteo.de>
Co-authored-by: Mayya Sharipova <mayya.sharipova@elastic.co>
Co-authored-by: ChrisHegarty <chegar999@gmail.com>
Co-authored-by: Brian Seeders <brian.seeders@elastic.co>
Co-authored-by: Armin Braun <me@obrown.io>
Co-authored-by: Panagiotis Bailis <pmpailis@gmail.com>
Co-authored-by: Benjamin Trent <4357155+benwtrent@users.noreply.github.com>
2024-10-21 13:38:23 +02:00
Pooya Salehi 671458a999
Always flush response body in AbstractBlobContainerRetriesTestCase#sendIncompleteContent with JDK23 (#115197)
Resolves https://github.com/elastic/elasticsearch/issues/115172
2024-10-21 22:01:58 +11:00
Martijn van Groningen c62a96c8ab
Include ignored source as part of loading field values in ValueSourceReaderOperator via BlockSourceReader. (#114903)
Currently, in compute engine when loading source if source mode is synthetic, the synthetic source loader is already used. But the ignored_source field isn't always marked as a required source field, causing the source to potentially miss a lot of fields.

This change includes _ignored_source field as a required stored field and allowing keyword fields without doc values or stored fields to be used in case of synthetic source.

Relying on synthetic source to get the values (because a field doesn't have stored fields / doc values) is slow. In case of synthetic source we already keep ignored field/values in a special place, named ignored source. Long term in case of synthetic source we should only load ignored source in case a field has no doc values or stored field. Like is being explored in #114886 Thereby avoiding synthesizing the complete _source in order to get only one field.
2024-10-18 07:49:00 +02:00
Pat Whelan 5cace0f047
[ML] Temporarily ignore inference index (#114928)
Until we can figure out where in the tests the index is being created,
temporarily ignore deleting it along with the other system indices.

Relates #114748
2024-10-17 09:43:05 +01:00
Simon Cooper 32ddbb3449
Squash transport versions into 8.15 (#114827) 2024-10-17 08:36:50 +01:00
Panagiotis Bailis e79127ba2a
Adding deprecation warnings for rank and sub_searches (#114854) 2024-10-16 21:56:13 +03:00
Salvatore Campagna 9bf6e3b0ba
Inject the `host.name` field mapping only if required for `logsdb` index mode (#114573)
Here we check for the existence of a `host.name` field in index sort settings
when the index mode is `logsdb` and decide to inject the field in the mapping
depending on whether it exists or not. By default `host.name` is required for
sorting in LogsDB. This reduces the chances for errors at mapping or template
composition time as a result of injecting the `host.name` field only if strictly
required. A user who wants to override index sort settings without including
a `host.name` field would be able to do so without finding an additional
`host.name` field in the mappings (injected automatically). If users override the
sort settings and a `host.name` field is not included we don't need
to inject such field since sorting does not require it anymore.

As a result of this change we have the following:
* the user does not provide any index sorting configuration: we are responsible for injecting the default sort fields and their mapping (for `logsdb`)
* the user explicitly provides non-empty index sorting configuration: the user is also responsible for providing correct mappings and we do not modify index sorting or mappings

Note also that all sort settings `index.sort.*` are `final` which means doing this
check once, when mappings are merged at template composition time, is enough.
2024-10-16 16:12:56 +02:00
David Turner 15c1051fb6
Inline `MockTransportService#getLocalDiscoNode()` (#114883)
This method just delegates to `getLocalNode()`, we may as well call the
more widely-used method with the shorter name directly.
2024-10-16 11:35:22 +02:00
Joe Gallo 69054ac83b
Download IPinfo ip location databases (#114847) 2024-10-15 17:59:38 -05:00
Oleksandr Kolomiiets c401a71426
Make mapping a distinct concept in logsdb data generation (#114370) 2024-10-16 00:24:39 +02:00
Luke Whiting 37f03dc40d
#111893 Add Warnings For Missing Index Templates (#114589)
* Add data stream template validation

to snapshot restore

* Add data stream template validation

to data stream promotion endpoint

* Add new assertion for response headers

Add a new assertion to synchronously execute a request and check the
response contains a specific warning header

* Test for warning header on snapshot restore

When missing templates

* Test for promotion warnings

* Add documentation for the potential error states

* PR changes

* Spotless reformatting

* Add logic to look in snapshot global metadata

This checks if the snapshot contains a matching template for the DS

* Comment on test cleanup to explain it was copied

* Removed cluster service field
2024-10-15 15:00:53 +01:00
David Kyle bd6eecac4b
[ML] Wait for allocation on scale up from 0 (#114719) 2024-10-15 13:38:45 +02:00
Yang Wang 5f3595bba9
Add a callback for onConnectionClosed to MockTransportService (#114564)
The callback is added to allow inserting additional behaviour such as
delay when handling closed connection.
2024-10-14 15:18:25 +11:00
Martijn van Groningen e833e7b6c4
Add feature flag for subobjects auto (#114616) 2024-10-12 18:55:27 +02:00
Nik Everett 4c48aa346a
ESQL: Retry test on 403 (#114450)
Retry the async test when you get a 403 - that could be because security
has not yet booted. We should have permission to fetch everything.
2024-10-11 20:55:01 +02:00
Craig Taverner 227a193483
Enable pushing Sort/Filter by ReferenceAttribute down to Lucene, and thereby optimize Sort by ST_DISTANCE (#112938)
The ST_DISTANCE function added in #108764 was optimized for lucene pushdown in a series of followup PRs, but this did not include sorting by distance. Now this is resolved, for two key scenarios, both known to be valued by users:

* Sorting by distance:
    `FROM index | EVAL distance=ST_DISTANCE(field, literal) | SORT distance`
* Sorting and filtering by distance:
    `FROM index | EVAL distance=ST_DISTANCE(field, literal) | WHERE distance < literal | SORT distance`

The key changes required to make this work:
* Add to the EsQueryExec the appropriate sort->_geo_distance sort type
* Enhance PushTopNToSource to understand how to pushdown the sort even when there is an EVAL in between the FROM and the SORT (between the TopNExec and the EsQueryExec in the physical plan).
* Enhance PushFiltersToSource to understand how to pushdown the filter even when there is an EVAL in between the FROM and the WHERE (between the Filter and the EsQueryExec in the physical plan).

A useful bonus feature of this additional EVAL intelligence is that other, non-spatial cases are now also pushed down. In particular EVALs that are simple aliases are considered and pushed down, for both filtering and sorting.

Local benchmark results, very approximate, but show massive improvements for distanceSort and distanceFilterSort, which relate to the two cases listed above.

Benchmark	Query DSL	ESQL before this PR	ESQL after this PR	Comments
distanceFilter	10	5	5	Optimized in #109972
distanceEvalFilter	10	10000	1500	Still slow due to unnecessary EVAL
distanceSort	150	12000	160	
distanceFilterSort	20	10000	24	

NOTE: This enables pushing down sorting by any ReferenceAttribute that either refers to a sortable FieldAttribute, or to an StDistance function that itself refers to a suitable FieldAttribute of geo_point type.

---------

Co-authored-by: Alexander Spies <alexander.spies@elastic.co>
2024-10-11 14:13:36 +02:00
David Turner 1e2b20065a
Ensure clean thread context in `MasterService` (#114512)
`ThreadContext#stashContext` doesn't guarantee to give a clean thread
context, but it's important we don't allow the callers' thread contexts
to leak into the cluster state update. This commit captures the desired
thread context at startup rather than using `stashContext` when forking
the processor.
2024-10-11 08:04:26 +01:00
Nik Everett db8a2d245d
ESQL: Delay construction of warnings (#114368)
Delay construction of `Warnings` until they are needed to save memory
when evaluating many many many expressions. Most expressions won't use
warnings at all and there isn't any need to make registering warnings
super duper fast. So let's make the construction lazy to save a little
memory. It's like 200 bytes per expression which isn't much, but it's
possible to have thousands of expressions in a single query. Abusive,
but possible.

This also consolidates all `Warnings` usages to a single `Warnings`
class. We had two. We don't need two.
2024-10-10 00:02:26 +02:00
Kostas Krikellas f6bf506584
Avoid noisy errors in testSyntheticSourceKeepArrays (#114391)
* Minimize storing array source

* restrict to fields

* revert changes for `addIgnoredFieldFromContext`

* fix test

* spotless

* count nulls

* Avoid noisy errors in testSyntheticSourceKeepArrays

* update

* update

* update

* update
2024-10-09 14:57:09 +03:00
Kostas Krikellas f79705d9b6
Skip storing ignored source for single-element leaf arrays (#113937)
* Minimize storing array source

* restrict to fields

* revert changes for `addIgnoredFieldFromContext`

* fix test

* spotless

* count nulls
2024-10-09 10:22:39 +03:00
David Turner 276f3b8836
Avoid leaking blackholed register ops in tests (#114287)
Today when we reboot a node in a test case derived from
`AbstractCoordinatorTestCase` we lose the contents of
`blackholedRegisterOperations`, but it's important that these operations
_eventually_ run. With this commit we copy these operations over into
the new node.
2024-10-09 06:59:38 +01:00
Dianna Hohensee 84fe9cf3a3
Track shard snapshot progress during node shutdown (#112567)
Track shard snapshot progress during shutdown to identify any
bottlenecks that cause slowness that can ultimately block shard
re-allocation.

Relates ES-9086
2024-10-08 16:59:37 -04:00
Nik Everett 216d2de877
ESQL: Weaken test assertion (#114336)
Weaken the assertion when testing breakers: it's ok to break while
building a block in addition to topn.
2024-10-09 05:38:15 +11:00
Oleksandr Kolomiiets 965265a1a4
Don't generate invalid combination of subobjects parameter in logsdb tests (#114265) 2024-10-08 18:14:03 +02:00
David Kyle 3a83fcdef9
[ML] Remove scale to zero feature flag (#114323) 2024-10-08 17:37:04 +02:00
Nik Everett bafdd81d3d
ESQL: Reenable part of heap attack test (#114252)
This reenables a test and adds more debugging to another one. We'll use
this to collect more information the next time it fails.
2024-10-08 04:02:14 +11:00
Kostas Krikellas 9cfe679173
Avoid using `dynamic:strict` with `subobjects:false` at root (#114247) 2024-10-07 18:06:23 +02:00
Parker Timmins fe36a4543d
Make randomInstantBetween return in range [minInstant, maxInstant] (#114177)
randomInstantBetween can produce a result which is not within the [minInstant, maxInstant] range. This occurs when the epoch second picked matches the min bound and the nanos are below the min nanos, or the second picked matches the max bound seconds and nanos are above the max bound nanos. This change fixes the function by setting a bound on which nano values can be picked if the min or max epoch second value is picked.
2024-10-07 08:34:05 -05:00
Ignacio Vera c1580639d4
Add getAndSet to Objectarray (#114200)
This commit adds a getAndSet implementation to the ObjectArray API and changes the set method to return void.
2024-10-07 14:48:00 +02:00
Luca Cavanna 6b2cc599ce
Replace some test usages of search(Query, Collector) (#113818)
The leftover usages of the deprecated  search(Query, TotalHitCountCollector)
have been replaced with search(Query, TotalHitCountCollectorManager)
2024-10-07 12:07:49 +02:00