* Add multi-project support for more stats APIs
This affects the following APIs:
- `GET _nodes/stats`:
- For `indices`, it now prefixes the index name with the project ID (for non-default projects). Previously, it didn't tell you which project an index was in, and it failed if two projects had the same index name.
- For `ingest`, it now gets the pipeline and processor stats for all projects, and prefixes the pipeline ID with the project ID. Previously, it only got them for the default project.
- `GET /_cluster/stats`:
- For `ingest`, it now aggregates the pipeline and processor stats for all projects. Previously, it only got them for the default project.
- `GET /_info`:
- For `ingest`, same as for `GET /_nodes/stats`.
This is done by making `IndicesService.stats()` and `IngestService.stats()` include project IDs in the `NodeIndicesStats` and `IngestStats` objects they return, and making those stats objects incorporate the project IDs when converting to XContent.
The transitive callers of these two methods are rather extensive (including all callers to `NodeService.stats()`, all callers of `TransportNodesStatsAction`, and so on). To ensure the change is safe, the callers were all checked out, and they fall into the following cases:
- The behaviour change is one of the desired enhancements described above.
- There is no behaviour change because it was getting node stats but neither `indices` nor `ingest` stats were requested.
- There is no behaviour change because it was getting `indices` and/or `ingest` stats but only using aggregate values.
- In `MachineLearningUsageTransportAction` and `TransportGetTrainedModelsStatsAction`, the `IngestStats` returned will return stats from all projects instead of just the default with this change, but they have been changed to filter the non-default project stats out, so this change is a noop there. (These actions are not MP-ready yet.)
- `MonitoringService` will be affected, but this is the legacy monitoring module which is not in use anywhere that MP is going to be enabled. (If anything, the behaviour is probably improved by this change, as it will now include project IDs, rather than producing ambiguous unqualified results and failing in the case of duplicates.)
* Update test/external-modules/multi-project/build.gradle
Change suggested by Niels.
Co-authored-by: Niels Bauman <33722607+nielsbauman@users.noreply.github.com>
* Respond to review comments
* fix merge weirdness
* [CI] Auto commit changes from spotless
* Fix test compilation following upstream change to base class
* Update x-pack/plugin/core/src/test/java/org/elasticsearch/xpack/core/datatiers/DataTierUsageFixtures.java
Co-authored-by: Niels Bauman <33722607+nielsbauman@users.noreply.github.com>
* Make projects-by-index map nullable and omit in single-project; always include project prefix in XContent in multip-project, even if default; also incorporate one other review comment
* Add a TODO
* update IT to reflect changed behaviour
* Switch to using XContent.Params to indicate whether it is multi-project or not
* Refactor NodesStatsMultiProjectIT to common up repeated assertions
* Defer use of ProjectIdResolver in REST handlers to keep tests happy
* Include index UUID in "unknown project" case
* Make the index-to-project map empty rather than null in the BWC deserialization case.
This works out fine, for the reasons given in the comment. As it happens, I'd already forgotten to do the null check in the one place it's actively used.
* remove a TODO that is done, and add a comment
* fix typo
* Get REST YAML tests working with project ID prefix TODO finish this
* As a drive-by, fix and un-suppress one of the health REST tests
* [CI] Auto commit changes from spotless
* TODO ugh
* Experiment with different stashing behaviour
* [CI] Auto commit changes from spotless
* Try a more sensible stash behaviour for assertions
* clarify comment
* Make checkstyle happy
* Make the way `Assertion` works more consistent, and simplify implementation
* [CI] Auto commit changes from spotless
* In RestNodesStatsAction, make the XContent params to channel.request(), which is the value it would have had before this change
---------
Co-authored-by: Niels Bauman <33722607+nielsbauman@users.noreply.github.com>
Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
In order to remove ActionType, ActionRequest will become strongly typed,
referring to the ActionResponse type. As a precursor to that, this
commit adds a LegacyActionRequest which all existing ActionRequest
implementations now inherit from. This will allow adding the
ActionResponse type to ActionRequest in a future commit without
modifying every implementation at once.
Transport actions have associated request and response classes. However,
the base type restrictions are not necessary to duplicate when creating
a map of transport actions. Relatedly, the ActionHandler class doesn't
actually need strongly typed action type and classes since they are lost
when shoved into the node client map. This commit removes these type
restrictions and generic parameters.
This makes using usesDefaultDistribution in our test setup for explicit by requiring a reason why it's needed.
This is helpful as part of revisiting the need for all those usages in our code base.
This field is only used (by security) for requests, having it in responses is redundant.
Also, we have a couple of responses that are singletons/quasi-enums where setting the value
needlessly might introduce some strange contention even though it's a plain store.
This isn't just a cosmetic change. It makes it clear at compile time that each response instance
is exclusively defined by the bytes that it is read from. This makes it easier to reason about the
validity of suggested optimizations like https://github.com/elastic/elasticsearch/pull/120010
Makes the execution and use of enrich policies project-aware.
Note: this does not make the enrich cache project-aware. That is to be
handled in a follow-up PR.
* Allow setting the `type` in the reroute processor
This allows configuring the `type` from within the ingest `reroute` processor. Similar to `dataset`
and `namespace`, the type defaults to the value extracted from the index name. This means that
documents sent to `logs-mysql.access.default` will have a default value of `logs` for the type.
Resolves#121553
* Update docs/changelog/122409.yaml
* Exhaustive testParseFractionalNumber
* Refactor: encapsulate ByteSizeUnit constructor
* Refactor: store size in bytes
* Support up to 2 decimals in parsed ByteSizeValue
* Fix test for rounding up with no warnings
* ByteSizeUnit transport changes
* Update docs/changelog/120142.yaml
* Changelog details and impact
* Fix change log breaking.area
* Address PR comments
This updates the gradle wrapper to 8.12
We addressed deprecation warnings due to the update that includes:
- Fix change in TestOutputEvent api
- Fix deprecation in groovy syntax
- Use latest ospackage plugin containing our fix
- Remove project usages at execution time
- Fix deprecated project references in repository-old-versions
If datastream rollover on write flag is set in cluster state, resolve pipelines from templates rather than from metadata. This fixes the following bug: when a pipeline reroutes every document to another index, and rollover is called with lazy=true (setting the rollover on write flag), changes to the pipeline do not go into effect, because the lack of writes means the data stream never rolls over and pipelines in metadata are not updated. The fix is to resolve pipelines from templates if the lazy rollover flag is set. To improve efficiency we only resolve pipelines once per index in the bulk request, caching the value, and reusing for other requests to the same index.
Fixes: #112781
The libs projects are configured to all begin with `elasticsearch-`.
While this is desireable for the artifacts to contain this consistent
prefix, it means the project names don't match up with their
directories. Additionally, it creates complexities for subproject naming
that must be manually adjusted.
This commit adjusts the project names for those under libs to be their
directory names. The resulting artifacts for these libs are kept the
same, all beginning with `elasticsearch-`.
It is English in the docs, so this fixes the code to match the docs. Note that this really impacts Elasticsearch when run on JDK 23 with the CLDR locale database, as in the COMPAT database pre-23, root and en are essentially the same.
Regardless of JDK version, ES should always use CLDR locale database from 9.0.0.
This also removes IsoCalendarDataProvider used to override week-date calculations for the root locale only.
Replaces the somewhat-awkward API on `ClusterAdminClient` for
manipulating ingest pipelines with some test-specific utilities that are
easier to use.
Relates #107984 in that this change massively reduces the noise that
would otherwise result from removing the trappy timeouts in these APIs.
This PR starts tracking raw ingest and storage size separately for updates by document.
This is done capturing the ingest size when initially parsing the update, and storage size when
parsing the final, merged document.
Additionally this renames DocumentSizeObserver to XContentParserDecorator / XContentMeteringParserDecorator
for better reasoning about the code. More renaming will have to follow.
---------
Co-authored-by: Przemyslaw Gomulka <przemyslaw.gomulka@elastic.co>
Deprecates for removal the following methods from `ClusterAdminClient`:
- `prepareSearchShards`
- `preparePutStoredScript`
- `prepareDeleteStoredScript`
- `prepareGetStoredScript`
Also replaces all usages of these methods with more suitable test
utilities. This will permit their removal, and the removal of the
corresponding `RequestBuilder` objects, in a followup.
Relates #107984
Reconstruct indices set in BulkRequest constructor so that the correct thread pool can be used for forwarded bulk requests. Before this fix, forwarded bulk requests were always using the system_write thread pool because the indices set was empty.
Fixes issue https://github.com/elastic/elasticsearch/issues/102792
Addresses the performance issue in the date ingest processor where Mustache template evaluation is unnecessarily applied inside a loop. The timezone and locale templates are now evaluated once before the loop, improving efficiency.
closes#110191
---------
Co-authored-by: Joe Gallo <joegallo@gmail.com>
this commit refactors the metering for billing api so that we can hide the implementation details of DocumentSizeObserver creation and adds additional field `originatesFromScript` on IndexRequest
There will no longer need to have a code checking if the request was already parsed in ingest service or updatehelper. This logic will be hidden in the implementation.
in order to decided what logic in to apply when reporting a document size we need to know if an index is a time_series mode. This information is in indexSettings.mode.
Change the ingest byte stats to always be returned
whether or not they have a value of 0. Add human readable
form of byte stats. Update docs to reflect changes.
Add test confirming that pipelines are run after a reroute.
Fix test of two stage reroute. Delete pipelines during teardown
so as to not break other tests using name pipeline name.
Co-authored-by: Joe Gallo <joegallo@gmail.com>
Current ingest byte stat fields could easily be confused.
Add more descriptive name to make it clear that they do not
count all docs processed by the pipeline.
Add ingested_in_bytes and produced_in_bytes stats to pipeline ingest stats.
These track how many bytes are ingested and produced by a given pipeline.
For efficiency, these stats are recorded for the first pipeline to process a
document. Thus, if a pipeline is called as a final pipeline after a default pipeline,
as a pipeline processor, and after a reroute request, a document will not
contribute to the stats for that pipeline. If a given pipeline has 0 bytes recorded
for both of these stats, due to not being the first pipeline to run any doc, these
stats will not appear in the pipeline's entry in ingest stats.