Commit Graph

518 Commits

Author SHA1 Message Date
Pete Gillin 1fe3b77a2a
ES-10063 Add multi-project support for more stats APIs (#127650)
* Add multi-project support for more stats APIs

This affects the following APIs:
 - `GET _nodes/stats`:
   - For `indices`, it now prefixes the index name with the project ID (for non-default projects). Previously, it didn't tell you which project an index was in, and it failed if two projects had the same index name.
   - For `ingest`, it now gets the pipeline and processor stats for all projects, and prefixes the pipeline ID with the project ID. Previously, it only got them for the default project.
 - `GET /_cluster/stats`:
   - For `ingest`, it now aggregates the pipeline and processor stats for all projects. Previously, it only got them for the default project.
 - `GET /_info`:
   - For `ingest`, same as for `GET /_nodes/stats`.

This is done by making `IndicesService.stats()` and `IngestService.stats()` include project IDs in the `NodeIndicesStats` and `IngestStats` objects they return, and making those stats objects incorporate the project IDs when converting to XContent.

The transitive callers of these two methods are rather extensive (including all callers to `NodeService.stats()`, all callers of `TransportNodesStatsAction`, and so on). To ensure the change is safe, the callers were all checked out, and they fall into the following cases:
 - The behaviour change is one of the desired enhancements described above.
 - There is no behaviour change because it was getting node stats but neither `indices` nor `ingest` stats were requested.
 - There is no behaviour change because it was getting `indices` and/or `ingest` stats but only using aggregate values.
 - In `MachineLearningUsageTransportAction` and `TransportGetTrainedModelsStatsAction`, the `IngestStats` returned will return stats from all projects instead of just the default with this change, but they have been changed to filter the non-default project stats out, so this change is a noop there. (These actions are not MP-ready yet.)
 - `MonitoringService` will be affected, but this is the legacy monitoring module which is not in use anywhere that MP is going to be enabled. (If anything, the behaviour is probably improved by this change, as it will now include project IDs, rather than producing ambiguous unqualified results and failing in the case of duplicates.)

* Update test/external-modules/multi-project/build.gradle

Change suggested by Niels.

Co-authored-by: Niels Bauman <33722607+nielsbauman@users.noreply.github.com>

* Respond to review comments

* fix merge weirdness

* [CI] Auto commit changes from spotless

* Fix test compilation following upstream change to base class

* Update x-pack/plugin/core/src/test/java/org/elasticsearch/xpack/core/datatiers/DataTierUsageFixtures.java

Co-authored-by: Niels Bauman <33722607+nielsbauman@users.noreply.github.com>

* Make projects-by-index map nullable and omit in single-project; always include project prefix in XContent in multip-project, even if default; also incorporate one other review comment

* Add a TODO

* update IT to reflect changed behaviour

* Switch to using XContent.Params to indicate whether it is multi-project or not

* Refactor NodesStatsMultiProjectIT to common up repeated assertions

* Defer use of ProjectIdResolver in REST handlers to keep tests happy

* Include index UUID in "unknown project" case

* Make the index-to-project map empty rather than null in the BWC deserialization case.

This works out fine, for the reasons given in the comment. As it happens, I'd already forgotten to do the null check in the one place it's actively used.

* remove a TODO that is done, and add a comment

* fix typo

* Get REST YAML tests working with project ID prefix TODO finish this

* As a drive-by, fix and un-suppress one of the health REST tests

* [CI] Auto commit changes from spotless

* TODO ugh

* Experiment with different stashing behaviour

* [CI] Auto commit changes from spotless

* Try a more sensible stash behaviour for assertions

* clarify comment

* Make checkstyle happy

* Make the way `Assertion` works more consistent, and simplify implementation

* [CI] Auto commit changes from spotless

* In RestNodesStatsAction, make the XContent params to channel.request(), which is the value it would have had before this change

---------

Co-authored-by: Niels Bauman <33722607+nielsbauman@users.noreply.github.com>
Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
2025-05-21 19:04:22 +01:00
Ryan Ernst a2b4a6f246
Add temporary LegacyActionRequest (#128107)
In order to remove ActionType, ActionRequest will become strongly typed,
referring to the ActionResponse type. As a precursor to that, this
commit adds a LegacyActionRequest which all existing ActionRequest
implementations now inherit from. This will allow adding the
ActionResponse type to ActionRequest in a future commit without
modifying every implementation at once.
2025-05-20 07:09:27 -07:00
Joe Gallo b46bee4e47
Correctly handle non-integers in nested paths in the remove processor (#127006) 2025-04-18 11:46:54 -04:00
Joe Gallo 450516d675
Fix a RemoveProcessor test that never ran (#126464) 2025-04-08 11:21:04 -04:00
Ryan Ernst 991e80d56e
Remove unnecessary generic params from action classes (#126364)
Transport actions have associated request and response classes. However,
the base type restrictions are not necessary to duplicate when creating
a map of transport actions. Relatedly, the ActionHandler class doesn't
actually need strongly typed action type and classes since they are lost
when shoved into the node client map. This commit removes these type
restrictions and generic parameters.
2025-04-07 16:22:56 -07:00
Joe Gallo bead858ccd
Correctly handle nulls in nested paths in the remove processor (#126417) 2025-04-07 16:54:07 -04:00
Joe Gallo 950456d38b
Cleanup community_id processor (#126247) 2025-04-03 19:59:09 -04:00
Armin Braun 50437e79d3
Cleanup missing use of StandardCharsets (#125424)
Random annoyance that I figured, I'd just fix globally:
We can do a bit of a cleaner job when doing byte <-> string conversion here and there.
2025-03-21 20:10:15 +01:00
Joe Gallo e210ea87d6
Add an ignoreMissing parameter to IngestDocument's removeField method (#125232) 2025-03-19 16:55:13 -04:00
Rene Groeschke ae569def9c
[Build] Require reason for usesDefaultDistribution (#124707)
This makes using usesDefaultDistribution in our test setup for explicit by requiring a reason why it's needed.
This is helpful as part of revisiting the need for all those usages in our code base.
2025-03-17 08:25:39 +01:00
Armin Braun 4c1c51e870
Remove remoteAddress field from TransportResponse (#120016)
This field is only used (by security) for requests, having it in responses is redundant.
Also, we have a couple of responses that are singletons/quasi-enums where setting the value
needlessly might introduce some strange contention even though it's a plain store.

This isn't just a cosmetic change. It makes it clear at compile time that each response instance
is exclusively defined by the bytes that it is read from. This makes it easier to reason about the
validity of suggested optimizations like https://github.com/elastic/elasticsearch/pull/120010
2025-03-16 19:54:29 +01:00
Joe Gallo 0e87e8454d
DateProcessor refactoring (#124349) 2025-03-08 09:03:50 -05:00
Niels Bauman 20e186a252
Make enrich project-aware (#124099)
Makes the execution and use of enrich policies project-aware.
Note: this does not make the enrich cache project-aware. That is to be
handled in a follow-up PR.
2025-03-06 19:20:46 +01:00
Joe Gallo 54c826532c
Refactor RegisteredDomainProcessorTests (#124175) 2025-03-06 11:01:12 -05:00
Joe Gallo d7b8b728e9
Cleanup RegisteredDomainProcessor (#124123) 2025-03-05 17:05:07 -05:00
Joe Gallo 65a8e778e3
Cleanup RegisteredDomainProcessorTests (#124118) 2025-03-05 13:19:12 -05:00
Tim Vernum 838d8389de Merge main into multi-project 2025-02-19 16:40:34 +11:00
Lee Hinman 2ae80c799d
Allow setting the `type` in the reroute processor (#122409)
* Allow setting the `type` in the reroute processor

This allows configuring the `type` from within the ingest `reroute` processor. Similar to `dataset`
and `namespace`, the type defaults to the value extracted from the index name. This means that
documents sent to `logs-mysql.access.default` will have a default value of `logs` for the type.

Resolves #121553

* Update docs/changelog/122409.yaml
2025-02-18 12:38:00 -07:00
Niels Bauman 621a18d947 Merge main into multi-project 2025-01-30 17:26:28 +10:00
Joe Gallo 022b841a45
Optimize IngestCtxMap construction (#120833) 2025-01-27 11:03:06 -05:00
Joe Gallo 5e662c507e
Optimize IngestDocMetadata isAvailable (#120753) 2025-01-24 09:22:21 -05:00
Tim Vernum 552cec7ff0 Merge revision 34059c9dbd into multi-project 2025-01-17 16:32:15 +11:00
Patrick Doyle 34059c9dbd
Limit ByteSizeUnit to 2 decimals (#120142)
* Exhaustive testParseFractionalNumber

* Refactor: encapsulate ByteSizeUnit constructor

* Refactor: store size in bytes

* Support up to 2 decimals in parsed ByteSizeValue

* Fix test for rounding up with no warnings

* ByteSizeUnit transport changes

* Update docs/changelog/120142.yaml

* Changelog details and impact

* Fix change log breaking.area

* Address PR comments
2025-01-16 19:30:23 +00:00
Tim Vernum 4ff691f066 Merge revision 7fb6ca447a into multi-project 2024-12-31 15:41:02 +11:00
Rene Groeschke ba61f8c7f7
Update Gradle wrapper to 8.12 (#118683)
This updates the gradle wrapper to 8.12

We addressed deprecation warnings due to the update that includes:

- Fix change in TestOutputEvent api
- Fix deprecation in groovy syntax
- Use latest ospackage plugin containing our fix
- Remove project usages at execution time
- Fix deprecated project references in repository-old-versions
2024-12-30 15:34:24 +01:00
Niels Bauman bd678373d8 Execute pipelines during indexing in multiple projects (MP-1866)
Allows executing pipelines during indexing and pipeline simulation in
multiple projects.
2024-12-18 12:34:20 +00:00
Parker Timmins 6db39d1765
Resolve pipelines from template if lazy rollover write (#116031)
If datastream rollover on write flag is set in cluster state, resolve pipelines from templates rather than from metadata. This fixes the following bug: when a pipeline reroutes every document to another index, and rollover is called with lazy=true (setting the rollover on write flag), changes to the pipeline do not go into effect, because the lack of writes means the data stream never rolls over and pipelines in metadata are not updated. The fix is to resolve pipelines from templates if the lazy rollover flag is set. To improve efficiency we only resolve pipelines once per index in the bulk request, caching the value, and reusing for other requests to the same index.

Fixes: #112781
2024-11-01 22:54:55 -05:00
Ryan Ernst e5d5c17c99
Use directory name as project name for libs (#115720)
The libs projects are configured to all begin with `elasticsearch-`.
While this is desireable for the artifacts to contain this consistent
prefix, it means the project names don't match up with their
directories. Additionally, it creates complexities for subproject naming
that must be manually adjusted.

This commit adjusts the project names for those under libs to be their
directory names. The resulting artifacts for these libs are kept the
same, all beginning with `elasticsearch-`.
2024-10-29 13:02:28 -07:00
Moritz Mack 68316f7d17
Remove metering from ingest service to occur afterwards when parsing the final document (#114895) 2024-10-25 11:52:06 -07:00
Pete Gillin 43e5258b3c
Add a `terminate` ingest processor (#114157)
This processor simply causes any remaining processors in the pipeline
to be skipped. It will normally be executed conditionally using the
`if` option. (If this pipeline is being called from another pipeline,
the calling pipeline is *not* terminated.)

For example, this:

```
POST /_ingest/pipeline/_simulate
{
  "pipeline":
  {
    "description": "Appends just 'before' to the steps field if the number field
 is present, or both 'before' and 'after' if not",
    "processors": [
      {
        "append": {
          "field": "steps",
          "value": "before"
        }
      },
      {
        "terminate": {
          "if": "ctx.error != null"
        }
      },
      {
        "append": {
          "field": "steps",
          "value": "after"
        }
      }
    ]
  },
  "docs": [
    {
      "_index": "index",
      "_id": "doc1",
      "_source": {
        "name": "okay",
        "steps": []
      }
    },
    {
      "_index": "index",
      "_id": "doc2",
      "_source": {
        "name": "bad",
        "error": "oh no",
        "steps": []
      }
    }
  ]
}
```

returns something like this:

```
{
  "docs": [
    {
      "doc": {
        "_index": "index",
        "_version": "-3",
        "_id": "doc1",
        "_source": {
          "name": "okay",
          "steps": [
            "before",
            "after"
          ]
        },
        "_ingest": {
          "timestamp": "2024-10-04T16:25:20.448881Z"
        }
      }
    },
    {
      "doc": {
        "_index": "index",
        "_version": "-3",
        "_id": "doc2",
        "_source": {
          "name": "bad",
          "error": "oh no",
          "steps": [
            "before"
          ]
        },
        "_ingest": {
          "timestamp": "2024-10-04T16:25:20.448932Z"
        }
      }
    }
  ]
}
```
2024-10-08 17:39:53 +01:00
Simon Cooper 31d50eed0f
Update 9.0 with various locale changes from 8.x (#113787) (#113870)
Forward-port changes from #113787, and update the docs with similar information to #113587
2024-10-02 11:41:33 +01:00
Simon Cooper d582db22b7
Change default locale of date processors to ENGLISH (#112796)
It is English in the docs, so this fixes the code to match the docs. Note that this really impacts Elasticsearch when run on JDK 23 with the CLDR locale database, as in the COMPAT database pre-23, root and en are essentially the same.
2024-09-24 09:54:39 +01:00
Simon Cooper f9aa6f40cd
Always use CLDR locale on ES v9 (#113184)
Regardless of JDK version, ES should always use CLDR locale database from 9.0.0.
This also removes IsoCalendarDataProvider used to override week-date calculations for the root locale only.
2024-09-23 11:05:08 +01:00
Mark Vieira a59c182f9f
Add AGPLv3 as a supported license 2024-09-13 15:29:46 -07:00
David Turner 8607d40679
Introduce test utils for ingest pipelines (#112733)
Replaces the somewhat-awkward API on `ClusterAdminClient` for
manipulating ingest pipelines with some test-specific utilities that are
easier to use.

Relates #107984 in that this change massively reduces the noise that
would otherwise result from removing the trappy timeouts in these APIs.
2024-09-12 08:22:50 +01:00
Mark Vieira 4ce661cc48
Bump Elasticsearch version to 9.0.0 (#112570) 2024-09-11 09:40:11 -07:00
Mark Vieira 24f33e95e8
Ensure rest compatibility tests are run when appropriate (#112526) 2024-09-05 08:22:48 -07:00
Panos Koutsovasilis 29453cb2ce
fix: support all allowed protocol numbers (#111528)
* fix(CommunityIdProcessor): support all allowed protocol numbers

* fix(CommunityIdProcessor): update documentation
2024-08-26 08:37:40 +03:00
Patrick Doyle 35a375329a
Move Guice to org.elasticsearch.injection.guice (#111723)
* Move files and fix imports & module exports
* Other consequences of moving Guice
2024-08-12 10:47:46 -04:00
Moritz Mack 6ca3ac253a
Track raw ingest and storage size separately to support updates by doc (#111179)
This PR starts tracking raw ingest and storage size separately for updates by document.
This is done capturing the ingest size when initially parsing the update, and storage size when 
parsing the final, merged document.

Additionally this renames DocumentSizeObserver to XContentParserDecorator / XContentMeteringParserDecorator
for better reasoning about the code. More renaming will have to follow.
---------

Co-authored-by: Przemyslaw Gomulka <przemyslaw.gomulka@elastic.co>
2024-08-02 09:26:37 +02:00
David Turner b8af2a066e
Remove usages of more test-only request builders (#111400)
Deprecates for removal the following methods from `ClusterAdminClient`:

- `prepareSearchShards`
- `preparePutStoredScript`
- `prepareDeleteStoredScript`
- `prepareGetStoredScript`

Also replaces all usages of these methods with more suitable test
utilities. This will permit their removal, and the removal of the
corresponding `RequestBuilder` objects, in a followup.

Relates #107984
2024-07-30 07:33:19 +01:00
Ankita Kumar 5761c4afb5
Reconstruct set of indices in BulkRequest (#110672)
Reconstruct indices set in BulkRequest constructor so that the correct thread pool can be used for forwarded bulk requests. Before this fix, forwarded bulk requests were always using the system_write thread pool because the indices set was empty.

Fixes issue https://github.com/elastic/elasticsearch/issues/102792
2024-07-25 20:30:55 -04:00
kanoshiou 9fbdfcf650
Fix unnecessary mustache template evaluation (#110986)
Addresses the performance issue in the date ingest processor where Mustache template evaluation is unnecessarily applied inside a loop. The timezone and locale templates are now evaluated once before the loop, improving efficiency.

closes #110191
---------
Co-authored-by: Joe Gallo <joegallo@gmail.com>
2024-07-22 15:42:58 -05:00
Przemyslaw Gomulka cf03c66c1f
Infrastructure to meter updates by script for ra-s nontimeseries (#108910)
this commit refactors the metering for billing api so that we can hide the implementation details of DocumentSizeObserver creation and adds additional field `originatesFromScript` on IndexRequest
There will no longer need to have a code checking if the request was already parsed in ingest service or updatehelper. This logic will be hidden in the implementation.
2024-07-11 10:49:32 +02:00
Przemyslaw Gomulka b80b739993
Provide document size reporter with MapperService (#109794)
Instead of indexMode a mapper service is necessary to reliably determine if an index is a timeseries datastream
2024-06-18 11:40:56 +02:00
Przemyslaw Gomulka 44ae540fd7
Provide the DocumentSizeReporter with index mode (#108947)
in order to decided what logic in to apply when reporting a document size we need to know if an index is a time_series mode. This information is in indexSettings.mode.
2024-06-10 11:48:22 +02:00
Parker Timmins 3662d12c9f
Return ingest byte stats even when 0-valued (#108796)
Change the ingest byte stats to always be returned
whether or not they have a value of 0. Add human readable
form of byte stats. Update docs to reflect changes.
2024-05-20 10:52:16 -05:00
Parker Timmins c5a3342449
Test pipeline run after reroute (#108693)
Add test confirming that pipelines are run after a reroute.
Fix test of two stage reroute. Delete pipelines during teardown
so as to not break other tests using name pipeline name.

Co-authored-by: Joe Gallo <joegallo@gmail.com>
2024-05-20 10:02:04 -05:00
Parker Timmins 298c6492a5
Make ingest byte stat names more descriptive (#108786)
Current ingest byte stat fields could easily be confused.
Add more descriptive name to make it clear that they do not
count all docs processed by the pipeline.
2024-05-17 12:03:42 -05:00
Larisa Motova a01baa3d79
Include doc size info in ingest stats (#107240)
Add ingested_in_bytes and produced_in_bytes stats to pipeline ingest stats.
These track how many bytes are ingested and produced by a given pipeline.
For efficiency, these stats are recorded for the first pipeline to process a 
document. Thus, if a pipeline is called as a final pipeline after a default pipeline,
as a pipeline processor, and after a reroute request, a document will not 
contribute to the stats for that pipeline. If a given pipeline has 0 bytes recorded
for both of these stats, due to not being the first pipeline to run any doc, these
stats will not appear in the pipeline's entry in ingest stats.
2024-05-17 08:53:24 -05:00