The migrate to data tiers routing API required ILM to be stopped. This
is fine for "live" runs, but for dry runs this isn't a requirement.
This changes the dry_run to allow the API to run irrespective of the ILM
status.
- Updates checkstyle to 8.45.1
- This was update was triggered by us running into checkstyle/checkstyle#9897 We do not update to latest 9.x release yet as we see a performance decrease when running precommit with this. likely related to checkstyle/checkstyle#10934
Since 8.11 lucene performs index checking concurrently
using disposable fixed thread pool executor by default.
Setting thread count to 1 to keep prior behavior
(check is executed in single caller thread).
This fixes the migrate to data tiers routing API to take into account
the scenario where the node attribute configuration for an index is more
accurate than the existing `_tier_preference` configuration.
Previously we would simply remove the node attributes routing if there
was a `_tier_preference` configured for the index.
With this commit, we'll look if either the `require.data` or
`include.data` custom routings are colder than the existing `_tier_preference`
configuration (ie. `cold` vs `data_warm,data_hot`) and update the tier
routing accordingly.
eg.
{
index.routing.allocation.require.data: "warm",
index.routing.allocation.include.data: "cold",
index.routing.allocation.include._tier_preference: "data_hot"
}
will be migrated to:
{
index.routing.allocation.include._tier_preference: "data_cold,data_warm,data_hot"
}
This also removes the existing invariant that had the `require.data`
configuration take precedence over a possible `include.data`
configuration, and will now migrate the coldest configuration to the
corresponding `_tier_preference`.
eg.
{
index.routing.allocation.require.data: "warm",
index.routing.allocation.include.data: "cold"
}
will be migrated to:
{
index.routing.allocation.include._tier_preference: "data_cold,data_warm,data_hot"
}
Previously, within tests, the file "roles.yml" (that is used to define
security roles in a cluster) would need to be configured using
`extraConfigFile`. This is effective, but means that there can only be
a single source of security roles for the testcluster.
This change introduces an explicit "securityRoles" setting in
testclusters that will concatenate the provided files into a single
"roles.yml" in the config directory. This makes it possible for
testclusters itself to define standard roles as well as having each
test define additional roles it may need.
Relates: #81400
All three template types (legacy templates, composable index templates
and component templates) are stored in cluster state metadata (in fields
"templates", "index_template" and "component_template"). This cluster
state is readable (via GET /_cluster/state) for users who have the
monitor privilege at the cluster level. However, calling the explicit
read endpoints for these templates required the manage_index_templates
privilege. This change grants access to the template specific retrieval
APIs for all users (or API Keys) with the cluster monitor privilege so
that they can make use of these fit-for-purpose APIs instead of parsing
data directly from cluster metadata Relates:
https://github.com/elastic/beats/issues/29554 Relates: #78832
* Script: track this pointer capture from blocks within lambdas
If a lambda contains a block that calls into a user function,
the painless compiler was not tracking that the lambda needs
to capture the `this` pointer.
Scripts that attempted to use a lambda that calls a user
function from inside a block would trigger an
`illegal_state_exception`:
> no 'this' pointer within static method
`BlockScope` now forwards `setUsesInstanceMethod` calls to
it's parent, which may be a `LambdaScope`.
`LambdaScope` will also forward `setUsesInstanceMethod` to
it's parent after setting it's own `usesInstanceMethod` flag,
propagating `this` pointer capture in the case of nested lambdas.
Fixes: #82224
As outlined in elastic/elasticsearch#81604, including the `searchable_snapshot` action in both the hot and cold phases can result in indices not automatically migrating to the cold tier during the cold phase.
This adds a related warning.
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
This change adds infrastructure for GeoShape making it accessible via the new scripting fields API.
This does not add any methods outside of get at this point in time since it needs additional
thought/discussion on what makes sense similar to GeoPoints. Note that because GeoShape does
not support XContent this is just a skeleton that currently supports getScriptDocValues.
The PR re-enables a test that has been muted for a while.
A few improvements have been made that are suspected
to allow this test to pass now. Also, the timeout has been
bumped up to help with the stability of this test.
closes#54096
This commit attempts to fix a set of Watcher tests that can
fail due to unexpected execution of Watches after Watcher has been stopped.
The theory here is that a Watch can be queued but not fully executed
then Watcher is shutdown, the test does some clean up, then the
queued Watch finishes execution and causes some some additional cleanup
to fail.
The change here ensures that when Watcher is stopped from AbstractWatcherIntegrationTestCase
that it will also wait until there are no more current Watches executing.
closes#66495
The error that caused his test be muted is
java.lang.AssertionError: Count is 2 hits but 1 was expected.
from: assertHitCount(searchResponse, 1L);
when searching for watch records.
It is believed this is due to the cron scheduler executing a
second watch. The change here is to use a long interval
schedule instead of the cron scheduler.
Fixes#67908
Changes:
* Notes that the query string query's `default_field` and `fields` parameters support wildcards.
* Adds an xref to the `index.query.default_field` docs to the `default_field` parameter.
The allocate action can specify only `number_of_replicas` (without
routing configuration) but failed if it attempted to only specify
`total_shards_per_node`.
This fixes the action to allow specifying only `total_shards_per_node`.
We only have a single entry in the map nowadays so the map is redundant
and has become a performance issue now that we do mapping deduplication
when building `Metadata`. Calls to this getter make up more 5%+ of the
master thread runtime in many shards benchmarking because of the expensive
iterator setup.
Unfortunately, it wasn't quite as easy to adjust the wire format here
since we need to still communicate with older version nodes that will
send a map diff so I left that as a TODO as it's low-impact/priority
and this change fixes the performance issue for now.
Today we allocate a contiguous chunk of memory for the global metadata
each time we write it to disk. The size of this chunk is unbounded and
in practice it can be pretty large. This commit splits the metadata
document up into pages (1MB by default) that are streamed to disk at
write time, bounding the memory usage of cluster state persistence.
Since the memory usage is now bounded we can allocate a single buffer up
front and re-use it for every write.
As part of the effort of making JDBC driver self sufficient, remove the
ES lib geo dependencies without any replacement.
Currently the JDBC driver takes the WKT text and instantiates a geo
object based on the ES lib geo.
Moving forward the driver will return the WKT string representation
without any conversion letting the user pick the geo library desired.
That can be ES lib geo, jts, spatial4j or others.
Note this is a breaking change.
Relates #80277
Adding the mapper extras plugin to avoid endless warnings about a
missing scaled_float mapper in these tests.
Also, removed some unrelated dead code that wasn't really worth a separate PR.
For string lists there is no need to parse an existing list after serializing
it. Also, if there's no fallback settings and/or validation and just a constant
default, we can pass through that default outright instead of serializing and
deserializing it to make a copy.
This is motivated by finding lots of duplicate strings on heap for the default
query fields list on data nodes holding many indices with Beats mappings which
I tracked down to the fact that we are not passing through the list containing
deduplicated strings from the `Settings` instance.
As part of the effort of making JDBC driver self sufficient, remove the
ES lib core dependencies by light cloning the couple the utility classes
needed (and their dependencies).
In additional provide a shim layer that adapts the sql-proto Protocol
constant class by introducing a sql-action Protocol class, which is ES
aware.
To avoid name collision sql-proto has been renamed to CoreProtocol which
is used at the driver level, while Protocol is used within ES.
Relates #82077
Use string intern cache when constructing discovery node instances.
These started showing up in profiling now here and there, likely
due to the fact that we're interning more strings via setting string
interning now. Using the deduplicator makes them disappear from profiling again
for the most part.
Also, fix the fact that we would read a mutable `attributes` map if there's
any attributes and escape it via a getter and intern the attribute keys and values
from the wire as well because it's pretty much free to do so now anyway.
In all reported cases it appears that the found count is > expected
count. It is suspected that there are some Watches from previous tests
still pending in the .triggered_watches index causing extra Watches to
be executed. The change here is to ensure that .triggered_watches is
removed prior to the test run ensuing a clean slate. closes#67729
In #77674 the assertion that is moved here was added.
The problem with it is that when running tests that do
random IO exceptions on the `channel.position()` call,
the assertion actual has a side effect.
If the exception it triggers as a side effect is not caught and
the translog failed as a result, tests around the safety of handling
IOExceptions in this code fail. Unfortunately, the assertion still
has a side effect after this change but its side effect is handled
like an expected IOException.
In the index permission block of a role descriptor, the "field_security"
field is an object with this format: "field_security": {
"grant" : [ "field-1", "field-2", "more-fields-*" ], "except" : [
"more-field-secret-*" ] } The docs incorrectly stated that
"field_security" was a list, and if you provided a list the parser would
fail with a message that incorrectly stated that START_ARRAY was an
acceptable token. These have both been fixed. While reviewing the test
cases for RoleDescriptor, I also introduced more randomisation to
increase the overall coverage of features and scenarios.
* Remove sql-action dependency from sql-cli
The CLI only uses the basic formatter in sql-action but without touching
the serialization or response items from it.
Extract just the formatting bits into sql-proto (used already) and keep
the serialization bits inside sql-action.
This should make the CLI jar significantly smaller since all the server
dependencies are removed.
Fix#82076
Async search reads aggregation results on many threads. It's quite
possible for it to concurrently serialize an aggregation and render
it's xcontent. But the cardinality agg's results were not thread safe!
They reuse non-thread safe shard constructs used to collect the agg.
It's fine for the collection side not to be thread safe. But not
results.
Anyway! This would sometimes cause the async search index to contain
invalid results which would fail to deserialize. This seems to happen
frequently enough to some folks that it makes cardinality totally
unusable with async search. So far as I can tell you have to create a
race on the iterator to make that happen. This is enough:
```
curl -XDELETE -uelastic:password localhost:9200/test
echo
for i in {1..100}; do
rm -f /tmp/bulk
printf "%03d: " $i
for j in {1..10000}; do
echo '{"index": {}}' >> /tmp/bulk
echo '{"i": '$i', "j": '$j'}' >> /tmp/bulk
done
curl -s -XPOST -HContent-Type:application/json -uelastic:password localhost:9200/test/_bulk?pretty --data-binary @/tmp/bulk | grep error
done
while true; do
id=$(curl -s -XPOST -HContent-Type:application/json -uelastic:password 'localhost:9200/test/_async_search?pretty&request_cache=false&wait_for_completion_timeout=10ms' -d'{
"size": 0,
"aggs": {
"i": {
"terms": {
"field": "i"
},
"aggs": {
"j": {
"cardinality": {
"field": "j"
}
}
}
}
}
}' | jq -r .id)
while curl -s -HContent-Type:application/json -uelastic:password localhost:9200/_async_search/$id?pretty | tee out | grep '"is_running" : true'; do
cat out
done
cat out
sleep 1
curl --fail-with-body -s -HContent-Type:application/json -uelastic:password localhost:9200/_async_search/$id?pretty || break
done
```
Run that without this PR and it'll break with a message about being
unable to deserialize stuff from the index. It'll give a 400 error too
which is totally bogus. On my laptop it takes less than ten iterations
of the loop.
So this PR fixes it! It removes the non-thread safe stuff from the
cardinality results. It also adds a half dozen extra unit tests that'll
be run for hundreds of objects which should catch similar sorts of
possible errors.
Introduces a new series of node settings
`xpack.security.authc.domains.<domain_name>.realms: <realm_name_list>`.
The setting sets the `domain` property on `Realm` and `RealmConfig`
instances. The domain property ought be subsequently used in order to
determine if two identical `usernames` are the same personas, and hence
can share ownership (of profiles, keys, tokens, scrolls), even though
they authenticated via different realms (but which are associated under
the same domain).
We (mostly I) were initially advocating for the auto-generated files to
use unique names (the name containing a timestamp particle), in order to
avoid that subsequent invocations of the config step conflict with
itself. Moreover, I was wishing that these files will not have to be
handled directly by admins (that the enrollment process was to be used).
However, experience proved us otherwise, admins have to manipulate these
files, and unique configuration names are hard to deal with in scripts
and docs, so this PR is all about using a fixed name for all the
generated files. _Labeling as a bug fix because the feedback is that it
very negatively impacts usabilty._ Closes
https://github.com/elastic/elasticsearch/issues/81057