This moves half of the remaining centralized expression mapper logic
into the individual expressions. This is how all but 3 of the remaining
expressions work. Let's try and be fully consistent.
"elser" is an alias for "elasticsearch", and "sagemaker" is an alias for
"amazon_sagemaker".
Users can continue to create and use providers by their alias.
Elasticsearch will continue to support the alias when it reads the
configuration from the internal index.
Speeds up the semantic_text tests by using a test inference plugin. That
skips downloading the normal inference setup.
Closes#128511Closes#128513Closes#128571Closes#128572 Closes
#128573Closes#128574
This change skips indexing points for the seq_no field in tsdb and
logsdb indices to reduce storage requirements and improve indexing
throughput. Although this optimization could be applied to all new
indices, it is limited to tsdb and logsdb, where seq_no usage is
expected to be limited and storage requirements are more critical.
Co-authored-by: Martijn van Groningen <martijn.v.groningen@gmail.com>
SageMaker now supports Completion and Chat Completion using the OpenAI
interfaces.
Additionally:
- Fixed bug related to timeouts being nullable, default to 30s timeout
- Exposed existing OpenAi request/response parsing logic for reuse
Spell out that multiple completions of a `SubscribableListener` race to
be chosen as the winner that is passed to subscribed listeners, such
that the subscribed listeners all fire at most once.
This adds a new logical optimization rule to purge a Join in case the
merge key(s) are null. The null detection is based on recognizing a tree
pattern where the join sits atop a project and/or eval (possibly a few
nodes deep) which contains a reference to a `null`, reference which
matches the join key.
It works at coordinator planning level, but it's most useful locally,
after insertions of `nulls` in the plan on detecting missing fields.
The Join is substituted with a projection with the same attributes as
the join, atop an eval with all join's right fields aliased to null.
Closes#125577.
Change LuceneSyntheticSourceChangesSnapshot to force sequential stored field reading when index.code is best_compression.
In CCR benchmarks I see that relatively often we spend a lot of time compressing the same stored field block over and over again when the doc ids are not dense. It is likely when a seqno range is requested that the corresponding doc id list contains gaps. However most docids are monotonically increasing, so not sequential reading harms performance. The reason that currently we're not loading sequentially is because of the logic in `StoredFieldLoader#hasSequentialDocs(...)`, which requires all requested docids to be in monotonically order (no gaps allowed). In the case of `LuceneSyntheticSourceChangesSnapshot` with stored field best compression that is too conservative. In practice, we end decompressing stored field blocks for each docid we need to synthetisize source for recovery.
I think it makes sense to do sequential reading in this case, given that it is very likely that many of the requested doc id ranges will contain monotonically increasing ranges. Note that the requested docids will always sort in ascending order (this happens in `LuceneSyntheticSourceChangesSnapshot#transformScoreDocsToRecords(...)`.
This commit is a minor refactoring of several tests to use TestUtil.alwaysKnnVectorsFormat where possible, rather than extending Lucene101Codec. The alwaysKnnVectorsFormat static method returns an AssertingCodec that delegates to the default codec for everything except KnnVectorsFormat.
The primary motivation for this change is to avoid extending a version specific Lucene codec that will require to be updated when Lucene updates its codec ( as can be seen in the lucene_snapshot branch, from Lucene101Codec to Lucene103Codec ).
Today we determine whether bootstrap checks are enforced (in production builds) based on whether a non-loopback address is configured.
This PR we should expand that to also not enforce bootstrap checks when a snapshot build is used, so that is enforceLimits = non-loopback && non-snapshot
Fixes#118328
* Revert "Fix the Text class package change in example plugins (#128316)"
This reverts commit cc486480e3.
* Revert "Update Text class to use native java ByteBuffer (#127666)"
This reverts commit db0c3c7a28.
Co-authored-by: Lorenzo Dematté <lorenzo.dematte@elastic.co>
Since our unit and IT tests run as unnamed modules we should add `--enable-native-access=ALL-UNNAMED` to suppress the pesky warning that arise from calls to `System::load`.
Earlier versions of MinIO had a bug which can cause repository analysis
failures. This commit upgrades the MinIO test container version to pick
up the bug fix, and reverts the workaround implemented in #127166.
Relates https://github.com/minio/minio/issues/21189