Commit Graph

1110 Commits

Author SHA1 Message Date
Ignacio Vera 4ca96c199f
Introduce a vectorize soarDistance function (#129744)
This commit replaces the method #soarResidual with a method call #soarDistance which perfoms better for computing soar distances.
2025-06-20 16:23:50 +02:00
Benjamin Trent 0a9f3a9630
Address when scores can be very large in osq score test (#129592)
Using a static `diff` or epsilon just doesn't work for this test as the
scores can be very large, but relatively close. 

Maybe there is a simpler way, but my mind wasn't wanting to "math" very
much. 

For example, the seed that this previously failed on had scores like
`1.726524E9` and `1.7265239E9`, which, given their size, are really
close together (within 128). But a static epsilon wouldn't capture that.

closes: https://github.com/elastic/elasticsearch/issues/128485
2025-06-19 01:33:29 +10:00
Benjamin Trent 4e926ae41a
Minor ivf cleanups and fixing quantization performance (#129566)
We are accidentally utilizing the non-vectorized quantizer when building
ivf indices. This provides a 3-5x speed improvement on quantizing on my
mac

This fixes that and addresses some minor fixes (removing unused code,
etc.)

Here is a small benchmark result. time spent quantizing goes down
significantly.

<img width="652" alt="image"
src="https://github.com/user-attachments/assets/9f46398c-c587-4e74-bc91-f2e07a63b406"
/>

vs.

<img width="673" alt="image"
src="https://github.com/user-attachments/assets/c4f4679f-d7a7-4486-841f-7dd3e75a11cb"
/>
2025-06-18 05:49:38 +10:00
Simon Cooper 98c1708adb
Add javadocs for BBQ dot product method (#129419) 2025-06-16 10:18:32 +01:00
Jordan Powers 96300a9d80
Optimized text for full unicode and some escape sequences (#129169)
Follow-up to #126492 to apply the json parsing optimization to strings
containing unicode characters and some backslash-escaped characters.

Supporting backslash-escaped strings is tricky as it requires modifying the
string. There are two types of modification: some just remove the backslash
(e.g. \", \\), and some replace the whole escape sequence with a new
character (e.g. \n, \r, \u00e5). In this implementation, the optimization
only supports the first case--removing the backslash. This is done by
making a copy of the data, skipping the backslash. It should still be more
optimized than full String decoding, but it won't be as fast as 
non-backslashed strings where we can directly reference the input bytes.

Relates to #129072.
2025-06-12 09:55:07 -07:00
Patrick Doyle 7ec8fccf94
Refactor before entitlements for testing (#129099)
* Support multiple plugin source paths

* Refactor: remove unncessary PathLookup method.

It's only called in one place, and there's no need to override it for testing.
Removing it just makes things simpler.

* Refactor: local var for pathLookup

* Fix bugs in test build info parsing

* Fix representative_class in test

* Move BridgeUtilTests.

Tests in org.elasticsearch.entitlement.bridge are going to be uniquely hard to
test once we patch the bridge into java.base, due to Java's prohibition on
split packages.

Let's just move this guy to another package.

* Upcast (?!) Java23EntitlementChecker to EntitlementChecker

* Empty TestPathLookup

* Create PolicyManager during bootstrap, allowing us to share initialization

* Use empty component path list instead of null

* Downcast to the class of the check method.

In our unit test, we have a mock checker that doesn't extend
EntitlementChecker, so downcasting to that would require us to needlessly
rework the unit test.

* Fix javadoc typos
2025-06-09 18:56:07 +02:00
Rene Groeschke 342083100b
[Build] Add support for publishing to maven central (#128659)
This ensures we package an aggregation zip with all artifacts we want to publish to maven central as part of a release.
Running zipAggregation will produce a zip file in the build/nmcp/zip folder. The content of this zip is meant to match the maven artifacts we have currently declared as dra maven artifacts.
2025-06-06 17:35:44 +02:00
Mike Pellegrini 5ee6dfadfe
Update AbstractXContentParser to support parsers that don't provide text characters (#129005) 2025-06-06 09:17:41 -04:00
Jordan Powers 496fb2d5a4
Skip UTF8 to UTF16 conversion during document indexing (#126492)
When parsing documents, we receive the document as UTF-8 encoded data which
we then parse and convert the fields to java-native UTF-16 encoded Strings. 
We then convert these strings back to UTF-8 for storage in lucene.

This patch skips the redundant conversion, instead passing lucene a
direct reference to the received UTF-8 bytes when possible.
2025-06-05 19:50:09 -07:00
Jordan Powers de40ac45d1
Move Text class to libs/xcontent (#128780)
This PR is a precursor to #126492.

It does three things:
1. Move org.elasticsearch.common.text.Text from :server to
   org.elasticsearch.xcontent.Text in :libs:x-content.
2. Refactor the Text class to use a new EncodedBytes record instead of
   the elasticsearch BytesReference.
3. Add the XContentString interface, with the Text class implementing
   that interface.

These changes were originally implemented in #127666 and #128316,
however they were reverted in #128484 due to problems caused by the
mutable nature of java ByteBuffers. This is resolved by instead using a
new immutable EncodedBytes record.
2025-06-04 11:22:03 -07:00
Niels Bauman f988611691
React more prompty to task cancellation while waiting for the cluster to unblock (#128737)
Instead of waiting for the next run of the `ClusterStateObserver` (which
might be arbitrarily far in the future, but bound by the timeout if one
is set), we notify the listener immediately that the task has been
cancelled. While doing so, we ensure we invoke the listener only once.

Fixes #117971
2025-06-03 11:00:20 +03:00
Patrick Doyle c633345a4d
Initial TestPolicyManager implementation (#128700)
* Initial TestPolicyManager implementation

* The forbidden APIs check is not messing around
2025-06-02 13:08:17 -04:00
Ryan Ernst 2be74a47e1
Fully initialize policy checker before instrumenting (#128703)
Entitlement instrumentation works by reflectively calling back into the
entitlements lib to grab the checker. It must be fully in place before
any classes are instrumented. This commit fixes a bug that was
introduced by refactoring which caused the checker to not be set until
after all classes were instrumented. In some situations this could lead
the checker to being null when it is grab (and statically cached) by the
entitlement bridge.
2025-05-31 02:10:20 +03:00
Patrick Doyle 9e40dc4e3b
Encapsulate entitlements (#128637)
* Rename and encapsulate InitializeArgs

* Move ElasticsearchEntitlementChecker out of api package.

It's an implementation detail that doesn't need to be exposed to the rest of
the system.

* Stub TestPathLookup (not yet implemented)
2025-05-30 17:05:56 -04:00
Patrick Doyle 77595cbccd
[Entitlements] Add test entitlement bootstrap and initialization classes (#128625)
* Initialization class as argument to EntitlementAgent

* visibility changes

* WIP: test entitlement bootstrap and initialization classes

* Simplify

* Moving packages to reduce visibility

* adjust visibility

* add plugins descriptor + policy parsing

* PR comments

* update visibility, uncomment TestBuildInfoParser usage

* [CI] Auto commit changes from spotless

* Factor out createPolicyManager to help merge

* TestEntitlementInitialization is not yet implemented

* Respond to PR comments

---------

Co-authored-by: Lorenzo Dematte <lorenzo.dematte@elastic.co>
Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
2025-05-29 22:36:39 +03:00
Lorenzo Dematté 554b96aec9
[Entitlements] Add missing NIO async network instrumentation (#128582)
This PR adds some additional instrumentation to ensure we capture more cases in which we use async network usage via channels and `select`
2025-05-29 19:52:10 +03:00
Patrick Doyle ba50798f62
Split PolicyChecker from PolicyManager (#128004)
* Split PolicyChecker from PolicyManager

* Restore EntitlementCheckerUtils

* [CI] Auto commit changes from spotless

---------

Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
2025-05-28 12:48:14 -04:00
Patrick Doyle 7690f4667e
Revert changes to Text class (#128483) (#128484)
* Revert "Fix the Text class package change in example plugins (#128316)"

This reverts commit cc486480e3.

* Revert "Update Text class to use native java ByteBuffer (#127666)"

This reverts commit db0c3c7a28.

Co-authored-by: Lorenzo Dematté <lorenzo.dematte@elastic.co>
2025-05-27 18:37:43 +10:00
Patrick Doyle 8d79de51f5
Use package to suppress warning for entitlement self-test (#128223)
* Use package to suppress warning for entitlement self-test

* [CI] Auto commit changes from spotless

---------

Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
2025-05-21 08:59:02 -04:00
Patrick Doyle 43841a5ac3
Fail fast on invalid entitlement patches (#128071)
* Fail fast on invalid entitlement patches

* Don't peel off `PolicyParserException`

* Just catch Exception
2025-05-20 13:29:09 -04:00
Benjamin Trent 1324ee0115
Reapply "Adds new unexposed and experimental IVF format (#127528)" (#128005) (#128051)
This reverts commit 8a17a5ed5f.

reapplying ivf format, but with a fix.
2025-05-14 08:47:59 +10:00
Craig Taverner cb1391368b
Fix #123425 numerical floating point edge case (#127982)
docs-build / docs-preview (push) Failing after 0s Details
updatecli-compose / compose (push) Has been skipped Details
Validate Gradle Wrapper / Validation (push) Has been skipped Details
2025-05-10 16:37:29 +02:00
John Wagster 8a17a5ed5f
Revert "Adds new unexposed and experimental IVF format (#127528)" (#128005)
Validate Gradle Wrapper / Validation (push) Has been skipped Details
This reverts commit ebe8ea6136.
2025-05-09 17:10:11 -05:00
Ryan Ernst ab690ba23f
Check hidden frames in entitlements (#127877)
Entitlements do a stack walk to find the calling class. When method
refences are used in a lambda, the frame ends up hidden in the stack
walk. In the case of using a method reference with
AccessController.doPrivileged, the call looks like it is the jdk itself,
so the call is trivially allowed. This commit adds hidden frames to the
stack walk so that the lambda frame created for the method reference is
included. Several internal packages are then necessary to filter out of
the stack.
2025-05-08 16:59:03 -07:00
Jordan Powers db0c3c7a28
Update Text class to use native java ByteBuffer (#127666)
This PR is a precursor to #126492.

It does three things:
- Move org.elasticsearch.common.text.Text from :server to
  org.elasticsearch.xcontent.Text in :libs:x-content.
- Refactor the Text class to use a java-native ByteBuffer instead
  of the elasticsearch BytesReference. 
- Add the XContentString interface, with the Text class implementing
  that interface.
2025-05-08 08:19:38 -07:00
Lorenzo Dematté 2d9fc30f62
Initialization class as argument to EntitlementAgent (#127815)
Preliminary step for test entitlement initialization, extracted from #127814
2025-05-08 10:22:02 +02:00
Benjamin Trent ebe8ea6136
Adds new unexposed and experimental IVF format (#127528) 2025-05-07 14:59:57 -04:00
Ryan Ernst 9537388897
Remove doPrivileged uses from server (#127781)
Now that SecurityManager is no longer used, doPrivileged is no longer
necessary. This commit removes uses of it from core and server
2025-05-07 07:24:53 -07:00
Lorenzo Dematté 8bda02dafa
Uniform main and backport code (#127766)
While backporting entitlement initialization refactorings, I realized there is a mismatch in getVersionSpecificCheckerClass signature, and also that this function in the backports is used in more places (DynamicInstrumentation), making it "strange" to have this in EntitlementInitialization. This PR extracts the function to a separate static class (package-private) and makes the signature uniform with backports.
This will need to be backported manually to the 8.x branches, and will make the backported version of DynamicInstrumentation cleaner.
2025-05-07 09:25:15 +02:00
Ryan Ernst 60ad8ba744
Remove custom SecurityManager (#127778)
Since SecurityManager is no longer used, the custom subclass of
SecurityManager, SecureSM, is no longer needed.
2025-05-06 16:16:46 -07:00
Ryan Ernst b78ac7c94c
Remove PrivilegedOperations (#127726)
With the SecurityManager gone, the PrivilegedOperations class is no
longer needed, these operations can be called directly.
2025-05-06 10:50:49 -07:00
Lorenzo Dematté 79ee234721
Extract hardcoded entitlements creation to a separate class (#127698)
Moving creation of hardcoded entitlements (server policy + APM agent) to a separate class
2025-05-05 19:43:41 +02:00
Lorenzo Dematté f90b01597c
Move FilesEntitlements validation to a separate class (#127703)
Moves FilesEntitlements validation to a separate class. This is the final PR to make EntitlementsInitialization a simpler "orchestrator" of the various steps in the initialization phase.
2025-05-05 17:41:22 +02:00
Lorenzo Dematté 23ab059252
[Entitlements] Extract instrumentation initialization to a separate class (#127702) 2025-05-05 16:08:18 +02:00
Ankit Sethi 94854b3a3f
Remove dangling spaces wherever found. (#127475)
* Remove dandling spaces wherever found.

This PR addresses #117067 , a report about unexpected spaces breaking message parsers built by customers. I used the regex `(\. \")(?![A-Z(a-z_0-9-;<%\/\.+ \t\n]+)` to detect such instances and clean up. In one case, a minor code improvement helps add optional spaces as necessary for a multi-sentence error message.

* fix test

* Update docs/changelog/127475.yaml

* correct logic

* fix test

* fix tests

* fix tests

* fix tests

* Update docs/changelog/127475.yaml

* Update x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/action/TransportGetInferenceModelAction.java

Co-authored-by: Slobodan Adamović <slobodanadamovic@users.noreply.github.com>

* Update libs/x-content/src/main/java/org/elasticsearch/xcontent/ObjectParser.java

Co-authored-by: Slobodan Adamović <slobodanadamovic@users.noreply.github.com>

* correctly reference issue

* Update docs/changelog/127475.yaml

---------

Co-authored-by: Slobodan Adamović <slobodanadamovic@users.noreply.github.com>
2025-05-01 10:33:54 -05:00
Benjamin Trent 74faf47121
New bulk scorer for binary quantized vectors via optimized scalar quantization (#127189)
* New bulk scorer for binary quantized vectors via optimized scalar quantization

* fixing headers

* fixing tests
2025-04-29 07:42:08 -04:00
Lorenzo Dematté e9bedf1184
[Entitlements] Small docs fixes (#127323) 2025-04-24 18:11:18 +02:00
Simon Cooper c5ada66410
Copy Lucene99FlatVectorsReader allowing direct IO to be specified directly (#125921)
We want to use DirectIO to access raw vector data randomly so it doesn't load everything into the page cache
2025-04-24 11:00:30 +01:00
Lorenzo Dematté 002fef75ff
[Entitlements] Fix: consider case sensitiveness differences (#126990)
Our path comparison for file access is string based, due to the fact that we need to support Paths created for different file systems/platforms.
However, Windows files and paths are (sort of) case insensitive.
This PR fixes the problem by abstracting String comparison operations and making them case sensitive or not based on the host OS.
2025-04-23 20:23:45 +02:00
Benjamin Trent 059f91c90c
Panama vector accelerated optimized scalar quantization (#127118)
* Adds accelerates optimized scalar quantization with vectorized functions

* Adding benchmark

* Update docs/changelog/127118.yaml

* adjusting benchmark and delta
2025-04-23 12:51:04 -04:00
Patrick Doyle 4d929ca986
Clean up PolicyManager and ScopeResolver tests (#127115)
* Simplify PolicyManagerTests

* Clean and simplify ScopeResolverTests
2025-04-23 08:57:57 -04:00
Ryan Ernst b5e92db171
Remove security manager from tests (#127087)
Now that entitlements are always used, there is no need to run tests
with security manager (a future enhancement will run tests with
entitlements). This commit removes setting up security manager from
tests.
2025-04-22 18:08:09 +02:00
Lorenzo Dematté 73d31533c6
[Entitlements] Improve FileAccessTree logging (#127050)
We already had logging in FileAccessTree as result of debugging the \\pipe\ failures a while ago; this PR slightly improves the logs to provide more information.
2025-04-22 16:39:36 +02:00
Lorenzo Dematté 02493f35f3
Add package-info.java and javadocs to document Entitlements design and implementation (#127023)
Design and implementation of Entitlement with this level of detail needs to stay close to the code, and take advantage of javadoc features like linking and class-references to help us with refactorings and future code changes.

The bulk of the information went into the package-info file for the main library, but I split up some parts and referenced them from the main doc, where I thought it made sense (mainly: the bridge sub-project for some implementation details, PolicyManager, EntitlementInitialization and FileAccessTree); this way they still can be reached from the "overview" while being closer to where the information really belongs.

Relates to ES-11284
2025-04-22 10:46:20 +02:00
Patrick Doyle 15c2c467e7
Refactor: ScopeResolver (#126921)
* Fix: use getScopeName consistently

* Rename PolicyManagerTests method

* Refacor: simplify PluginsResolver.create

* Change PluginsResolver to ScopeResolver

* Move boot layer test to ScopeResolverTests

* [CI] Auto commit changes from spotless

* Rename PolicyScope

* Add ComponentKind enum

* Package private componentName field

---------

Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
2025-04-21 17:35:10 +02:00
Jack Conradson 1234f97031
Refactor file path resolution for entitlements (#127040)
This change refactors the known directory resolution such as modules, 
plugins, lib, etc. into a PathLookup. This is one of the steps towards 
allowing unit tests to provide their own PathLookup for resolution so 
we can enable entitlements there.

ES-11584
2025-04-21 06:53:56 -07:00
Ryan Ernst 0d2bc75301
Make sure SM isn't running alongside entitlements tests (#127082)
closes #127077
2025-04-19 02:37:13 +02:00
Lorenzo Dematté 69f6520b0c
[Entitlements] Validation checks on paths (#126852)
With this PR we restrict the paths we allow access to, forbidding plugins to specify/request entitlements for reading or writing to specific protected directories.

I added this validation to EntitlementInitialization, as I wanted to fail fast and this is the earliest occurrence where we have all we need: PathLookup to resolve relative paths, policies (for plugins, server, agents) and the Paths for the specific directories we want to protect.

Relates to ES-10918
2025-04-18 15:36:07 +02:00
Lorenzo Dematté 115062c643
Fix vec_caps to test for OS support too (on x64) (#126911)
On x64, we are testing if we support vector capabilities (1 = "basic" = AVX2, 2 = "advanced" = AVX-512) in order to enable and choose a native implementation for some vector functions, using CPUID.

However, under some circumstances, this is not sufficient: the OS on which we are running also needs to support AVX/AVX2 etc; basically, it needs to acknowledge it knows about the additional register and that it is able to handle them e.g. in context switches. To do that we need to a) test if the CPU has xsave feature and b) use the xgetbv to test if the OS set it (declaring it supports AVX/AVX2/etc).

In most cases this is not needed, as all modern OSes do that, but for some virtualized situations (hypervisors, emulators, etc.) all the component along the chain must support it, and in some cases this is not a given.

This PR introduces a change to the x64 version of vec_caps to check for OS support too, and a warning on the Java side in case the CPU supports vector capabilities but those are not enabled at OS level.

Tested by passing noxsave to my linux box kernel boot options, and ensuring that the avx flags "disappear" from /proc/cpuinfo, and we fall back to the "no native vector" case.

Fixes #126809
2025-04-16 16:06:46 +02:00
Ryan Ernst 6174acdc39
Workaround max name limit imposed by Jackson 2.17 (#126806)
In Jackson 2.15 a maximum string length of 50k characters was
introduced. We worked around that by override the length to max int on
all parsers created by xcontent. Jackson 2.17 introduced a similar limit
on field names. This commit mimics the workaround for string length by
overriding the max name length to be unlimited.

relates #58952
2025-04-15 11:40:27 -07:00