vector
The vector:​*
stages retrieve multidimensional vectors based on various criteria. You can use these
vectors to search for semantically-similar
labels
and
documents.
You can use the following vector stages in your analysis requests:
-
vector:​composite
-
Computes a composite of the vectors you provide.
-
vector:​document​Embedding
-
Returns a vector with weights corresponding to the composition of embedding vectors for the document set you provide.
-
vector:​label​Embedding
-
Returns a vector with weights corresponding to the composition of embedding vectors for the set of labels you provide.
vector:​reference
-
References the results of another
vector:​*
stage defined in the request.
The JSON output of the vector stage has the following structure:
{
"values": [
// a list of N floating point numbers: vector weights.
]
}
vector:​composite
Creates a composite vector from the vectors you provide by summing up their individual components.
{
"type": "vector:composite",
"vectors": []
}
vectors
The input vectors to compose. All input vectors must have the same size.
vector:​document​Embedding
Returns a vector with weights corresponding to the composition of embedding vectors for the document set you provide.
{
"type": "vector:documentEmbedding",
"documents": {
"type": "documents:reference",
"auto": true
},
"failIfEmbeddingsNotAvailable": true
}
The composite vector is a document score-weighted sum of individual embedding vectors.
Unless failIfEmbeddingsNotAvailable
is set to
false
, this stage will require document embeddings to be
present in the index.
Consider the following request, which returns the embedding vector for the first document matching the photon query and then computes three most similar documents to that embedding vector.
{
"stages": {
"documents": {
"type": "documents:byQuery",
"query": {
"type": "query:string",
"query": "photon"
},
"limit": 1
},
"documentEmbedding": {
"type": "vector:documentEmbedding",
"documents": {
"type": "documents:reference",
"use": "documents"
},
"failIfEmbeddingsNotAvailable": true
},
"similarDocuments": {
"type": "documents:embeddingNearestNeighbors",
"vector": {
"type": "vector:reference",
"use": "documentEmbedding"
},
"limit": 3
}
}
}
Retrieving a document embedding vector and using it to find similar documents.
Shown below, is the embedding vector part of the response:
"documentEmbedding": {
"values": [
-0.99103373,
-0.61993295,
0.4042985,
-1.1614705,
-0.52302164,
0.6941693,
0.27720472,
-0.89477026,
-0.054978706,
-0.27918375,
-0.21151417,
0.6429526,
0.69916975,
-0.2415477,
0.40307912,
-0.78268766,
0.17965193,
-0.4961045,
-0.31955218,
0.52410066,
1.3476398,
-0.5696117,
-0.1011139,
0.36646006,
-0.0368462,
0.42020366,
0.22188598,
0.49885812,
0.38663238,
-0.9042661,
-1.0002325,
0.3590454,
0.37262416,
-0.16460687,
0.050458044,
0.6337493,
0.0632156,
0.29189587,
-0.86135393,
-0.0827778,
-1.9721417,
0.0946002,
0.83832526,
1.4163167,
-0.6701371,
0.9395537,
-1.0429033,
-0.6764844,
-0.11551779,
1.035179,
0.4344454,
0.5452566,
0.47005108,
-0.73416495,
0.44311583,
0.014774237,
1.319659,
-0.69617754,
-0.33849755,
-0.017449401,
-0.0040018084,
0.9872341,
1.2851647,
-0.13578914,
-0.43303326,
-0.69944334,
0.98159957,
0.19036204,
-0.6602206,
-0.9635043,
-0.14981964,
-1.1410289,
0.29195544,
-0.31277227,
-0.09985383,
0.6923386,
-1.1024474,
0.46135065,
0.37878135,
1.5208758,
1.1232415,
0.79433787,
-0.12940468,
0.17998278,
0.108538546,
-0.23473981,
-1.07629,
-0.15014234,
0.4499113,
1.218994,
-0.2030959,
-0.885567,
-0.041270923,
0.70257646,
0.30352822,
0.9245882
]
}
And this is the list of similar documents retrieved for the vector above:
"similarDocuments": {
"documents": [
{
"id": 184486,
"weight": 0.9531418
},
{
"id": 314007,
"weight": 0.95071065
},
{
"id": 245260,
"weight": 0.94943094
}
]
}
documents
One or more input documents for which the embedding vector should be returned.
fail​If​Embeddings​Not​Available
Determines the behavior of this stage if the index does not contain document embeddings.
If the index does not contain document embeddings and fail​If​Embeddings​Not​Available
is:
true
- this stage fails and logs an error.
false
- this stage returns an empty set of document embeddings.
vector:​label​Embedding
Returns a vector with weights corresponding to the composition of embedding vectors for the label set you provide.
{
"type": "vector:labelEmbedding",
"failIfEmbeddingsNotAvailable": true,
"labels": {
"type": "labels:reference",
"auto": true
}
}
The composite vector is a weighted sum of individual label embedding vectors (weighted by each label's weight).
Unless failIfEmbeddingsNotAvailable
is set to
false
, this stage will require label embeddings to be
present in the index.
Consider the following request, which returns the embedding vector for the specific label oil and then computes three most similar labels based on oil's embedding vector.
{
"stages": {
"labels": {
"type": "labels:direct",
"labels": [
{
"label": "oil",
"weight": 1
}
]
},
"labelEmbedding": {
"type": "vector:labelEmbedding",
"labels": {
"type": "labels:reference",
"use": "labels"
},
"failIfEmbeddingsNotAvailable": true
},
"similarLabels": {
"type": "labels:embeddingNearestNeighbors",
"vector": {
"type": "vector:reference",
"use": "labelEmbedding"
},
"limit": 5
}
}
}
Retrieving a label embedding vector and using it to find similar labels.
Shown below, is the embedding vector part of the response:
"labelEmbedding": {
"values": [
-0.056218036,
-0.016219355,
-0.097506955,
0.03419229,
-0.14090592,
0.012211461,
-0.06729116,
0.13687253,
-0.12202141,
0.05840475,
0.10308239,
-0.039139103,
-0.066286676,
-0.04229747,
0.062129702,
-0.0709934,
-0.1390585,
0.074060366,
0.11323412,
-0.053741228,
-0.09797654,
-0.00074452395,
0.12603402,
-0.11076641,
-0.01178393,
-0.1688074,
0.030520564,
-0.07905724,
0.004189321,
0.0035251991,
-0.096744716,
-0.070698775,
-0.035969727,
-0.053766362,
0.06178642,
0.21652527,
-0.18458892,
0.080813296,
0.082455836,
-0.07065346,
0.085473984,
-0.036417063,
0.08940274,
-0.11169317,
-0.11553197,
-0.014741239,
0.028184947,
0.027740661,
-0.20679004,
-0.063633166,
-0.13676593,
-0.077558,
0.15755467,
-0.045371544,
0.047832448,
-0.02211245,
-0.020387521,
-0.029218199,
-0.0649806,
-0.058419973,
0.08937007,
-0.084519245,
-0.1271999,
0.031995956,
-0.08992219,
-0.038587146,
0.20926896,
-0.122392155,
-0.015908016,
-0.18183255,
-0.013469522,
0.1941475,
-0.09968846,
0.104315154,
0.115674205,
0.022810362,
0.14583807,
0.2034118,
-0.11636415,
-0.3147494,
0.04710036,
-0.020315854,
0.05568123,
0.010184015,
-0.18691273,
-0.14869832,
0.083398715,
-0.12028383,
-0.06248831,
0.07073338,
-0.095944166,
0.11422471,
-0.021139272,
-0.04514896,
0.07884139,
-0.031363823
]
}
And this is the list of similar labels retrieved for the vector above:
"similarLabels": {
"labels": [
{
"label": "oil",
"weight": 1
},
{
"label": "moisture",
"weight": 0.789073
},
{
"label": "saline",
"weight": 0.77412647
},
{
"label": "geothermal",
"weight": 0.77246994
},
{
"label": "relative humidity",
"weight": 0.76045656
}
]
}
fail​If​Embeddings​Not​Available
Determines the behavior of this stage if the index does not contain label embeddings.
If the index does not contain label embeddings and fail​If​Embeddings​Not​Available
is:
true
- this stage fails and logs an error.
false
- this stage returns an empty set of label embeddings.
labels
One or more input labels for which the embedding vector should be returned.
vector:​*
Consumers of
The following stages and components take vector:​*
as
input:
Stage or component | Property |
---|---|
documents:​embedding​Nearest​Neighbors | vector |
labels:​embedding​Nearest​Neighbors | vector |
vector:​composite | vectors |