Query parsers

The query‚ÄčParsers section of the project descriptor declares parsers that Lingo4G uses to prepare Lucene queries from text.

Properties in this object must be of the following types:

enhanced

Implements an enhanced syntax of Lucene's (flexible) Standard‚ÄčQuery‚ÄčParser. The extensions include support for interval queries.

A typical query parsers section of the project descriptor looks like this:

{
  "queryParsers": {
    "enhanced": {
      "type": "enhanced",
      "defaultFields": [
        "title",
        "abstract"
      ],
      "defaultOperator": "OR"
    }
  }
}

The query‚ÄčParsers property of the project descriptor must be an object with keys corresponding to query parser identifiers. Use that identifier to choose the query parser at analysis time, for example in the query‚ÄčParser property of the query:‚Äčstring component.

Property values of the query‚ÄčParsers object represent configurations of the specific query parser. Each configuration object must contain the type property defining the kind of query parser to use.

Important

You must declare at least one query parser configuration in your project descriptor.

You can use the following query parser types in your project descriptors:

enhanced
A parser with syntax based on Lucene's standard query parser.

enhanced

This query parser implements an enhanced version of the syntax of Lucene's (flexible) Standard‚ÄčQuery‚ÄčParser. The extensions include support for interval queries.

{
  "type": "enhanced",
  "defaultFields": [],
  "defaultOperator": "AND",
  "sanitizeSpaces": "(?U)\\p{Blank}+",
  "validateFields": true
}

Query syntax

The text query must contain one or more clauses, optionally combined with Boolean operators AND or OR. If you don't provide any explicit operators in the query, Lingo4G uses the default‚ÄčOperator to combine clauses.

A review of all types of clauses and their modifications is provided in the following sections.

Term queries

A simple term query selects documents that contain matching terms in any of the default search fields. The following list shows a few examples of different term queries.

  • test ‚ÄĒ selects documents containing the word test.

  • "test equipment" ‚ÄĒphrase search; selects documents containing adjacent terms test equipment.

  • "test failure"~4 ‚ÄĒproximity search; selects documents containing the words test and failure within 4 words (positions) from each other. The provided "proximity" is technically translated into "edit distance" (maximum number of atomic word-moving operations required to transform the document's phrase into the query phrase). For a more intuitive notion of proximity, use the ordered interval searches with a maximum position range constraint.

  • tes* ‚ÄĒ prefix wildcard matching; selects documents containing words starting with tes, such as: test, testing or testable.

  • /.est(s|ing)/ ‚ÄĒ documents containing words matching the regular expression you provide. Here documents containing resting or nests would both match, along with other terms ending in ests or esting.

  • nest~2 ‚ÄĒ fuzzy term matching; documents containing words within 2-edits distance (2 additions, removals or replacements of a letter) from nest, such as test, net or rests.

Fields

An unqualified term query is applied to all default search fields specified in your project descriptor. To search for terms in a specific field, prefix the term with the field name followed by a colon, for example:

  • title:‚Äčtest ‚ÄĒ documents containing test in the title field.

It is also possible to group several clauses and apply them to a single field using parentheses:

  • title:‚Äč(dandelion ‚ÄčO‚ÄčR daisy) ‚ÄĒ documents containing dandelion or daisy in the title field.

Boolean operators

You can combine simple terms and other clauses using the AND, OR and NOT Boolean operators, for example:

  • test ‚ÄčA‚ÄčN‚ÄčD results ‚ÄĒ selects documents containing both the word test and the word results in any of the default search fields.

  • test ‚ÄčO‚ÄčR suite ‚ÄčO‚ÄčR results ‚ÄĒ selects documents with at least one of test, suite or results in any of the default search fields.

  • title:‚Äčtest ‚ÄčA‚ÄčN‚ÄčD ‚ÄčN‚ÄčO‚ÄčT title:‚Äčcomplete ‚ÄĒ selects documents containing test and not containing complete in the title field.

  • title:‚Äčtest ‚ÄčA‚ÄčN‚ÄčD (pass* ‚ÄčO‚ÄčR fail*) ‚ÄĒ grouping; use parentheses to specify the precedence of terms in a Boolean clause. Query will match documents containing test in the title field and a word starting with pass or fail in the default search fields.

  • title:‚Äč(pass fail skip) ‚ÄĒ uses the default operator to combine three term queries. If the default operator is an O‚ÄčR then the query selects documents containing at least one of pass, fail or skip in the title field. If the default operator is an A‚ÄčN‚ÄčD then the query selects documents containing all of those terms in the title field.

  • title:‚Äč(+test +"result unknown") ‚ÄĒ shorthand AND notation; documents containing both pass and result unknown in the title field.

Note the operators must be written in all-capital letters.

Range operators

To search for ranges of textual or numeric values, use square or curly brackets, for example:

  • name:‚Äč[‚ÄčJones ‚ÄčT‚ÄčO ‚ÄčSmith] ‚ÄĒ inclusive range; selects documents whose name field has any value between Jones and Smith, including boundaries.

  • score:‚Äč{2.5 ‚ÄčT‚ÄčO 7.3} ‚ÄĒ exclusive range; selects documents whose score field is between 2.5 and 7.3, excluding boundaries.

  • score:‚Äč{2.5 ‚ÄčT‚ÄčO *] ‚ÄĒ one-sided range; selects documents whose score field is larger than 2.5.

Term boosting

You can attach a floating point boost value to quoted terms, term range expressions and grouped clauses to increase their score relative to other clauses. For example:

  • jones^2 ‚ÄčO‚ÄčR smith^0.5 ‚ÄĒ prioritizes documents with jones term over matches on the smith term.

  • field:‚Äč(a ‚ÄčO‚ÄčR b ‚ÄčN‚ÄčO‚ÄčT c)^2.5 ‚ÄčO‚ÄčR field:‚Äčd ‚ÄĒ applies the boost to a sub-query.

Special character escaping

You can put most search terms in double quotes to make special-character escaping unnecessary. If a search term contains the quote character (or cannot be quoted for some reason), use backslash to escape any character. For example:

  • \:‚Äč\(quoted\+term\)\: ‚ÄĒ a single search term :‚Äč(quoted+term): with escape sequences. An alternative quoted form would be simpler: ":‚Äč(quoted+term):‚Äč".

Another case when quoting may be required is to escape leading forward slashes, which are parsed as regular expressions. For example, this query will not parse correctly without quotes:

  • title:‚Äč"/daisy" ‚ÄĒ a full quote is needed here to prevent the leading forward slash character from being recognized as an (invalid) regular expression term query.
Handling of quoted expressions

The conversion from a quoted expression to a Lucene query depends on the analyzer specified for the field the quoted expression applies to. Term queries are parsed and divided into a stream of individual tokens using the same analyzer used to index the field's content. The result is a phrase query for a stream of tokens or a simple term query for a single token.

Minimum-should-match constraint

You can apply the minimum-should-match operator to a disjunction Boolean query (a query with only "OR"-subclauses), forcing the query to match documents containing at least the provided number of subclauses. For example:

  • (blue ‚ÄčO‚ÄčR crab ‚ÄčO‚ÄčR fish)@2 ‚ÄĒ matches all documents with at least two terms from the [blue, crab, fish] set (in any order).

  • ((yellow ‚ÄčA‚ÄčN‚ÄčD blue) ‚ÄčO‚ÄčR crab ‚ÄčO‚ÄčR fish)@2 ‚ÄĒ sub-clauses of the top-level disjunction query can themselves be complex queries; here the min-should-match selects documents that match at least two of: yellow ‚ÄčA‚ÄčN‚ÄčD blue, crab, fish.

Interval queries and functions

Interval functions are a very powerful mechanism for selecting documents based on the presence and proximity of specific regions of text. Before we explain how interval functions work, we need to show how Lingo4G and Lucene index text data. When indexing, Lucene splits the text of each field in each document into tokens. Each token has an associated position in the token stream. For example, the following sentence:

The quick brown fox jumps over the lazy dog

could be transformed into the following token stream. Note that some token positions are "blank", these positions reflect tokens omitted from the index, typically stop words.

The‚ÄĒ quick2 brown3 fox4 jumps5 over6 the‚ÄĒ lazy7 dog8

Intervals are contiguous spans between two token positions in a document. For example, consider this interval query for intervals between an ordered sequence of terms brown and dog: fn:‚Äčordered(brown dog). The query covers the following interval:

The quick brown fox jumps over the lazy dog

The result of this function is the entire span of terms between brown and dog. This type of function can be called an interval selector. The second class of interval functions works on top of other intervals and provide filters (interval restrictions).

In the above example, the matching interval can be of any length ‚ÄĒ if the word brown occurred at the beginning of the document and the word dog at the very end, the interval would be very long, covering the entire document. You can restrict the matching intervals to, for example, only those with at most 3 positions between the search terms: fn:‚Äčmaxgaps(3 fn:‚Äčordered(brown dog)). There are five tokens in between the terms dog and brown (and therefore five "gaps" between the matching interval's positions) and the above query no longer matches the input document at all.

Interval filtering functions allow expressing a variety of conditions ordinary Lucene queries can't easily cover. For example, consider this interval query that searches for words lazy or quick but only if they are in the neighborhood of 1 position from the words dog or fox:

fn:‚Äčwithin(fn:‚Äčor(lazy quick) 1 fn:‚Äčor(dog fox))

The result of this query is correctly shown below: only the word lazy matches the query; the word quick is 2 positions away from fox but is not part of the match (it's only the interval's filtering condition).

The quick brown fox jumps over the lazy dog

Interval functions

Enhanced query parser supports the following interval functions, grouped by similar functionality:

term queries
alternatives
length restrictions
context filtering
ordering
containment
Examples

All examples in the following description of interval functions assume a single document with the following content (tokens):

The quick brown fox jumps over the lazy dog

term literals

Quoted or unquoted character sequences are converted into an interval expression based on the sequence (or graph) of tokens returned by the field's analyzer. In most cases, the interval expression will be a contiguous sequence of tokens equivalent to that returned by the field's analysis chain.

Another way to express a contiguous sequence of terms is to use the fn:‚Äčphrase function.

Examples
  • fn:‚Äčor(quick "fox")

    The quick brown fox jumps over the lazy dog

  • "quick fox" (The document does not match ‚ÄĒ no adjacent terms quick fox exist.)

    The quick brown fox jumps over the lazy dog

  • fn:‚Äčphrase(quick brown fox)

    The quick brown fox jumps over the lazy dog

fn:‚Äčwildcard

Matches the disjunction of all terms that match a wildcard glob.

Heads up, clause count limit.

The expanded wildcard can cover a lot of terms. By default, Lingo4G limits the maximum number of such "expansions" to 128. You can override the default limit, but this can lead to excessive memory use or slow query execution.

Arguments

fn:‚Äčwildcard(glob max‚ÄčExpansions)

glob
term glob to expand based on the contents of the index.
max‚ÄčExpansions
maximum acceptable number of term expansions before the function fails. This is an optional parameter.
Examples
  • fn:‚Äčwildcard(jump*)

    The quick brown fox jumps over the lazy dog

  • fn:‚Äčwildcard(br*n)

    The quick brown fox jumps over the lazy dog

fn:‚Äčfuzzy‚ÄčTerm

Matches the disjunction of all terms that are within the given edit distance from the provided base.

Heads up, clause count limit.

The expanded set of terms can cover a lot of terms. By default, Lingo4G limits the maximum number of such "expansions" to 128. You can override the default limit, but this can lead to excessive memory use or slow query execution.

Arguments

fn:‚Äčfuzzy‚ÄčTerm(glob max‚ÄčEdits max‚ÄčExpansions)

glob
the baseline term.
max‚ÄčEdits
maximum number of edit operations for the transformed term to be considered equal (1 or 2).
max‚ÄčExpansions
maximum acceptable number of term expansions before the function fails. This is an optional parameter.
Examples
  • fn:‚Äčfuzzy‚ÄčTerm(box)

    The quick brown fox jumps over the lazy dog

fn:‚Äčor

Matches the disjunction of nested intervals.

Arguments

fn:‚Äčor(sources...)

sources
sub-intervals (terms or other functions)
Examples
  • fn:‚Äčor(dog fox)

    The quick brown fox jumps over the lazy dog

fn:‚Äčat‚ÄčLeast

Matches documents that contain at least the provided number of source intervals.

Arguments

fn:‚Äčat‚ÄčLeast(min sources...)

min
an integer specifying minimum number of sub-interval arguments that must match.
sources
sub-intervals (terms or other functions)
Examples
  • fn:‚Äčat‚ÄčLeast(2 quick fox "furry dog")

    The quick brown fox jumps over the lazy dog

  • fn:‚Äčat‚ÄčLeast(2 fn:‚Äčunordered(furry dog) fn:‚Äčunordered(brown dog) lazy quick) (This query results in multiple overlapping intervals.)

    The quick brown fox jumps over the lazy dog
    The quick brown fox jumps over the lazy dog
    The quick brown fox jumps over the lazy dog

fn:‚Äčmaxgaps

Accepts source interval if it has at most max position gaps.

Arguments

fn:‚Äčmaxgaps(gaps source)

gaps
an integer specifying maximum number of source's position gaps.
source
source sub-interval.
Examples
  • fn:‚Äčmaxgaps(0 fn:‚Äčordered(fn:‚Äčor(quick lazy) fn:‚Äčor(fox dog)))

    The quick brown fox jumps over the lazy dog

  • fn:‚Äčmaxgaps(1 fn:‚Äčordered(fn:‚Äčor(quick lazy) fn:‚Äčor(fox dog)))

    The quick brown fox jumps over the lazy dog

fn:‚Äčmaxwidth

Accepts source interval if it has at most the given width (position span).

Arguments

fn:‚Äčmaxwidth(max source)

max
an integer specifying maximum width of source's position span.
source
source sub-interval.
Examples
  • fn:‚Äčmaxwidth(2 fn:‚Äčordered(fn:‚Äčor(quick lazy) fn:‚Äčor(fox dog)))

    The quick brown fox jumps over the lazy dog

  • fn:‚Äčmaxwidth(3 fn:‚Äčordered(fn:‚Äčor(quick lazy) fn:‚Äčor(fox dog)))

    The quick brown fox jumps over the lazy dog

fn:‚Äčphrase

Matches an ordered, gapless sequence of source intervals.

Arguments

fn:‚Äčphrase(sources...)

sources
sub-intervals (terms or other functions)
Examples
  • fn:‚Äčphrase(quick brown fox)

    The quick brown fox jumps over the lazy dog

  • fn:‚Äčphrase(fn:‚Äčordered(quick fox) jumps)

    The quick brown fox jumps over the lazy dog

fn:‚Äčordered

Matches an ordered span containing all source intervals, possibly with gaps in between their respective source interval positions. Source intervals must not overlap.

Arguments

fn:‚Äčordered(sources...)

sources
sub-intervals (terms or other functions)
Examples
  • fn:‚Äčordered(quick jumps dog)

    The quick brown fox jumps over the lazy dog

  • fn:‚Äčordered(quick fn:‚Äčor(fox dog)) (Note only the shorter match out of the two alternatives is included in the result; the algorithm is not required to return or highlight all matching interval alternatives).

    The quick brown fox jumps over the lazy dog

  • fn:‚Äčordered(quick jumps fn:‚Äčor(fox dog))

    The quick brown fox jumps over the lazy dog

  • fn:‚Äčordered(fn:‚Äčphrase(brown fox) fn:‚Äčphrase(fox jumps)) (Sources overlap, no matches.)

    The quick brown fox jumps over the lazy dog

fn:‚Äčunordered

Matches an unordered span containing all source intervals, possibly with gaps in between their respective source interval positions. Source intervals may overlap.

Arguments

fn:‚Äčunordered(sources...)

sources
sub-intervals (terms or other functions)
Examples
  • fn:‚Äčunordered(dog jumps quick)

    The quick brown fox jumps over the lazy dog

  • fn:‚Äčunordered(fn:‚Äčor(fox dog) quick) (Note only the shorter match out of the two alternatives is included in the result; the algorithm is not required to return or highlight all matching interval alternatives).

    The quick brown fox jumps over the lazy dog

  • fn:‚Äčunordered(fn:‚Äčphrase(brown fox) fn:‚Äčphrase(fox jumps))

    The quick brown fox jumps over the lazy dog

fn:‚Äčunordered‚ÄčNo‚ÄčOverlaps

Matches an unordered span containing two source intervals, possibly with gaps in between their respective source interval positions. Source intervals must not overlap.

Note that, unlike fn:‚Äčunordered, this function takes a fixed number of arguments (two).

Arguments

fn:‚Äčunordered‚ÄčNo‚ÄčOverlaps(source1 source2)

source1
sub-interval (term or other function)
source2
sub-interval (term or other function)
Examples
  • fn:‚Äčunordered‚ÄčNo‚ÄčOverlaps(fn:‚Äčphrase(fox jumps) brown)

    The quick brown fox jumps over the lazy dog

  • fn:‚Äčunordered‚ÄčNo‚ÄčOverlaps(fn:‚Äčphrase(brown fox) fn:‚Äčphrase(fox jumps)) (Sources overlap, no matches.)

    The quick brown fox jumps over the lazy dog

fn:‚Äčbefore

Matches intervals from the source that appear before intervals from the reference.

This is a filtering function, reference intervals will not be part of the match.

Arguments

fn:‚Äčbefore(source reference)

source
source sub-interval (term or other function)
reference
reference sub-interval (term or other function)
Examples
  • fn:‚Äčbefore(fn:‚Äčor(brown lazy) fox)

    The quick brown fox jumps over the lazy dog

  • fn:‚Äčbefore(fn:‚Äčor(brown lazy) fn:‚Äčor(dog fox))

    The quick brown fox jumps over the lazy dog

fn:‚Äčafter

Matches intervals from the source that appear after intervals from the reference.

This is a filtering function, reference intervals will not be part of the match.

Arguments

fn:‚Äčafter(source reference)

source
source sub-interval (term or other function)
reference
reference sub-interval (term or other function)
Examples
  • fn:‚Äčafter(fn:‚Äčor(brown lazy) fox)

    The quick brown fox jumps over the lazy dog

  • fn:‚Äčafter(fn:‚Äčor(brown lazy) fn:‚Äčor(dog fox))

    The quick brown fox jumps over the lazy dog

fn:‚Äčextend

Matches an interval around another source, extending its span by a number of positions before and after.

This is an advanced function that allows extending the left and right "context" of another interval.

Arguments

fn:‚Äčextend(source before after)

source
source sub-interval (term or other function)
before
an integer number of positions to extend to the left of the source
after
an integer number of positions to extend to the right of the source
Examples
  • fn:‚Äčextend(fox 1 2)

    The quick brown fox jumps over the lazy dog

  • fn:‚Äčextend(fn:‚Äčor(dog fox) 2 0)

    The quick brown fox jumps over the lazy dog

fn:‚Äčwithin

Matches intervals of the source that appear within the provided number of positions from the intervals of the reference.

Arguments

fn:‚Äčwithin(source positions reference)

source
source sub-interval (term or other function)
positions
an integer number of maximum positions between source and reference
reference
reference sub-interval (term or other function)
Examples
  • fn:‚Äčwithin(fn:‚Äčor(fox dog) 1 fn:‚Äčor(quick lazy))

    The quick brown fox jumps over the lazy dog

  • fn:‚Äčwithin(fn:‚Äčor(fox dog) 2 fn:‚Äčor(quick lazy))

    The quick brown fox jumps over the lazy dog

fn:‚Äčnot‚ÄčWithin

Matches intervals of the source that do not appear within the provided number of positions from the intervals of the reference.

Arguments

fn:‚Äčnot‚ÄčWithin(source positions reference)

source
source sub-interval (term or other function)
positions
an integer number of maximum positions between source and reference
reference
reference sub-interval (term or other function)
Examples
  • fn:‚Äčnot‚ÄčWithin(fn:‚Äčor(fox dog) 1 fn:‚Äčor(quick lazy))

    The quick brown fox jumps over the lazy dog

fn:‚Äčcontained‚ÄčBy

Matches intervals of the source that are contained by intervals of the reference.

Arguments

fn:‚Äčcontained‚ÄčBy(source reference)

source
source sub-interval (term or other function)
reference
reference sub-interval (term or other function)
Examples
  • fn:‚Äčcontained‚ÄčBy(fn:‚Äčor(fox dog) fn:‚Äčordered(quick lazy))

    The quick brown fox jumps over the lazy dog

  • fn:‚Äčcontained‚ÄčBy(fn:‚Äčor(fox dog) fn:‚Äčextend(lazy 3 3))

    The quick brown fox jumps over the lazy dog

fn:‚Äčnot‚ÄčContained‚ÄčBy

Matches intervals of the source that are not contained by intervals of the reference.

Arguments

fn:‚Äčnot‚ÄčContained‚ÄčBy(source reference)

source
source sub-interval (term or other function)
reference
reference sub-interval (term or other function)
Examples
  • fn:‚Äčnot‚ÄčContained‚ÄčBy(fn:‚Äčor(fox dog) fn:‚Äčordered(quick lazy))

    The quick brown fox jumps over the lazy dog

  • fn:‚Äčnot‚ÄčContained‚ÄčBy(fn:‚Äčor(fox dog) fn:‚Äčextend(lazy 3 3))

    The quick brown fox jumps over the lazy dog

fn:‚Äčcontaining

Matches intervals of the source that contain at least one interval of the reference.

Arguments

fn:‚Äčcontaining(source reference)

source
source sub-interval (term or other function)
reference
reference sub-interval (term or other function)
Examples
  • fn:‚Äčcontaining(fn:‚Äčextend(fn:‚Äčor(lazy brown) 1 1) fn:‚Äčor(fox dog))

    The quick brown fox jumps over the lazy dog

  • fn:‚Äčcontaining(fn:‚Äčat‚ÄčLeast(2 quick fox dog) jumps)

    The quick brown fox jumps over the lazy dog

fn:‚Äčnot‚ÄčContaining

Matches intervals of the source that do not contain any intervals of the reference.

Arguments

fn:‚Äčnot‚ÄčContaining(source reference)

source
source sub-interval (term or other function)
reference
reference sub-interval (term or other function)
Examples
  • fn:‚Äčnot‚ÄčContaining(fn:‚Äčextend(fn:‚Äčor(fox dog) 1 0) fn:‚Äčor(brown yellow))

    The quick brown fox jumps over the lazy dog

  • fn:‚Äčnot‚ÄčContaining(fn:‚Äčordered(fn:‚Äčor(the ‚ÄčThe) fn:‚Äčor(fox dog)) brown)

    The quick brown fox jumps over the lazy dog

fn:‚Äčoverlapping

Matches intervals of the source that overlap with at least one interval of the reference.

Arguments

fn:‚Äčoverlapping(source reference)

source
source sub-interval (term or other function)
reference
reference sub-interval (term or other function)
Examples
  • fn:‚Äčoverlapping(fn:‚Äčphrase(brown fox) fn:‚Äčphrase(fox jumps))

    The quick brown fox jumps over the lazy dog

  • fn:‚Äčoverlapping(fn:‚Äčor(fox dog) fn:‚Äčextend(lazy 2 2))

    The quick brown fox jumps over the lazy dog

fn:‚Äčnon‚ÄčOverlapping

Matches intervals of the source that do not overlap with any intervals of the reference.

Arguments

fn:‚Äčnon‚ÄčOverlapping(source reference)

source
source sub-interval (term or other function)
reference
reference sub-interval (term or other function)
Examples
  • fn:‚Äčnon‚ÄčOverlapping(fn:‚Äčphrase(brown fox) fn:‚Äčphrase(lazy dog))

    The quick brown fox jumps over the lazy dog

  • fn:‚Äčnon‚ÄčOverlapping(fn:‚Äčor(fox dog) fn:‚Äčextend(lazy 2 2))

    The quick brown fox jumps over the lazy dog

default‚ÄčFields

Type
array of string
Default
[]
Required
no

An array of field names to search for query terms without an explicit field name qualifier. For example, the data title:‚Äčmining query contains one unqualified term: foo and one with an explicit field qualifier: title:‚Äčmining. If default‚ÄčFields was equal to ["title", "abstract"], Lingo4G would rewrite the query to (summary:‚Äčfoo ‚ÄčO‚ÄčR description:‚Äčbar) title:‚Äčbar.

If you do not provide default‚ÄčFields or set it to an empty array, Lingo4G will raise errors for queries containing terms without explicit field qualifiers.

default‚ÄčOperator

Type
string
Default
"AND"
Constraints
one of [AND, OR]
Required
no

The default Boolean operator Lingo4G applies to each clause of the query, unless you explicitly provide the operator to use.

For example, with the default‚ÄčOperator equal to A‚ÄčN‚ÄčD, Lingo4G rewrites the data mining query to data ‚ÄčA‚ÄčN‚ÄčD mining.

The default‚ÄčOperator property supports the following values:

A‚ÄčN‚ÄčD

Conjunction operator.

O‚ÄčR

Disjunction operator.

sanitize‚ÄčSpaces

Type
string
Default
"(?U)\\p{Blank}+"
Required
no

Before parsing the query, Lingo4G replaces each occurrence of the regular expression pattern you provide in the sanitize‚ÄčSpaces property with a single space character. The default pattern normalizes any sequence of Unicode white space characters into one plain space. To disable the replacement, set sanitize‚ÄčSpaces to an empty string.

validate‚ÄčFields

Type
boolean
Default
true
Required
no

If true, Lingo4G raises an error if the query contains a field name qualifier referring to a field that does not exist in the index. Field name validation ensures that accidental typos in field names result in errors rather than empty search results.