2.0.x release notes
Release notes for Lingo4G 2.0.x.
Version 2.0.5
The 2.0.5 release squashes a few minor bugs.
Compatibility
Lingo4G 2.0.5 is backward-compatible with previous 2.0.x releases and works with indices created by any 2.0.x release.
Bug fixes
- Slow parsing of requests
-
API V2 requests containing large arrays of primitive types were slow to parse and process.
Other changes
Version 2.0.4
The 2.0.4 release squashes a few minor bugs.
Compatibility
Lingo4G 2.0.4 is backward-compatible with previous 2.0.x releases and works with indices created by any 2.0.x release.
Bug fixes
-
--output
option fixed inrun-request
-
Previous versions of Lingo4G would ignore the
--output
option of therun-request
command. Version 2.0.4 fixes the issue. - Incorrect rounding of variable values in Explorer
-
Previous versions of Lingo4G Explorer app may incorrectly round values of certain variables in the request variables editor. Version 2.0.4 fixes the issue.
Improvements
- Truncated JSON arrays warning
-
Lingo4G Explorer now displays a warning if the JSON response view truncates large arrays for better display performance.
Version 2.0.3
The 2.0.3 release brings minor improvements to example data sources and fixes for display problems in the Explorer (map view).
Compatibility
Lingo4G 2.0.3 is backward-compatible with previous 2.0.x releases and works with indices created by the 2.0.0, 2.0.1 and 2.0.2 releases.
Bug fixes
- Potential assertion error in duplicate detection
-
Duplicate detection could throw an internal assertion error stating internal iterators are not properly sorted.
- Document map settings in Explorer v1 bug fix
-
Explorer v1 included in Lingo4G 2.0.2 would reset certain document map settings to their default on each new analysis. Version 2.0.3 fixes the issue.
- Document map display fixes on fractional device pixel ratios
-
The document map can display truncated labels on devices with fractional device pixel ratios. This release brings a workaround for this problem.
Other changes
- zstandard compression support
-
The JSON records document source now supports reading zstd-compressed files.
- PubMed example shows up to date URLs automatically
-
The PubMed example will try to fetch and parse bulk data file URLs for your convenience.
Version 2.0.2
The 2.0.2 release fixes reporting of multi-value field count in the
document​Content
stage and updates dotAtlas to fix jittery zooming of the document map.
Compatibility
Lingo4G 2.0.2 is backward-compatible with previous 2.0.x releases and works with indices created by the 2.0.0 and 2.0.1 releases.
Bug fixes
-
value​Count
parameter was ignored -
Lingo4G ignored the
value​Count
property and never emitted field value count. Version 2.0.2 fixes the issue and also corrects the documentation of the property.
- Jittery zooming of the document map
-
Zooming of the document map view was jittery in both the legacy and the current version of Lingo4G Explorer. Version 2.0.2 fixes the issue.
Version 2.0.1
Version 2.0.1 improves the initial loading time of Lingo4G documentation. Version 2.0.1 does not make any changes to Lingo4G software.
Version 2.0.0
Lingo4G 2.0.0 adds analysis API v2: a new flexible API for building and running diverse text processing pipelines. It also comes with a modernized Lingo4G Explorer v2 application.
See the Version 1.x vs 2.x article for an overview of what's changed and what remained the same.
Compatibility
- Project descriptor
-
Update recommended. Lingo4G 2.0.0 maintains compatibility with Lingo4G 1.x project descriptors. We recommend applying one change to your existing project descriptors to make the analysis API v2 easier to use.
- Reindexing
-
Required. Lingo4G 2.0.0 does not work with indices created with earlier versions of Lingo4G. You need to perform full indexing to open your existing projects with Lingo4G 2.0.0.
- REST API v1
-
Available, but enters maintenance mode. Lingo4G 2.0.0 preserves the REST API available in the 1.x line. All software you created against Lingo4G 1.x will also work with Lingo4G 2.0.0.
As of version 2.0.0, the REST API v1 enters maintenance mode: it will only receive critical bug fixes. All new analysis features, such as document embeddings introduced in version 2.0.0, will be exposed only in the analysis API v2.
New features
- Analysis API v2
-
Version 2.0.0 adds a new flexible way of executing analyses. You can use analysis API v2 to build requests of varying complexity, ranging from simple query-based document search, through clustering of documents or labels, to generating a time series of 2d document maps and finding near-duplicate documents.
For more information, see:
- Lingo4G Explorer v2
-
Version 2.0.0 comes with a modernized Lingo4G Explorer v2. Currently, Lingo4G Explorer v2 offers the JSON Sandbox app for authoring, executing and debugging analysis API v2 requests.
Lingo4G Explorer running the JSON Sandbox app.
- Duplicate detection
-
You can use Lingo4G 2.0.0 to identify pairs of documents with overlapping content. The degree of overlap can range from entire documents (exact duplicates), almost all the content (near duplicates) or just partial overlap (sentences, paragraphs). Lingo4G can also highlight the overlapping areas of documents for easier inspection of the results.
- Document embeddings
-
Lingo4G 2.0.0 can learn multidimensional embeddings for documents. You can use analysis API v2 to compute embedding-based similarities between documents.
See the learning embeddings article for more information and limitations of the current implementation.
API changes
analysis_v2
project descriptor section-
Version 2.0.0 adds the optional
analysis_v2
section to the project descriptor to specify defaults, such as feature field names, for the analysis API v2.We recommend adding to your existing project descriptor an
analysis_v2
section similar to the following:"analysis_v2": { "components": { "featureFields": { "type": "featureFields:simple", "fields": [ "title$phrases", "abstract$phrases" ] }, "contentFields": { "type": "contentFields:simple", "fields": { "id": {}, "title": {}, "abstract": {}, "category": {} } }, "labelFilter": { "type": "labelFilter:autoStopLabels" } } }
Adapt the two highlighted blocks based on the feature and content fields available in your project:
-
in the
fields
array of thefeature​Fields
component, provide names of the feature fields from which Lingo4G should extract labels. You can use the same feature fields as in theanalysis.source.labels.fields
section that should already exist in your descriptor.See the
feature​Fields:​simple
reference for more details. -
in the
fields
object of thecontent​Fields
component, provide names of content fields which you would like Lingo4G to retrieve when displaying the contents of documents.Objects inside the
fields
object configure the retrieval details, such as maximum length and label highlighting, for each field. Our example uses empty objects, sticking to default retrieval settings for each field.See the
content​Fields:​simple
reference for a complete example.
-
Previous releases
For Lingo4G 1.x release notes, see the v1 documentation.