Clustering Engine Manual for version 2.0.0-beta2
This manual provides detailed information about the Carrot Search Lingo3G document clustering engine. It includes a general overview of Lingo3G, a description of Lingo3G Java API, HTTP/REST service API, tuning attributes and configuration files.
What is Lingo3G?
Lingo3G is a text clustering engine that can organize small to medium collections of documents into clearly labeled thematic groups called clusters. Clustering is performed in real-time, fully automatically and is based only on the provided text fields of each document. Lingo3G's unique algorithm ensures high-quality semantic results are delivered quickly.
Lingo3G can turn, for example, search result titles and snippets into groups like these:
Lingo3G is a programming component (Java library). Programming knowledge is required to use it. The distribution package comes with examples and demonstration applications.
What's in the box?
Lingo3G distribution comes with:
- the Java API that can be used independently or as an automatically loaded algorithm within the Carrot2 project,
- the REST service for mash-ups or integration with languages other than Java,
- the Search Results Clustering demo application,
- the Clustering Workbench application for more advanced users,
- code snippets and examples for free reuse in your code.
Additionally, several downstream projects provide integration between Carrot2 and popular document retrieval services: