Heart of Gold - Download of Middleware and NLP Components
The Heart of Gold Middleware (Java) source code is licensed under the GNU Lesser General Public License (LGPL).
'Components' below are not part of the Middleware, but independent, individual natural language processing software.
Most components of the Heart of Gold available for download from this web page can be
obtained free-of-charge for scientific research and teaching (e.g.,
Sprout, TnT, Chunkie).
NLP components developed by external authors, institutions or companies are published
under different licenses, e.g., RASP, LingPipe, ChaSen.
Please check their web sites mentioned below.
For a minimal system with (German/English) HPSG, you need 'Middleware', the TnT tagger and PET with a grammar.
The table shows which components and linguistic resources have been integrated so far.
Component | Default Depth | Language resources* |
| |
*ISO 639-1 language codes
|
JTok | 10 | de, en, it |
ChaSen | 10 | ja |
TnT | 20 | de, en |
FreeLing | 20 | ca, en, es, gc, it |
TreeTagger | 20 | de, en, es, fr, it |
Chunkie | 30 | de, en |
ChunkieRMRS | 35 | de, en |
LingPipe | 40 | en |
SProUT | 40 | de, el, en, ja |
LoPar/Whiteboard Topoparser | 50 | de |
RASP | 50 | en |
Sleepy | 50 | de |
PET | 100 | de, el, en, ja |
RMRSMerge | 110 | independent |
SDL | | independent |
All archives below include the root directory './', except external systems like RASP and LingPipe which have to be unpacked or compiled to the directory specified under 'Location:'.
For installation instructions (including component-specific configurations etc.), hardware and software requirements see the User and Developer Documentation.
Currently, only Linux x86 is fully supported as platform. This is solely due to the fact that some components come with a Linux implementation only or that the module adapter has been implemented for Linux only. The core middleware (together with Java-implemented components e.g. JTok, LingPipe, SProUT) itself should run wherever JDK 1.5 is supported, e.g. Windows, Mac OS, Solaris etc. Any contribution to ports and increased portability is welcome as are bug reports, bug fixes, comments, contributions of new components or resources etc.!
If you publish on systems, software, applications, experiments or results that you have achieved with the help of the Heart of Gold middleware, we kindly ask you to cite an appropriate reference mentioned under Publications and/or Components.
Contributors (Heart of Gold middleware):
Concept: Ulrich Callmeier, Andreas Eisele, Ulrich Schäfer, Melanie Siegel
Implementation: Robert Barbey, Özgür Demir, Ulrich Schäfer
JTok and Module configuration and Launcher: Jörg Steffen
PET extensions (XML input chart): Bernd Kiefer
SDL: Hans-Ulrich Krieger
RMRS construction from chunks: Anette Frank and Kathrin Spreyer
RMRS merging stylesheets: Anette Frank
rmrs2html.xsl: Thomas Klöcker and Ulrich Schäfer, with ideas and styles
borrowed from Stephan Oepens Javascript code for MRS (lkb.js)
Web demo: Özgür Demir
LoParModule, Port of the Whiteboard topoparser XSLT pipeline: Daniel Contag
RASP2Module, deployment: Torsten Marek
Downloads:
Middleware
Description: | Sources of the Java middleware including component adapters ('modules'), Python demo clients, stylesheets and configuration files |
Institution: | DFKI Language Technology Lab |
License: | LGPL, parts are Apache Software License (ant, log4j, xml2html.xsl) |
Requirements: | Java JDK 1.5, (for GUI client application: Python >= 2.2 with Python TK and Mozilla >= 1.3 on X11) |
Location: | ./{conf,java,python,xsl,lib,components/jtok} |
Download: | hog-1.5-src.tar.gz, hog-1.5-bin.tar.gz |
| Alternative for src package (subversion): |
| svn checkout https://heartofgold.opendfki.de/repos/trunk
hog-1.5 |
| Installation script: install (instructions inside) |
Publications: | Publications |
JTok
Description: | Configurable Tokenizer implemented in Java |
Ling. resources: | en, de, it (additional languages can be added via XML configuration files) |
Institution: | DFKI Language Technology Lab |
License: | LGPL |
Requirements: | Java 1.5 |
Location: | ./components/jtok |
Download: | (included in the middleware bin archive) |
TnT
Chunkie
ChunkieRMRS
SProUT
Description: | General-purpose linguistic processor, e.g., for named entity recognition, information extraction, tokenization, morphological analysis, compound segmentation, sentence boundary recognition, coreference resolution, combines finite-state and unification-based approaches |
Ling. resources: | de, el, en, ja |
Institution: | DFKI Language Technology Lab |
License: | LGPL |
Requirements: | Java 1.5 |
Location: | ./components/sprout |
Download: | runtime for Heart of Gold including language
resources for de, el, en, ja: |
| components-sprout.tar.gz |
| Integrated Development Environment: http://sprout.dfki.de |
Publications: | Publications/SProUT |
LoPar (external)/Whiteboard Topoparser
SDL
Description: | Description language for NLP subarchitectures |
Institution: | DFKI Language Technology Lab |
License: | LGPL |
Requirements: | Java |
Location: | ./java, ./xsl/sdl |
Download: | (included in the middleware src/bin archives) |
Publications: | Publications/SDL |
PET
Description: | HPSG parser |
Ling. resources: | separate grammar dumps (grammar development with LKB), see 'Download' |
Institution: | Saarland University, Computational Linguistics Department, DFKI Language Technology Lab |
License: | LGPL, parts are Apache Software License and others |
Requirements: | Linux |
Location: | ./components/pet |
Download: | - binary of the HPSG parser for Heart of Gold: components-pet-binlib.tar.gz,
source: notes; pet.opendfki.de |
| Each of the following grammars comes with its own license! |
| - English Resource Grammar (ERG, Stanford): binary for Heart of Gold: components-pet-erg.tar.gz, source: http://lingo.stanford.edu/erg |
| - German German Grammar (DFKI): binary for Heart of
Gold: components-pet-german.tar.gz; or
generate via GG download package, unpack, make hog, unpack generated gg4hog.tar.gz to ./components/pet/german/ - Source: http://gg.opendfki.de |
| - Japanese 'JACY' Grammar (DFKI): binary for Heart of Gold: components-pet-japanese.tar.gz, source: http://jacy.opendfki.de |
| - Modern Greek Grammar (Saarland U.): http://www.delph-in.net/mgrg/ |
| - Spanish Grammar (UPF Barcelona): http://www.delph-in.net/srg/ |
Publications: | Publications/PET |
RMRSMerge
LingPipe (external)
RASP (external)
ChaSen (external)
Sleepy (external)
TreeTagger (external)
FreeLing (external), preliminary integration
Description: | Part-of-speech tagger, morphology, named entity recognition |
Ling. resources: | ca, en, es, gc, it |
Institutions: | Universitat Politècnica de Catalunya, TALP Research Center |
License: | LGPL |
Requirements: | Linux with Berkeley DB (version 4.1.25 or higher), pcre (version 4.3 or higher), libcfg+ (version 0.6.1 or higher) |
Location: | ./components/freeling |
Download: | (external) http://www.lsi.upc.es/~nlp/freeling/ |
Publications: | http://www.lsi.upc.es/~nlp/freeling/, additional LKBwrapper required, in components-freeling-sppp.tar.gz |
Install hints: | see FreeLing section in the Heart of Gold User and Developer Documentation, README in components-freeling-sppp.tar.gz |
Top of page.