A B C D E F G I J L M N O P R S T X

A

ABBREV_RULE - Static variable in class de.dfki.lt.tools.tokenizer.AbbrevDescription
This is the name of the abbreviation rule.
ALL_RULE - Static variable in class de.dfki.lt.tools.tokenizer.PunctDescription
This is the name of the all punctuation rule.
AbbrevDescription - class de.dfki.lt.tools.tokenizer.AbbrevDescription.
AbbrevDescription extends Description.
AbbrevDescription(Document, Set, String) - Constructor for class de.dfki.lt.tools.tokenizer.AbbrevDescription
This creates a new instance of AbbrevDescription for the abbreviation description contained in the dom Document abbrDescr.
AnnotatedString - interface de.dfki.lt.tools.tokenizer.annotate.AnnotatedString.
AnnotatedString is an interface for annotating strings and working on them.
annotate(String, Object, int, int) - Method in interface de.dfki.lt.tools.tokenizer.annotate.AnnotatedString
Adds an annotation to a subrange of the string.
annotate(String, Object, int, int) - Method in class de.dfki.lt.tools.tokenizer.annotate.FastAnnotatedString
Adds an annotation to a subrange of the string.

B

BORDER_ANNO - Static variable in class de.dfki.lt.tools.tokenizer.JTok
This is the annotation key for sentences and paragraph borders.

C

CLASS_ANNO - Static variable in class de.dfki.lt.tools.tokenizer.JTok
This is the annotation key for the token class.
CLITIC_RULE - Static variable in class de.dfki.lt.tools.tokenizer.PunctDescription
This is the name of the clitic punctuation rule.
CLOSE_PUNCT - Static variable in class de.dfki.lt.tools.tokenizer.PunctDescription
This is the class name for closing punctuation.
CliticsDescription - class de.dfki.lt.tools.tokenizer.CliticsDescription.
CliticsDescription extends Description.
CliticsDescription(Document, Set) - Constructor for class de.dfki.lt.tools.tokenizer.CliticsDescription
This creates a new instance of CliticsDescription for the clitics description contained in the dom Document clitDescr.
charAt(int) - Method in interface de.dfki.lt.tools.tokenizer.annotate.AnnotatedString
Returns the character from the specified position without changing the index.
charAt(int) - Method in class de.dfki.lt.tools.tokenizer.annotate.FastAnnotatedString
This returns the character from the specified position without changing the index.
clone() - Method in class de.dfki.lt.tools.tokenizer.annotate.FastAnnotatedString
This create a copy of this object.
contains(String) - Method in class de.dfki.lt.tools.tokenizer.regexp.JavaRegExp
This checks if the input contains a match for the regular expression.
contains(String) - Method in interface de.dfki.lt.tools.tokenizer.regexp.RegExp
This specifies a method signature that checks if the input contains a match for the regular expression.
copyFile(File, File) - Static method in class de.dfki.lt.tools.tokenizer.FileTools
This simply copies a source file to a target file.
createParagraphs(AnnotatedString) - Static method in class de.dfki.lt.tools.tokenizer.output.ParagraphOutputter
This creates a List of Paragraphs with TextUnit and Token from an annotated input.
createRegExp(String) - Method in class de.dfki.lt.tools.tokenizer.regexp.JavaRegExpFactory
This creates a regular expression object from an input string.
createRegExp(String) - Method in class de.dfki.lt.tools.tokenizer.regexp.RegExpFactory
This specifies a method signature that creates a regular expression object from an input string.
createXMLDocument(AnnotatedString, LanguageResource) - Static method in class de.dfki.lt.tools.tokenizer.output.XMLOutputter
This creates an XML document from an annotated input.
createXMLString(AnnotatedString, LanguageResource) - Static method in class de.dfki.lt.tools.tokenizer.output.XMLOutputter
This creates an XML string from an annotated input.
current() - Method in class de.dfki.lt.tools.tokenizer.annotate.FastAnnotatedString
This gets the character at the current position (as returned by getIndex()).

D

DEFS - Static variable in class de.dfki.lt.tools.tokenizer.Description
This is the name of the element with the definitions in the description files.
DEF_CLASS - Static variable in class de.dfki.lt.tools.tokenizer.Description
This is the attribute of a definition or list element that contains the class name.
DEF_REGEXP - Static variable in class de.dfki.lt.tools.tokenizer.Description
This is the attribute of a definition element that contains the regular expression.
DIGITS_RULE - Static variable in class de.dfki.lt.tools.tokenizer.NumbersDescription
This is the name of the digits rule.
Description - class de.dfki.lt.tools.tokenizer.Description.
Description is an abstract class that provides common methods to manage the content of description files.
Description() - Constructor for class de.dfki.lt.tools.tokenizer.Description
 
de.dfki.lt.tools.tokenizer - package de.dfki.lt.tools.tokenizer
 
de.dfki.lt.tools.tokenizer.annotate - package de.dfki.lt.tools.tokenizer.annotate
 
de.dfki.lt.tools.tokenizer.exceptions - package de.dfki.lt.tools.tokenizer.exceptions
 
de.dfki.lt.tools.tokenizer.output - package de.dfki.lt.tools.tokenizer.output
 
de.dfki.lt.tools.tokenizer.regexp - package de.dfki.lt.tools.tokenizer.regexp
 
definitionsMap - Variable in class de.dfki.lt.tools.tokenizer.Description
This maps a class to a regular expression that matches all tokens of this class.

E

ENCLITIC_RULE - Static variable in class de.dfki.lt.tools.tokenizer.CliticsDescription
This is the name of the enclitic rule.

F

FACTORY - Static variable in class de.dfki.lt.tools.tokenizer.Description
This is the factory for creating regular expressions.
FastAnnotatedString - class de.dfki.lt.tools.tokenizer.annotate.FastAnnotatedString.
FastAnnotatedString is a fast implementation of the AnnotatedString interface.
FastAnnotatedString(String) - Constructor for class de.dfki.lt.tools.tokenizer.annotate.FastAnnotatedString
This create a new instance of FastAnnotatedString for a text in inputString.
FileTools - class de.dfki.lt.tools.tokenizer.FileTools.
FileTools provides static methods to work on files and stream.
findNextAnnotation(String) - Method in interface de.dfki.lt.tools.tokenizer.annotate.AnnotatedString
Returns the index of the first character annotated with the given annotation key following the run containing the current character with respect to the given annotation key.
findNextAnnotation(String) - Method in class de.dfki.lt.tools.tokenizer.annotate.FastAnnotatedString
This returns the index of the first character annotated with the given annotation key following the run containing the current character with respect to the given annotation key.
first() - Method in class de.dfki.lt.tools.tokenizer.annotate.FastAnnotatedString
This sets the position to getBeginIndex() and returns the character at that position.

G

getAllMatches(String) - Method in class de.dfki.lt.tools.tokenizer.regexp.JavaRegExp
This returns an array of all Matches for the regular expression in input.
getAllMatches(String) - Method in interface de.dfki.lt.tools.tokenizer.regexp.RegExp
This specifies a method signature that returns a List with all Matches for the regular expression in input.
getAnnotation(String) - Method in interface de.dfki.lt.tools.tokenizer.annotate.AnnotatedString
Returns the annotation value of the string at the current index for a given annotation key.
getAnnotation(String) - Method in class de.dfki.lt.tools.tokenizer.annotate.FastAnnotatedString
This returns the annotation value of the string at the current index for a given key.
getBeginIndex() - Method in class de.dfki.lt.tools.tokenizer.annotate.FastAnnotatedString
This returns the start index of the text.
getDefinitionsMap() - Method in class de.dfki.lt.tools.tokenizer.Description
This returns the field Description.definitionsMap.
getEndIndex() - Method in class de.dfki.lt.tools.tokenizer.annotate.FastAnnotatedString
This returns the end index of the text.
getEndIndex() - Method in class de.dfki.lt.tools.tokenizer.output.Paragraph
This returns the end index of the paragraph.
getEndIndex() - Method in class de.dfki.lt.tools.tokenizer.output.TextUnit
This returns the end index of the text unit.
getEndIndex() - Method in class de.dfki.lt.tools.tokenizer.output.Token
This returns the end index of the token.
getEndIndex() - Method in class de.dfki.lt.tools.tokenizer.regexp.Match
This returns the index within the input string where the match in its entirety ends.
getFilesFromDir(String, String) - Static method in class de.dfki.lt.tools.tokenizer.FileTools
This recursivly collects all filenames in the directory aDirectory with suffix aSuffix and returns them in a List.
getImage() - Method in class de.dfki.lt.tools.tokenizer.output.Token
This returns the surface image of the token.
getIndex() - Method in class de.dfki.lt.tools.tokenizer.annotate.FastAnnotatedString
This returns the current index.
getLanguage() - Method in class de.dfki.lt.tools.tokenizer.LanguageResource
This returns the language of this language resource.
getLanguageResource(String) - Method in class de.dfki.lt.tools.tokenizer.JTok
This returns the LanguageResource for the given language if available
getListsMap() - Method in class de.dfki.lt.tools.tokenizer.Description
This returns the field Description.listsMap.
getRegExpMap() - Method in class de.dfki.lt.tools.tokenizer.Description
This returns the field Description.regExpMap.
getRulesMap() - Method in class de.dfki.lt.tools.tokenizer.Description
This returns the field Description.rulesMap.
getRunLimit(String) - Method in interface de.dfki.lt.tools.tokenizer.annotate.AnnotatedString
Returns the index of the first character following the run with respect to the given annotation key containing the current character.
getRunLimit(String) - Method in class de.dfki.lt.tools.tokenizer.annotate.FastAnnotatedString
This returns the index of the first character following the run with respect to the given annotation key containing the current character.
getRunStart(String) - Method in interface de.dfki.lt.tools.tokenizer.annotate.AnnotatedString
Returns the index of the first character of the run with respect to the given annotation key containing the current character.
getRunStart(String) - Method in class de.dfki.lt.tools.tokenizer.annotate.FastAnnotatedString
This returns the index of the first character of the run with respect to the given annotation key containing the current character.
getStartIndex() - Method in class de.dfki.lt.tools.tokenizer.output.Paragraph
This returns the start index of the paragraph.
getStartIndex() - Method in class de.dfki.lt.tools.tokenizer.output.TextUnit
This returns the start index of the text unit.
getStartIndex() - Method in class de.dfki.lt.tools.tokenizer.output.Token
This returns the start index of the token.
getStartIndex() - Method in class de.dfki.lt.tools.tokenizer.regexp.Match
This returns the index within the input text where the match in its entirety began.
getTagsMap() - Method in class de.dfki.lt.tools.tokenizer.LanguageResource
This returns a HashMap that maps class names to their tags as defined in the class definition file.
getTextUnits() - Method in class de.dfki.lt.tools.tokenizer.output.Paragraph
This returns the list with the text units of the paragraph.
getTokens() - Method in class de.dfki.lt.tools.tokenizer.output.TextUnit
This returns the list with the tokens of the text unit.
getType() - Method in class de.dfki.lt.tools.tokenizer.output.Token
This returns the type of the token.

I

ID_ATT - Static variable in class de.dfki.lt.tools.tokenizer.output.XMLOutputter
This is the name of the XML attribute in XML_TEXT_UNIT that contains the text unit id.
IMAGE_ATT - Static variable in class de.dfki.lt.tools.tokenizer.output.XMLOutputter
This is the name of the XML attribute in XML_TOKEN that contains the token image.
INTERNAL_RULE - Static variable in class de.dfki.lt.tools.tokenizer.PunctDescription
This is the name of the internal punctuation rule.
INTERNAL_TU_RULE - Static variable in class de.dfki.lt.tools.tokenizer.PunctDescription
This is the name of the sentence internal punctuation rule.
InitializationException - exception de.dfki.lt.tools.tokenizer.exceptions.InitializationException.
InitializationException is thrown when the tokenizer can't be initialized.
InitializationException() - Constructor for class de.dfki.lt.tools.tokenizer.exceptions.InitializationException
This creates a new instance of InitializationException.
InitializationException(String) - Constructor for class de.dfki.lt.tools.tokenizer.exceptions.InitializationException
This creates a new instance of InitializationException with an error message aMessage
isAncestor(String, String, String) - Method in class de.dfki.lt.tools.tokenizer.JTok
This checks if the class of a token with tag tag1 is ancestor in the class hierarchy of the class of a token with tag tag2 or if the token classes are equal in the token class hierarchy for aLanguage.

J

JTok - class de.dfki.lt.tools.tokenizer.JTok.
JTok is a low level tokenizer tool that recognizes paragraphs, sentences, tokens, punctuation, numbers, abbreviations, etc.
JTok(Properties) - Constructor for class de.dfki.lt.tools.tokenizer.JTok
This creates a new instance of JTok using the properties in configProps.
JTok.OpenClosePunctFlag - class de.dfki.lt.tools.tokenizer.JTok.OpenClosePunctFlag.
This inner class is used as a wrapper for a boolean primitive value to allow call-by-reference with it.
JTok.OpenClosePunctFlag(boolean) - Constructor for class de.dfki.lt.tools.tokenizer.JTok.OpenClosePunctFlag
This creates a new instance of the wrapper and initializes the flag with the given boolean.
JavaRegExp - class de.dfki.lt.tools.tokenizer.regexp.JavaRegExp.
JavaRegExp implements the RegExp interface for regular expressions of the java.util.regex package.
JavaRegExp(String) - Constructor for class de.dfki.lt.tools.tokenizer.regexp.JavaRegExp
This creates a new instance of JavaRegExp for a String containing a regular expression.
JavaRegExpFactory - class de.dfki.lt.tools.tokenizer.regexp.JavaRegExpFactory.
JavaRegExpFactory extends RegExpFactory for regular expressions of the java.util.regex package.
JavaRegExpFactory() - Constructor for class de.dfki.lt.tools.tokenizer.regexp.JavaRegExpFactory
This creates a new instance of JavaRegExpFactory.

L

LENGTH_ATT - Static variable in class de.dfki.lt.tools.tokenizer.output.XMLOutputter
This is the name of the XML attribute in XML_TOKEN that contains the token length.
LISTS - Static variable in class de.dfki.lt.tools.tokenizer.Description
This is the name of the element with the lists in the description files.
LIST_ENCODING - Static variable in class de.dfki.lt.tools.tokenizer.Description
This is the attribute of a list element that contains the encoding of the list file.
LIST_FILE - Static variable in class de.dfki.lt.tools.tokenizer.Description
This is the attribute of a list element that point to the list file.
LanguageNotSupportedException - exception de.dfki.lt.tools.tokenizer.exceptions.LanguageNotSupportedException.
LanguageNotSupportedException is thrown when the necessary language resources are not available.
LanguageNotSupportedException() - Constructor for class de.dfki.lt.tools.tokenizer.exceptions.LanguageNotSupportedException
This creates a new instance of LanguageNotSupportedException.
LanguageNotSupportedException(String) - Constructor for class de.dfki.lt.tools.tokenizer.exceptions.LanguageNotSupportedException
This creates a new instance of LanguageNotSupportedException with an error message aMessage
LanguageResource - class de.dfki.lt.tools.tokenizer.LanguageResource.
LanguageResource class manages the language-specific information needed by the tokenizer to process a document of that language.
LanguageResource(String, String) - Constructor for class de.dfki.lt.tools.tokenizer.LanguageResource
This creates a new instance of LanguageResource for aLanguage by using the resource description files in aResourceDir.
last() - Method in class de.dfki.lt.tools.tokenizer.annotate.FastAnnotatedString
This sets the position to getEndIndex()-1 (getEndIndex() if the text is empty) and returns the character at that position
listsMap - Variable in class de.dfki.lt.tools.tokenizer.Description
This maps a class to a hash map that contains members of this class.
loadDefinitions(Document, Set) - Method in class de.dfki.lt.tools.tokenizer.Description
This uses the definitions section in a description file to map each token class from the definitions to a regular expression that matches all tokens of that class.
loadLists(Document, Set, String) - Method in class de.dfki.lt.tools.tokenizer.Description
This uses the lists section in a description file to map each token class from the lists to a hashmap that contains all members of that class.
loadRules(Document) - Method in class de.dfki.lt.tools.tokenizer.Description
This maps each rule from the description to a regular expression that matches all tokens from that rule.

M

Match - class de.dfki.lt.tools.tokenizer.regexp.Match.
Match holds the result of matching an input string with a regular expression.
Match(int, int, String) - Constructor for class de.dfki.lt.tools.tokenizer.regexp.Match
This creates a new instance of Match using the given parameters.
main(String[]) - Static method in class de.dfki.lt.tools.tokenizer.JTok
This main method must be used with two or three arguments: - a file name for the document to tokenize - the language of the document - an optional encoding to use (default is ISO-8859-1) Supported languages are: de, en, it
main(String[]) - Static method in class de.dfki.lt.tools.tokenizer.TestJTok
This main method gets as argument the name of a test description file and creates a new instance of TestJTok with it.
main(String[]) - Static method in class de.dfki.lt.tools.tokenizer.annotate.TestFastAnnotatedString
 
matches(String) - Method in class de.dfki.lt.tools.tokenizer.regexp.JavaRegExp
This checks if the regular expression matches the input in its entirety.
matches(String) - Method in interface de.dfki.lt.tools.tokenizer.regexp.RegExp
This specifies a method signature that checks if the regular expression matches the input in its entirety.

N

NON_BREAK_LEFT_RULE - Static variable in class de.dfki.lt.tools.tokenizer.PunctDescription
This is the name of the non-breaking left punctuation rule.
NON_BREAK_RIGHT_RULE - Static variable in class de.dfki.lt.tools.tokenizer.PunctDescription
This is the name of the non-breaking right punctuation rule.
NumbersDescription - class de.dfki.lt.tools.tokenizer.NumbersDescription.
NumbersDescription extends Description.
NumbersDescription(Document, Set) - Constructor for class de.dfki.lt.tools.tokenizer.NumbersDescription
This creates a new instance of NumbersDescription for the numbers description contained in the dom Document numbDescr.
next() - Method in class de.dfki.lt.tools.tokenizer.annotate.FastAnnotatedString
This increments the index by one and returns the character at the new index.

O

OFFSET_ATT - Static variable in class de.dfki.lt.tools.tokenizer.output.XMLOutputter
This is the name of the XML attribute in XML_TOKEN that contains the token offset.
OPEN_CLOSE_PUNCT - Static variable in class de.dfki.lt.tools.tokenizer.PunctDescription
This is the class name for ambiguous open/close punctuation.
OPEN_PUNCT - Static variable in class de.dfki.lt.tools.tokenizer.PunctDescription
This is the class name for opening punctuation.
ORDINAL_RULE - Static variable in class de.dfki.lt.tools.tokenizer.NumbersDescription
This is the name of the ordinal number rule.

P

PROCLITIC_RULE - Static variable in class de.dfki.lt.tools.tokenizer.CliticsDescription
This is the name of the proclitic rule.
P_BORDER - Static variable in class de.dfki.lt.tools.tokenizer.JTok
This is the annotation value for paragraph borders.
Paragraph - class de.dfki.lt.tools.tokenizer.output.Paragraph.
This represents a paragraph with its text units.
Paragraph() - Constructor for class de.dfki.lt.tools.tokenizer.output.Paragraph
This creates a new instance of Paragraph.
Paragraph(List) - Constructor for class de.dfki.lt.tools.tokenizer.output.Paragraph
This creates a new instance of Paragraph that contains the given text units.
ParagraphOutputter - class de.dfki.lt.tools.tokenizer.output.ParagraphOutputter.
ParagraphOutputter provides static methods that convert a AnnotatedString into a list of nested representation of Paragraphs with TextUnits and Tokens.
ParagraphOutputter() - Constructor for class de.dfki.lt.tools.tokenizer.output.ParagraphOutputter
 
ProcessingException - exception de.dfki.lt.tools.tokenizer.exceptions.ProcessingException.
ProcessingException is thrown when the processing of input data causes an error.
ProcessingException() - Constructor for class de.dfki.lt.tools.tokenizer.exceptions.ProcessingException
This creates a new instance of ProcessingException.
ProcessingException(String) - Constructor for class de.dfki.lt.tools.tokenizer.exceptions.ProcessingException
This creates a new instance of ProcessingException with an error message aMessage
PunctDescription - class de.dfki.lt.tools.tokenizer.PunctDescription.
PunctDescription extends Description.
PunctDescription(Document, Set) - Constructor for class de.dfki.lt.tools.tokenizer.PunctDescription
This creates a new instance of PunctDescription for the punctuation description contained in the dom Document punctDescr.
previous() - Method in class de.dfki.lt.tools.tokenizer.annotate.FastAnnotatedString
This decrements the index by one and returns the character at the new index.

R

RULES - Static variable in class de.dfki.lt.tools.tokenizer.Description
This is the name of the element with the rules in the description files.
RegExp - interface de.dfki.lt.tools.tokenizer.regexp.RegExp.
RegExp defines an interface for regular expression patterns.
RegExpFactory - class de.dfki.lt.tools.tokenizer.regexp.RegExpFactory.
RegExpFactory is an abstract class for creating objects that fit the RegExp interface.
RegExpFactory() - Constructor for class de.dfki.lt.tools.tokenizer.regexp.RegExpFactory
 
readFile(File) - Static method in class de.dfki.lt.tools.tokenizer.FileTools
New NIO based method to read the contents of a file as byte[] array.
readFileAsString(File, String) - Static method in class de.dfki.lt.tools.tokenizer.FileTools
New NIO based method to read a file as a String with the given charset encoding.
readInputStream(OutputStream, InputStream) - Static method in class de.dfki.lt.tools.tokenizer.FileTools
Reads some input stream and writes it into an output stream.
readInputStream(InputStream) - Static method in class de.dfki.lt.tools.tokenizer.FileTools
Read some input stream and return its content as a string.
readInputStreamToByteArray(InputStream) - Static method in class de.dfki.lt.tools.tokenizer.FileTools
Read some input stream and return its content as byte array.
readUrlToByteArray(URL) - Static method in class de.dfki.lt.tools.tokenizer.FileTools
Read a url content to a byte array.
readUrlToString(URL) - Static method in class de.dfki.lt.tools.tokenizer.FileTools
Read a url content to a string.
regExpMap - Variable in class de.dfki.lt.tools.tokenizer.Description
This maps regular expressions of rules to class names of the matched expression.
rulesMap - Variable in class de.dfki.lt.tools.tokenizer.Description
This maps the rule names to regular expressions that match the tokens as described by the rule.

S

SIMPLE_DIGITS_RULE - Static variable in class de.dfki.lt.tools.tokenizer.NumbersDescription
This is the name of the digits probe rule.
saveStream(InputStream, File) - Static method in class de.dfki.lt.tools.tokenizer.FileTools
Write an input stream to a file.
setDefinitionsMap(HashMap) - Method in class de.dfki.lt.tools.tokenizer.Description
This sets the field Description.definitionsMap to aDefinitionsMap.
setEndIndex(int) - Method in class de.dfki.lt.tools.tokenizer.output.Paragraph
This sets the end index of the paragraph to anEndIndex.
setEndIndex(int) - Method in class de.dfki.lt.tools.tokenizer.output.TextUnit
This sets the end index of the text unit to anEndIndex.
setEndIndex(int) - Method in class de.dfki.lt.tools.tokenizer.output.Token
This sets the end index of the token to anEndIndex.
setImage(String) - Method in class de.dfki.lt.tools.tokenizer.output.Token
This sets the surface image of the token to anImage.
setIndex(int) - Method in class de.dfki.lt.tools.tokenizer.annotate.FastAnnotatedString
This sets the position to the specified position in the text and returns that character.
setListsMap(HashMap) - Method in class de.dfki.lt.tools.tokenizer.Description
This sets the field Description.listsMap to aListsMap.
setRegExpMap(HashMap) - Method in class de.dfki.lt.tools.tokenizer.Description
This sets the field Description.regExpMap to aRegExpMap.
setRulesMap(HashMap) - Method in class de.dfki.lt.tools.tokenizer.Description
This sets the field Description.rulesMap to aRulesMap.
setStartIndex(int) - Method in class de.dfki.lt.tools.tokenizer.output.Paragraph
This sets the start index of the paragraph to aStartIndex.
setStartIndex(int) - Method in class de.dfki.lt.tools.tokenizer.output.TextUnit
This sets the start index of the text unit to aStartIndex.
setStartIndex(int) - Method in class de.dfki.lt.tools.tokenizer.output.Token
This sets the start index of the token to aStartIndex.
setTextUnits(List) - Method in class de.dfki.lt.tools.tokenizer.output.Paragraph
This sets the text units of the paragraph to someTextUnits.
setTokens(List) - Method in class de.dfki.lt.tools.tokenizer.output.TextUnit
This sets the tokens of the text unit to someTokens.
setType(String) - Method in class de.dfki.lt.tools.tokenizer.output.Token
This sets the type of the token to aType.
start() - Method in class de.dfki.lt.tools.tokenizer.TestJTok
This starts the test.
substring(int, int) - Method in interface de.dfki.lt.tools.tokenizer.annotate.AnnotatedString
Returns the substring between the specified indices.
substring(int, int) - Method in class de.dfki.lt.tools.tokenizer.annotate.FastAnnotatedString
This returns the substring between the specified indices.

T

TOK_TYPE_ATT - Static variable in class de.dfki.lt.tools.tokenizer.output.XMLOutputter
This is the name of the XML attribute in XML_TOKEN that contains the token type.
TU_BORDER - Static variable in class de.dfki.lt.tools.tokenizer.JTok
This is the annotation value for text unit borders.
TestFastAnnotatedString - class de.dfki.lt.tools.tokenizer.annotate.TestFastAnnotatedString.
TestFastAnnotatedString is a test class for FastAnnotatedString.
TestFastAnnotatedString() - Constructor for class de.dfki.lt.tools.tokenizer.annotate.TestFastAnnotatedString
 
TestJTok - class de.dfki.lt.tools.tokenizer.TestJTok.
This is a test class for JTok.
TestJTok(String) - Constructor for class de.dfki.lt.tools.tokenizer.TestJTok
This creates a new instance of TestJTok using the test description in the file configFile.
TextUnit - class de.dfki.lt.tools.tokenizer.output.TextUnit.
This represents a text unit with its tokens.
TextUnit() - Constructor for class de.dfki.lt.tools.tokenizer.output.TextUnit
This creates a new instance of TextUnit.
TextUnit(List) - Constructor for class de.dfki.lt.tools.tokenizer.output.TextUnit
This creates a new instance of TextUnit that contains the given tokens.
Token - class de.dfki.lt.tools.tokenizer.output.Token.
This represents a token with its type and surface image.
Token() - Constructor for class de.dfki.lt.tools.tokenizer.output.Token
This creates a new instance of Token.
Token(int, int, String, String) - Constructor for class de.dfki.lt.tools.tokenizer.output.Token
This creates a new instance of Token with the given start index, end index, type and surface image of the token.
toString(String) - Method in interface de.dfki.lt.tools.tokenizer.annotate.AnnotatedString
Returns a string representation of the annotated string with the annotation for the given attribute key.
toString() - Method in interface de.dfki.lt.tools.tokenizer.annotate.AnnotatedString
Returns the surface string of the annotated string.
toString(String) - Method in class de.dfki.lt.tools.tokenizer.annotate.FastAnnotatedString
This returns a string representation of the annotated string with the annotation for the given attribute key.
toString() - Method in class de.dfki.lt.tools.tokenizer.annotate.FastAnnotatedString
This returns the surface string of the annotated string.
toString() - Method in class de.dfki.lt.tools.tokenizer.output.Paragraph
This returns a string representation of the paragraph.
toString() - Method in class de.dfki.lt.tools.tokenizer.output.TextUnit
This returns a string representation of the text unit.
toString() - Method in class de.dfki.lt.tools.tokenizer.output.Token
This returns a string representation of the token.
toString() - Method in class de.dfki.lt.tools.tokenizer.regexp.Match
This returns the String matching the regular expression pattern.
tokenize(String, String) - Method in class de.dfki.lt.tools.tokenizer.JTok
This takes a String that contains the text to tokenize and parses it for aLanguage.

X

XMLOutputter - class de.dfki.lt.tools.tokenizer.output.XMLOutputter.
XMLOutputter provides static methods that return an XML presentation of a AnnotatedString.
XMLOutputter() - Constructor for class de.dfki.lt.tools.tokenizer.output.XMLOutputter
 
XML_DOCUMENT - Static variable in class de.dfki.lt.tools.tokenizer.output.XMLOutputter
This is the name of XML elements in the result that describe a document.
XML_PARAGRAPH - Static variable in class de.dfki.lt.tools.tokenizer.output.XMLOutputter
This is the name of XML elements in the result that describe a paragraph.
XML_TEXT_UNIT - Static variable in class de.dfki.lt.tools.tokenizer.output.XMLOutputter
This is the name of XML elements in the result that describe a text unit.
XML_TOKEN - Static variable in class de.dfki.lt.tools.tokenizer.output.XMLOutputter
This is the name of XML elements in the result that describe a token.

A B C D E F G I J L M N O P R S T X