Package com.inet.jorthodictionaries
Class BookGenerator
- java.lang.Object
-
- com.inet.jorthodictionaries.BookGenerator
-
- Direct Known Subclasses:
BookGenerator_ar,BookGenerator_de,BookGenerator_en,BookGenerator_es,BookGenerator_fr,BookGenerator_it,BookGenerator_nl,BookGenerator_pl,BookGenerator_pl_Engish,BookGenerator_ru,BookGenerator_ru_templates,BookGenerator_sv
public abstract class BookGenerator extends java.lang.ObjectHow to use- Download the latest Wiktionary file "pages_articles.xml". It is typical compressed. The position changed. I found it last at:
- http://dumps.wikimedia.org/arwiktionary/latest/arwiktionary-latest-pages-articles.xml.bz2
- http://dumps.wikimedia.org/dewiktionary/latest/dewiktionary-latest-pages-articles.xml.bz2
- http://dumps.wikimedia.org/enwiktionary/latest/enwiktionary-latest-pages-articles.xml.bz2
- http://dumps.wikimedia.org/eswiktionary/latest/eswiktionary-latest-pages-articles.xml.bz2
- http://dumps.wikimedia.org/frwiktionary/latest/frwiktionary-latest-pages-articles.xml.bz2
- http://dumps.wikimedia.org/itwiktionary/latest/itwiktionary-latest-pages-articles.xml.bz2
- http://dumps.wikimedia.org/nlwiktionary/latest/nlwiktionary-latest-pages-articles.xml.bz2
- http://dumps.wikimedia.org/plwiktionary/latest/plwiktionary-latest-pages-articles.xml.bz2
- http://dumps.wikimedia.org/ruwiktionary/latest/ruwiktionary-latest-pages-articles.xml.bz2
- start the Generator with follow command line:
java -Xmx256m com.inet.spell.wiktionary.BookGenerator de
-
-
Constructor Summary
Constructors Constructor Description BookGenerator()BookGenerator(Book book)
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description private voidaddFileToZip(java.util.zip.ZipOutputStream out, java.lang.String filename, boolean delete)protected voidaddWord(java.lang.String word)Add a word to the tree.private voidcreatePackage(java.lang.String language)Generate the distribution package(package private) BookgetBook()Get the resulting book for the current generator.protected intindexOf(java.lang.String string, char[] chars, int fromIndex)Help function for parsing the Wiktinary formats.(package private) abstract booleanisValidLanguage(java.lang.String word, java.lang.String wikiText)Check if a word is a valid word of the current language.protected booleanisValidWord(java.lang.String word)Check if the word is valid word.static voidmain(java.lang.String[] args)(package private) voidsave(java.lang.String language)private voidsaveStatistics(java.io.File dictFile)Create statistics data and save it in statistics.txt(package private) voidstart(java.io.File file)Beginn des einlesend der Daten von dem XML stream
-
-
-
Field Detail
-
book
private final Book book
-
-
Constructor Detail
-
BookGenerator
BookGenerator()
-
BookGenerator
BookGenerator(Book book)
-
-
Method Detail
-
main
public static void main(java.lang.String[] args) throws java.lang.Exception- Throws:
java.lang.Exception
-
start
void start(java.io.File file) throws java.lang.ExceptionBeginn des einlesend der Daten von dem XML stream- Parameters:
stream- Daten im XML format- Throws:
java.lang.Exception
-
save
final void save(java.lang.String language) throws java.lang.Exception- Throws:
java.lang.Exception
-
saveStatistics
private final void saveStatistics(java.io.File dictFile) throws java.lang.ExceptionCreate statistics data and save it in statistics.txt- Parameters:
dictFile- the created ortho file.- Throws:
java.lang.Exception- if an error occur
-
createPackage
private final void createPackage(java.lang.String language) throws java.lang.ExceptionGenerate the distribution package- Throws:
java.lang.Exception
-
addFileToZip
private final void addFileToZip(java.util.zip.ZipOutputStream out, java.lang.String filename, boolean delete) throws java.lang.Exception- Throws:
java.lang.Exception
-
indexOf
protected final int indexOf(java.lang.String string, char[] chars, int fromIndex)Help function for parsing the Wiktinary formats.- Parameters:
string- zu durchsuchender Stringchars- the searching charchters, can not be emptyfromIndex- Startposition der Suche. Index beginnt bei 0.- Returns:
- erstes vorkommen einer der Zeichen in chars oder -1, wenn nicht gefunden.
-
isValidWord
protected boolean isValidWord(java.lang.String word)
Check if the word is valid word. This exclude help pages and some phrases. It should be call ever before addWord(String)- Parameters:
word- the to check- Returns:
- true, if the word is valid
-
addWord
protected final void addWord(java.lang.String word)
Add a word to the tree.- Parameters:
word- can not be null
-
getBook
Book getBook()
Get the resulting book for the current generator.- Returns:
- the book
-
isValidLanguage
abstract boolean isValidLanguage(java.lang.String word, java.lang.String wikiText)Check if a word is a valid word of the current language. With function getBook().addWord() you can add additional Flexion of the word. The current word self does not need added.- Parameters:
word- the test wordwikiText- die decription from Wiktionary- Returns:
- true if valid
-
-