Eine kurze Liste.
- Mecab (C++) https://github.com/taku910/mecab
- JUMAN http://nlp.ist.i.kyoto-u.ac.jp/index.php?JUMAN
- JUMAN++ http://nlp.ist.i.kyoto-u.ac.jp/index.php?JUMAN++
- Chasen http://chasen-legacy.sourceforge.jp/
- Cabocha https://github.com/taku910/cabocha
Japanese Dependency Analysis using Cascaded Chunking - cmecab-java (JNI-binding für MeCab) http://code.google.com/p/cmecab-java/
- kuromoji (Java) https://github.com/atilika/kuromoji | http://www.atilika.org/
- Sudachi (Java) https://github.com/WorksApplications/Sudachi
- igo (Java) http://igo.sourceforge.jp/
- Rakuten MA (JavaScript) https://github.com/rakuten-nlp/rakutenma
- KyTea http://www.phontron.com/kytea/
Verfügbare Corpora
- IPADIC (nicht mehr gepflegt) http://sourceforge.jp/projects/ipadic/
- NAIST Japanese Dictionary (IPADIC Nachfolger) http://sourceforge.jp/projects/naist-jdic/
- Unidic http://unidic.ninjal.ac.jp/ | http://sourceforge.jp/projects/unidic/
- UniDic 近代文語 (basiert auf Unidic) http://www2.ninjal.ac.jp/lrc/index.php?UniDic%2F%B6%E1%C2%E5%CA%B8%B8%ECUniDic
Überblick
Inhalte
0 Kommentare