Dictionaries file format |
Prefixes
Let's have a look at the prefixes dictionary gpl/pierrick/brihaye/aramorph/dictionaries/dictPrefixes :
; conjunctions w wa Pref-Wa and <pos>wa/CONJ+</pos> f fa Pref-Wa and;so <pos>fa/CONJ+</pos>
We can see that commentaries are introduced by ; and that significant lines are divided by tabs whose significance is respectively :
- the prefix' consonantic skeleton (using Buckwalter's transliteration system)
- the prefix' vocalization (using the same system)
- the prefix' morphological category
- one or several translations for the prefix, followed by one or several grammatical categories. Notice the + which indicates that a stem is expected after this prefix.
Some informations are optional. One good example is that of the empty prefix :
; The first category is the null prefix (has a null gloss as well): Pref-0
... where we just have a morphological category.
Suffixes
Let's now have a look at this snippet taken from the suffixes dictionary gpl/pierrick/brihaye/aramorph/dictionaries/dictSuffixes :
; perfect verb, null suffix: banA-h, daEA-h h hu PVSuff-0ah he/it <verb> it/him <pos>+(null)/PVSUFF_SUBJ:3MS+hu/PVSUFF_DO:3MS</pos> hmA humA PVSuff-0ah he/it <verb> them (both) <pos>+(null)/PVSUFF_SUBJ:3MS+humA/PVSUFF_DO:3D</pos> hm hum PVSuff-0ah he/it <verb> them <pos>+(null)/PVSUFF_SUBJ:3MS+hum/PVSUFF_DO:3MP</pos> hA hA PVSuff-0ah he/it <verb> it/them/her <pos>+(null)/PVSUFF_SUBJ:3MS+hA/PVSUFF_DO:3FS</pos> hn hun~a PVSuff-0ah he/it <verb> them <pos>+(null)/PVSUFF_SUBJ:3MS+hun~a/PVSUFF_DO:3FP</pos> k ka PVSuff-0ah he/it <verb> you <pos>+(null)/PVSUFF_SUBJ:3MS+ka/PVSUFF_DO:2MS</pos> k ki PVSuff-0ah he/it <verb> you <pos>+(null)/PVSUFF_SUBJ:3MS+ki/PVSUFF_DO:2FS</pos> kmA kumA PVSuff-0ah he/it <verb> you (both) <pos>+(null)/PVSUFF_SUBJ:3MS+kumA/PVSUFF_DO:2D</pos> km kum PVSuff-0ah he/it <verb> you <pos>+(null)/PVSUFF_SUBJ:3MS+kum/PVSUFF_DO:2MP</pos> kn kun~a PVSuff-0ah he/it <verb> you <pos>+(null)/PVSUFF_SUBJ:3MS+kun~a/PVSUFF_DO:2FP</pos> ny niy PVSuff-0ah he/it <verb> me <pos>+(null)/PVSUFF_SUBJ:3MS+niy/PVSUFF_DO:1S</pos> nA nA PVSuff-0ah he/it <verb> us <pos>+(null)/PVSUFF_SUBJ:3MS+nA/PVSUFF_DO:1P</pos>
The principle is exactly the same, although the example is slightly more complex. Indeed, we have a double suffixes sequence, the first one being the Ø perfective third masculine person suffix, the second one being relative to a pronominal direct object. One will notice the + that operates the junction with the stem then the subsequent + which operates the junction with the Ø suffix.
Stems
Let's have a look at this snippet taken from the stems dictionary gpl/pierrick/brihaye/aramorph/dictionaries/dictstems :
;
;--- ktb
;; katab-u_1
ktb katab PV write
ktb kotub IV write
ktb kutib PV_Pass be written;be fated;be destined
ktb kotab IV_Pass_yu be written;be fated;be destined
;; kAtab_1
kAtb kAtab PV correspond with
kAtb kAtib IV_yu correspond with
;; >akotab_1
>ktb >akotab PV dictate;make write
Aktb >akotab PV dictate;make write
ktb kotib IV_yu dictate;make write
ktb kotab IV_Pass_yu be dictated
;; takAtab_1
tkAtb takAtab PV correspond
tkAtb takAtab IV correspond
;; {inokatab_1
<nktb {inokatab PV subscribe
Anktb {inokatab PV subscribe
nktb nokatib IV subscribe
;; {ikotatab_1
<kttb {ikotatab PV register;enroll
Akttb {ikotatab PV register;enroll
kttb kotatib IV register;enroll
;; {isotakotab_1
<stktb {isotakotab PV make write;dictate
Astktb {isotakotab PV make write;dictate
stktb sotakotib IV make write;dictate
;; kitAb_1
ktAb kitAb Ndu book
ktb kutub N books
;; kitAboxAnap_1
ktAbxAn kitAboxAn NapAt library;bookstore
ktbxAn kutuboxAn NapAt library;bookstore
;; kutubiy~_1
ktby kutubiy~ Ndu book-related
;; kutubiy~_2
ktby kutubiy~ Ndu bookseller
ktby kutubiy~ Nap booksellers <pos>kutubiy~/NOUN</pos>
;; kut~Ab_1
ktAb kut~Ab N kuttab (village school);Quran school
ktAtyb katAtiyb Ndip kuttab (village schools);Quran schools
;; kutay~ib_1
ktyb kutay~ib NduAt booklet
;; kitAbap_1
ktAb kitAb Nap writing
;; kitAbap_2
ktAb kitAb Napdu essay;piece of writing
ktAb kitAb NAt writings;essays
;; kitAbiy~_1
ktAby kitAbiy~ N-ap writing;written <pos>kitAbiy~/ADJ</pos>
;; katiybap_1
ktyb katiyb Napdu brigade;squadron;corps
ktA}b katA}ib Ndip brigades;squadrons;corps
ktA}b katA}ib Ndip Phalangists
;; katA}ibiy~_1
ktA}by katA}ibiy~ Nall brigade;corps <pos>katA}ibiy~/NOUN</pos>
ktA}by katA}ibiy~ Nall brigade;corps <pos>katA}ibiy~/ADJ</pos>
;; katA}ibiy~_2
ktA}by katA}ibiy~ Nall Phalangist <pos>katA}ibiy~/NOUN</pos>
ktA}by katA}ibiy~ Nall Phalangist <pos>katA}ibiy~/ADJ</pos>
;; makotab_1
mktb makotab Ndu bureau;office;department
mkAtb makAtib Ndip bureaus;offices
;; makotabiy~_1
mktby makotabiy~ N-ap office <pos>makotabiy~/ADJ</pos>
;; makotabap_1
mktb makotab NapAt library;bookstore
mkAtb makAtib Ndip libraries;bookstores
;; mikotAb_1
mktAb mikotAb Ndu printer
;; mukAtabap_1
mkAtb mukAtab NapAt correspondence
;; {ikotitAb_1
<kttAb {ikotitAb N/At enrollment;registration;subscription
AkttAb {ikotitAb N/At enrollment;registration;subscription
;; {isotikotAb_1
<stktAb {isotikotAb N/At dictation
AstktAb {isotikotAb N/At dictation
<stktAby {isotikotAbiy~ N-ap dictation <pos>{isotikotAbiy~/ADJ</pos>
AstktAby {isotikotAbiy~ N-ap dictation <pos>{isotikotAbiy~/ADJ</pos>
;; kAtib_1
kAtb kAtib N/ap writer;author
kAtb kAtib N/ap clerk
ktAb kut~Ab N authors;writers
ktb katab Nap authors;writers
;; kAtib_2
kAtb kAtib Nall writing <pos>kAtib/ADJ</pos>
;; makotuwb_1
mktwb makotuwb N-ap written <pos>makotuwb/ADJ</pos>
;; makotuwb_2
mktwb makotuwb Ndu letter;message
mkAtyb makAtiyb Ndip letters;messages
;; mukAtib_1
mkAtb mukAtib Nall correspondent;reporter
;; mukotatib_1
mkttb mukotatib Nall subscriber
;
The format is slightly different since we have a line beginning by ;; whose purpose is to provide a lemma identifier. The remaining is similar however.
We will notice that the grammatical category is often missing since it can be extrapolated from the morphological category. In some cases however, we will have some examples where the grammatical category is to be explicited because, for example, morphological categories like nisbas, which are morphologically nominal, may have adjectival usages.
This would help in a direct processing of the dictionaries in arabic rather than through the Buckwalter's transliteration system, thus taking profit from Java's native Unicode support.
The version 2.0 of the Aramorph's Perl version, uses XML dictionaries, but is unfortunately not compliant with the GPL.
