I have been learning Esperanto lately (see here). One of the really cool features of the language is affixes. Basically, you can create new words using some simple morphological rules, e.g.:
bona (good) → bonulo (good person)
juna (young) → junulo (young person)vorto (word) → vortaro (group of words = dictionary/vocabulary)
arbo (tree) → arbaro (group of trees = forest)
There are a lot of affixes (at time of writing I have 48 suffixes and 18 prefixes), so I thought it might be useful to write a small program to create new words by randomly attaching these affixes, then quizzing myself on them.
soup.py
)Usage is like this, for example:
> soup(root=u'hundo', n_p=1, n_s=4, cheat=True)
hundo + pseŭdo : false
+ uj : container for objects described by root
+ esk : similar to/in the manner of whatever is described by root
+ eg : augments or strengthens idea shown by root affix(opposite of -et)
+ ec : quality/characteristic defined by root
pseŭdohundujeskegeco
n_p
is the number of prefixes, n_s
suffixes.
The cheat
flag toggles printing the explanation.
So let's interpret pseŭdohundujeskegeco
... this is an abstract noun, the
quality/characteristic of being a large thing similar to a container for false
dogs. Or a false quality of being a large thing similar to a container for dogs.
The order of interpretation is clear for suffixes or prefixes, I'm not sure
how to resolve it when both are present.
This is obviously a ridiculous word which no normal person would use, but I find generating and interpreting these very entertaining. Another example...
baledejarinegestro
: boss of an enormous, somehow female collection of ballet theatres
I could go on all day. To save myself the effort of doing this, I automated it. So now there's a...
vortidplenigilo
:
tool to make [something] full of word derivatives, from vorto + ido + plena + igi + ilo
.
Every hour (or so), it tweets a random root (grabbed from a dictionary)
with a random number of suffixes and prefixes. Code is in the same repo as
before, see vortidplenigilo.py
. It chooses how many affixes to use based on
two draws from Poisson distributions, preferring fewer prefixes. Since it's
limited by Twitter's 140 character limit, those with n_s
or n_p
above 1 tend
not to make it, unfortunately. Future work will shorten the descriptions so I
can squeeze more in. The selection of which affix is not entirely random,
however...
Not every affix can go on every type of word. Some take nouns and output nouns,
other take nouns and output adjectives, etc. The page I grabbed the affixes from
thankfully lists which transformations
are valid, so I encoded that. See affixes.py
for what is essentially a rendering of aforementioned page into python. The
sort of information I recorded is explicit in this class definition:
class affix(object):
def __init__(self, name=u'undefined',
transformations={'x': 'x'},
explanation='undefined',
conflicts={},
category='undefined'):
self.name = name
self.transformations = transformations
self.explanation = explanation
self.conflicts = conflicts
self.category = category
transformations
is a dict of valid word-type maps, based on word-endings
(since Esperanto is so very regular in this regard). In practice these
dictionaries either have one element (e.g. 'a': 'o'
) or all ('i':'i' for 'i'
in valid_word_endings
), but in theory one could have an affix which turns
adjectives into verbs and nouns into adjectives, I suppose. Or something like
that. My code is future-proofed against complicated Esperanto dystopias. The
point is that as the compound word is created, I keep track of its current 'word
type' and make sure I only accept affixes which are compatible with that (and
then it gets a new type from its new affix, and so on). This all takes place in
the make_soup
function in soup.py
.
explanation
is just the string explaining the affix.
conflicts
is a list of other affixes (by name
) which I forbid to co-exist
in a compound word. The idea is to prevent illogical things like
arbarero
: one of a collection of trees... a tree
dormigiĝi
: to become made to be asleep... to sleep
I'm not entirely convinced I want this, though. For example,
hundetego
: huge small dog
sort of makes sense. Jury is out on this decision.
category
, records what type of affix it is, and is
currently not used. Future version could restrict to true affixes or
adjective suffixes or something. Future proof, yo. Maybe.One of its first tweets was beautifully meta:
morfologiido
: offspring of morphology
I would gladly welcome comments/ideas on the GitHub repository,
be it language suggestions or corrections (since I am still a komencisto), code
fixes, ideas for automatically producing 'interpretations' of the generated
words, or anything else. The contents of affixes.py
might also be useful
for other people doing things with Esperanto.