Introduction

In all spoken languages the sentences are linear, but they can usually be parsed to a tree. “I see a man with a telescope.” can be parsed as “I see (a man) (with (a telescope)).” or as “I see (a (man (with (a telescope)))).” Either way we get a tree where some phrases are modified by others. I call the modifying ones “modifiers”. In English, an adjective can modify a noun, a prepositional phrase can modify a verb or a noun etc. Such systems lower the possibilities of ambiguous syntax, but don’t eliminate them.

This language eliminates syntactic ambiguity while keeping the grammar simple. It has only one part of speech for content words, so no nouns/adjectives etc. The main new idea is to inflect words according to their depth in the parse tree and using head-initial syntax.

Sapolim was inspired by Toki Pona and Lojban. Toki Pona is simple, but very ambiguous, while Lojban is fairly unambiguous, but very complex. I want to combine the best qualities of both of them.

Simplicity

‘Simple’ is not the same as ‘easy’. A simple grammar, for me, is a grammar that can be described formally using few rules. An easy grammar is a grammar that is easy to learn; this quality is subjective. Sapolim is not meant to be easy for anyone, but it is meant to be speakable.

Main Assumptions

Sapolim is always open to further development, but there are some features I’d like to leave intact:

  • a simple and unambiguous grammar,

  • phonetic spelling,

  • few structural words (conjunctions etc.),

  • content words with well-defined, unambiguous meanings,

  • version indicating.

Phonology

There are 11 phonemes: a, e, i, o, u, k, l, m, p, s, t.

Vowels

Vowels are pronounced like in Spanish, Italian, most Slavic languages, Japanese, Esperanto, Lojban, Quenya, Toki Pona and many other languages. More precisely (using IPA):

  • a is an open unrounded vowel: a, ä, ɑ

  • e is an close-mid to open-mid front unrounded vowel: e, e̞, ɛ

  • i is a close front unrounded vowel: i

  • o is an close-mid to open-mid back rounded vowel: o, o̞, ɔ

  • u is a close back rounded vowel: u

Consonants

The consonants (the first sound is always the default):

  • k is any velar or uvular stop: k, ɡ, q, ɢ

  • l is any lateral or rhotic consonant: l, r, ɾ etc.

  • m is any nasal consonant: m, n, ŋ, ɴ etc.

  • p is any labial stop or fricative: p, b, ɸ, β, f, v etc.

  • s is any sibilant fricative: s, z, ʃ, ʒ, ʂ, ʐ, ɕ, ʑ etc.

  • t is any dental, alveoral or retroflex stop: t, d, ʈ, ɖ

    • and affricates: t͡s, d͡z, t͡ʃ, d͡ʒ, t͡ɕ, d͡ʑ, ʈ͡ʂ, ɖ͡ʐ etc.

Phonotactics

In native words (not proper names) consonants and vowels are always alternating and the last sound is always a consonant. If you have a problem with ending words on a consonant, you can add a short unstressed vowel at the end.

Words starting with a vowel should be pronounced with initial glottal stop or a short pause. They shouldn’t be glued to the preceding word. eg. kak sel om lemet (‘I have food’) shouldn’t be pronounced like /kak selom lemet/. Glottal stop is very easy to utter before an initial vowel and you probably do it even if you’re not aware of it.

Stress

Stress is irrelevant. You can divide speech into words easily (as long as there are no proper names), because all words start with a consonant (or glottal stop) and end with a consonant and there are no consonant clusters inside.

Vocabulary

The current vocabulary list is here.

All structural words start with a vowel, all content words start with a consonant.

No two words should have the same consonants in the same order. If two two-syllable content words differ only on one consonant, the second vowel has to be different.

I’ve generated a list of words fulfilling the above conditions and assigned them meanings at random, so there is no connection between the meaning and the shape of the word. The most basic words are monosylabic, others have two syllables.

There are no compound words, not even “compound word” phrases from Toki Pona. You just have to describe what you have in mind using simple words and usually a broader meaning than you would in a more complex language.

Proper names are mostly left in the original spelling or pronunciation.

Inflection

Sapolim has unusual inflection. Each lexical unit (lexeme) can take five possible forms: predicate (the head) or one of four levels of modifiers, called mod1, mod2, mod3, mod4. The idea is that the predicate is modified by mod1, mod1 is modified by mod2 and so on. Only four levels of modifiers are possible in the current grammar.

When you’re modifying a phrase which already has some modifiers, you’re really adding another modifier to the head of this phrase. So “(pretty little) (girls school)” would become ‘school-mod1 girl-mod2 little-mod2 pretty-mod3’ (assuming all of these words are in the dictionary, which isn’t currently the case).

Words inflect by exchanging the first vowel. Vowels are assigned like this:

A
head
E
mod1
I
mod2
O
mod3
U
mod4

For example the forms of pames are: pames, pemes, pimes, pomes, pumes.

Syntax

* means zero or more, ? means zero or one, | is an alternative.

S   ::= STAG* HP
HP  ::= H (TAG M1P)*
M1P ::= M1 (TAG M2P)*
M2P ::= M2 (TAG M3P)*
M3P ::= M3 (TAG M4)*
TAG ::= MTAG? | CONJ

H  ::= CONV* HWORD  PN* NUM*
M1 ::= CONV* M1WORD PN* NUM*
M2 ::= CONV* M2WORD PN* NUM*
M3 ::= CONV* M3WORD PN* NUM*
M4 ::= CONV* M4WORD PN* NUM*
S
sentence
HP
head phrase
M1P, M2P, M3P, M4P
modifier phrases
H
head
M1, M2, M3, M4
mod1, mod2, mod3, mod4
STAG
sentence tag (yes/no question, imperative, politeness)
MTAG
modifier tag
CONJ
conjunction
CONV
conversion marker
*WORD
the actual inflected word
PN
proper name
NUM
decimal digit or digit separator

STAG, MTAG, CONJ, CONV, NUM are structural words. I try to keep the number of such words reasonably low.

Semantics

HP constitutes the predicate of the sentence. M1Ps have multiple roles: subject, object and modifiers. Their roles are ambiguous unless clarified by a tag. By default, the subject comes before the object (Sapolim is a VSO language).

Conversion Markers

Conversion markers are words that are placed before a content word to transform the relation in some regular way.

Abstraction

Some relations have arguments that can be abstractions (events, facts, properties etc.) The abstraction marker it converts a subsentence to an abstraction: ‘S is an event/fact/property/etc. of [subsentence]’.

pames
S loves O
it pames
S is love
palet
S is red
it palet
S is the color red

The relation after the abstraction marker can be modified in the same way as any other subsentence.

kal it letopuk sil
It’s possible that I fight.

I can fight.

kas it is lemet sil it tel it lilil sol
If I don’t eat, I die.
sapes lek it pemes sil lik
You know that I love you.

Passive Voice

The passive voice marker om inverts transitive relations, switching the subject and the direct object.

lamet
S eats/consumes O; S is an eater/consumer of O
om lamet
S is eaten/consumed by O; S is food/drinks of O

Reciprocity

The reciprocity marker makes a predicate symmetric by adding an implicit “and vice versa”.

pames sel lek
I love you.
ak pames sel lek
We love each other.

Reflexivity

The reflexivity marker ut makes a predicate reflexive. The subject of the converted predicate is both the subject and the object of the original predicate.

ut pames op sel
I love myself.

Association

The association marker am transforms a predicate into ‘S is in some relation with X’ where X is bound to the S place of the predicate being transformed. In other words, it’s a contraction of las es. It’s just as vague as las itself.

lam op am pekem
a house of an animal
lam op les es pikem
a house in some relation with an animal

Negation

The negation marker is makes a logical negation of the following sentence or subsentence. It’s equivalent to saying “It’s not true that…”.

pames sel lek
I love you.
is pames sel lek
I don’t love you.
pames sel is lek
I love somebody who is not you.
is pames sel is lek
I don’t love anybody other than you. (but doesn’t imply ‘I love you’)

Modifier Tags

There is a place for an optional tag before each modifier. When the tag is ommited, the role of a modifier is vague and should be deduced from context.

There are no actual noun phrases, each modifier is in fact a subsentence.

Subject, Object

Two most basic tags are op--the subject tag, and es--the object tag. They indicate that the subject (S) of the modifier sentence is linked to the subject (S) or the object (O) of the modified sentence. The subject tag can be used with every phrase, even if its subject is already linked.

pas
S is a part of O
palelol
S is a fruit/vegetable
pas es pelelol
a part of a fruit
talat
S is big/large
palelol op telat
a big fruit
kak
S possesses O
sal
S is me (the speaker/author)
kak op sel es pilelol op tolat
I have a large fruit
In this example ‘large’ is a “subject” of ‘fruit’. This may be not intuitive, but it just means that the S places of ‘large’ and ‘fruit’ are linked, ie. they refer to the same object (which is also linked to the O place of ‘possess’).

Abstraction Modifier (Adverbial)

The adverbial tag il adds a modifier to the abstraction of the modified phrase. In other words, X il Y adds a statement that Y op it X.

lakom sel lek
I see you.
lakom sel lek il temas
I see you + seeing you is good.

Good to see you (or: I see you well).

The adverbial tag may also be used inside an abstraction, which is the easiest way to modify an abstraction (and not the phrase being abstracted).

tamas it pemes op mim
The love of a great person is good.
tamas it pemes il mim
The great love is good.

Modifier Scope

Each modifier modifies the phrase one level lower directly before it, ie. the phrase starting on the last occurence of a form one level lower and ending directly before the modifier. Eg. each M3 modifies a phrase starting with the last M2 and ending just before this M3.

Digits, Proper Names

Digital numbers modifying a word mean quantity:

lam opus usat
‘34 houses’

A digital number always modifies only the last word. If you want to attach quantity to a larger phrase, you have to use mak.

mak
S is one/single
mak opus usat
S is 34 single objects
pal temas mek opus usat
34 good people (literally ‘good people being 34 single objects’)
pal temas opus usat
a person of 34 good things

Proper names are analogous: they can modify a word directly, giving the referred object a name, eg. pel Seko (a person named Seko), or they can attach to om tat:

om tat
S is named O
om tat Seko sel
I am named Seko.
om tat Seko
someone/something named Seko
pel timas om tit Seko
a good person called Seko
You can also introduce yourself using sal (me) as a head: sal Seko.

Conjunctions

There are four conjunctions. They have almost the same syntax as the modifier tags. One od them is a simple logical non-exclusive or: ol.

There is no logical ‘and’. To state many things about something, you can just use multiple modifiers.

palelol op telat op temas
[It’s] a large and good fruit
lamet op sel op pemes
I eat and love.

There is ep (with) which is like ‘and’, but indicates that the connected phrases should be regarded as a sum.

The last one is al (question or), which turns a sentence into a question about which side of the conjunction makes the sentence true.

Modifiers before a conjunction modify only the first part, while modifiers after it (and on the same level) modify the conjoined result.

lam ep pelos
house and land
lem ep pelos am sel
my [house and land]
lem am sel ep pelos
[my house] and land
lam ep pelos am sil
house and my land

Sentence Tags

The yes/no question marker ek turns a sentence into a question: is this sentence true?

The imperative marker im turns a sentence into a command: make this sentence true!

The politeness indicator ikosel is used to express kind intentions, eg. added to an imperative, makes it explicitely a polite request. It can also be used without a sentence as any polite phrases, eg. “Thank you”, “I’m sorry”, “You’re welcome”, “Hello”, “Goodbye” etc.

Odds and Ends

The generic anaphora tam refers vaguely to something mentioned earlier.

tem
he/she/it
pel tim
that person

The next-sentence cataphora mat refers to the next sentence when a relative clause would be used in many natural languages (much like the frequent use of “ni” in Toki Pona). It doesn’t make much sense as a head.

The question word pam turns a sentence into a question: what should be in place of the question word to make the sentence true?

Versioning

I think it’s a good idea to save the possibility for breaking changes in the future. In order to make that as nice as possible I attach version numbers to subsequent versions of the language and keep the documentation of old versions. People will be able to specify which version of the language they’re using in their writings by adding an explicit declaration at the beginning.

The syntax for version declaration is VER PN* NUM+, where VER is a special structural word—version indicator (apulis). Proper names may be used in unofficial forks to identify them. Official versions only use numbers. The current version declaration is apulis akel (version 0) which means that this is not yet officially released and the current version won’t be supported in the future. The official release will be as soon as the textbook is ready.