I’ve received a lot of mail over the years that amounts to “OK, but how do I do it?” This page, adapted from the first chapter of Advanced Language Construction, is an attempt to answer that question, as well as similar questions like “How do I know when I’m done?” and “Is it weird enough?” And don’t miss the section on how to gloss.
Basic outlineBeginning a novel, you have to face the horror of staring at a blank page. It’s easier with a conlang: you can start by writing an outline! Then you can stare at a blank outline instead.
Here’s the overall outline I start with:
IntroductionIf you’re not used to outlining, the idea is to state your topics and their order before you actually write anything. You don’t write straight ahead from the first sentence of the introduction all the way to the words starting with Ž. You can work on topics in any order; the outline makes sure they’re in the right place and you don’t forget anything.
When you think of a new topic, add it to the outline; you don’t have to fill it out immediately. Topics can have subtopics, to any level you like. For instance, you could go add subtopics to Phonology right now:
Any modern word processor, like Word, will have useful facilities to work with outlines. E.g. you can move entire topics around (their subtopics and text will come with), or view just the titles of the outlines without the text.
Start adding text to the topics, in any order. You could start with a list of vowels (you can make a nice table later):
You may find it helpful to add a symbol so you know what hasn’t been filled in yet. I use STD or $$$. Then I can jump quickly to the first uncompleted section by searching for this text.
Some of the sections may not make sense for a given language, or will logically appear in a different place. E.g. if you have an alphabet, it’s more convenient to treat that under Phonology; while if you have an isolating language, you may have no inflectional morphology at all.
I am simple caveman, not know ‘computer’You can work on paper if you prefer— that’s how I did Verdurian. Just expect to go through multiple drafts.
If you use a binder and loose pages, you can easily replace just a section of the grammar. Start new sections on a new page, and keep everything about a language together— avoid having your notes in five different piles or notebooks.
You can keep a dictionary in alphabetical order by maintaining two columns and just writing in one. New words get placed in the second column. When it starts to get unreadable, it’s time to make a new edition. Index cards work too, with less rewriting.
Plan of attackI work on a grammar iteratively, going back and forth between sections. But my overall progress usually looks something like this:
How do I choose?How do you know which features to add, which way to implement them, what the word for ‘fish’ should be?
Some people struggle with this; I hope it’ll help if I say that there is no right answer. No one can tell you when you need to break out the ergativity machine or drop in some evidentials.
Creating a language is much like drawing a cartoon character, where you just arbitrarily decide whether it’s a male or female, human or dog or turtle, how big to make the nose, whether to add a ponytail or a dashiki. The skill is in the naturalism, detail, and consistency, not in the choice of accoutrements.
Though you can certainly interpret a non-English feature in your own way, it’s always a good idea to look at natural models. If you have the print LCK or Advanced Language Construction, review the appropriate section. If not, check out the grammar of a language that has that feature, or at least look it up on Wikipedia.
Which language is this?Here’s some good advice you probably won’t take: don’t start with your main language— that of your protagonists or major story setting.
You’ll get better at conlanging as you do more. Your first language is likely to be the least satisfying.
What I recommend is to work first on your protolanguage— the ancestor of your main language. Then use the SCA to derive the words for its descendant. This will not only give you a more naturalistic vocabulary, it’ll give you an ancestor you can borrow learned words from.
(Is there anything special about creating a protolanguage? No, it’s just a language. It doesn’t have to be like Latin.)
I should note that if you use the SCA from a large wordlist, you will of course start with a large wordlist. That’s great! The gotcha here is assuming that every word means the same as in the parent language. A large number of them should change meaning. And for more naturalism, many words should come from a derived form, as e.g. French soleil ‘sun’ comes from the diminutive of Latin sōl.
Creating paradigmsI work out the morphology pretty early, because without it I can’t create sample sentences. You can leave gaps, but it’s hard to (say) introduce a whole new dimension of verbal conjugation late in the process.
The key moment in creating a paradigm is not deciding on the affixes, but creating the structure of the table. So if you create a blank table
you’ve already decided that your verbs are conjugated by person and number— and already eliminated interesting alternatives like obviative, dual, gender, and politeness forms!
Similarly you can easily create a present tense paradigm, then past and future, and not even realize that you never considered aspect, modals, or irrealis forms.
So, take a moment before filling out the table to think about whether it has the features you really want. (You can add more dimensions later; but if you do, don’t forget to check your sample sentences in case they need updating.)
If you look at an actual paradigm, like the present tense of French finir ‘finish’—
you may wonder where all that juicy variation comes from. How do you know how different to make the endings, or how many identical endings speakers will put up with?
This is agglutinative, but with two different pluralizers, -chik and -ku. The former is used when the listener is included, i.e. in the 2p and the inclusive 1p.
I like to keep the Morphology section focused on the paradigms, leaving their usage to the Syntax section. That’s for two reasons:
Placeholders vs. filling outIf you’re aiming at a grammar like mine, it’s apt to be 25+ pages of dry linguistic prose. Don’t be intimidated by the task of generating all that text. Start with placeholders, like this:
Questions: auxiliary verb polAssuming you’ve worked out how auxiliaries actually work, that’s all you need to actually write questions. In the final Munkhâshi grammar, I expanded this as follows:
It’s not just a matter of writing full sentences; trying to explain the procedure, you’ll find you have to work out minor details. In this case: what if there’s another auxiliary; how is the question answered; what about negative questions (not shown).
It’s work to create sample sentences and glosses, but every sentence you write is another chance to develop the vocabulary and add new points to the language.
Wordcrafting on the goAs you work on the grammar you’ll be inventing words; never create one without adding it to the lexicon, in alphabetical order. Not only does this ensure they don’t get lost, but it keeps you from accidentally creating homophones. Plus, it’s a lot of work to generate a lexicon, so every bit you do gets you closer to the finish!
E.g. the Dhekhnami word for swim was entered into the lexicon like this:
I always use a table format, which looks neater. If there are morphological peculiarities (such as the out-of-control plurals in Xurnese), I indicate these in a column just after the word itself.
(Some languages have a morphology that just spits on alphabetical order— e.g. Old Skourene agaşti ‘beloved’, eguşeta ‘romance’, gşiutta ‘affair’, and iggşet ‘loving’ are all formed from one root. So the lexicon is sorted by roots, and all these words are entered under gaşt- ‘love’.)
It’s a good habit to provide a part of speech column. This provides another place for morphological data (e.g. gender of nouns, conjugation class for verbs), it disambiguates glosses (e.g. ‘a bear’ vs. ‘to bear’), and it allows searches— e.g. you can look for all your prepositions.
Another good habit is to provide multiple glosses. Fight the tendency to make every word a one-for-one equivalent of one English word. This makes your language more naturalistic, and can save time later when you find you need the other word.
Extra credit if you take the time to work out some quick derivations. E.g. swim could generate words for swim (n), swimmer, swimming hole. Extra extra credit if some of the derivations aren’t also derivations in English. E.g. swim-thing might be the word for fish; swim + diminutive might be bathe.
I hate to create a word without an etymology. Dhekhnami is created mostly from Munkhâshi using the SCA, so to invent math I actually created mat, added it to the Munkhâshi lexicon, and ran it through the SCA. Often I’ll borrow the word instead, or derive it as a compound.
Words usually don’t retain a single meaning for millennia on end— you should often take the opportunity to modify the meaning of an inherited or borrowed word.
How do you look up a word when you need it? Well, you’re doing this on the computer, right? Use the search function. If it’s a common word, you can save time by placing the cursor at the beginning of the lexicon, or just keep your lexicon in a separate file.
An alternative is to include a separate English-to-Conlang lexicon. That’s not a bad thing to have, but it’s a huge hassle to maintain, and it makes it all too easy to create ciphers of English— e.g. you create a word for can and later when you want to translate ability you create a different word just because ability doesn’t yet have an entry. So it’s best to create such a lexicon when the language is pretty much done.
Here’s a checklist, not at all exhaustive, of things that you should consider putting in the grammar somewhere.
Is it complicated enough?You may be trying for a simplified language— or you’re just in a hurry to get done. But a hallmark of natural languages is their almost fractal complexity. There’s always another exception or complication, and linguists can write entire dissertations on a single word.
Complexities may occur to you if you just think hard about a feature. Say you’re thinking about comparatives: you work out how to say bigger than a mammoth. Revolve the concept of comparison around in your head— does your method work on these cases?
superlatives (biggest of all); note that speakers may turn absolutes into intensives (fortissimo = very strong)You can’t always think of such variations just staring at the computer. Alternatives include looking at other people’s grammars, and waiting till interesting cases come up in sample texts.
Sometimes an idea that didn’t make it into the morphology may pop up elsewhere. E.g. French doesn’t have evidentials as a morphological category, but it can use the conditional as one: il serait allé can be used for “he supposedly went”. English doesn’t have a topic particle, but clefting is a substitute: what I’m looking for is a cheap bicycle.
Another source of complication is to think about variations of dialect or register. Come up with three ways to solve the problem and assign one to the yokels from Nowheresville and another to colloquial speech. If you’ve derived your language from a parent, the newer language may have innovated a new method but kept the parent’s method in formal written language.
Six quirky constructionsLanguages are full of minor constructions with their own odd syntax; here’s a sampling. You don’t have to address these in particular; the point is that once you start looking you’ll find more and more.
Is it simple enough?Maybe you’re making an auxlang, or a pidgin, or an interlanguage for talking to AIs, or something else where simplicity is a virtue. In that case the thing to watch for is borrowing complexities from English (or other natlangs) that you don’t really need.
Check your verb conjugations... do you really need each dimension of inflection? Do you need time and aspect?Less radically, you can ruthlessly combine categories, in the manner of the Australian avoidance languages. These are languages that were required for all conversation with taboo relatives, such as mothers-in-law. One word in the avoidance language often corresponded with half a dozen in ordinary language— e.g. nyirrindan in Jalnguy stood in for seven Guwal words used for different kinds of spearing or poking. You might have only one word for all sorts of small omnivores, or all older relatives, or all ways to hurt someone. It’s less precise, but it works and it sure cuts down on words.
(Hey, while I’ve got the book open, here’s a cool word from Guwal: banyin means ‘get a stone tomahawk and bring it down on a rotten log so the blade is embedded in the log, then pick up both tomahawk and log by the handle of the tomahawk and bash the log against a tree so that the log splits open and the ripe grubs inside it can be extracted and eaten.’)
But yeah, it’s generally less interesting to just redo English or do a neo-Romance language. How close is your languge to the following?
Standard Fantasy Phonology (i.e. English plus kh)If it’s pretty close— again, it’s no sin, but you’re not taking advantage of the breadth and strangeness of natural languages. Review the options given in the Language Construction Kit; even more are in the print books.
I’m generally satisfied if I can point out four or five ‘interesting features’ of a language... these can be unusual features, or just things I want to play with. For instance, for Old Skourene:
If you’re creating an auxlang, you don’t want weirdness per se, but if your idea can be described as “Esperanto done right”, be aware that Esperanto is blandly European and that its creator would have done well to learn a lot more about Amerindian or East Asian languages.
Sample textsWriting texts in your language is like exercise: it’s work, but it’s good for you. Every sentence you write is an opportunity to develop the lexicon, confront syntactic oddities, and show off the culture.
For the last reason, I don’t advocate translating standard texts (like the Babel story). Instead, showcase something from your culture. Some ideas:
A conversation with a visitor (a chance to work out greetings and other mechanics of conversation)If your conculture differs spectacularly from modern earthly models, focus on that. E.g. the Lé are female-dominant, so one of my Lé sample texts is a pious letter from a mother instructing her son on how to fit into the matriarchal clan he’s marrying into.
В России все работают на заводе.Ha, I’m just winding you up. You don’t need all of that— though it’s all useful. In order, the lines are:
You won’t be able to provide the native writing system unless you have a font for it, and if your Phonology section is good enough the phonetic representation is just a convenience. So that leaves us with the transliteration, gloss, and translation. When I was starting out I’d often skip the gloss, but now I think it’s essential. It allows the reader to follow the grammatical descriptions without learning the language. (And it’s a big help even if they are learning it.)
Glosses are chunky to read. You could try expanding them—
in | Russia singular genitive | everyone plural nominative | work third person plural present indicative | on | factory singular locativebut that’s not really more readable, is it?
The convention is that - separates morphemes, while . separates words required to explain the morpheme. So work-3p.pres.ind above means that rabotajut is divided into two morphemes:
That is, the dots tell us that 3p.pres.ind describes a single, indivisible morpheme. We can use the same convention for words that require more than one word in the English gloss; e.g. we could gloss French sortir as go.out.
Compare Quechua llamka-n-ku which means the same as rabotajut but whose gloss is work-3-pl. That is, -n-ku can be divided into -n = 3rd person, -ku = plural.
Some people like the neatness of a tabular format, though I think it’s overkill and makes the transliteration hard to read:
An alternative is the approach J. Randolph Valentine takes in his Nishnaabemwin Reference Grammar:
Gii-gshkitoon wii-nsaaknang Maanii shkwaandem.Although this takes a lot of space, it fits the language since (as the glosses suggest) there’s a lot of grammatical information to get across.
The translation should be unforced English, not an attempt to capture the feel of the original— that’s what the glosses are for. For instance, if you’re translating Quechua
Gringuqa hamukunsi kaballupi.don’t try to use the nuances or syntax of the original:
As for the gringo, he came, I’m sure, by horse.Rather, supply the sentence as we’d say it:
A gringo was coming along on a horse.The reader can look at the glosses to see the differences from English. You can force it a bit if you are contrasting two constructions— e.g. if you had a variation with hamukunmi, which uses the hearsay evidential -mi rather than the direct knowledge evidential -si, you can write contrasting glosses:
(I know) a gringo was coming along on a horse.
The print and e-book versions of the Kit— and its sequel, Advanced Language Construction— are full of even more information. To keep the online kit simple, I’ve left out a lot of detail, as well as fascinating natlang examples. If you get very far into creating languages, both volumes are well worth picking up!
And check out the web resources here.
Back to Outline
Back to Sounds
Back to Grammar