A Primer In
SF XENOLINGUISTICS
- ash nazg
durbatulûk -
TABLE OF CONTENTS
- borag thungg -
FANTASY EXOTIC TONGUES - An Introduction
If you're here chasing the search-string "+fantasy
+exotic +tongues", then I'm afraid you've probably come
to the wrong place. This is a companion article to my
Guide to SF Chronophysics, dealing with
another random feature of fictional world design. It is
written in the same spirit as the chronophysics guide: I'm willing
to tolerate any level of implausibility in the quasiscience as long
as these flaws are irrelevant to the linguistic issues! (For a
definition of this term "quasiscience", see my
Star Trek Rant.) And I'll try to
avoid jargon, since there's no reason a pedantic distinction
between (e.g.) inflectional and derivational morphology should be
relevant for nonterrestrial languages. So for instance the
Universal Translators section is not
intended as an assessment of how close we are to building
natural-language interpreter machines in real life - it's just a
somewhat idiosyncratic survey of the excuses available for SF
writers who want to avoid dealing with irritating language
barriers.
By the way, when I say "SF" I don't mean to exclude
Fantasy. This is one of the reasons I avoid the word
"scifi": I can claim where convenient that the
abbreviation "SF" stands for something nice and inclusive
like "Speculative Fabulation" or "Secular
Fantasy"!
- klaatu barada nikto -
LET'S SPEAK ALIEN - In Ten Easy Lessons
Ever wondered how all those traditional space-opera and
epic-fantasy races - the pig-faced warriors, the smug bumheads, and
all the rest - came up with their wonderfully clichéd alien
vocabularies? It's not difficult; once you've mastered these
basic rules, you'll be able to produce names and phrases just as
stereotypical as theirs!
- LESSON ONE
- Languages described as "High", like High Martian, Old
High Vulcan or indeed High Draconic, aren't from upland regions
(as is the case for, e.g., High German) - they're ancient and
complicated prestige dialects preserved from the days when the
Empire was much bigger and better and more sophisticated.
Speaking them requires considerable effort, dramatic gestures, and
often a special capital-T Talent.
- LESSON TWO
- Sounds (and sequences of sounds) common in English are still
possible in Alienese, but much less common - there are no exotic
alien worlds called Stritty or Thudgewundle. Sounds (and
sequences of sounds) entirely unused in English are also very rare
in Alienese - no Star Trek character will ever be named
Bwäølh or Ngì! But sounds (and sequences
of sounds) uncommon in English are abundant in Alienese; hence the
alien races known as the Xeelee, Chirpsithtra, and Githyanki.
- LESSON THREE
- Initial K is especially popular (Kazon, Klendathu, Krell,
K'kree). Incidentally, there's a good reason for this (and
one I'll credit to Steve
Mowbray): aliens are obsessed with triangles, a particular
shade of green, the number three, and the letter K because they
learned everything they know from our TV broadcasts. To be
more specific, from a particular episode of "Sesame
Street".
- LESSON FOUR
- Aliens enjoy designing their words to look like Latin or Greek,
or occasionally Hebrew; they make heavy use of classical sounds
spelt in classical ways, such as X, QU, TH, and PH - hence Thranx,
Zarquon, Tholian, Cylon, etcetera. Some, such as the
Romulans, Centauri, and Draconians, take it a step further and
steal entire words out of Latin dictionaries (or Atlantean TV
broadcasts, maybe).
- LESSON FIVE
- What's more, aliens tend to put classical-looking endings on
their names: -ON and -OS are particular favourites for planet
names (Axos, Auron, Gothos, Krypton), and if there's any sign of
females, their names will end in an unstressed -A (Thuvia,
Belanna, Dua, Ardana).
- LESSON SIX
- A civilisation of billions of individuals will have no trouble
allocating each one a unique, pronounceable name a syllable or two
long (e.g. Worf, G'kar, Worsel, Kal-El). They may even manage
to make them all alliterate. Exceptions to this rule usually
have very long names indeed, though there are a few planets where
everyone is called Bruce to save time.
- LESSON SEVEN
- The names of a species, empire, language, homeworld, homestar
and so on will all be self-evidently related; Ogrons come from
Ogros, Arisians come from Arisia, Arcturans come from Arcturus,
and Humans no doubt come from Humus.
- LESSON EIGHT
- When the endings aren't pseudoclassical they usually follow the
Middle-Eastern standard: Pakistan-i, Minbar-i, Tymbrim-i,
Kimdiss-i. Such words often serve both as racial adjective
and collective noun, removing the need for a distinct plural;
where alien plurals do occur they either end in -I (Fyndii) or
occasionally -N (Thrintun).
- LESSON NINE
- A name dominated by guttural consonants and sibilants (Cthulhu,
Troxxt, Chasch) indicates savagery; one with lots of front vowels
and sonorants (Alderan, Eloi, Emereli) implies a more civilised
nature. Except of course that mysterious gas-giant races
always have names like thunderous farting.
- LESSON TEN
- If they use apostrophes, ignore them - they're not
serious. Some aliens will try to tell you that "'"
stands for an obscure vowel (F'lar, T'pau, Sp'thra), or a silent
consonant (Dra'Azon, Ka'a Orto'o), but in reality it's purely
decorative. It's not clear why they choose to use
apostrophes rather than, say, umlauts (à la
Mötley Crüe) - or peculiar alien squiggles,
come to that. Maybe they just want to keep things convenient
for ASCII.
- cthulhu fhtagn -
THE UNSPEAKABLE - And The Unthinkable
If you're the kind of person who read the Silmarillion just for
the linguistic appendices (No? Oh, well, it's only me then),
you'd probably prefer your SF languages not to be quite like the
ones lampooned above. So what's the alternative? Well,
I suppose you could use the Simpsons Manoeuvre - to quote
Kang: "No, actually I'm speaking Rigellian. By an
astonishing coincidence our two languages are exactly the
same!" But to the best of my knowledge, only Star Trek
has ever had the nerve to offer this excuse with a straight face
(see e.g. "Bread and Circuses")... so if that's out,
you're left having to imagine a real alien language. What
could that be like?
Well, there are plenty of ways in which alien languages could be
extremely unearthly. The most basic variables are those of
FORMAT:
- Phonology: even human languages vary widely not just in
the sounds they use (nasalised clicks, uvular implosives, four
distinct kinds of "L") but in the sequences of sounds
they permit (English uses "h-a-ng" and yet rejects the
equally pronounceable sequence "ng-a-h").
- Modulation: whereas we rely mainly on pitch and quality
to carry meaning, aliens might make equivalent use of volume and
tempo (so that
...n...i...k...t...o... is
"abort program" and NIKTO
is "hurry"), or rely on rhythms and harmonies calibrated
for alien aural equipment.
- Articulation: audible languages can still be unspeakable
for those with terrestrial mouthparts. And never mind
deciding whether you should spell it as "Cthulhu fhtagn"
or "K'thooh looph'dægh-n"; you'll be lucky if you
can identify any of the sounds involved ("Erm, was that
Rroahrgh! or Wroarrgh?").
- Medium: nonaudible (e.g. ultrasonic) or nonauditory
(e.g. visual) languages obviously pose major hardware problems for
any interpreter. More subtly, different media introduce
varying natural protocols for communication; some depend on direct
one-to-one physical contact, some require "listeners" to
keep quiet until the "speaker" stops talking, and others
make utterances accessible from anywhere forever after (by
http!).
Understandably, writers (including screenwriters) tend to shy
away from untransliteratable dialogue - but no such problems
arise from variations in GRAMMAR:
- Exoticity: if you only know English, or even if you know
half a dozen European languages, you might imagine that Denebian
would necessarily have equivalents for "preposition" or
"plural ending", but no; that's less alien than
Japanese! Not all languages use the same toolkit of elements
(adverb, adjective, infinitive, etc), or signpost the same things
(number, case, person, etc); see the "Coding" section
below. And even familiar categories like "plural
noun" can be indicated in a dizzying array of ways -
with freestanding "plural-marker" particles; with
additions, changes or even reshuffles at the beginning, middle or
end of the noun; with changes in accompanying articles, or
adjectives, or verbs, or other words that just happen to be
around.
- Conjugations: the upshot of all this is that knowing the
words "me" and "go" won't necessarily help you
recognise the phrase "I went" in ET-speak (any more than
it would in English); if the direction was to windward rather than
leeward it may take a completely different motion-prefix!
- Rearrangements: the rules for arranging words in
sequence are no more variable than all the rest, but scrambled
word-order is disproportionately popular in fiction as a marker of
alienness. Easy to carry over into English it is!
- Architecture: seriously alien grammars may throw out
the whole terrestrial "tree-structure" scheme, replacing
it with some bizarre kind of "stack" or
"hash", but I feel no urge to attempt to describe such
horrors.
And on the traditional third hand, it's commonplace for languages
to make some things more and others less convenient to communicate
about by means of alternative styles of CODING:
- Lexicalisation: you probably already knew that Eskimo
dictionaries contain innumerable words for "snow"... so
it's a pity that factoid's not true! They only have about as
many basic words for it as we do, though Eskimo does have a lot of
specialised seal-hunting vocabulary items (and "iglu" is
their general-purpose word for house).
- Le Mot Juste: people often get very excited about
the idea that (e.g.) "Martians have no word for war",
forgetting that a lexical gap this easy to fill with a paraphrase
or loanword is unlikely to tell us much about their familiarity
with the idea. After all, until the twentieth century
Terrans had no word for genocide. It's unlikely that any
concept that's understandable could ever be totally
inexpressible; nonetheless, an alien language might make an idea
formidably awkward to conceptualise or communicate, and the
closest translation may have any manner of strange built-in
associations. Even if the word "pity" is in
a Dalek's vocabulary banks, it may be listed as a synonym for
"errorcode 7 (failure to exterminate caused by temporary
targeting failure or neural dysfunction)".
- Naming: English creates or borrows new words fairly
readily; extreme creativity or conservatism might make for a
genuinely interesting alien vocabulary. Contrariwise, while
assigning stable "proper names" to things is quite
limited in English, it could be common for (say) Elves: if you
live for centuries, you have time to learn the nicknames of
individual oak trees... but who'd bother naming something as
shortlived as a cat?
- Grammaticalisation: a language's favourite distinctions
may be drawn by means of independent separate coinages
(king/queen, mother/father) or with compounds
(chairman/-woman, prince/princess). But the whole thing can
also be built into the syntax, so that words take gender-marked
adjectives (blond/blonde) or pronouns (he/she). Now, in
place of gender, try imagining the same things being done to mark
duration, proximity, certainty, agency, approbation, or urgency,
and not being done to distinguish singular/plural or
affirmative/negative.
- Inescapables: often, concepts are treated as so
important that they're built into words and sentences
automatically, whether they're relevant or not. Thus it's
unnecessarily difficult to be neutral with regard to social
position in Japanese, to gender in Esperanto (see
Ranto), or to tense in English (see
Chronophysics).
So, is there anything that can't vary? Well,
there's no surplus of hard evidence, but I'd say that all true
languages (see footnote) must have in common the
following characteristic properties, necessary for any
general-purpose communicative mechanism:
- STRUCTURE
- Utterances are constructed and interpreted out of modular
constituents according to systems of combinatorial rules (capable
of creating arbitrarily many novel sentences). Some
proto-sentient species might in principle do this
"primitively" (as some sort of
pidgin), but a "holistic, impressionistic grammar" with
no such structure won't get them far.
- DISCRETENESS
- Meanings depend on systems of distinctions between
finite sets of elements (sounds, words, grammatical forms, and so
on), not on the subtle shadings of their individual
qualities. Continuous variables such as stress may be useful
for expressing generalised attitudes and overtones, but they're no
good for specifics.
- CONVENTIONALITY
- Words are arbitrary labels, not representations of what they
denote. The word "giraffe" isn't particularly
giraffelike, "big" is a small word, and dogs don't
really say "bark". Even in sign-languages and
pictographic writing systems, few meanings are guessable from
their signs - aliens may use sonar onomatopoeia, but it won't
make their grammar any more comprehensible.
- METAPHOR
- The "literal-minded" languages of many space-opera
Ancients are impossible; all linguistic categories and rules are
formed via analogies, explicit or implicit. The definition
of the word "dance" presupposes a resemblance between
waltzes and raves - and pluralising "days" is a
fossilised metaphor too: when did you last see a stack of
them? On the other extreme, the "allusive
language" of "Darmok" (see
Star Trek Rant) is unworkable because
it's all metaphor and no grammar.
- ABSTRACTION
- Phenomena can be discussed other than those directly apparent
to the senses, including spatially or temporally remote events,
hypothetical or generalised situations, counterfactual fantasies,
and lies. The speakers may have trouble with them, but the
language will always provide for "saying the thing that is
not".
- CONTEXTUALITY
- Meanwhile, phenomena that are directly apparent or
previously established can be referred back to by means of special
shortcut forms - pronouns and point-of-view-dependent
expressions as in "you were behind them". Aliens
with no way of expressing the first person (even as "this
person now speaking") are unlikely - they'd need unique
absolute identifiers for every person, place, moment and
event!
Then again, while I'm sure about xenolinguistics being a branch
of linguistics, can I entirely rule out the idea of intelligent
lifeforms who simply don't have language? The usual
suggestion is that instead they have some sort of (telepathic or
biochemical) "hive mind". Well, maybe. But
unless they've got some standardised way of encoding concepts for
portability from brain to brain, it's not going to be enough to
distinguish them from animals - and if they have, that's
essentially a language by another name.
- zog -
UNIVERSAL TRANSLATORS - A Buyer's Guide
Okay, so you walk into the spaceport bar and discover that nobody
within a kiloparsec speaks English (or
whatever it's become by the year 3000
AD). You may think you'll be able to tell what that Kzin is
saying just by the tone of his voice; you may think you can signal
your friendly intentions by showing your teeth a lot; you may even
hope he'll speak your favourite interlanguage (cf. Don Harlow's
notes on
Esperanto
and Science-Fiction). But no, take my advice: it's time
to invest in a Universal Translator system (henceforth
"UT")! And beware of dodgy characters trying to
flog second-hand 3PO units or fishy-looking implants; here are some
guidelines to help you avoid wasting your credits on something
likely to get you lynched or brainwashed.
For the convenience of any alien readers (especially Gubru,
Ramans and the like), all the points made come in sets of
three.
- "UNIVERSAL"
- There is some room for flexibility here - you needn't
splash out on a Star-Trek-style model that can handle any number
of unknown alien tongues at once, as long as it doesn't specialise
in, say, Basque-to-Tamil. Does it need to be able to cope
with thoroughly alien mindsets, non-auditory languages, and so
forth, or is everyone in the bar tediously humanoid? And
when you encounter a language it doesn't already have on file, how
does it learn new ones? Possibilities include:
- Plug-And-Play - "canned" languages, which
you can upload into the UT or your own brain. Note the social
implications if you can learn Xemahoa on a whim, Gothic as a
fashion statement, or Nonesuch to baffle eavesdroppers. Just
be careful with black-market language tapes; don't buy any that
claim to be "doubleplusgood".
- Exchangers - including "handshaking"
computers that swap lexicons on contact (do you really want to
give away security-risk terms such as "hypnosis" to
unknown aliens?), and psychic "language chamaeleons"
that can reply in any dialect they encounter (be careful not to
use "royal we" back to God-Emperors).
- The Hard Way - language learning by prolonged
interaction with cooperative native speakers. Unless you've
got magical assistance, don't expect mere minutes of eavesdropping
to help. Even if the locals point at a spaniel and say
"That's a dog!" you can't be sure they mean "dog =
Canis familiaris"; it could be "dawg =
barking" or "tsäd = Fido"!
- "TRANSLATOR"
- When it comes to the "three levels of translation"
it's worth being clear about your requirements:
- "Literal" or word-by-word translation is less
useful than monoglots tend to imagine; the intelligibility of the
results is proportional to the relatedness of the source-language
to the target-language. This is hopeless for Syrians, let
alone Sirians - if you pass that first line through
Babelfish a couple
of times, you get: "literal" or the word for the
translation of the word is little use, of that monoglots if
inclines to present itself.
- "Official" or phrase-by-phrase translation is
organised by legal conventions about equivalences, and restricted
to subjects with narrow, codified jargons. If you're an
interplanetary lawyer or civil
engineer, this might be adequate; but don't expect it to
convey subtexts. Actually, unless your UT device is
ridiculously good it's always wise to steer clear of fancy figures
of speech, such as jokes or irony ("do I look
stupid?") - be literal and tolerant of apparent threats,
insults, and the like.
- "Psychological" or concept-by-concept
translation is the ideal, producing exactly the same effect on a
speaker of the target-language as the original would on a speaker
of the source-language. This objective is next to impossible
for anything less expensive than a trained human-equivalent
brain. When you're talking about your family life to a
Sontaran warrior-clone, syntax, idiom, and social background
knowledge blur into one another as things that need to be
translated; so how much are you willing to pay for? A
"shallow" UT confuses "Pat owns an orange
pelt" with "Pat has red hair"; better models can
deal with slang, xenoethnological trivia, and allusions to Oolon
Colluphid.
- MECHANISM
- Never mind the questions "What is it? How does it
work? Where is it?" (to quote
Mary
Shapiro's rant); all that matters is that there are three
broad categories of UTs, each of which has its own pros and
cons.
- Polyglottisers - mechanisms by which one or both
participants can come to understand all of the languages involved
(i.e. either a cyberpunk plug-in language-chip or a psionic/magical
Pentecost Effect). The drawbacks of this are that if you
become a monoglot again afterwards, you're left with baffling
memories (if only "Why was that funny?"), and if there's
any truth whatsoever to Whorfian Relativism, exotic languages may
influence the decisions you make thinking in them! The
effects of human tongues such as Hopi are arguable, but UTs
capable of making you take as natural the conversational instincts
of a radar-using methane-shark shaman are another thing
entirely.
- Psi-Dubbing - reads what a speaker is thinking and
provides a voiceover, itself preferably telepathic. Alien
brains may turn out to be unreadable, or all minds may prove to be
readable regardless of native tongue - it all depends on
whether there's a universal nonlinguistic "language of
thought" for the Psi-Dubbing to work in (a very Chomskyan
thing to imagine). Unfortunately, if it works it may
outflank all efforts at diplomacy - at any rate, it sounds
like a monstrous breach of privacy - and it requires
improbable technology such as psionics or neurotelemetry.
Farscape's playful suggestion of "translator microbes"
is about the most plausible version I've heard of!
- Cyberinterpreters - "expert system"
translators, with robot bodies. Audio-only "Pocket
UTs" wouldn't be able to handle situations as simple as a
trip to a Spanish grocer's: the correct rendering of "I'll
have that one!" depends on whether it's nearby and
feminine (¡ésa!), distant and masculine
(¡aquél!) or whatever. The body needn't
be humanoid, but any decent Machine Translator would have to be
such a flexible and intelligent AI that it deserves civil rights
(I pity C-3PO, kept as a slave translator for biochauvinist rebels
in a society where everything understands English anyway).
Nonetheless, you may have to slow down to allow your interpreter
to keep up - a good one may start translating your sentences
"incrementally" before you finish, but anything
approaching a "simultaneous" translation takes an awful
lot of processing power. So if your UTAI starts speaking at
the same moment as you do, shut up and let it do the talking.
- FEATURES
- Check how good the system is at coping with mismatches between
languages in the following fields:
- Vocabulary - if one language lacks an idiomatic
match for the other's expressions, UTs may paraphrase or neologise
to fill the gaps ("he contracted blue fever from a jabberwock
bite"). More problematic are partial matches: we
say "cousin (parent's sibling's child)", they say
"cousin (relation of the same moiety and age-grade)";
they say "water (dihydrogen monoxide)", we say
"water (salt or fresh, but always liquid)". The
classic example of this problem is the inventory of basic
colour-terms: Russian discriminates between "light blue"
and "dark blue", while Hanunôo uses a single term
to cover the entire green/blue end of the spectrum. And as
for Jovians...
- Agreement - when a Betelgeusian calls you
{addressee-agentive-adult-nondistributed}, the UT throws
away as irrelevant all the surplus grammatical features and
focusses on its function, parallel to English
"you". Then when you in reply address the
Betelgeusian as "you" the UT has to somehow come up with
the extra details the alien agreement system involves. This
kind of routine trimming and padding takes a good deal of creative
fudging - especially if there's a risk the Betelgeusian might
later turn out to have intended you to pay attention to some of
the discarded "trivial" details. The UT could play
safe and insist on conveying every last ambiguity and nuance
explicitly, but this is excruciatingly difficult, not to mention
distracting (exercise: try paraphrasing the precise differences
between "he ate the biscuit" and "she has eaten a
cookie").
- Implicatures - ordinary communication relies on
speakers obeying a set of "conversational maxims".
They shouldn't say things that are either uninformative or
self-evident; that are inaccurate; that are irrelevant; or that
are either otiose or ambiguous. Of course, people
aren't always apposite, reliable, etc, but the interesting
point is that infractions are often themselves communicative: if I
say something blatantly inappropriate, it usually means there's a
subtext to be found. The trouble is, aliens are likely to
have different conventions about such things, and inhuman
intuitions about what needs to be pointed out ("You're very
tall..."), what level of hyperbole is acceptable
("Nobody ever comes this way"), what's relevant
("Are you hungry?" - "It's daytime!"),
and what's concise (Ents and Vorlons never get along).
- INTERFACE
- Ensure the UT's output is appropriately customisable -
assuming it has output; if the UT's skills are seamlessly
integrated into your own mind, you may not have these options.
- Confidence - if it has to guess at a translation
does it plough on with fingers crossed, flash unintelligible
warning lights, or constantly interrupt with questions?
Backchat can be annoying ("Is that and/or or
either/or?", "You realise that's not answering
the question?"), but it's the best way of dealing with errors
when they do occur. Internalised UT systems don't pose these
problems, but external ones make great scapegoats...
- Anthropomorphism - does it credit you with a lot of
background expertise (talking of "zitidars",
"foreclaws", and "spoo") or recast everything
patronisingly into familiar analogies ("elephants",
"thumbs", "apple pie")? And what style
of analogy does it use - does 144 baph-'l-ghab equate
to "503 km" or "a hundred leagues"? Does
gobl-digûk become "many mooncycles" or
"several of your Earth years"? Or does the UT just
convert everything into Galactic Standard hexadecimal Planck
units?
- Diplomacy - should the UT respect linguistic taboos
and conversational etiquette, or translate insults as
insults? If your UT can cope with multiple stylistic
registers, it can be set to convert between them and filter out
(or enhance) the profanities. Come to that, any UT that can
translate an oratorical welcome can be expected to summarise it
too. There's no need to reply with a formal speech of
gratitude; simply configure the translator to turn your terse
colloquial English into polite long-winded Vilani. Etc,
blah, waffle.
- ack ack ack, ack ack ack
ack -
CETI FOR BEGINNERS - Little Green Manuals
You may or may not be surprised to learn that some extremely
serious attempts have been made to specify in detail the best ways
of opening up communications with Extra-Terrestrial Intelligence,
either in radio broadcasts or on the White House lawn. To
start with the classic example:
And here are some other TEFLETI projects:
Personally, I say if they can't be bothered to work out in
advance how to say "Take me to your leader" then they
can't be trusted to drive unlicenced starships around in our
atmosphere... Blast them out of the sky before somebody gets
hurt!