Word-forming apostrophes and glottal stops

Apostrophes

Some words, such as the Sena word kang'ombe, use apostrophes as word-forming characters. However, not all apostrophes can be used as word-forming characters. Language Explorer (FLEx) uses the Unicode properties for characters to decide which apostrophes are simply punctuation. This is crucial towards allowing FLEx to correctly find wordforms when it processes a text. Specifically, an incorrect apostrophe entered in a Baseline tab will cause that single word to appear as two words in the Gloss and Analyze tabs, such as kang and 'ombe from kang'ombe. As long as FLEx can correctly find the wordforms, then the parsers will see correct wordforms, not partial wordforms.

Use U+0027 APOSTROPHE or 02bc MODIFIER LETTER APOSTROPHE for the word-forming apostrophe rather than 2019 Right Single Quote mark.

02bc MODIFIER LETTER APOSTROPHE is defined as word-forming in the Unicode Standard so its use would be appropriate, but it has disadvantages:

If you already have a special keyboard, adding this (02bc) probably would not be too hard. Otherwise, use 0027 APOSTROPHE. This does not have the rounded appearance, but it is available in all fonts and does not require anything special to enter it.

 Tip

Glottal Stops

(Adapted from: https://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&id=EncodingFAQ)

If you want something that looks like a curly quote you should use U+02BC MODIFIER LETTER APOSTROPHE. You could use 2019 RIGHT SINGLE QUOTATION MARK, but there are at least two issues with that. It is considered punctuation with different properties than an orthographic character and if you use quote marks there is nothing to distinguish between the two characters. (Our Roman fonts (Doulos SIL, Charis SIL, Andika Basic and Gentium Basic) all have an alternate glyph for 02BC MODIFIER LETTER APOSTROPHE which is a bit larger than normal to help distinguish the glyph from 2019 RIGHT SINGLE QUOTATION MARK.)

Many orthographies have used something that looks like the straight quote. There were so many problems with using 0027 APOSTROPHE for this character that we requested the addition of a character to Unicode for that. You should use A78C LATIN SMALL LETTER SALTILLO (one language even "cases" this and A78B is used for the uppercase). (Our Roman fonts (Doulos SIL, Charis SIL, Andika Basic and Gentium Basic) all have an alternate glyph for A78C LATIN SMALL LETTER SALTILLO and A78B LATIN CAPITAL LETTER SALTILLO which are a bit larger than normal to help distinguish the glyph from 0027 APOSTROPHE.)

02BE MODIFIER LETTER RIGHT HALF RING is sometimes used for transliterating Arabic hamza (glottal stop). This looks different from both A78C LATIN SMALL LETTER SALTILLO and 02BC MODIFIER LETTER APOSTROPHE and might be a good option for traditions which recognize the transliterated hamza.

Some Saskatchewan orthographies use an upper and lowercase glottal stop. Those are 0241 LATIN CAPITAL LETTER GLOTTAL STOP and 0242 LATIN SMALL LETTER GLOTTAL STOP.

Of course, the IPA representation is 0294 LATIN LETTER GLOTTAL STOP and some languages also use this in their orthographies (where casing is not required).

Related Topics

Citation Form field

Insert overview

Lexeme Form field

Technical Support

Text Edit overview

Treat punctuation as word-forming characters

Valid Characters dialog box