Thagojian is (or was) an Indo-European language, used up until the tenth or eleventh century along Mediterranian coastal regions between approximately Port Said and Tel Aviv. The language is known from very few extant documents, and research is currently at a very early stage.
It is written in a script derived from uncial Greek (much like Coptic), and indeed shares a few additional glyphs with Coptic (specifically shai, hori and qima). There are also glyphs borrowed from Hebrew (specifically yod, ayin and tzadi). There is also a single letter, which stands in the position of digamma, and seems to have the same value (that is /u/), but the origin of the symbol itself (which looks like a roman-alphabet ‘s’) is unclear at this time.
Here’s the Latin-1 alphabet, with a following apostrophe on a consonant representing an acute.
a b g d e u ts é th i ë k l lh m n n' ks o p r s t ï ph kh ps ó s' h q
There are a number of phonetic readings of these letters, what follows seems to be the best average:
/A b g d e u ts) E T i @ k l K m n N ks) o p r\ s t i\ f x ps) O S h ?/
The greatest variation in pronunciation is in the realisation of q, which is variously thought to be any one of a number of uvular, pharyngeal, epiglottal or glottal sounds. What is certain is that it is the reflex of the PIE laryngeals *h2 and *h3.
There is another transcription scheme, known as the “Germanic” scheme, which makes the following graphic and phonetic substitutions:
é -> ä, ë -> ö /2/, ï -> ü /y/, and ó -> å
The value of r as /r\/ is somewhat disputed, but the main proof comes from the fact that Semitic ‘tapped’ resh is borrowed as d and not as r.
Since s and h are both valid single letters, in plain text, a digraph is distinguished from two consecutive characters by placing a period between the latter. In typeset text, the intervening period becomes an underdot on the second letter.
The close vowels i, ï and u seem to be able to be used as consonants, almost certainly representing the equivalent approximants (the most common consensus is /j/, /M\/ and /w/). Some authors transcribe these cases as y, ÿ and w, but this is not consistent, and has lead to some minor controversy in cases where the syllabicity of a close vowel is unclear.
The most that can be said about the genealogy of the language at this stage (beyond its Indo-European heritage) is that it is more specifically of the satem branch, that it retains the PIE laryngeal phonemes in some positions, and that the sound changes from PIE seem to be very simple and straightforward, perhaps indicating that the phonological development of the language was somehow slowed down or frozen in place. There is speculation that the use of the language as a liturgical language is responsible in some way for this degree of conservatism, although the fraction of extant texts that are indeed liturgical is not especially high. Some doubt is also cast upon this theory by the degree of grammatical change which has occured, which is very significant. Thagojian is quite inflectional, much more so than other Indo-European languages, and much more so than either Arabic or Hebrew. Some would say it is almost polysynthetic, although that is probably an overstatement.