Kura is built around the idea of linguistic data. Linguistic data are stored in a database and relations between the data are then created, either by the linguist or automatically.
There are four core types of linguistic data: texts, recordings, scans and lexicon. Of these, texts and lexicon are analyzed by Kura. Recordings of fieldwork sessions or scans of manuscripts are related to texts.
In contrast to other systems, Kura stores texts in the database in parsed and tagged form. This makes it easier to create relations, but more difficult to re-assemble complete texts. This is one of the reasons for the relative slowness of the interlinear editor.
Kura stores facts about many languages and can be used by many users. Which linguistic fact has been entered by which user is stored, too, so it is always possible to account for the data in a scholarly way.
After data has been entered, it is possible to publish it. Kura not only has a range of export formats for texts (and in the next version for lexica), but can present data directly on the web. Using hyperlinks, users of the data can trace their own path through a language. For instance, when reading a certain text, a user might want to look up a word in the lexicon. He clicks on the word in the lexicon, and Kura shows the lexeme, and all the sentences where that lexeme occurs in the corpus.
Kura is extensible: every important language fact can be annotated with user-defined tags. For instance, a text can be tagged with one or more references, and a word in a text with a glosse, a translation, a syntactical function or another bibliographic reference. Kura leaves you free in the creation of these tags.
text | A text from a certain language; a connected narrative | stream | Sentences in a text, or phrases, at the descretion of the linguist. | element | Words in a text; can be subdivided into subelements, like morphemes or phonemes. | tag | A bit of information that's 'tagged' onto a text, a stream, an element or a lexeme. Tags can be either a short free-format text, a longer free-format text, an entry out of a predefined list or an entry out of the list of references. |
Kura uses a relational database to store the linguistic data. I've not yet succeeded in making this completely transparent to users, so I'd like to appeal to their intelligence and keep the following in mind:
Kura supports Unicode throughout. You can enter Unicode text with the qcharmap utility. Just press the button on the toolbar to make the Unicode editor pop up. You can continue using the character map even if you open another Kura window. Simply choose a script, click on the character, and cut and paste the result where you want.
Five minutes of work on the web with a search engine will find you many Unicode fonts. I am particularly satisfied with Microsoft's Arial Unicode and the Gnu Unicode font. Both cover a large range of characters, but Arial Unicode will show combining characters (for instance a vowel and a diacritic mark) better.
When using combining characters, type the base character first, and then the diacritic.
Previous: Installation | Next: Default configuration | Table of contents