Instituudi põhiülesanne on teha õppe- ning teadus- ja arendustööd ning osutada ühiskonnale vajalikke teenuseid eesti keele, soome-ugri keelte ja üldkeeleteaduse alal.

The Institute of Estonian and General Linguistics conducts in-depth teaching and world-class research on Estonian and related languages in comparison with other world languages.

Featured Dataverses

In order to use this feature you must have at least one published or linked dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

Advanced Search

111 to 120 of 303 Results
ZIP Archive - 38.6 KB - MD5: dbc981c6fe482b43f8dc36852609957e
Praat & R scripts used for collecting data from corpora
Comma Separated Values - 1.1 MB - MD5: 60304eb2631a86333ca5b4da41c723f9
lemma bigrams
Comma Separated Values - 431.6 KB - MD5: d5f65620d2f5a6fe94262c835f589f0d
lemma frequency
Comma Separated Values - 1.5 MB - MD5: 7a58847e323eba35884d1bf356f67f90
lemma trigrams
Comma Separated Values - 2.8 MB - MD5: d98bd25d56707e973a21f6a44d6ca322
wordform bigrams
Comma Separated Values - 3.6 MB - MD5: 07f805ac6b40ce4c44cccf8b1dc43507
wordform frequency
Comma Separated Values - 2.9 MB - MD5: df229d1024daad3a1950c0bedceec17d
wordform trigrams
Feb 22, 2024 - Eesti ja üldkeeleteaduse andmed
Vihman, Virve-Anneli; Pilvik, Maarja-Liisa; Mandel, Aive; Kängsepp, Annika; Aigro, Mari; Koreinik, Kadri; Praakli, Kristiina; Lindström, Liina, 2024, "Estonian Teen Language Corpus", https://doi.org/10.23673/RE-455, DATADOI, V1
Estonian Teen Language Corpus (Eesti teismeliste keele korpus) is a corpus representing spoken and written language data, collected from Estonian teenagers (ages 9-18) between 2019-2023. The corpus consists of four types of files. Spoken language data is represented by .eaf and .tsv files (spoken_eaf.zip, spoken_tsv.zip), and contain transcriptions...
ZIP Archive - 363.4 KB - MD5: 8d01fd1ef57043886016a536e915d9c8
Chat corpus in HTML form
ZIP Archive - 26.0 MB - MD5: 6e3d007398977ff9ed42619c6a13aa43
Chat corpus pictures
Add Data

Sign up or log in to create a dataverse or add a dataset.

Share Dataverse

Share this dataverse on your favorite social media networks.

Link Dataverse
Reset Modifications

Are you sure you want to reset the selected metadata fields? If you do this, any customizations (hidden, required, optional) you have done will no longer appear.