####################################################### # _____ _ __ ___ _ __ _____ _ __ # # | ___| | |/ / / __| | |/ / | ___| | |/ / # # | |___ | / | /_ | / | |___ | / # # | ___| | < \_ \ | < | ___| | < # # | |___ | \ __| | | \ | | | \ # # |_____| |_|\_\ |___/ |_|\_\ |_| |_|\_\ # # TÜ eesti keele spontaanse kõne foneetiline korpus # # # ####################################################### This package was compiled Wed Sep 8 2021 This package corresponds to v.1.2 of the corpus The dataset includes personal data and therefore is not fully open. The data can be accessed after making a non-disclosure agreement with the University of Tartu Institute of Estonian and General Linguistics. The corpus is ment to be used for linguistic research and training of NLP models. Some files are restricted to academic use only. Applicants must provide their research plan. For applying access to this corpus please contact partel.lippus@ut.ee. The repository contains following folders: EKSKFK_doc - metadata: speakers, recordings, labelling tiers SKK0_TG SKK1_TG SKK2_TG SKK3_TG SKK0_WAV SKK1_WAV SKK2_WAV SKK3_WAV SKK0_keypoints SKK3_keypoints SKK3_resp_TG SKK3_resp_WAV WAV - the sound recordings TG - TextGrid annotations (see metadata folder for tier info) keypoints - - OpenPose data (json by frame, see frame resolution in metadata) resp_WAV & resp_TG - respiratory data (mp4 files are not included in the repository. If you need them please contact partel.lippus@ut.ee) Also a plain text version of the word-level annotation is available in the repository: EKSKFK_words_by_IPU_full_corpus.txt If you are using R check out library(textgRid) for reading TextGrid files in R. Also library(rPraat) & library(phonTools) may come handy. See the recordings metadata file for more information about the tiers & if they are created by a script or are hand labelled. In filenames: recordingID-speakerID_gender. More info see the html documents in this repository & visit https://foneetikakorpus.ut.ee/ You can generate a citation from the DataDOI link to refer to the corpus in your publications. Please use the corpus only for the research purposes that you stated in your application. Do not redistribute the files and keep your copies safe. Do not publish personal data. If you have questions: partel.lippus@ut.ee