# The timing and quality of diphthong components in spontaneous Estonian (data)

CC by, Pärtel Lippus 2019

This is the data and code for the paper:

Lippus, P., & Asu, E. L. (2019). The timing and quality of diphthong components in spontaneous Estonian. In S. Calhoun, P. Escudero, M. Tabain, & P. Warren (Eds), Proceedings of the 19th International Congress of Phonetic Sciences, Melbourne, Australia 2019 (pp. 1139–1143). Australasian Speech Science and Technology Association Inc. https://www.internationalphoneticassociation.org/icphs-proceedings/ICPhS2019/papers/ICPhS_1188.pdf


## The repository contains the following files:

  1. leia_diftongide_formandid_fonkorpusest_ver2.praat -- A Praat script for extracting the data from the Phonetic Corpus of Estonian Spontaneous Speech (see https://foneetikakorpus.ut.ee). The script is lightly commented in Estonian.
  2. lippus_asu_icphs2019_diphthongs_2018-11-21.txt -- The dataset: a table with tab-separated values in txt format; stressed first open syllable monophthongs and diphthongs of disyllabic words; F1, F2 F3 values in Hz from 30 equidistant points. The data returned by the Praat script has been semi-manually cleaned from noise, these steps are not documented here. For more information see the paper. 
  3. lippus_asu_icphs2019_diphthongs_code.R -- The R script for runing the analysis and creating the figures for the ICPhS 2019 paper. This R script file has been revisited for a very light cleanup and commenting 17.10.2025.
  4. ICPhS_1188.pdf -- the Lippus & Asu 2019 paper.
  5. icphs_dift_fig1.png, icphs_dift_fig2.png, icphs_dift_fig3.png -- figures created from the data with the R code for the paper.


## The variables in the data file lippus_asu_icphs2019_diphthongs_2018-11-21.txt

  - file -- file ID (anonymised)
  - phon -- the word transcription in SAMPA 
  - morph -- morphological analysis (from Vabamorf)
  - ftong -- is the vowel monophthong or a diphthong (mono/dift)
  - quantity -- the quantity of the word (Q2/Q3)
  - vok1 -- phoneme label in the beginning of the vocalic sequence
  - vok2 -- phoneme label in the end of the sequence (in case of mono vok2 == vok1)
  - vok_start -- start time of the vocalic sequence
  - vok_stop -- end time of the vocalic sequence
  - vok_boundary -- boundary between vok1 and vok2 (if ftong == dift)
  - formant_ceiling -- the optimal ceiling for the formant analysis
  - f1_1 .. f1_30 -- F1 values from 30 equidistant points
  - f2_1 .. f2_30 -- F2 values
  - f3_1 .. f3_30 -- F3 values
  - vowel -- vowel label for the whole sequence
  - disp -- ?
  - SP -- speaker ID (anonymised)
  - gender -- gender (F/M)
  - vok1.1 -- first character of vok1 label (the phoneme without diacritics) 
  - vok2.1 -- first char of vok2
  - dift -- vok1.1 + vok2.1
  - dur -- duration 
  - dur2 -- ?
  - dift_piir


--- end of the document ---