Author of the README file: Piia Taremaa Last updated: 24.08.2021 ------------------- GENERAL INFORMATION ------------------- Title of the dataset: Data and R code for "Verbs of horizontal and vertical motion: a corpus study in Estonian" DOI: ... URI: ... Description: Data and statistical code used in the paper "Verbs of horizontal and vertical motion: a corpus study in Estonian" (accepted by the Finnish Journal of Linguistics 2021); Piia Taremaa Licence: CC-BY Acknowledgements: This work has been supported by the Estonian Research Council grant (PSG671) and by the European Union through the European Regional Development Fund (Centre of Excellence in Estonian Studies) Aim of the study: To investigate any possible differences between the use of the verbs of horizontal motion and vertical motion in Estonian Contact: Piia Taremaa, University of Tartu (Jakobi 2-446, Tartu), piia.taremaa@ut.ee -------------------- DATA & FILE OVERVIEW -------------------- The dataset consists of the following files: - Documentation: '00_README_HorVertVerbs.txt'. This file (= the current document) contains the documentation of the dataset. - Data: 'hv_2021_Taremaa.txt' and 'hvgr_Taremaa.txt'. - Statistical code: 'R code for the paper 'Verbs of horizontal and vertical motion' (Taremaa 2021).txt'. --------------------------------------------------- DATA-SPECIFIC INFORMATION FOR 'hv_2021_Taremaa.txt' --------------------------------------------------- File: 'hv_2021_Taremaa.txt'. This file contains annotated corpus data. Corpus clauses (i.e., raw data; not included in the file) originate from the Estonian fiction corpus (https://www.cl.ut.ee/korpused/segakorpus/eesti_ilukirjandus_1990/) and from the Estonian newspapers’ corpora (accessed via Keeleveeb: http://www.keeleveeb.ee/). Clauses are manually coded for the following variables by the author of the paper. The coded data is analysed in R (see statistical code). The definitions of the variables can be found in the paper. List of the variables (levels): * SentenceNr (Sentence0101, Sentence0102, etc.) * HorVert (HorVerb, VertVerb) * Verb (jalutama 'walk, stroll', kukkuma 'fall', etc.) * Frequency (numeric values) * Genre (Fiction, Journal) * SpatExprPresent * Source (yes, no) * FromDirection (yes, no) * Location (yes, no) * Trajectory (yes, no) * Direction (yes, no) * Goal (yes, no) * Distance (yes, no) * MannerInstr (yes, no) * Result (yes, no) * Cause (yes, no) * Purpose (yes, no) * CoMover (yes, no) * Time (yes, no) * Tense (Present, Past) * Aspect (Unspecified, Perfective, Progressive) * Polarity (Aff, Neg) * Mood (Indicative, Conditional, Imperative, Jussive, Quotative) * Voice (Pers, Impers) * Person (1st, 2nd, 3rd, Unclear) * Number (SG, PL, Unclear) ----------------------------------------------------- DATA-SPECIFIC INFORMATION FOR 'hvgr_2021_Taremaa.txt' ----------------------------------------------------- File: 'hvgr_2021_Taremaa.txt'. This file contains corpus data in which the grammatical verb-related variables are recoded as binary variables. List of the variables (levels): * SentenceNr (Sentence0101, Sentence0102, etc.) * HorVert (HorVerb, VertVerb) * Verb (jalutama 'walk, stroll', kukkuma 'fall', etc.) * Polarity_Aff (yes, no) * Polarity_Neg (yes, no) * Mood_Indicative (yes, no) * Mood_Conditional (yes, no) * Mood_Imperative (yes, no) * Mood_Jussive (yes, no) * Mood_Quotative (yes, no) * Voice_Pers (yes, no) * Voice_Impers (yes, no) * Aspect_Unspecified (yes, no) * Aspect_Perfective (yes, no) * Aspect_Progressive (yes, no) * Tense_Present (yes, no) * Tense_Past (yes, no) * Person_1 (yes, no) * Person_2 (yes, no) * Person_3 (yes, no) * Person_Unclear (yes, no) * Num_SG (yes, no) * Num_Pl (yes, no) * Num_Unclear (yes, no)