Author of the README file: Piia Taremaa 
Last updated: 08.12.2022

-------------------
GENERAL INFORMATION
-------------------

Title of the dataset: Data and R code for "Speed and space: semantic asymmetries in motion descriptions in Estonian"
DOI: doi.org/10.1515/cog-2021-0132
URI: https://www.degruyter.com/document/doi/10.1515/cog-2021-0132/html
Description: Data and statistical code used in the paper "Speed and space: semantic asymmetries in motion descriptions in Estonian" (Cognitive Linguistics, Ahead of Print, published online December 8, 2022). 
Authors of the paper: Piia Taremaa, Anetta Kopecka
Licence: CC-BY
Acknowledgements: This study has been supported by the research fund of Kadri, Nikolai and Gerda Rõuk, by the Estonian Research Council grant (PSG671) and by the European Union through the European Regional Development Fund (Centre of Excellence in Estonian Studies).
Aims of the study: (i) to establish any possible speed effects in clausal structures of Estonian motion descriptions and (ii) to examine the relationship between the goal-over-source and fast-over-slow bias.

Contact: Piia Taremaa, University of Tartu (Jakobi 2-446, Tartu), piia.taremaa@ut.ee

--------------------
DATA & FILE OVERVIEW
--------------------

The dataset consists of the following files:

- Documentation: '00_README_Taremaa&Kopecka_Speed and space.txt'. This file (= the current document) contains the documentation of the dataset.
- Coded data: 'Taremaa&Kopecka_data_Speed and space_CL10.11.2022.txt' (full data is available for researchers upon request)
- Statistical code: 'R code for 'Speed and space' (Taremaa&Kopecka).txt'.


--------------------------------------------------------------------------------
DATA-SPECIFIC INFORMATION FOR 'Taremaa&Kopecka_Speed and space_CL10.11.2022.txt'
--------------------------------------------------------------------------------

File: 'Taremaa&Kopecka_Speed and space_CL10.11.2022.txt'.

This fail contains annotated data of the corpus material (12,300 clauses). Corpus material is taken from the Estonian National Corpus 2019.

Clauses are coded for the following variables (see below). The coded data is analysed in R (see statistical code). 
The definitions and levels of the variables can be found in the paper.

List of the variables:
 [1] "ID"                 
 [2] "ID_long"            
 [3] "Verb"               
 [4] "VerbType"           
 [5] "VerbSpeed"          
 [6] "VerbCat"            
 [7] "Source"             
 [8] "SourceLength"       
 [9] "SourceForm"         
[10] "Location"           
[11] "LocationLength"     
[12] "LocationForm"       
[13] "Trajectory"         
[14] "TrajectoryLength"   
[15] "TrajectoryForm"     
[16] "Direction"          
[17] "DirectionLength"    
[18] "DirectionForm"      
[19] "Goal"               
[20] "GoalLength"         
[21] "GoalForm"           
[22] "Time"               
[23] "TimeLength"         
[24] "TimeForm"  
[25] "Purpose"            
[26] "PurposeLength"      
[27] "PurposeForm"        
[28] "Result"             
[29] "ResultLength"       
[30] "ResultForm"         
[31] "Distance"           
[32] "DistanceLength"     
[33] "DistanceForm"       
[34] "MoverAnimacy"       
[35] "MoverAnimacyForm"   
[36] "Manner"             
[37] "MannerLength"       
[38] "MannerForm"         
[39] "SlowOrFast"         
[40] "SpatialExprPresence"