3 Motion data cleaning

3.1 Setup

Before starting, create a new folder cleaned_data to store the cleaned motion data.

+-- motion_data
|   +-- MDim1_s1_b1_0000.exp
|   +-- MDim1_s1_b1_0001.exp
|   +-- ...
+-- cleaned_data
+-- task_data
|   +-- MDim1 raw.txt
|   +-- trial_data.csv

The following packages should be installed and loaded

library(seathree)
library(motionImport)
library(seacurve)

3.2 Guide

First, load the trial data

trial.data <- read.csv('task_data/trial_data.csv', header = T)

and get a list of files to cleaned

filepaths  <- paste('motion_data/', trial.data$FileName, '.exp', sep = '')

3.2.1 Single trial workflow

To see what the cleaning looks like, we can import the first trial using ImportMotionData(). The function also needs the time window the recording (in this case, from .5 seconds before response to 1.5 seconds after), and the symbol that the .exp file uses to denote missing values (in this case, 0)

timestamps <- c(-.5,1)
na.symbol  <- 0
data <- ImportMotionData(filepaths[1], timestamps, na.symbol)

The result is a list with eight elements, each corresponding to a sensor

names(data)
## [1] "RtmbS2"       "RindexS3"     "RmiddleS4"    "RbackhandS5" 
## [5] "LtmbS7"       "LindexS8"     "LmiddleS9"    "LbackhandS10"

Since this is a right-handed trial, we’ll use the right thumb sensor as an example

sensor <- data$RtmbS2

Each sensor contains a vector of time stamps

head(sensor$Timestamps)
## [1] -0.5000000 -0.4873950 -0.4747899 -0.4621849 -0.4495798 -0.4369748

a matrix of position data (in Cartesian coordinates)

head(sensor$Position)
##       RtmbS2x RtmbS2y  RtmbS2z
## [1,] 0.019186 0.09589 0.041145
## [2,] 0.019242 0.09589 0.041145
## [3,]       NA      NA       NA
## [4,] 0.019242 0.09589 0.041145
## [5,]       NA      NA       NA
## [6,] 0.019242 0.09589 0.041145

a matrix of orientation data (unit quaternions)

head(sensor$Rotation)
##      RtmbS2q0 RtmbS2q1  RtmbS2q2 RtmbS2q3
## [1,] 0.665255 0.442971 -0.388956 0.458177
## [2,] 0.665111 0.443103 -0.388934 0.458277
## [3,]       NA       NA        NA       NA
## [4,] 0.665100 0.443097 -0.388998 0.458244
## [5,]       NA       NA        NA       NA
## [6,] 0.665100 0.443097 -0.388998 0.458244

We can represent these data compactly by converting the sensor to a single dual quaternion trajectory using sc.posrot2dq()

sc  <- sc.posrot2dq(v = sensor$Position,
                    q = sensor$Rotation,
                    timestamps = sensor$Timestamps)
sc[1:6,]
##            Time        q0        q1         q2        q3          q4
## [1,] -0.5000000 0.6652548 0.4429709 -0.3889559 0.4581769 0.004973227
## [2,] -0.4873950 0.6651109 0.4431029 -0.3889339 0.4582769 0.004956442
## [3,] -0.4747899        NA        NA         NA        NA          NA
## [4,] -0.4621849 0.6651000 0.4430970 -0.3889980 0.4582440 0.004960248
## [5,] -0.4495798        NA        NA         NA        NA          NA
## [6,] -0.4369748 0.6651000 0.4430970 -0.3889980 0.4582440 0.004960248
##              q5         q6          q7
## [1,] 0.03635088 0.03661337 -0.01128354
## [2,] 0.03637246 0.03659539 -0.01130351
## [3,]         NA         NA          NA
## [4,] 0.03637210 0.03659507 -0.01130407
## [5,]         NA         NA          NA
## [6,] 0.03637210 0.03659507 -0.01130407

Both the position and rotation data can be plotted using sc.plot():

sc.plot(sc, 'Position')

sc.plot(sc, 'Rotation')

Sparing the details (which can be found on the project webpage), minimal cleaning (in including reorienting the dual quaternions, interpolation, and outlier removal) can be done with the sc.clean() function. This function accepts the sensor data directly, so there is no need to convert the sensor to a Seacurve object first.

# Number of windows for local regression outlier removal
win <- 5 

# Cook's distance threshold for outlier removal
thresh <- .025

# Clean
sc <- sc.clean(sensor, win, thresh)
sc.plot(sc, 'Position')

sc.plot(sc, 'Rotation')

3.2.2 Single trial wrapper

We want a function that accepts a path to a .exp file and returns a list with cleaned data for each sensor. In the event of an error (say, the motion monitor system returns a .exp file with no data), we want keep a record of the error in trial.data. Our function will return a list with two elements: - A logical value valid denoting whether the data were cleaned successfully - A field data containing the cleaned trial data (if any) We’ll then store the value of valid in a column trial.data for our records.

cleanTrial <- function(filepath, timestamps, na.symbol, win, thresh) {
    tryCatch({
        data    <- ImportMotionData(filepath, timestamps, na.symbol)
        pr.data <- lapply(data, sc.process, win, thresh)
        ret     <- list(valid = T, data = pr.data)
        return(ret)
    }, error = function(e) {
        ret <- list(valid = F, data = NULL)
    })
}

3.2.3 Cleaning all trials

Now we’ll loop over all trials in trial.data and save the results. The output of valid will be saved to a column Has.Motion.Data. The cleaned data is saved to processed_data/. As there are almost 18,000 trials, this can be expected to take over an hour.

n <- nrow(trial.data)
trial.data$Has.Motion.Data <- NA

for (i in 1:n) {
    
    # Process trial 
    pr.data <- cleanTrial(filepaths[i], timestamps, na.symbol, win, thresh)
    trial.data$Has.Motion.Data[i] <- pr.data$valid
    
    # If trial has valid motion data, save it to cleaned_data/
    if (pr.data$valid) {
        fname <- paste('cleaned_data/', trial.data$FileName[i], '.rds', sep = '')
        saveRDS(pr.data$data, file = fname)
    }
    
}

Finally, we save the new data frame

write.table(trial.data, file = 'task_data/cleaned_trial_data.csv',
            sep = ',', row.names = F)

The directory structure should now resemble

+-- motion_data
|   +-- MDim1_s1_b1_0000.exp
|   +-- MDim1_s1_b1_0001.exp
|   +-- ...
+-- cleaned_data
|   +-- MDim1_s1_b1_0000.rds
|   +-- MDim1_s1_b1_0001.rds
|   +-- ...
+-- task_data
|   +-- MDim1 raw.txt
|   +-- trial_data.csv
|   +-- cleaned_trial_data.csv