3 Motion data cleaning
3.1 Setup
Before starting, create a new folder cleaned_data
to store the cleaned motion data.
+-- motion_data
| +-- MDim1_s1_b1_0000.exp
| +-- MDim1_s1_b1_0001.exp
| +-- ...
+-- cleaned_data
+-- task_data
| +-- MDim1 raw.txt
| +-- trial_data.csv
The following packages should be installed and loaded
library(seathree)
library(motionImport)
library(seacurve)
3.2 Guide
First, load the trial data
trial.data <- read.csv('task_data/trial_data.csv', header = T)
and get a list of files to cleaned
filepaths <- paste('motion_data/', trial.data$FileName, '.exp', sep = '')
3.2.1 Single trial workflow
To see what the cleaning looks like, we can import the first trial using ImportMotionData()
. The function also needs the time window the recording (in this case, from .5 seconds before response to 1.5 seconds after), and the symbol that the .exp file uses to denote missing values (in this case, 0)
timestamps <- c(-.5,1)
na.symbol <- 0
data <- ImportMotionData(filepaths[1], timestamps, na.symbol)
The result is a list with eight elements, each corresponding to a sensor
names(data)
## [1] "RtmbS2" "RindexS3" "RmiddleS4" "RbackhandS5"
## [5] "LtmbS7" "LindexS8" "LmiddleS9" "LbackhandS10"
Since this is a right-handed trial, we’ll use the right thumb sensor as an example
sensor <- data$RtmbS2
Each sensor contains a vector of time stamps
head(sensor$Timestamps)
## [1] -0.5000000 -0.4873950 -0.4747899 -0.4621849 -0.4495798 -0.4369748
a matrix of position data (in Cartesian coordinates)
head(sensor$Position)
## RtmbS2x RtmbS2y RtmbS2z
## [1,] 0.019186 0.09589 0.041145
## [2,] 0.019242 0.09589 0.041145
## [3,] NA NA NA
## [4,] 0.019242 0.09589 0.041145
## [5,] NA NA NA
## [6,] 0.019242 0.09589 0.041145
a matrix of orientation data (unit quaternions)
head(sensor$Rotation)
## RtmbS2q0 RtmbS2q1 RtmbS2q2 RtmbS2q3
## [1,] 0.665255 0.442971 -0.388956 0.458177
## [2,] 0.665111 0.443103 -0.388934 0.458277
## [3,] NA NA NA NA
## [4,] 0.665100 0.443097 -0.388998 0.458244
## [5,] NA NA NA NA
## [6,] 0.665100 0.443097 -0.388998 0.458244
We can represent these data compactly by converting the sensor to a single dual quaternion trajectory using sc.posrot2dq()
sc <- sc.posrot2dq(v = sensor$Position,
q = sensor$Rotation,
timestamps = sensor$Timestamps)
sc[1:6,]
## Time q0 q1 q2 q3 q4
## [1,] -0.5000000 0.6652548 0.4429709 -0.3889559 0.4581769 0.004973227
## [2,] -0.4873950 0.6651109 0.4431029 -0.3889339 0.4582769 0.004956442
## [3,] -0.4747899 NA NA NA NA NA
## [4,] -0.4621849 0.6651000 0.4430970 -0.3889980 0.4582440 0.004960248
## [5,] -0.4495798 NA NA NA NA NA
## [6,] -0.4369748 0.6651000 0.4430970 -0.3889980 0.4582440 0.004960248
## q5 q6 q7
## [1,] 0.03635088 0.03661337 -0.01128354
## [2,] 0.03637246 0.03659539 -0.01130351
## [3,] NA NA NA
## [4,] 0.03637210 0.03659507 -0.01130407
## [5,] NA NA NA
## [6,] 0.03637210 0.03659507 -0.01130407
Both the position and rotation data can be plotted using sc.plot()
:
sc.plot(sc, 'Position')
sc.plot(sc, 'Rotation')
Sparing the details (which can be found on the project webpage
), minimal cleaning (in including reorienting the dual quaternions, interpolation, and outlier removal) can be done with the sc.clean()
function. This function accepts the sensor data directly, so there is no need to convert the sensor to a Seacurve object first.
# Number of windows for local regression outlier removal
win <- 5
# Cook's distance threshold for outlier removal
thresh <- .025
# Clean
sc <- sc.clean(sensor, win, thresh)
sc.plot(sc, 'Position')
sc.plot(sc, 'Rotation')
3.2.2 Single trial wrapper
We want a function that accepts a path to a .exp
file and returns a list with cleaned data for each sensor. In the event of an error (say, the motion monitor system returns a .exp file with no data), we want keep a record of the error in trial.data
. Our function will return a list with two elements: - A logical value valid
denoting whether the data were cleaned successfully - A field data
containing the cleaned trial data (if any) We’ll then store the value of valid
in a column trial.data
for our records.
cleanTrial <- function(filepath, timestamps, na.symbol, win, thresh) {
tryCatch({
data <- ImportMotionData(filepath, timestamps, na.symbol)
pr.data <- lapply(data, sc.process, win, thresh)
ret <- list(valid = T, data = pr.data)
return(ret)
}, error = function(e) {
ret <- list(valid = F, data = NULL)
})
}
3.2.3 Cleaning all trials
Now we’ll loop over all trials in trial.data
and save the results. The output of valid
will be saved to a column Has.Motion.Data
. The cleaned data is saved to processed_data/
. As there are almost 18,000 trials, this can be expected to take over an hour.
n <- nrow(trial.data)
trial.data$Has.Motion.Data <- NA
for (i in 1:n) {
# Process trial
pr.data <- cleanTrial(filepaths[i], timestamps, na.symbol, win, thresh)
trial.data$Has.Motion.Data[i] <- pr.data$valid
# If trial has valid motion data, save it to cleaned_data/
if (pr.data$valid) {
fname <- paste('cleaned_data/', trial.data$FileName[i], '.rds', sep = '')
saveRDS(pr.data$data, file = fname)
}
}
Finally, we save the new data frame
write.table(trial.data, file = 'task_data/cleaned_trial_data.csv',
sep = ',', row.names = F)
The directory structure should now resemble
+-- motion_data
| +-- MDim1_s1_b1_0000.exp
| +-- MDim1_s1_b1_0001.exp
| +-- ...
+-- cleaned_data
| +-- MDim1_s1_b1_0000.rds
| +-- MDim1_s1_b1_0001.rds
| +-- ...
+-- task_data
| +-- MDim1 raw.txt
| +-- trial_data.csv
| +-- cleaned_trial_data.csv