Time Normalization of Force-Time Curves

Time Normalization

Dylan Hicks emailed me recently about “time normalization” of vertical jump data. If you’re unfamiliar, time normalization (see here for an example)–better known as interpolation–involves re-sampling a known-length data set to a new length.

Like the paper I linked above, a common reason for performing time normalization is to standardize a data set’s length prior to performing comparisons. For example, say we want to compare an athlete’s SJ and CMJ force-time characteristics at multiple external loads (0kg, 10kg, and 20kg). Before even looking at the data, we know the trial lengths will differ between jump types and the duration will increase with increasing load. In the linked data, the trials range from 280ms - 955ms. Apples-to-apples curve comparisons aren’t really possible with the raw data since they’re different lengths, but we can interpolate new standard-length curves (e.g., 101 data points to represent 0% - 100% of the jump) to overcome our length discrepancy problem.

Interpolation in R

The great thing about R is that there’s a function for everything (or you can write your own, but that’s beside the point). In the case of interpolation, we’re going to rely on the approx() function. approx() performs linear interpolation of the data (i.e. it draws a line through each pair of data points and estimates the value at the new location), although you can implement other interpolation methods (e.g. spline or cubic) by calling their respective functions (spline() and pracma::pchip(), respectively). I’m going to assume you’re sampling at a high enough frequency that linear vs. polynomial interpolation isn’t a huge factor…and by huge factor, I mean the interpolated force values aren’t statistically or practically different from one another. I’m unaware of any papers that have empirically investigated this (or the sampling frequency at which the two methods do produce different values), but the data I’ve included here (sampled at 1000 Hz) are virtually identical for both approx() and spline(). Maybe the enterprising among you can publish a paper on it and list me in the acknowledgements. :)

Anyway, the approx function is pretty straightforward:

args(approx)
## function (x, y = NULL, xout, method = "linear", n = 50, yleft, 
##     yright, rule = 1, f = 0, ties = mean) 
## NULL

We need to provide approx() with values for the arguments x, y, and n. x and y are same-length vectors (e.g. time and force or index value and force), while n is the number of points we want to interpolate our data to. Using the data I linked above, let’s walk through the process. First, we need to import our data.

# fread is from the data.table package and is much faster than read.csv when reading large amounts of data
jump_data <- fread("jump_trials.csv")

# Data aren't displayed due to the size of the data frame

It’s worth noting the example data are organized in a pretty peculiar manner. I created this data set probably five years ago when I was still an R newbie, so don’t judge me too harshly. Let’s start off by putting things in a saner format.

Edit: It’s worth pointing out this step probably isn’t necessary for your data. The data in this example are in wide format, meaning each row represents a trial. In most software that spits out force-time data, trials will be arranged by columns instead. Sorry for any confusion!

# Again, transpose() comes from data.table
jump_data <- transpose(jump_data)

# Alternatively, using base R
alt_transpose_1 <- data.frame(t(jump_data))

# Or piping via the tidyverse
alt_transpose_2 <- jump_data %>%
  t %>%
  data.frame

With our shiny new long data in hand, let’s interpolate some new values. Remember, we need x (the locations of y, e.g. time or index location), y (the data), and n (the new length). Let’s start off by interpolating trial 1 (V1 or X1 depending on whether you used transpose() or t() above) to a length of 101 points (0% - 100% of the trial).

approx(1:length(jump_data$V1), jump_data$V1, n = 101)
## $x
##   [1]   1.00   9.69  18.38  27.07  35.76  44.45  53.14  61.83  70.52  79.21
##  [11]  87.90  96.59 105.28 113.97 122.66 131.35 140.04 148.73 157.42 166.11
##  [21] 174.80 183.49 192.18 200.87 209.56 218.25 226.94 235.63 244.32 253.01
##  [31] 261.70 270.39 279.08 287.77 296.46 305.15 313.84 322.53 331.22 339.91
##  [41] 348.60 357.29 365.98 374.67 383.36 392.05 400.74 409.43 418.12 426.81
##  [51] 435.50 444.19 452.88 461.57 470.26 478.95 487.64 496.33 505.02 513.71
##  [61] 522.40 531.09 539.78 548.47 557.16 565.85 574.54 583.23 591.92 600.61
##  [71] 609.30 617.99 626.68 635.37 644.06 652.75 661.44 670.13 678.82 687.51
##  [81] 696.20 704.89 713.58 722.27 730.96 739.65 748.34 757.03 765.72 774.41
##  [91] 783.10 791.79 800.48 809.17 817.86 826.55 835.24 843.93 852.62 861.31
## [101] 870.00
## 
## $y
##   [1]  658.515994  649.379220  640.331911  631.879330  623.584359  614.324304
##   [7]  603.097639  589.529908  573.922460  557.029491  539.891934  523.690095
##  [13]  509.438574  496.952157  484.949169  472.286676  458.389636  443.856387
##  [19]  430.012987  418.109340  408.728879  402.061446  399.570826  404.211114
##  [25]  416.259345  432.474051  448.140441  460.439238  470.200577  480.204110
##  [31]  492.336535  506.817073  522.627300  538.451562  553.828468  568.714075
##  [37]  583.211994  597.729222  613.742874  633.031957  656.844017  685.375517
##  [43]  717.312283  750.107935  781.341543  809.573299  834.244476  855.778391
##  [49]  875.476081  895.252876  916.483554  940.830454  969.165875 1000.883505
##  [55] 1034.586934 1068.650115 1101.899997 1133.434751 1162.049659 1187.555766
##  [61] 1210.867641 1232.865991 1253.954299 1274.421249 1293.447920 1309.479565
##  [67] 1321.744179 1330.096045 1334.880012 1336.699071 1336.489713 1335.620329
##  [73] 1335.907529 1338.357643 1342.899723 1349.396180 1357.966187 1369.044265
##  [79] 1382.943453 1399.448286 1417.106000 1435.371568 1453.604835 1470.312076
##  [85] 1484.890367 1496.711817 1504.299495 1505.483694 1496.409366 1473.150133
##  [91] 1431.732874 1366.362132 1269.803685 1136.728087  966.787323  767.363009
##  [97]  553.793176  346.721613  182.028978   67.890906    2.957444

You’ll notice I didn’t add a time column to the data prior to using approx(). Instead, I used the index locations of the points via 1:length(jump_data$V1). Using either index location or a user-defined time column is perfectly fine and won’t affect the results. You’ll also notice approx() returns interpolated values for both x and y. We’re only concerned with y, however, so you should adjust the above function slightly:

approx(1:length(jump_data$V1), jump_data$V1, n = 101)$y
##   [1]  658.515994  649.379220  640.331911  631.879330  623.584359  614.324304
##   [7]  603.097639  589.529908  573.922460  557.029491  539.891934  523.690095
##  [13]  509.438574  496.952157  484.949169  472.286676  458.389636  443.856387
##  [19]  430.012987  418.109340  408.728879  402.061446  399.570826  404.211114
##  [25]  416.259345  432.474051  448.140441  460.439238  470.200577  480.204110
##  [31]  492.336535  506.817073  522.627300  538.451562  553.828468  568.714075
##  [37]  583.211994  597.729222  613.742874  633.031957  656.844017  685.375517
##  [43]  717.312283  750.107935  781.341543  809.573299  834.244476  855.778391
##  [49]  875.476081  895.252876  916.483554  940.830454  969.165875 1000.883505
##  [55] 1034.586934 1068.650115 1101.899997 1133.434751 1162.049659 1187.555766
##  [61] 1210.867641 1232.865991 1253.954299 1274.421249 1293.447920 1309.479565
##  [67] 1321.744179 1330.096045 1334.880012 1336.699071 1336.489713 1335.620329
##  [73] 1335.907529 1338.357643 1342.899723 1349.396180 1357.966187 1369.044265
##  [79] 1382.943453 1399.448286 1417.106000 1435.371568 1453.604835 1470.312076
##  [85] 1484.890367 1496.711817 1504.299495 1505.483694 1496.409366 1473.150133
##  [91] 1431.732874 1366.362132 1269.803685 1136.728087  966.787323  767.363009
##  [97]  553.793176  346.721613  182.028978   67.890906    2.957444

Let’s plot the interpolated data against the raw data.

interpolated_data <- approx(1:length(jump_data$V1), 
                            jump_data$V1, 
                            n = 101)$y

plot_ly() %>%
  add_lines(data = jump_data,
            x = ~1:length(V1),
            y = ~V1,
            name = "Raw") %>%
  add_lines(x = 1:length(interpolated_data),
            y = interpolated_data,
            name = "Interpolated")
2004006008000200400600800100012001400
RawInterpolated1:length(V1)V1

Typically, we want to time normalize multiple trials. Thankfully, R makes this a cakewalk with lapply().

lapply_interpolation <- data.frame(lapply(jump_data,
                                          function(x) approx(1:length(x),
                                                             x,
                                                             n = 101)$y))

Or if you’re a data.table user…

data_table_interpolate <- jump_data[, lapply(.SD,
                                             function(x) approx(1:length(x),
                                                                x,
                                                                n = 101)$y)]

In either case, enjoy your shiny new time normalized data!

2040608010002004006008001000120014001600
Trial 1Trial 2Trial 3Trial 4Trial 5Trial 6Trial 7Trial 8Trial 9Trial 10Trial 11Trial 12Trial 13Trial 14Trial 15

Wrapping Up

This post is a bit more off the cuff than usual and assumes some basic to intermediate proficiency in R, but hopefully it’s helpful to some of you in the sports science Twitterverse. Feel free to message me via email or Twitter if you run into problems!

Avatar
Matt Sams
Analyst - Performance Science

My interests focus on maximizing athletes’ performance while managing their fatigue and injury risk.

Related

Next
Previous