Clock-aligned results

February 9, 2025

I’ve had a few discussions in the last month on using the model to score data during the daytime for both a follow-up scientific study as well as personal research.

Paper’s dataset

As I detailed in the Methods of the paper, the data came from 5 different sleep datasets, with the only “time” requirement being that the recording was from 5 to 15hrs long. As a consequence, almost all of the data was recorded during the typical nighttime period.

This bias for the nighttime recordings (for the training/validation/testing sets) can more easily be visualized by looking at the clock-aligned stage-time summary (figure below, top-right panel). The gray dashed line shows the percent of recordings for any given period of the night. There are a handful of recordings that extend beyond the window shown, but the stage ratios fluctuate too wildly with only a few recordings remaining in the sample.

Additionally, in the paper’s Supplementary Information, I showed the performance when stratifying the epochs across time (when aligned to the beginning of every recording) (figure below, lower-left panel). When the recordings are clock-aligned, we can see that performance is nearly identical. Of note, the lower performance for REM at the beginning of the night and N3 at the end of the night are primarily a function of the lower prevalence of those stages during those periods (see top-right panel). Otherwise, the performance is relatively consistent across time.

stage summary and results
alter-text

The panels on the left are from the paper (Fig. 1d, and Supplementary Fig. S5a). The panels on the right are the same data, but clock-aligned. The clock-aligned data stops once there are fewer than 5 recordings in a period.

Future improvements

If the model is going to be used outside the normal nighttime window, then it will need to be trained and tested during those periods. To that end, I plan to make some changes in the future.

Find more ECG sleep data to “fill out” the clock.
- I won’t be able to change the current dataset, but this could be a second revision to the dataset. The point here would be to have every hour in a 24-hr day covered with some data, even if the majority of the data is still during the nighttime.
Adjust the clock input for the model.
- The current input for the clock time is a single number representing the number of days +/- the nearest midnight (i.e, the “midnight offset”). This would typically mean the values range from -0.5 (noon before) to 0.5 (noon after).
- To prevent tricky discontinuities, I’m going to change this to 2 numbers which represent the circular encoded time. This would look like (cos(2*pi*t), sin(2*pi*t)), where t would be still be the same “midnight offset” value used previously.

Time manipulation

To drive home the point about the need for a better encoding for the clock time, I shifted the testing set recording start times. In the figure below, you can see that the performance (and predicted sleep stage ratios) is relatively constant in the expected normal midnight offset range [-0.5, 0.5]. However, beyond that, it changes drastically. Note that at every integer day forward or backward the performance should be the same (since the clock time is again the same).

time shift results
alter-text

The midnight offset for each of the recordings in the testing set was shifted either forward or backward in time. On the left is the performance. On the right is the predicted sleep stage ratio.

Twitter Facebook LinkedIn