Currently, given (B, T, C, D) we compress the signal via a sequence of layers
Are these the best approaches?
So far with end-2-end training Wav2vec2 + Shallownet achieves ~65 on single subject/single session but drops to ~45 with proper evaluation
Currently, given (B, T, C, D) we compress the signal via a sequence of layers
- Uses Temple University Hospital (TUH) EEG corpus
- selected 22 channels only
- Processed 20,000 (19,000 train/ 1000 val) EEG recordings (~5656 hrs)
- Split EEG signal into N chunks, each chunk has dim C * T.
- Each chunk is treated independent sample
- Introduce AdaCT, Adapters to convert time series data into spatio-temporal 2D pseudo-images or text.
- Then use pretrained models for text/image
- No involvement with EEG pretraining directly.
- Looks like Fake work (I didn't even get how the convert it to text)