Stephen M
Abstraction:
Correlation between {Original DSM} and {Hidden Layer DSM}
Learning Speed:
# Epochs to learn
All architectures used a single framework, consisting of 12 pairs of input and output units per modality (connected on a one-to-one basis with a frozen weight of 6 and a fixed bias of −3 for the output units), 60 hidden units and 3,132 bidirectional connections with learnable weights.
24 units
60 units
Ultimate Winner
"The model environment included four orthogonal structures: one distinct unimodal structure (based on five perfectly correlated or anti-correlated features within a single modality) per modality (unimodal M1, unimodal M2 and unimodal M3) and a multimodal structure"
Hopfield Networks (Associative Memory)
"This Control Layer sent trainable unidirectional connections to all units, providing a simple way of implementing control"
"This resulted in three Hidden Layer 1 regions in the Spokes-Only, Shallow Multimodal Hub, ..."
...
Empirical Phenomena
Martin, A., Haxby, J. V., Lalonde, F. M., Wiggs, C. L. & Ungerleider, L. G. Discrete cortical regions associated with knowledge of color and knowledge of action. Science 270, 102–105 (1995).
Acorn for squirrel
has stripes for zebra
horse for zebra
animal for squirrel
Both produce errors of omission, but differ in mistakes
Robson, H., Sage, K. & Ralph, M. A. L. Wernicke’s aphasia reflects a combination of acoustic-phonological and semantic control deficits: A case-series comparison of Wernicke’s aphasia, semantic dementia and semantic aphasia. Neuropsychologia 50, 266–275 (2012).
Omission: inactivation of correct feature
Context appropriate: activation of incorrect task-relevant feature
Intrusion: activation of task-irrelevent feature
"Recent evidence suggests that functional connectivity between the ATL hub and modality-specific regions changes depending on the information required for a task"
FFA
ATL
IPL
PCC
FFA
ATL
IPL
PCC
"activity at the final time point of each trial in context 1 (M1 ‘face’ input and M2 output, or ‘status’) and context 2 (M1 ‘face’ input and M3 output, or ‘trait’) was concatenated in a different random order per model run to create a time series for each voxel, per context. Each run of the model is treated as a different participant. To collapse across units within a region, a principal component analysis was performed per region for each context in each run, analogous to extracting a region-of-interest (ROI) time course for a psychophysiological interaction analysis as in Wang et al.15. The correlation between the time course in Hidden Layer 2 and each spoke region was calculated"
The optimal network had 4 characteristics
This model accounted for:
Why multimodal hub?
Why depth?
Why shortcut connections?
Why separate representation and control systems?
Nitpicky: