Reverse Engineering Cortical Architecture

Stephen M

"We employ a reverse-engineering approach to illuminate the neurocomputational building blocks that combine to support controlled semantic cognition"

Outline

  • Basic Premise
  • Architectures and inputs
  • Results
  • Discussion
  • Thoughts

Premise:

"By systematically varying the structure within a computer simulation and assessing the functional consequences, once can establish the architectural elements critical to the targeted functions"

Conceptual system must do 2 tasks:

1) Abstract over episodes

2) Adapt representations to suit immediate demands

Conceptual system must do 2 tasks:

1) Abstract over episodes

2) Adapt representations to suit immediate demands

  • In one view, concepts reflect clusters in the high-order covariance structure of experience
    • Critically, this information must be extracted over episodes
    • The features relevant to moving a piano conflict with the features relevant to playing it

Main Idea

  • Take neural networks of different architectures
  • Train networks to "complete the pattern"
    • Two cases: without control (complete entire pattern), and with control (output specific modality)
  • Test for ability to abstract
  • Look at learning speed
  • Lesion model
  • Show other neuro-psych phenomenon

Abstraction:

     Correlation between {Original DSM} and       {Hidden Layer DSM}

 

Learning Speed:

     # Epochs to learn

Architectures and Inputs

All architectures used a single framework, consisting of 12 pairs of input and output units per modality (connected on a one-to-one basis with a frozen weight of 6 and a fixed bias of −3 for the output units), 60 hidden units and 3,132 bidirectional connections with learnable weights.

24 units

60 units

Ultimate Winner

Concepts

  • The architectures are trained on 16 concepts ...
  • Method motivates design of concepts

 

"The model environment included four orthogonal structures: one distinct unimodal structure (based on five perfectly correlated or anti-correlated features within a single modality) per modality (unimodal M1, unimodal M2 and unimodal M3) and a multimodal structure"

C \approx \frac{n}{2 \cdot log_2 n}

Hopfield Networks (Associative Memory)

= 256

Adding Control

"This Control Layer sent trainable unidirectional connections to all units, providing a simple way of implementing control"

Results

Without Control

With Control

With Control

But ...

"This resulted in three Hidden Layer 1 regions in the Spokes-Only, Shallow Multimodal Hub, ..."

...

Of course deep hubs performed better!

*

*

Learning Times

Without Control

Without Control

Location of Control

  • Supports that control should be on spokes layer

Comment

  • Maybe look at what the values are for control. If M3 "neuron" is "off", then we know there should be no output for M3 spoke so just have very high negative weights

Phase 3

Empirical Phenomena

Anatomy

  • Unimodal perceptual representations progress to multimodal conceptual representations in a graded fashion 
  • Sparse long range short-cut connections
  • Suggests systems for semantic control should connect with more posterior regions, distal to ATL
    • control demands act on spoke representations

Martin, A., Haxby, J. V., Lalonde, F. M., Wiggs, C. L. & Ungerleider, L. G. Discrete cortical regions associated with knowledge of color and knowledge of action. Science 270, 102–105 (1995).

Functional Imaging

  • Given input in M1, need to activate M2 or M3
  • Corresponding hidden layer was activated more

Neuropsychological Syndromes

  • Semantic Aphasia
    • Context inappropriate intrusions
  • Semantic Dementia
    • Context appropriate, semantically incorrect

Acorn for squirrel

has stripes for zebra

horse for zebra

animal for squirrel

Both produce errors of omission, but differ in mistakes

Robson, H., Sage, K. & Ralph, M. A. L. Wernicke’s aphasia reflects a combination of acoustic-phonological and semantic control deficits: A case-series comparison of Wernicke’s aphasia, semantic dementia and semantic aphasia. Neuropsychologia 50, 266–275 (2012).

 

  • SA -- Added noise to control units
  • SD -- Removed proportion of connections within the multimodal hub
  • Show you can lesion to get the right proportion of errors
  • Semantic Aphasia
    • Context inappropriate intrusions
  • Semantic Dementia
    • Context appropriate, semantically incorrect

Omission: inactivation of correct feature

Context appropriate: activation of incorrect task-relevant feature

Intrusion: activation of task-irrelevent feature

Functional Connectivity

"Recent evidence suggests that functional connectivity between the ATL hub and modality-specific regions changes depending on the information required for a task"

FFA

ATL

IPL

PCC

Judge:

Social Status

FFA

ATL

IPL

PCC

Judge:

Trait Processing

Reverse Engineered Model

  • Stimuli presented in M1, but output was either M2 or M3
  • M1 to hub connectivity was the same, but connectivity between two outupt spokes varied significantly

"activity at the final time point of each trial in context 1 (M1 ‘face’ input and M2 output, or ‘status’) and context 2 (M1 ‘face’ input and M3 output, or ‘trait’) was concatenated in a different random order per model run to create a time series for each voxel, per context. Each run of the model is treated as a different participant. To collapse across units within a region, a principal component analysis was performed per region for each context in each run, analogous to extracting a region-of-interest (ROI) time course for a psychophysiological interaction analysis as in Wang et al.15. The correlation between the time course in Hidden Layer 2 and each spoke region was calculated"

Discussion

Main Points

The optimal network had 4 characteristics

  • Multimodal hub
  • Deep architecture
  • Sparse shortcut connections
  • Control operating on shallow rather than deep components

 

This model accounted for:

  • "coarse anatomy of temporal cortex"
  • qualitative differences in error patterns of SD verses SA
  • differential functional connectivity

Brief Summary

Why multimodal hub?

  • helps to acquire internal representation across modalities and episodes
  • If direct pathways exist, little pressure to use connections across all modalities

Why depth?

  • Note, only made difference with control
  • hub can be sufficiently insulated from contextual information to acquire context invariant representations

Why shortcut connections?

  • vanishing gradients

Why separate representation and control systems?

  • "evolution’s way of promoting the acquisition of deep conceptual representations while preserving the flexibility required to think and act as the situation demands"

Thoughts

Thoughts

  • I think that the training time is pretty meaningless, and the abstraction ability is predicated on how abstraction is defined
  • Good overall idea

Nitpicky:

  • Code is a bit rough
  • Weird supplemental ...

Future Directions

  • Instead of abstraction, test generalization (or usage in new contexts)
  • Use RSA to look for representations in each layer in the brain
  • Use Hebbian update rule

Paper Pres: Reverse Engineering Cortical Architecture

By smazurchuk

Paper Pres: Reverse Engineering Cortical Architecture

This is a paper I presented for lab meeting going over a paper that created different computational models. This paper was originally seen as a poster at SNL

  • 73