Multimodal Neurons in Artificial Neural Networks

montyanderson1 pts0 comments

Multimodal Neurons in Artificial Neural Networks

Distill

Multimodal Neurons in Artificial Neural Networks

Authors

Affiliations

Gabriel Goh

OpenAI

Nick Cammarata †

OpenAI

Chelsea Voss †

OpenAI

Shan Carter

Observable

Michael Petrov

OpenAI

Ludwig Schubert

Alec Radford

OpenAI

Chris Olah

Published

March 4, 2021

DOI

10.23915/distill.00030

li")].map(node => node.id)` -->

Acknowledgments

We are deeply grateful to Sandhini Agarwal, Daniela Amodei, Dario Amodei,<br>Tom Brown, Jeff Clune, Steve Dowling, Gretchen Krueger, Brice Menard,<br>Reiichiro Nakano, Aditya Ramesh, Pranav Shyam, Ilya Sutskever and Martin<br>Wattenberg.

Author Contributions

Gabriel Goh: Research lead. Gabriel Goh first discovered multimodal<br>neurons, sketched out the project direction and paper outline, and did<br>much of the conceptual and engineering work that allowed the team to<br>investigate the models in a scalable way. This included developing tools<br>for understanding how concepts were built up and decomposed (that were<br>applied to emotion neurons), developing zero-shot neuron search (that<br>allowed easy discoverability of neurons), and working with Michael Petrov<br>on porting CLIP to microscope. Subsequently developed faceted feature<br>visualization, and text feature visualization.

Chris Olah: Worked with Gabe on the overall framing of the article,<br>actively mentored each member of the team through their work providing<br>both high and low level contributions to their sections, and contributed<br>to the text of much of the article, setting the stylistic tone. He worked<br>with Gabe on understanding the neuroscience literature and better<br>understanding the relevant neuroscience literature. Additionally, he wrote<br>the sections on region neurons and developed diversity feature<br>visualization which Gabe used to create faceted feature visualization

Alec Radford: Developed CLIP. First observed that CLIP was learning<br>to read. Advised Gabriel Goh on project direction on a weekly basis. Upon<br>the discovery that CLIP was using text to classify images, proposed<br>typographical adversarial attacks as a promising research direction.

Shan Carter: Worked on initial investigation of CLIP with Gabriel<br>Goh. Did multimodal activation atlases to understand the space of<br>multimodal representations and geometry, and neuron atlases, which<br>potentially helped the arrangement and display of neurons. Provided much<br>useful advice on the visual presentation of ideas, and helped with many<br>aspects of visual design.

Michael Petrov: Worked on the initial investigation of multimodal<br>neurons by implementing and scaling dataset examples. Discovered, with<br>Gabriel Goh, the original “Spider-Man” multimodal neuron in the dataset<br>examples, and many more multimodal neurons. Assisted a lot in the<br>engineering of Microscope both early on, and at the end, including helping<br>Gabriel Goh with the difficult technical challenges of porting microscope<br>to a different backend.

Chelsea Voss†: Performed investigation of the typographical attacks<br>phenomena, both via linear probes and zero-shot, confirming that the<br>attacks were indeed real and state of the art. Proposed and successfully<br>found “in-the-wild” attacks in the zero-shot classifier. Subsequently<br>wrote the section “typographical attacks”. Upon completion of this part of<br>the project, investigated responses of neurons to rendered text on<br>dictionary words. Also assisted with the organization of neurons into<br>neuron cards.

Nick Cammarata†: Drew the connection between multimodal neurons in<br>neural networks and multimodal neurons in the brain, which became the<br>overall framing of the article. Created the conditional probability plots<br>(regional, Trump, mental health), labeling more than 1500 images,<br>discovered that negative pre-ReLU activations are often interpretable, and<br>discovered that neurons sometimes contain a distinct regime change between<br>medium and strong activations. Wrote the identity section and the emotion<br>sections, building off Gabriel’s discovery of emotion neurons and<br>discovering that “complex” emotions can be broken down into simpler ones.<br>Edited the overall text of the article and built infrastructure allowing<br>the team to collaborate in Markdown with embeddable components.

Ludwig Schubert: Helped with general infrastructure.

† equal contributors

Discussion and Review

Review 1 - Anonymous

Review 2 - Anonymous

Review 3 - Anonymous

References<br>Invariant visual representation by single neurons in the human brain  [PDF]<br>Quiroga, R.Q., Reddy, L., Kreiman, G., Koch, C. and Fried, I., 2005. Nature, Vol 435(7045), pp. 1102--1107. Nature Publishing Group.<br>Explicit encoding of multimodal percepts by single neurons in the human brain<br>Quiroga, R.Q., Kraskov, A., Koch, C. and Fried, I., 2009. Current Biology, Vol 19(15), pp. 1308--1313. Elsevier.<br>Learning Transferable Visual Models From Natural Language Supervision  [link]<br>Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J.,...

neurons multimodal gabriel openai clip text

Related Articles