Semantic reconstruction of continuous language from non-invasive brain recordings

10 min read Original article ↗

References

  1. Anumanchipalli, G. K., Chartier, J. & Chang, E. F. Speech synthesis from neural decoding of spoken sentences. Nature 568, 493–498 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Pasley, B. N. et al. Reconstructing speech from human auditory cortex. PLoS Biol. 10, e1001251 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Willett, F. R., Avansino, D. T., Hochberg, L. R., Henderson, J. M. & Shenoy, K. V. High-performance brain-to-text communication via handwriting. Nature 593, 249–254 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Moses, D. A. et al. Neuroprosthesis for decoding speech in a paralyzed person with anarthria. N. Engl. J. Med. 385, 217–227 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  5. Huth, A. G., de Heer, W. A., Griffiths, T. L., Theunissen, F. E. & Gallant, J. L. Natural speech reveals the semantic maps that tile human cerebral cortex. Nature 532, 453–458 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  6. de Heer, W. A., Huth, A. G., Griffiths, T. L., Gallant, J. L. & Theunissen, F. E. The hierarchical cortical organization of human speech processing. J. Neurosci. 37, 6539–6557 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  7. Broderick, M. P., Anderson, A. J., Di Liberto, G. M., Crosse, M. J. & Lalor, E. C. Electrophysiological correlates of semantic dissimilarity reflect the comprehension of natural, narrative speech. Curr. Biol. 28, 803–809 (2018).

    Article  CAS  PubMed  Google Scholar 

  8. Caucheteux, C. & King, J.-R. Brains and algorithms partially converge in natural language processing. Commun. Biol. 5, 134 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  9. Farwell, L. A. & Donchin, E. Talking off the top of your head: toward a mental prosthesis utilizing event-related brain potentials. Electroencephalogr. Clin. Neurophysiol. 70, 510–523 (1988).

    Article  CAS  PubMed  Google Scholar 

  10. Mitchell, T. M. et al. Predicting human brain activity associated with the meanings of nouns. Science 320, 1191–1195 (2008).

    Article  CAS  PubMed  Google Scholar 

  11. Pereira, F. et al. Toward a universal decoder of linguistic meaning from brain activation. Nat. Commun. 9, 963 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  12. Dash, D., Ferrari, P. & Wang, J. Decoding imagined and spoken phrases from non-invasive neural (MEG) signals. Front. Neurosci. 14, 290 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  13. Logothetis, N. K. The underpinnings of the BOLD functional magnetic resonance imaging signal. J. Neurosci. 23, 3963–3971 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Jain, S. & Huth, A. G. Incorporating context into language encoding models for fMRI. In Advances in Neural Information Processing Systems 31 6629–6638 (NeurIPS, 2018).

  15. Toneva, M. & Wehbe, L. Interpreting and improving natural-language processing (in machines) with natural language-processing (in the brain). In Advances in Neural Information Processing Systems 32 14928–14938 (NeurIPS, 2019).

  16. Schrimpf, M. et al. The neural architecture of language: integrative modeling converges on predictive processing. Proc. Natl Acad. Sci. USA 118, e2105646118 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. LeBel, A., Jain, S. & Huth, A. G. Voxelwise encoding models show that cerebellar language representations are highly conceptual. J. Neurosci. 41, 10341–10355 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Naselaris, T., Prenger, R. J., Kay, K. N., Oliver, M. & Gallant, J. L. Bayesian reconstruction of natural images from human brain activity. Neuron 63, 902–915 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Nishimoto, S. et al. Reconstructing visual experiences from brain activity evoked by natural movies. Curr. Biol. 21, 1641–1646 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Radford, A., Narasimhan, K., Salimans, T. & Sutskever, I. Improving language understanding by generative pre-training. Preprint at OpenAI https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf (2018).

  21. Tillmann, C. & Ney, H. Word reordering and a dynamic programming beam search algorithm for statistical machine translation. Comput. Linguist. 29, 97–133 (2003).

    Article  Google Scholar 

  22. Lerner, Y., Honey, C. J., Silbert, L. J. & Hasson, U. Topographic mapping of a hierarchy of temporal receptive windows using a narrated story. J. Neurosci. 31, 2906–2915 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Binder, J. R. & Desai, R. H. The neurobiology of semantic memory. Trends Cogn. Sci. 15, 527–536 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  24. Deniz, F., Nunez-Elizalde, A. O., Huth, A. G. & Gallant, J. L. The representation of semantic information across human cerebral cortex during listening versus reading is invariant to stimulus modality. J. Neurosci. 39, 7722–7736 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Gauthier, J. & Ivanova, A. Does the brain represent words? An evaluation of brain decoding studies of language understanding. In 2018 Conference on Cognitive Computational Neuroscience 1–4 (CCN, 2018).

  26. Fedorenko, E. & Thompson-Schill, S. L. Reworking the language network. Trends Cogn. Sci. 18, 120–126 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  27. Fodor, J. A. The Modularity of Mind (MIT Press, 1983).

  28. Keller, T. A., Carpenter, P. A. & Just, M. A. The neural bases of sentence comprehension: a fMRI examination of syntactic and lexical processing. Cereb. Cortex 11, 223–237 (2001).

    Article  CAS  PubMed  Google Scholar 

  29. Geschwind, N. The organization of language and the brain. Science 170, 940–944 (1970).

    Article  CAS  PubMed  Google Scholar 

  30. Barsalou, L. W. Grounded cognition. Annu. Rev. Psychol. 59, 617–645 (2008).

    Article  PubMed  Google Scholar 

  31. Bunzeck, N., Wuestenberg, T., Lutz, K., Heinze, H.-J. & Jancke, L. Scanning silence: mental imagery of complex sounds. Neuroimage 26, 1119–1127 (2005).

    Article  PubMed  Google Scholar 

  32. Martin, S. et al. Decoding spectrotemporal features of overt and covert speech from the human cortex. Front. Neuroeng. 7, 14 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  33. Naselaris, T., Olman, C. A., Stansbury, D. E., Ugurbil, K. & Gallant, J. L. A voxel-wise encoding model for early visual areas decodes mental images of remembered scenes. Neuroimage 105, 215–228 (2015).

    Article  PubMed  Google Scholar 

  34. Silbert, L. J., Honey, C. J., Simony, E., Poeppel, D. & Hasson, U. Coupled neural systems underlie the production and comprehension of naturalistic narrative speech. Proc. Natl Acad. Sci. USA 111, E4687–E4696 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Fairhall, S. L. & Caramazza, A. Brain regions that represent amodal conceptual knowledge. J. Neurosci. 33, 10552–10558 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Popham, S. F. et al. Visual and linguistic semantic representations are aligned at the border of human visual cortex. Nat. Neurosci. 24, 1628–1636 (2021).

    Article  CAS  PubMed  Google Scholar 

  37. Çukur, T., Nishimoto, S., Huth, A. G. & Gallant, J. L. Attention during natural vision warps semantic representation across the human brain. Nat. Neurosci. 16, 763–770 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  38. Kiremitçi, I. et al. Attentional modulation of hierarchical speech representations in a multitalker environment. Cereb. Cortex 31, 4986–5005 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  39. Mesgarani, N. & Chang, E. F. Selective cortical representation of attended speaker in multi-talker speech perception. Nature 485, 233–236 (2012).

    Article  CAS  PubMed  Google Scholar 

  40. Horikawa, T. & Kamitani, Y. Attention modulates neural representation to render reconstructions according to subjective appearance. Commun. Biol. 5, 34 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Rainey, S., Martin, S., Christen, A., Mégevand, P. & Fourneret, E. Brain recording, mind-reading, and neurotechnology: ethical issues from consumer devices to brain-based speech decoding. Sci. Eng. Ethics 26, 2295–2311 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  42. Kaplan, J. et al. Scaling laws for neural language models. Preprint at arxiv https://doi.org/10.48550/arXiv.2001.08361 (2020).

  43. White, B. R. & Culver, J. P. Quantitative evaluation of high-density diffuse optical tomography: in vivo resolution and mapping performance. J. Biomed. Opt. 15, 026006 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  44. Eggebrecht, A. T. et al. A quantitative spatial comparison of high-density diffuse optical tomography and fMRI cortical mapping. Neuroimage 61, 1120–1128 (2012).

    Article  PubMed  Google Scholar 

  45. Makin, J. G., Moses, D. A. & Chang, E. F. Machine translation of cortical activity to text with an encoder–decoder framework. Nat. Neurosci. 23, 575–582 (2020).

    Article  CAS  PubMed  Google Scholar 

  46. Orsborn, A. L. et al. Closed-loop decoder adaptation shapes neural plasticity for skillful neuroprosthetic control. Neuron 82, 1380–1393 (2014).

    Article  CAS  PubMed  Google Scholar 

  47. Goering, S. et al. Recommendations for responsible development and application of neurotechnologies. Neuroethics 14, 365–386 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  48. Levy, C. Sintel (Blender Foundation, 2010).

  49. Fedorenko, E., Hsieh, P.-J., Nieto-Castañón, A., Whitfield-Gabrieli, S. & Kanwisher, N. New method for fMRI investigations of language: defining ROIs functionally in individual subjects. J. Neurophysiol. 104, 1177–1194 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  50. Yuan, J. & Liberman, M. Speaker identification on the SCOTUS corpus. J. Acoust. Soc. Am. 123, 3878 (2008).

    Article  Google Scholar 

  51. Boersma, P. & Weenink, D. Praat: doing phonetics by computer (University of Amsterdam, 2014).

  52. Casarosa, E. La Luna (Walt Disney Pictures; Pixar Animation Studios, 2011).

  53. Sweetland, D. Presto (Walt Disney Pictures; Pixar Animation Studios, 2008).

  54. Sohn, P. Partly Cloudy (Walt Disney Pictures; Pixar Animation Studios, 2009).

  55. Jenkinson, M. & Smith, S. A global optimisation method for robust affine registration of brain images. Med. Image Anal. 5, 143–156 (2001).

    Article  CAS  PubMed  Google Scholar 

  56. Dale, A. M., Fischl, B. & Sereno, M. I. Cortical surface-based analysis. I. Segmentation and surface reconstruction. Neuroimage 9, 179–194 (1999).

    Article  CAS  PubMed  Google Scholar 

  57. Gao, J. S., Huth, A. G., Lescroart, M. D. & Gallant, J. L. Pycortex: an interactive surface visualizer for fMRI. Front. Neuroinform. 9, 23 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  58. Harris, C. R. et al. Array programming with NumPy. Nature 585, 357–362 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  59. Virtanen, P. et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat. Methods 17, 261–272 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Paszke, A. et al. PyTorch: an imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32 8024–8035 (NeurIPS, 2019).

  61. Wolf, T. et al. Transformers: state-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations 38–45 (Association for Computational Linguistics, 2020).

  62. Holtzman, A., Buys, J., Du, L., Forbes, M. & Choi, Y. The curious case of neural text degeneration. In 8th International Conference on Learning Representations 1–16 (ICLR, 2020).

  63. Papineni, K., Roukos, S., Ward, T. & Zhu, W.-J. BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics 311–318 (Association for Computational Linguistics, 2002).

  64. Banerjee, S. & Lavie, A. METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization 65–72 (Association for Computational Linguistics, 2005).

  65. Zhang, T., Kishore, V., Wu, F., Weinberger, K. Q. & Artzi, Y. BERTScore: evaluating text generation with BERT. In 8th International Conference on Learning Representations 1–43 (ICLR, 2020).

  66. Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B Stat. Methodol. 57, 289–300 (1995).

    Google Scholar 

  67. Faul, F., Erdfelder, E., Lang, A.-G. & Buchner, A. G*Power 3: a flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behav. Res. Methods 39, 175–191 (2007).

    Article  PubMed  Google Scholar 

  68. Pennington, J., Socher, R. & Manning, C. D. GloVe: global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing 1532–1543 (Association for Computational Linguistics, 2014).

  69. Warriner, A. B., Kuperman, V. & Brysbaert, M. Norms of valence, arousal, and dominance for 13,915 English lemmas. Behav. Res. Methods 45, 1191–1207 (2013).

    Article  PubMed  Google Scholar 

  70. Brysbaert, M., Warriner, A. B. & Kuperman, V. Concreteness ratings for 40 thousand generally known English word lemmas. Behav. Res. Methods 46, 904–911 (2014).

    Article  PubMed  Google Scholar 

  71. Levy, R. Expectation-based syntactic comprehension. Cognition 106, 1126–1177 (2008).

    Article  PubMed  Google Scholar 

  72. Fischl, B., Sereno, M. I., Tootell, R. B. H. & Dale, A. M. High-resolution intersubject averaging and a coordinate system for the cortical surface. Hum. Brain Mapp. 8, 272–284 (1999).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references