Neuroanatomy is key for understanding the brain, but software to analyse the data are often single purpose, for one model species and suffer from lack of support following publication. We have established the BrainGlobe Initiative - an international, distributed team of users and developers working towards the goal of creating open-source, interoperable and easy to use tools for the analysis of all types of neuroanatomical data. I will introduce the tools we have developed, and discuss ways in which we can make data analysis easier and more reproducible.
Artificial intelligence has made tremendous strides through deep learning methods, but computer algorithms still lag behind the capabilities of mammalian brains. Despite being trained on vast amounts of data, AI algorithms suffer from a lack of generalization and suffer from brittle results, struggling with tasks like transfer learning, causal reasoning, and scene understanding - all hallmarks of natural intelligence. We believe current deep learning architectures are severely under-constrained, lacking key model biases found in the brain that are instantiated by the multitude of cell types, pervasive feedback, innately structured connectivity, specific non-linearities, local learning rules and learning objectives. To bridge this gap, our goal is to uncover the model biases inherent in the brain. Through cutting-edge experiments and data collection across multiple levels from the visual cortex combined with deep learning, we are building foundation models of the brain's function. These models enable essentially unlimited in silico experiments to provide insights and generate hypotheses about brain function which are then verified using close loop experiments. We are committed to open access and sharing our data and models with the community.
Integrating the readouts of both structure and physiology of a brain region that contains a complete neuronal circuit can be a fruitful approach towards understanding how that circuit works. These maps can be reliably generated through a correlative multimodal imaging (CMI) pipeline that combines in vivo 2-photon, synchrotron X-ray and volume electron microscopy. This experimental workflow presents specific challenges for the handling of data and samples - from leveraging large singular datasets of multiple terabytes in size, to making informed decisions on when is the right time to irreversibly destroy (trim) a precious sample for follow-up imaging, to establishing fair policies for the sharing and publishing of data, algorithms and biological findings. I will showcase the CMI pipeline being developed in the lab of Andreas Schaefer at the Francis Crick Institute in the context of the scientific aims of the lab, and discuss future developments that may bring further meaningful improvements.
I address common versus heterogeneous aspects of the brain vasculature. Graph analysis of reconstricted networks revealed (1) common network topology across the brain that leads to a shared structural robustness against the rarefaction of vessels, and (2) that the capillary network forms interconnected loops with a topology that is invariant to the position and boundary of cortical columns. Geometrical analysis uncovered a scaling law that links length density, i.e., the length of vessel per volume, with tissue-to-vessel distances and provides a means to connect regional differences in metabolism to differences in length density. These and other issues will be discussed with refererece to the importance of complete, large-scale data sets.
The use of semantic technologies has great potential to improve the discoverability and interoperability of ever-increasing volumes of cellular resolution neuroscience data. Virtual Fly Brain has used these technologies to build an integrated knowledge base for Drosophila neuroscience, using standard classifications of cell types curated the literature to integrate over 150,000 cross-registered 3D images along with connectomics and transcriptomics data. I will present details of VFB and discuss two major challenges - the challenge of community engagement and adoption of ontologies and metadata standards; and the challenge reconciling data-driven approaches to classification with the simple categorical approaches typical of semantic technologies.
I will present Wellcome's perspectives on open science, why we value this approach and the roles we believe it has to play within our work. I will share our approaches to enabling and supporting the researchers we fund to work openly across our portfolio, before highlighting some new work within neuroscience.
The International Brain Laboratory is a collaboration of 22 labs with the shared mission of advancing our understanding of the neural systems and circuits that underlie behavior. The IBL approaches this challenge with a distributed experiment model, in which scientists within each lab contribute to the same shared scientific experiment. Data collection takes place across 12 labs and is preprocessed and stored centrally. We released our first release of data from the "Brain-wide Map" (BWM) project, in which we have systematically recorded from nearly all major brain areas with Neuropixels probes. We recorded activity from 295501 units from 194 brain regions, by performing 547 probe insertions. These data constitute a brain-wide activity map at single-spike cellular resolution during a decision-making task. In addition to the map, this data set contains other information gathered during the task - sensory stimuli presented to the mouse; mouse decisions and response times; and mouse behavioral information from analyzed video recordings. Here, I will provide hands-on training using the IBL-BWM data set and share details on the structure and organization of large data sets collected by different labs.
Data sharing works best when it is easy for both producers and consumers of shared data. Well-designed data standards help data consumers by reducing the time required to study documentation and write custom data loader, but it is also important that standards do not unduly burden data producers, who will likely respond by taking shortcuts that undermine the standardization. This talk will describe the Open Neurophysiology Environment (ONE), a simple data standard used by the International Brain Lab. Data producers can use it simply by naming individual files according to a naming convention that identifies their contents, saving in common formats such as npy, and uploading files to a website. Data consumers will then know how to load and interpret the files without reading documentation. The standard also scales to larger projects such as IBL, who can run a database allowing data to be rapidly searched.
Brains are not engineered solutions to a well-defined problem but arose through selective pressure acting on random variation. It is therefore unclear how well a model chosen by an experimenter can relate neural activity to experimental conditions. Here we developed “Model-free identification of neural encoding (MINE)” which combines convolutional neural networks with Taylor decomposition approaches to understand and comprehensively characterize the mapping from task features to activity. I will introduce MINE and discuss challenges related to sharing the generated method and data and how we think that sharing analysis code may increase data accessibility.
Animals may maintain, update, or alter behaviors as their circuitries undergo developmental and environmental changes. The transition process of the C. elegans nervous system from its juvenile to mature configuration is a good experimental platform to reveal rules that govern the form and plasticity of neural networks. We have combined connectomics, optogenetics, and modeling approaches to address how the relationship between the C. elegans’ structural and functional circuitry changes during development. These analyses implicate multiple maturation strategies to serve different etiology needs.
The mapping of neuronal connectivity is one of the main challenges in neuroscience. We use 3-dimensional electron microscopy (EM) imaging of nerve tissue at nanometer-scale resolution across substantial volumes, extending to more than one millimeter on the side, followed by AI-based image analysis to obtain dense connectivity maps, or connectomes. With these we have recently mapped local circuitry in mouse and human cortex, determining learning-related synaptic traces, inhibitory axonal development, and discovering an expanded interneuron-to-interneuron network in the human cortex.
How have animals managed to maintain metabolically expensive brains given the volatile and fleeting availability of calories in the natural world? I will present recent results in support of two strategies that involve - 1) an implementation of energy-efficient neural coding, enabling the brain to operate at reduced energy costs, and 2) an efficient use of costly neural resources during food scarcity. These results reveal metabolic state-dependent mechanisms by which the mammalian cortex regulates coding precision to preserve energy in times of food scarcity.
The neurophysiology of cells and tissues are monitored electrophysiologically and optically in diverse experiments and species, ranging from flies to humans. Understanding the brain requires integration of data across this diversity, and thus these data must be findable, accessible, interoperable, and reusable (FAIR). In this presentation we will describe with Neurodata Without Borders (NWB) the design and implementation for an open standard for neurophysiology data. The open-source NWB software defines and modularizes the interdependent, yet separable, components of a data language. We will discuss the NWB data standard and software and provide an overview of the broader NWB software ecosystem of community visualization and analysis tools.
Modern neuroscience relies on a combination of experimental and computational approaches to understand the brain. We have developed an updated version of the Open Source Brain platform (OSBv2), a browser based, integrated research environment for both experimental data analysis and theoretical/modelling research. OSBv2 uses NWB as the recommended data sharing format, and also integrates the graphical frontend to NetPyNE to facilitate the simulation and analysis of network models, as well as providing access to fully interactive computing environments with JupyterLab. Providing this single, integrated environment for data analysis and modelling will help close the gap between experimental observations and insights obtained through computational modelling.
Since its founding, the Allen Institute has made open data one of its core principles. Specifically, it has become known for generating and sharing survey datasets within the field of neuroscience. Starting a decade ago, we began planning the first surveys of in vivo physiology in mouse cortex with single-cell resolution - the Allen Brain Observatory. We first used 2-photon calcium imaging and later Neuropixels electrophysiology to record spontaneous activity and evoked responses in visual cortex and thalamus of awake mice that were passively exposed to a wide range of visual stimuli (known as “Visual Coding” experiments). In both cases, the data was shared even before we published our own analyses of them. Since they have been released, these data have been used to produce new discoveries, to validate computational algorithms, and as a benchmark for comparison with other data, resulting in over 100 publications and preprints to date. The widespread mining of our publicly available resources demonstrates a clear demand for open neurophysiology data and points to a future in which data reuse becomes more commonplace. I will reflect here on the lessons learned concerning the challenges of data sharing and reuse in the neurophysiology space and ways that we are working to address these in our ongoing efforts.
DANDI (dandiarchive.org) is a cellular neurophysiology data archive and collaboration hub built to support multiscale, multispecies, and multimodal neuroscience research. The data and resources related to DANDI support theoretical neuroscience, drive biological applications, and help develop new analytic tools. The DANDI infrastructure is built on open-source technologies and in the cloud, to support dissemination, search, visualization, computation, and coordination in neurophysiology research projects to promote FAIRness and efficiency. Through the lens of DANDI, I will discuss how neuroinformatics infrastructures can be built to support collaborative and open science, and potentially transform knowledge generation.
The Kavli Foundation seeks to catalyze Open Data in Neuroscience by creating award mechanisms to optimize the vast quantities of data generated by neuroscientists, and fuel novel discoveries by mining open data. I will describe our funding approach, highlighting past projects that empowered researchers and recent awards to promote development of robust data standards, data sharing and data reuse.
In a perfect world, we would like datasets that represent the activity and properties of every functional component of our experimental systems. This situation has already come true in simulations, as we have access to every parameter and state variable of the model. This overabundance of data presents its own set of design problems for data integration, sharing and visualization, especially for multiscale models that span molecules to networks. I will discuss how we've implemented formats and tools to do this. I suggest that the development of these tools may provide guideposts for managing richer experimental datasets which incorporate multimodal and multiscale data.