- 2025
- 2024
- 2023
- 2022
2025
Projects:
Advancing Bias-Free Sentiment Analysis: Scaling the SProp GNN to SOTA-Level Performance
AUTHOR: Hubert Plisiecki
AFFILIATION: Stowarzyszenie na rzecz Otwartej Nauk
Modern transformer-based architectures have demonstrated remarkable performance in sentiment analysis but often learn and propagate social biases from their training data. To address this, I previously introduced the Semantic Propagation Graph Neural Network (SProp GNN), a bias-robust alternative that relies exclusively on syntactic structures and word-level emotional cues. While the SProp GNN effectively mitigates biases—such as political or gender bias—and provides greater interpretability, its performance currently falls slightly short of state-of-the-art (SOTA) transformer models. This project aims to advance the SProp GNN by testing and implementing architectural and methodological improvements to elevate its performance to transformer levels and beyond. Proposed enhancements include: (1) developing alternative sentence parsing models and graph setups to better align with the propagation of emotional information through syntactic structures, (2) experimenting with various taxonomies for parts of speech and dependency types, and (3) exploring alternative SProp architectures and conducting extensive hyperparameter optimization. Additional ideas for improvement are also welcome. Achieving SOTA performance while maintaining the model’s ethical and transparent design could establish the SProp GNN as a valuable alternative for sentiment analysis across diverse applications. Results from this hackathon will be shared on the original project’s GitHub repository, with proper attribution for contributions.
AI Chart Surgeon: Improving Visualizations, One Graph at a Time
AUTHORS: Piotr Migdał, Katarzyna Kańska
AFFILIATION: independent AI consultant at p.migdal.pl
Good charts present data in a way that is easy to understand and interpret.
We will use modern AI tools to improve existing charts, following the best practices of data visualization.
We will construct a tool that is able to:
– extract data from an existing chart
– suggest appropriate chart types for the data
– create code for a new chart
– generate the new, improved chart
Most scientists are not data visualization experts, so we will create a tool that helps them create better charts. It will provide concrete feedback on their choices, not only to get results but also to teach good practices.
Additionally, many charts are not suitable for republishing – both in terms of their visual quality and copyright restrictions.
We plan to test:
– which types of charts work well for data extraction
– which types of data are good candidates for automatic chart generation
– which AI models work best for each task
We plan to use modern Large Language Models (LLMs) and vision models, such as GPT-4o, Gemini, and Claude.
Any partial solution (e.g., only data extraction or only chart generation) would be a valuable achievement on its own.
Automated Motion Tracking for Early Neurological Assessment in Infants Based on the Hammersmith Neonatal and Infant Neurological Examination (HINE)
AUTHOR: Paulina Domek
AFFILIATION: SWPS University
This preliminary research project aims to explore the potential of automated motion tracking systems in supporting the Hammersmith Neonatal and Infant Neurological Examination (HINE). We will attempt to develop a computer vision-based approach to analyze video recordings of infant assessments, with the goal of extracting quantifiable movement features that could assist in clinical evaluation. The proposed system will combine motion tracking technology with machine learning algorithms to potentially provide objective measurements of infant motor performance.
Our methodology will involve collecting video recordings of HINE assessments and applying pose estimation algorithms to track infant movements. We plan to use frameworks such as OpenPose or DeepLabCut for movement analysis, focusing on key features such as spontaneous movement patterns, posture, and reflex responses. The project will explore the feasibility of machine learning models in distinguishing between different movement patterns, while acknowledging the complexity and variability inherent in infant motor development.
If successful, this tool could complement traditional HINE assessments by offering additional data points for clinicians to consider when screening for early signs of neurological disorders. The potential benefits include enhanced objectivity in assessment, improved documentation of infant movement patterns, and the possibility of identifying subtle motor abnormalities that might warrant further clinical investigation.
We recognize the challenges in developing such a system, including the need for extensive validation, the complexity of infant movement patterns, and the importance of maintaining the central role of clinical expertise in assessment. This exploratory project represents an initial step toward combining modern computer vision techniques with established clinical practices in infant neurological assessment, potentially contributing to the broader field of technology-assisted pediatric healthcare.
Eye orbit segmentation and eye movement detection via fMRI
AUTHORS: Cemal Koba, Jan Argasinski
AFFILIATION: Sano Center for Computational Medicine
In our previous research, we demonstrated that the mean fMRI time series from eye regions correlate with spontaneous brain activity in visual and somatomotor regions. We now aim at refining our analyses by better defining the eye movements rather than using mean signal from the whole eye region. To achieve this, we plan to create an algorithm that processes 4D data (3D spatial data + time) from eye regions. More specifically, we want to automatize the following steps:
– Locating and isolating eye orbits in a given 4D fMRI image
– Identifying the initial position of the eye
– Reporting the movement parameters (such as translation and rotation) over time.
– Reporting summary statistics such as relative and absolute motion, coherence between both eyes, and displacement
– Optional: Find the neural correlates of each summary statistic
Although similar algorithms already exist, they are often deep-learning-based and trained on specific populations. Our goal is to develop an algorithm that operates solely on the subject’s available data, without requiring a pre-trained model, and is adaptable to specific clinical populations.
Can mental health be quantified? – preliminary project of a mobile app for patients receiving psychiatric care
AUTHOR: Sylwia Adamus
AFFILIATION: University of Warsaw, Faculty of Physics; Medical University of Warsaw, Faculty of Medicine
The popularity of healthcare mobile apps has been rising constantly, with a variety of them available for each medical specialty. The ones dedicated to patients receiving psychiatric care are however often lacking in necessary functionalities and focus on tracking one’s symptoms by a descriptive analysis.
This project was conceptualized during the 12th edition of Bravecamp, qualifying for its finale. We will brainstorm an app that would allow for simultaneous tracking of both medications and mental health symptoms, focusing on a quantitative approach inspired by scales used in psychiatry and psychology. The project will include generating test data, programming basic functionalities, and visualizing the output that a potential app user would receive.
Beginners in programming are welcome, what matters the most is your creativity!
Second Generation Diffusion Models of Brain Dynamics via Flow Matching
AUTHOR: Adam Sobieszek
AFFILIATION: University of Warsaw
This project focuses on new methods in modelling EEG brain dynamics by combining flow matching, a recent generalization of diffusion models, with transformer-based representations of neural signals. While traditional diffusion models have shown promise in generative tasks, flow matching offers a more flexible framework that can be formulated as neural differential equations, enabling a wider range of applications beyond simple generation. We will leverage this flexibility to test multiple challenges in neural signal processing and brain-computer interface (BCI) applications. We will work on models that can perform tasks such as source separation in the signal domain, as well as flow matching operating in the representation space of transformer models trained on neural data. Flow matching in representation space could aid controlled signal generation by incorporating additional information about represented brain processes. We will focus on data from language-related BCI tasks, where we will work on detecting language-related brain activity. Our project will explore how flow matching can be used to model the complex trajectories of brain states during cognitive tasks, while maintaining the interpretability advantages of transformer-based representations. If successful, this approach could be worked on after the event to build a promising framework for modeling and manipulating neural signals in research settings.
2024
Projects:
Combining Generative Autoencoder and Complex-Valued Neural Network architectures for EEG signal modeling
AUTHOR: Adam Sobieszek
AFFILIATION: Univeristy of Warsaw
The project aims to solve problems with EEG signal analysis with Variational Autoencoders (VAEs) by combining them with the lesser-known architecture of Complex-Valued Neural Networks (CVNNs). Our aim is to enhance the signal generation and representation capabilities of VAEs for EEG signals, developing a better architecture for modelling signals in the frequency domain, which is represented with complex numbers. VAEs are deep learning models that learn to encode and decode data, creating a latent space representation. This latent space is a compressed knowledge representation of the training signal data, which VAEs learn to reconstruct. In the context of EEG signals, VAEs are useful for signal generation (e.g. for missing data imputation) and their learnt representations can be used for clasification and prediction, as well as for uncovering insights about the training EEG data. Training VAEs on EEG data in the time domain has proven ineffective. Conversely, representing EEG signals in the frequency domain via their Fourier Transform is challenging due to these representations being complex numbers, that are not well transformed with real-valued neural network layers. CVNNs integrate complex numbers directly into the network architecture, using complex numbers in the trainable layer weights. In our project we will program custom CVNN layers that we can use to manipulate the complex-valued Fourier spectra. By combining VAEs with CVNNs, we can learn a complex-valued latent space representation of EEG signals, which can be interpreted in terms of the magnitude and phase of discovered signal components. We will train such networks on EEG data from psychological studies and analyze the learnt representation in a supervised and unsupervised setting. If successful, we expect to continue working on the project after the Brainhack in order to present the results in a scientific publication.
Subtyping and grading of gliomas using artificial intelligence
AUTHORS: Paulina Domek
AFFILIATION: SWPS
Advances in brain tumor research show promise to improve diagnosis and treatment by identification of epigenetic molecular targets responsible for the disease (Skouras et al., 2023). It has been shown that gene expression is controlled by epigenetic modifications of DNA, such as histone methylation and acetylation (Gibney and Nolan, 2010), therefore, it is crucial to accurately identify molecular targets responsible for the occurrence of the disease. Recently, various approaches have been taken to combine transcriptomics (gene expression data analysis) and epigenetics to design novel drugs to target malfunctioning proteins, specific for each subtype and grade of brain tumors. The project was initiated at Brainhack 2023. In the project, we successfully developed a data analysis workflow to analyze publicly-available epigenetic data of brain tumor patients from various databases, such as The Cancer Genome Atlas and Gene Expression Omnibus. We validated an open-source software MethPed (Ahamed et al., 2016) as a potentially useful tool for clinical application of pediatric brain tumor subtyping. We found out that MethPed could serve as a confirmation test for the primary diagnosis, however some results might require additional confirmation. To emphasize the importance of code and data reproducibility, we shared our results in publicly available GitHub repository: https://github.com/jjjk123/Brain_tumor_subtyping. At Brainhack 2024, we plan to extend the workflow to include a machine learning software for subtyping and grading of gliomas using transcriptomic data (Munquad et al., 2022). The results will be analyzed alongside the results from MethPed, and are expected to provide a deeper insight into the mechanisms of brain tumor subtypes.
Visualization of Dyslexic Reading using Large Language Model
AUTHOR: Karolina Źróbek
AFFILIATION: Akademia Górniczo Hutnicza
Large language models (LLMs) open up new possibilities, allowing us to explore and understand various perspectives of human experiences. The primary aim of this project is to create a visualization of dyslexic reading patterns predicted by a custom ChatGPT (Generative Pre-trained Transformer). By training a specialized language model on articles related to reading and dyslexia, we intend to develop a tool that can generate insightful analyses of dyslexic reading behaviors, including saccadic movements, fixation points, and the duration of fixations. Furthermore, the custom model will provide qualitative information on how individuals with dyslexia perceive text while reading. The information obtained from the custom language model can be used to create visualizations, such as videos or images, representing dyslexic reading patterns. Apart from enabling the viewer to empathize with dyslexic readers, we see a possible development of the visualization in such a way that provides dyslexic readers with a tool for enhanced comprehension of any given text. This project proposal is rooted in the exploration of eye movement patterns observed in individuals with dyslexia, coupled with qualitative research focusing on the subjective experience of reading. By delving into the intricate dynamics of eye movements among individuals facing dyslexia, we aim to unravel deeper insights into how this neurological condition shapes the act of reading. To achieve the stated objective, we propose the following implementation plan: Custom GPT Training: Training a custom Language Model (LM) using a diverse dataset of articles specifically focused on reading and dyslexia. The model would be used as a meaning and parts of speech identification tool, allowing for a comprehensive understanding of textual content. Text Analysis: The model would later be used to generate detailed text analyses including information related to saccadic movements, fixation points, and the length of fixations during dyslexic reading. The model would also provide qualitative insights into how dyslexic individuals perceive and interpret text while reading, capturing subjective experiences, such as struggles with specific words or patterns. Visualization Creation: The information obtained from the language model would be utilized to create visualizations representing dyslexic reading patterns. Prompt-to-image generative models could be used to enhance the visual representation of the data.
Estimating the amount of recalled information by NLP techniques for free recall psychological research of memory
AUTHORS: Michał Domagała
AFFILIATION: Jagiellonian University
In research of memory, traditionally there exist two methods of judging memorized content: Explicit paradigms, that task participant in recognising presented stimuli as novel or previously seen or Implicit methods, where subject are asked to fill the collection of object with one from memory. Both methods however bear little resemblance to how humans memorize objects in their daily functioning, as they consist of recognising simple, isolated objects as opposed to complex audiovisual stimuli. Thus, a novel paradigm of recalling from a continuous stream of information, such as movies or audiobooks has been proposed as a more ecological alternative. Here however, the problem of judging the amount of remembered information arises, as it is difficult to use objective measurements as people tend to differ in their ascriptions of importance. We propose to use Natural Language Processing and advanced GPT based language models to develop an index of how much detail have been memorized. In order to achieve this we intend to develop similarity measures between auditory stream and recall that is sensitive to context and works irregardless of the text length. Such a tool would be invaluable in ecological psychological research as it will allow for fast and highly reproducible assessment of the amount of remembered information. To achieve the stated objective, we propose the following implementation plan:
1. Generate the dataset of stories by the virtue of language models such as ChatGPT and ask people for detailed and less detailed summary. Additionally, summaries will be performed by language models.
2. Customly refit GPT model to return a remembrance score – as an estimate of recalled information on the basis of text similarity between original text, GPT made summary, and by referencing detailed and less detailed summaries. Here our work will center on reshaping data and establishing text properties that will be connected to better recall.
3. Compare remembrance score with the judgment of Competent judges and replicate simple experiment of Viewing a standard audiovisual stimuli of “Sherlock” movie and then asking participants to Recall specific scenes in as detailed way as possible.
Default Mode Network connectivity
AUTHOR: Zofia Borzyszkowska
AFFILIATION: Nicolas Copernicus University
Default Mode Network (DMN) is active during a resting state and is especially helpful in identifying differences between healthy people, and patients with mental diseases. The relationship between resting state and visual input has not been widely researched. In 2022 Yi Wang lead an EEG study aiming to find differences in connectivity in between two states: eyes open and eyes closed [1]. Analysis carried out on the Brainhack Warsaw is a part of an ongoing Neurotech Students Scientific Club project inspired by Wang’s paper. The EEG data for this project has been collected by members of the club on a 32-electrode cap. The goal of this project is to analyze DMN connectivity between two states: eyes closed, eyes opened.
LoRACraft: does composing diffusion model LoRAs actually work?
AUTHOR: Paweł Pierzchlewicz
AFFILIATION: University of Tübingen and University of Göttingen
In the realm of high-quality image generation, diffusion models have demonstrated prowess, notably with the infusion of Low-Rank Adaptation (LoRA). This technique excels at injecting novel concepts into diffusion models, enhancing versatility and creativity. Despite its success in fine-tuning for specific tasks, the potential of composing LoRAs for multi-task capabilities remains underexplored. Enter LoRACraft: a project delving into the uncharted territory of composing LoRAs. Our mission is to scrutinize the limitations, crafting different models to evaluate their individual task performance and their prowess in combination. Drawing a link between LoRAs’ composability and the energy-based perspective on diffusion models, we aim to establish a robust theory explaining the efficacy of combining these models. Our hypothesis posits that composing LoRAs is akin to composing energies, allowing for the flexible combination of LoRAs and achieving superior performance in compositional tasks. LoRACraft seeks to unravel the potential synergy, providing insights into the underexplored realm of LoRA composition for enhanced performance in diffusion models.
Rhythms of thought: Unraveling human behavior with bayesian models and circadian rhythmicity
AUTHOR: Patrycja Ściślewska
AFFILIATION: Department of Animal Physiology, Faculty of Biology, University of Warsaw; Laboratory of Emotions Neurobiology, Nencki Institute of Experimental Biology PAS
Bayesian models are increasingly used across a broad spectrum of the cognitive sciences. This approach aligns with the current trends in data-driven attempts to understand human behavior (Wilson & Collins, 2019). The aim of the project is to develop a mathematical model to describe human behavior (e.g., reaction time in particular conditions, decision-making processes, learning rate) and to determine the relationship between these aspects of human behavior and the features of the biological clock, sleep quality, and personality traits. Are the evening people more willing to take risks? Do morning larks learn faster? How does the negative emotionality affect our reaction time? At the Hackathon, we will use experimental data from state-of-the-art neuroscience tasks (such as the Iowa Gambling Task or the Monetary Incentive Delay Task), which measure individual sensitivity to reward and punishment, the tendency to take risks, or to learn from mistakes. We will choose one of the computational models described in the literature (e.g. Rescorla-Wagner Model (Rescorla & Wagner, 1972), Outcome-Representation Learning Model (Haines et al., 2018)) and modify it to best fit the behavioral data. Then, we will perform computer simulations to validate the model. The results will provide more insight into the role of the circadian rhythmicity and sleep in the proper functioning of cognitive processes in the brains of young people.
Can AI diagnose mental conditions within a single conversation?
AUTHOR: Piotr Migdał PhD, Michał Jaskólski
Modern large language models (LLMs) exhibit immense prowess in interpreting text data, particularly in natural language. They adeptly extract information, not only from explicit statements but also by reading between the lines, utilizing elements from word usage and syntax to overall coherence. For this exploration, we will utilize GPT-4 by OpenAI, the most potent LLM available to date. It has been trained on extensive datasets, including psychological and therapeutic textbooks, articles, and innumerable real conversations. This project investigates whether this generalist model can effectively assess an individual’s psychological traits. Our aim is to predict various psychological traits, such as the Big Five and Dark Triad, along with Attachment Style, as well as neurodivergences like autism and ADHD. We remain open to predicting other traits and conditions, contingent on the available data and participant interest. Traditional questionnaires use a rigid structure of questions and answers. With the latest AI model, we can transcend this limitation through automatic text analysis and AI-enabled conversations, mimicking a mental care professional’s approach.
- What methods are more effective: text analysis or interactive discussion?
- Which psychological traits and conditions are more accurately predictable?
- How can we optimize prompts for GPT-4 to ensure high-quality responses?
- What is the minimal data required for accurate predictions? – How can we benchmark and validate our findings?
In 2013, simpler machine learning models could predict personality from as few as 10 Facebook likes, more accurately than a work colleague, and with 300 likes, better than a spouse. We seek to discover the extent of advancements possible with far more sophisticated AI and richer data.
Large Language Models vs Human Cerebral Cortex: Similarities, Differences, and their Consequences
AUTHOR: Natalia Bielczyk
AFFILIATION: Ontology on a Valuse
Will Large Language Models (LLMs) take our jobs? This is a delicate and complex subject-matter: AI is designed to automate tasks rather than jobs, while most jobs consist of dozens of tasks.
Similarly to other reinforcement learning models, Large Language Models were originally inspired by the human brain. Indeed, human cortical networks and large language models (LLMs) such as GPT share some similarities in their structure, function, and learning mechanisms.
- Structural Similarities: Both human cortical networks and LLMs are composed of interconnected nodes (neurons in the brain, artificial neurons in LLMs) that process and transmit information. Transformer-based LLMs, in particular, have been found to have structurally similar representations to neural response measurements from brain imaging.
- Functional Similarities: Both human brains and LLMs process language in a predictive manner, predicting the next word based on the context. Studies have shown that the activations of LLMs map accurately to areas distributed over the two hemispheres of the brain, suggesting a functional similarity.
- Learning Similarities: Both human brains and LLMs learn from experience. For LLMs, this experience comes in the form of large amounts of text data that they are trained on.
However, there are also profound differences, especially in terms of consciousness, learning mechanisms, and the ability to handle abstract logic and long-term consistency.
- Structural Differences: The human brain is a highly complex, three-dimensional network of neurons with both local and long-range connections, while LLMs are typically organized in layers with connections primarily within and between adjacent layers. Moreover, the brain’s structure is influenced by physical locality and developmental processes, leading to a modular structure.
- Functional Differences: While LLMs excel at tasks like translation and text completion, they struggle with tasks that require abstract logic or long-term consistency, which are areas where the human brain excels. Additionally, LLMs lack the bidirectional connectivity that is believed to be essential for consciousness in the human brain.
- Learning Differences: The human brain learns from a variety of sensory inputs and experiences over a lifetime, while LLMs are trained on specific datasets and their learning is confined to the patterns present in these datasets. Furthermore, the brain is capable of lifelong learning and adaptation, while LLMs’ parameters remain fixed after training. On the other hand, learning in LLMs is fast and widespread as back-propagation mechanism allows for global and fast learning in the network while Hebbian learning in cortical networks is local and slow.
Stroke lesion segmentation via unsupervised anomaly detection
AUTHOR: Cemal Koba
AFFILIATION: Sano Center
Detection of areas affected by stroke is a crucial aspect for its recovery. Automatization of this task has been a popular challenge, but the ground truth for confirming the performance of the automatic processes was the lesion masks that are manually drawn by human experts. However, stroke lesions may not be always visible to human eye. In this project, we are aiming at automatic lesion segmentation based an unsupervised model. The model will be trained on the healthy brains and the aim will be anomaly detection (stroke lesion, in our case) through generative adversarial networks.
2023
Projects:
Brain Tumor Subtyping
AUTHOR: Jędrzej Kubica
AFFILIATION: University of Warsaw
Brain tumors can be classified into different subtypes to recommend a patient-specific treatment. Characterization of patients into groups based on molecular features can help clinicians to choose proper medications for each individual and to improve the outcome of the treatment. Various classification tools for tumors of the central nervous system have been developed. For instance, MethPed [1] is an open-access classifier which uses DNA methylation as an epigenetic marker for subtype classification. The goal of the project is to develop of a data analysis workflow to analyze publicly-available epigenetic data of brain tumor patients from databases, such as The Cancer Genome Atlas. Future work will include subtype-specific drug recommendations. Further research can also be extended from brain tumors into tumors of the nervous system.
Application of an amygdala parcellation pipeline based on Recurrence Quantification Analysis to resting-state fMRI data acquired in a 7T MRI scanner
AUTHORS: Sylwia Adamus
AFFILIATION: Medical University of Warsaw, University of Warsaw Faculty of Physics
According to animal studies the amygdala consists of several groups of nuclei, which play different roles in emotion-related processes. It has also been shown that this brain structure is important for the development of many psychiatric conditions, such as depression, addictions, and autism spectrum disorder. Up to this day, a number of approaches to the topic of amygdala parcellation have been suggested. One of them, which was recently published, is a pipeline using Recurrence Quantification Analysis (RQA). It enables the division of the human amygdala into two functionally different parts based on brain signal dynamics [1]. The aim of this project is to further develop this pipeline and check whether with its help it is possible to divide the amygdala into more than two subdivisions. To achieve this the pipeline will be applied to resting-state fMRI data acquired in a 7T MRI scanner from a dataset consisting of 184 healthy subjects from the Human Connectome Project. An exploratory approach will be applied using several variations of RQA parameters and clustering algorithms. The pipeline was developed on resting-state fMRI data acquired in a 3T MRI scanner. It has been speculated, that its application to data from a 7T MRI scanner could enable obtaining more detailed parcellations. Therefore the main hypothesis behind this project is that by using this pipeline it will be possible to achieve a parcellation with at least three functionally different subdivisions. It could have the potential to serve as a mask in further studies of human amygdala functional connectivity. All participants will have the opportunity to go through the whole pipeline to fully explore its possibilities. A device, which supports Anaconda Navigator is necessary to take part in the project. No advanced neuroscientific knowledge is required and everyone, who knows some basics of Python programming, is invited to cooperate. References: [1] Bielski K. et al., (2021) NeuroImage 227(117644)
Automatic artifact detection in EEG using Telepathy
AUTHOR: Marcin Koculak
AFFILIATION: Consciousness Lab, Institute of Psychology, Jagiellonian University
This will be a follow-up to a project from Brainhack Warsaw 2022, where we built a library for EEG analysis from scratch. That library since then was named „Telepathy” and is being actively developed by the author of this project. In this edition I would like to focus on implementing methods for automatic detection of artefactual signal in EEG recording, along with methods to deal with them. This includes correcting hardware related issues, rejecting signals coming from non-brain sources, and methods for dimensionality reduction, all of which help to isolate the signal of interest from EEG. Participants will have a chance to better understand sources of noise in EEG data, see what analysis software does under the hood to deal with them, and attempt to implement them in new and dynamically evolving language for scientific computing. If possible, I will try to organise a small demonstration how EEG signal is collected and from where the artefacts might come from.
Pose estimation based long term behavior assesement of animals in semi-naturalistic condtions
AUTHORS: Konrad Danielewski, Marcin Lipiec, Ewelina Knapska, Alexander Mathis
AFFILIATION: Nencki Institute of Experimental Biology
Using Eco-HAB system of four cages equiped with RFID antennas (RFID tagged group of 12 mice) and DeepLabCut, SOTA pose estimation framework we want to create a framework for behavior analysis of those animals. A set of 9 bodyparts is tracked on each animal (pose estimation model is ready and working well). Our test recording is 3 hours long but the goal is for the system to perform non-stop for a week. A synchronization script that will work based on visual information is needed, as there is no TTL signal from the antennas – camera timestamps are available and between frame times are very precise (at the scale of 10s of microseconds – FLIR BFS camera is used). We aim to develop an algorithm that will filter identity based on antenna readout and correct any potential switches. With identity corrected DLC2Kinematics and DLC2Action with some additional functions can be used to assess animal behavior for single animals and pairwise social interations. The goal would be to combine all of this into a user friendly framework with a set of high-level functions and thorough documentation that can be open-sourced and used by anyone interested in highly detailed long-term, homecage monitoring and behavior assesement.
A paradigm shift in experiment design leading to large scale EEG data acquisition for visual attention
AUTHOR: Shaurjya Mandal
AFFILIATION: Carnegie Mellon University
Visual attention often refers to the cognitive processes that allow the selective processing of visual information that we are exposed to in our daily lives. Gaining an understanding of visual attention can be crucial to a number of applications like the study of human-computer interaction and analysis and improvement of advertisements. EEG along with eye-tracking has been popularly used to study visual attention in subjects. In deep learning applications, to model needs to be trained on a significantly large dataset. Although the results obtained with deep learning algorithms tend to be highly accurate, it is always hard to acquire such a dataset. The apparatus required to track the eye movements and comment on visual attention from the gaze of the subject are not portable. Thus, to sync the EEG data with eye tracking/ visual attention data outside the laboratory setup, we would need an additional approach. To study the same, we ask two important questions:
1. To what extent can multi-channel EEG data provide inference regarding eye movement and eye tracking? And are there any specific experiments that help us to observe the behaviour better?
2. Is there a way to find an alternative to proper eye-tracking that can be crowdsourced? To answer the above questions, we would start with analysing 3 distinct publicly available datasets.
EEGEyeNet dataset comprises of EEG and eye-tracking data from 350 participants, 190 female and 160 male, between the ages of 18 and 80 years. This dataset provides enough data to correlate the changes in EEG based on the movement of the eyes. The eye-tracking experiments with synchronized EEG have been divided into 3 main parts: Pro and antisaccade, large grid, and visual symbol search. Knowing the protocols of the experiment while analysing the data would allow gain a better understanding of eye-movement with the EEG data. In the previous literature, eye-tracking for visual attention is linked to mouse-tracking. But to allow this, specific protocols have to be adopted that causes takes care of localised visual exposure with minimal distraction. To validate our performance across eye-tracking and mouse-tracking scenarios, we will make use of the OSIE dataset. OSIE dataset consists of 700 natural images along with publicly available mouse tracking and eye tracking data. To train the deep learning model that will allow us to generate labels for visual attention, we have made use of the SALICON dataset. The dataset consists of 10,000 MS COCO training images. By the end of the project, we hope to:
1. Analyse the pretrained models for visual attention with eye tracking and train our own models with the mouse-tracking data and compare the models.
2. Through effective data analysis, reduce the number of channels while preserving the variance of EEG data during eye-tracking experiments.
3. Develop our own custom software which can be used as a convenient means to collect and sync EEG data to visual attention outside laboratory settings.
Robust Latent Space Exploration of EEG Signals with Distributed Representations
AUTHOR: Adam Sobieszek
AFFILIATION: University of Warsaw
It is said that the advantage of methods, such as Variational Autoencoders (VAEs) and Generative Adversarial Networks (GANs), is their learning of latent space representations of the target domain. E.g. training a GAN that generates EEG signals gives us a latent space representation of the type of EEG signals that were in the training set. These representations could moreover be made „disentangled”, which uncovers independent dimensions that are features that best describe the target domain. Such methods could give us the ability to describe and better understand the type of information contained in EEG signals. However, for such methods to be widely accepted as scientific tools, we need to overcome the issue that each time we train a neural network, the learnt latent space seems on the surface to be completely different. Recently we investigated the ability of GANs to learn disentangled representations of EEG signals. With our network, we’ve obtained multiple latent space representations of EEG signals. The goal of this project is to investigate whether such representations hold the same kind of information, possibly showing the robust nature of neural network representation learning, as well as investigate how such multiple (distributed) representations may be used together. We will implement down-stream tasks, that latent representations are useful for (such as classification, and explainable feature visualisation) and compare how their performance differs between different representations. We will investigate whether combining representations may result in an increase in accuracy. We will test methods for learning probabilistic maps between latent spaces, which could prove useful for a wider array of machine learning applications. Finally, we will attempt to find in these distributed representations the common factors, that are robustly able to describe EEG signals. If time allows we will also attempt to train disentangled variational autoencoders (called beta-VAEs) for generating EEG signals and investigate whether the factors discovered by this method are similar to those found with our GAN-based method.
Comparison brain networks from EEG and fNIRS
AUTHOR: Rosmary Blanco, Cemal Koba, Alessandro Crimi
AFFILIATION: Sano center for computational medicine
The integrated analysis of functional near infrared spectroscopy (fNIRS) and electroencephalography (EEG) provides complementary information about electrical and hemodynamic activity of the brain. Evidence supports the mechanism of the neurovascular coupling mediating their brain processing. However, it is not well understood how the specific empirical patterns of neuronal activity measured by these techniques underlies brain function and networks. Here we have compared the functional networks of the whole brain between synchronous EEG and fNIRS connectomes across frequency bands, using source space analysis. We have shown that both EEG and fNIRS networks have a small-world topology. We have also observed an increased interhemispheric connectivity for HbO compared to EEG and HbR, with no differences across the frequency bands. Our results demonstrate that some topological characteristics of the networks captured by fNIRS and EEG are different. Once we understand what their differences and similarities mean in order to interpret them correctly, their complementarity could be useful for clinical application.
2022
Projects:
Generating Sequences of Rat Poses
AUTHOR: Paweł Pierzchlewicz
AFFILIATION: University of Göttingen & University of Tübingen
The goal to understand behaviour has been the backbone for driving research in various areas of neuroscience. Recently machine learning has provided a set of tools, which allows us to study behaviour in previously unimaginable ways. Specifically deep learning helps us shine some light on high dimensional data such as behaviour. Particularly interesting for behaviour could be the sub field of generative modelling, where one strives to model and sample from some target distribution. A prominent example of such models are generative adversarial networks (GANs), which have presented impressive results in generating novel entities (faces, cats, memes, etc.). However, their implicit nature makes it harder to intuitively understand the generative process. Thankfully a different equally exciting method called Normalising Flows (NFs) allows us to explicitly transform one distribution (e.g. standard normal) into another (e.g. faces), through a series of invertible transformations. As a result, providing us with an easier to understand generative model.
In this project we will attempt to learn to generate temporally coherent sequences of rat poses based on the Rat7m dataset using NFs. To achieve this we will explore variants and constraints on the latent space to best capture the distribution of pose sequences. One possible direction would be to constraint the latent space such that following a linear path between two points generates a coherent “pose movie” between the two corresponding poses. Finally, we will analyse the learned latent space to show that NFs can serve as a powerful analysis tool for behavioural neuroscience researchers. Meaningful structure is expected to emerge in the latent space indicating clusters of actions, temporal similarity of poses or some other interpretable object.
Talking with Machine Learning models using chatbots
AUTHORS: Michał Kuźba
Machine Learning models are often blackboxes. It means their predictions are hard to interpret and trust. However, there’s ongoing research in the area of interpretable/explainable Machine Learning.
Why don’t we talk to the model to ask and understand more about the decisions it makes? We can use an existing chatbot framework such as Dialogflow, plug in a blackbox ML model and interact with the model in a conversational way.
Imagine interrogating the ML model for doing biased, wrong or weird decisions.
What can we do?
- Train some ““controversial”” ML models (medical, COVID, financial, legal, including biases, etc.)
- Create a dialogue system as an interface for the model
- Work on explaining model decisions
- Discover questions to ask the ML models
- Deploy our chatbot to a larger audience
- Anything that comes to your mind and we might do using a chatbot as an interface for the Machine Learning model (All ideas are welcome!)
Learning resting-state EEG data analysis through software development
AUTHOR: Marcin Koculak
AFFILIATION: C-Lab, Institute of Psychology, Jagiellonian University
Electroencephalography (EEG) is a popular method to investigate how our brains work and process information. Most of the researchers rely on software written by others to do their analyses, following example pipelines from the published literature. This creates a situation, where users have little practical knowledge about what is actually happening with their data and how choices made on each step of the analysis impact the final outcome. This is especially noticeable when using proprietary software, where source code is unavailable and analysis options are usually limited. Open-source projects can be easily inspected and the code tweaked, but rarely documentation provides implementation details or compare similar functions across different software. This main goal of this project is to help participants understand what EEG data represents, how it is collected, and how it can be analysed. We will take a mostly practical approach and acquire the knowledge through coding our own software. This will force us to understand every step of the process, so we can arrive at the proper result at the end. We will focus mainly on the most common preprocessing steps, but we should have a working pipeline for a simple analysis at the end of the hackathon. We will be programming in Julia – a relatively new language that draws inspiration from C, Matlab, and Python, with a strong focus on scientific computing. The project will assume participants have no experience in Julia, but familiarizing yourself with it before the brainhack or having experience with other languages (especially Python and Matlab) will definitely help. Author of the project will also try to arrange an EEG equipment, so participants will be able to analyse their own data.
Isometric Latent Space Representation of EEG signals for bootstrapping and classification
AUTHORS: Adam Sobieszek
AFFILIATION: MISMaP, University of Warsaw
We will train a generative adversarial network (GAN) in order to construct a latent representation of EEG signals from some domain (a representation, where one point in a vector space corresponds to one EEG signal the network can generate). The dataset we’ll use will either be data from experiments on emotional word processing of K. Imbir, J. Żygierewicz, myself and others, or some other publicly available dataset, e.g. of BCI data. The goal of the project is to develop a GAN capable of generating EEG signals similar to a given dataset of EEG samples, that learns a latent space representation of those signals that is isometric to the output domain. This means that a distance in latent space between two points representing two signals approximately corresponds to a measure of distance between these two signals. This is useful as it (a) adds smoothness to the representation, such that signals that are similar correspond to points that are near each other, (b) directions in latent space start to correspond to useful features of the signals, which makes classification much easier, (c) you can use such a latent space to generate, for example, a typical signal from some category, or easily bootstrap new signals similar to a set of signals (which can be used, for example, in data augmentation or bootstrap statistical tests).
We will code and train this network, design a distance metric (in order to design a regularization term, based on path-length regularization, that makes the latent space isometric) and perform a preliminary investigation of the usefulness of this latent representation of EEG signals for bootstrapping and classification.
Workflow for automated classification of sMRI images of psychiatric disorders using neural networks
AUTHOR: Sara Khalil
AFFILIATION: Faculty of Life Sciences, University of Bradford
Despite the advances in medicine, there are still no diagnostic methods for psychiatric disorders depending mainly on the subjective description of the condition by the patients. Among different neuroimaging techniques, structural MRI is considered the most convenient methods that can be used for diagnosis of different psychiatric disorders because it is widely available and less biased. Using deep learning for diagnosis of psychiatric disorders are widely explored, however, in this study, we will test the possibility of classification of different disorders using neural networks. We will develop workflow for sMRI images that automatically do the processing, segmentation into gray and white matter using FSL, conversion into video, using ResNet for the classification, and identification of significant regions of the brain. This workflow will be deployed into a website that psychiatrist can upload sMRI and get the most probable diagnosis. The model will undergo continuous updates. We will use the data from openMRI dataset (https://openneuro.org/datasets/ds000030/versions/1.0.0) and other datasets will be discussed with participants.
Artificial intelligence-based techniques for neglect identification
AUTHOR: Benedetta Franceschiello, PhD
AFFILIATION: CIBM Center for Biomedical Imaging, EEG CHUV-UNIL Section.
Background and Objective: Eye-movement trajectories are rich behavioral data, providing a window on how the brain processes information. We address the challenge of characterizing signs of visuo-spatial neglect from saccadic eye trajectories recorded in brain-damaged patients with spatial neglect as well as in healthy controls during a visual search task. Methods: In a previous study, we established a standardized pre-processing pipeline adaptable to other task-based eye-tracker measurements. By using a convolutional neural network, we automatically analysed 1-dimensional eye trajectories (x-projections) and found that we could classify brain damaged patients vs. healthy individuals with an accuracy of 86±5%. Moreover, the algorithm scores correlate with the degree of severity of neglect signs estimated with standardized paper-and-pencil test and with white matter tracts impairment via Diffusion Tensor Imaging (DTI). Interestingly, the latter showed a clear correlation with the third branch of the superior longitudinal fasciculus (SLF), especially damaged in neglect. Data are already pre-processed in a standardised fashion, and ready to be analysed. Aim: The purpose of this project is to extend these analyses from 1D trajectories (x-projections) to 2D images, i.e. by representing the eye-tracking trajectories in 2D. The goal is to verify whether adding 1 dimensionality and applying recent computer vision technique would entail increased sensibility and sensitivity than the one we have at present. Furthermore, we would like to underpin the neural mechanisms laying behind the results.
