Skip to main content
University of Gothenburg

Method development in biomedical research

One of our goals is to contribute to the development of new techniques. Here are some examples of collaborations that focus on the method development that have been funded by the Swedish Foundation for Strategic Research (RIF14-008), granted 2016-2020.

Research projects

Bipolar disorder: Omic-data integration

BCF contributor(s): Katarina Truvé. In collaboration with Keiko Funa (UGoT)

Identification of causative genetic variants leading to the development of bipolar disorder (BD) could result in genetic tests that would facilitate diagnosis. A better understanding of affected genes and pathways is also necessary for targeting of genes that may improve treatment strategies. To date several susceptibility genes have been reported from genome-wide association studies (GWAS), but little is known about specific variants that affect disease development. Here, we performed quantitative proteomics and whole-genome sequencing (WGS). Quantitative proteomics revealed NLRP2 as the most significantly up-regulated protein in neural stem cells and mature neural cells obtained from BD-patient cell samples. These results are in concordance with our previously published transcriptome analysis. Furthermore, the levels of FEZ2 and CADM2 proteins were also significantly differentially expressed in BD compared to control derived cells. The levels of FEZ2 were significantly downregulated in neural stem cells (NSC) while CADM2 was significantly up-regulated in mature neuronal cell culture. Promising novel candidate mutations were identified in the ANK3, NEK3, NEK7, TUBB, ANKRD1, and BRD2 genes.


A modelling approach for Bioinformatics Workflows

BCF collaborator(s); Marcela Dávila. In collaboration with Francisco Gomes, Jennifer Horkoff and Alexander Schliep, Chalmers University of Technology

Bioinformaticians execute frequent, complex, manual and semi-scripted workflows to process data. There are many tools to manage and conduct these workflows, but there is no domain-specific way to textually and diagrammatically document them. Consequently, we create methods for modeling bioinformatics workflows. Specifically, we extend the Unified Modeling Language (UML) Activity Diagram to the bioinformatics domain by including domain-specific concepts and notations. Additionally, a template was created to document the same concepts in a text format. A design science methodology was followed, where four iterations with seven domain experts tailored the artefacts, extending concepts and improving usability, terminology, and notations. The UML extension received a positive evaluation from bioinformaticians. However, the written template was rejected due to the amount of text and complexity.

Gall bladder cancer profiling

BCF contributor(s): Sanna Abrahamsson and Marcela Dávila. In collaboration with Justo Bermejo (Heidelberg University)

Research on gallbladder cancer (GBC) has been largely neglected and molecular GBC data is underrepresented in public databases. Cancer cell lines constitute a valuable tool to examine the mechanisms of malignant transformation and identify potential therapeutic targets. Here we use RNA sequencing to characterize 23 commercial hepatobiliary cancer cell lines, including ten GBC cell lines, and provide detailed mutation and gene expression data to the research community. We illustrate the practical utility of the released information by (1) assessing the presence of specific mutations in the investigated cancer cell lines, (2) comparing global gene expression patterns in cell lines and primary biliary tumours and (3) examining the expression levels of specific genes. The released data and showcase applications will ease the design of in vitro cell culture assays for future studies.


Molecular regulation of homeostasis in human liver

BCF contributor(s): Marcela Dávila. In collaboration with Christina Jern (UGoT)

Worldwide stroke is the most common cause of disability and the second most common cause of death (GBD, 2020). Blood clot formation is a key mechanistic event in ischemic stroke and is regulated by platelets and hemostatic factors, many of which are produced in the liver. Increased levels of circulating coagulation factors can lead to thrombus formation (e.g. ischemic stroke) and reduced levels can lead to bleeding. Understanding the genetic basis underlying hepatic hemostatic gene expression variability may reveal genetic variants that are important for thrombotic and/or bleeding disorders Several recent publications have described novel alternatively spliced isoforms of genes involved in hemostasis (Duval et al, 2017; Rehman et al, 2019; Nuzzo et al, 2015; Paraboschi et al, 2019; Liang et al, 2015; Odaira et al, 2019). Dysregulation of alternative splicing is implicated in many diseases. However, alternative splicing in the majority of hemostatic genes remains poorly characterized.

Tool development

Radiotherapy: Biomarker discovery using machine learning approaches

BCF contributor(s): Peidi Liu, Erik Lorentzen and Björn Andersson. In collaboration with Britta Langen (UGoT)

Tumor radiotherapy and basic radiation research rely on the accurate relation between absorbed dose and the therapeutic/biological effect after irradiation. In practice, the dose-response relationship is difficult to determine due to various factors. Miscorrelation of the dose-response can cause severe issues, such as under-treatment of cancers (and thus disease progression) or risk exposure of healthy tissue leading to secondary diseases.  A novel approach for improving the dose-response correlation is to identify molecular biomarkers (genes, proteins, or other products) whose relative abundance would be an accurate representation of irradiation and biological effect, such as absorbed dose, detrimental effects in healthy tissue, or tumor regression. The purpose of this project is to develop a machine learning tool based on omics data for biomarker discovery in radiation research

InVi: Integration and visualization of genomic data

BCF contributor(s): Luciano Fernández and Marcela Dávila. In collaboration with Christina Jern (UGoT)

Advanced visualization of genomic data is vital to allow researchers explore and understand the complexities of their experimental data or large-scale datasets.  Complex data visualization techniques exist today but their nature makes them difficult to use.  A common process involves processing genomics files (e.g. gene expression profiles) so that they can be used by visualization software like Circos. Circos is a very powerful tool that allows for the creation and customization of sophisticated circular layouts.  This process involves the creation of complex configuration files amongst other steps. The creation of such visualizations is time consuming, and repetitive.  Furthermore, if a specialist on demand produces these displays, it requires constant input from the user to adjust the views to fit their requirements. To facilitate the exploration and creation of advanced genomics visualizations and support knowledge discovery, we developed the software InVi (Integration and Visualization of Genomic Data) and CiGUI (Circos Graphic User Interface) aimed at facilitate the analysis of genomic data and its visualization by using circular displays. 

P-PSY-Finder: Novel processed pseudogenes in colorectal cancer

BCF contributor(s): Sanna Abrahamsson and Marcela Dávila. In collaboration with Anna Rohlin (Laboratory Medicine at UGoT)

Processed pseudogenes (PΨgs) are disabled gene copies that are transcribed and may affect expression of paralogous genes. Moreover, their insertion in the genome can disrupt the structure or the regulatory region of a gene, affecting its transcription. These events have been identified as occurring mutations during cancer development, thus being able to identify processed pseudogenes and their location will improve the somatic mutation testing in the clinical setting. PΨFinder is a tool that can automatically predict novel PΨgs from DNA sequencing data and determine their location in the genome with high accuracy. It generates high quality figures and tables that aid in the interpretation of the results and guide the experimental validation. PΨFinder is a complementary analysis to any mutational screening in the identification of disease-causing mutations within cancer and other diseases.


TC-Hunter: Transgenic insertion sites identification

BCF contributor(s): Vanja Börjesson and Marcela Dávila. In collaboration with Jelena Milosevic (Karolinska Institutet)

Genetically manipulated animal models are considered essential for studying gene function in whole anymals. Today thanks to whole genome sequencing data we are now capable to identify the exact insertion site of the construct in the host. TC-Hunter (Transgenic Construct Hunter tool) is an algorithm that takes aligned pair-end data and extracts candidate positions where the transgenic construct may have been incorporated.

IONISER: Searching for glycoproteins

BCF contributor(s): Dagmara Gotlib. In collaboration with the Proteomics Core Facility

The characterization of glycosylated proteins is a challenging task in the proteomics field as they are commonly presented by multiple glycoforms. The Ioniser assists in identifying potentially novel glycoforms, without the need for prior knowledge of the existing glycostructures for a given peptide. It processes and filters large amounts of mass-to-charge (m/z) ratio and abundance data allowing the user to identify additional glycosylated proteins based on user-specified parameters.

REAPER: A light-weight file monitor

BCF contributor(s): Dagmara Gotlib. In collaboration with the Proteomics Core Facility

When performing mass spectrometry analyses, large amounts of data is produced. As the computer which performs this analysis has limited storage, it is of great interest to move the files to another storage as soon as possible. The Reaper monitors a specific directory where the files are created and updated, and with a user defined time unit checks for changes in that directory. When a file hasn’t changed in size by the third check, it is then copied to the appropriate location. The destination directories are named after the year and month of the file (YYMM) and can be prefixed with a user defined string to further aid in the organisation of the files. It is also possible to filter the monitoring of files by file type. A log detailing the copying and a sha256 verification of the files is also created with back-ups on a weekly basis.

Tool evaluation

Benchmarking five pseutotime analysis tools for snRNAseq data

BCF contributor(s): Vanja Börjesson. In collaboration with Malin Johansson (UGoT)

We have benchmarked five tools for pseudotime analysis: Monocle2, Monocle3, Slicer, Destiny and Scanpy. We compared their time efficiency and memory usage, their ability to control filtering, normalization, the layout of the output figures and the installation  process. All five tools predict the same trajectory of our test dataset. All tools were easy to install and use. Slicer was by far the most time consuming tool. Monocle and Scanpy implement different algorithms sfor trajectory development, filtering and normalization fo the data, while having many options for visualizing differentially expressed genes.

Batch effect in proteomic data

BCF contributor(s): Jari Martikainen and Marcela Dávila. In collaboration with the Proteomics Core Facility

Batch effects can occur when subsets of a dataset are collected in a way that are systematically different from the ways in which other subsets of the data are collected. These are innate differences between batches that may occur in time, place, calibration of instruments or persons doing the collection. Batch effect can cause confounding if the treatments are in some way related to the batches. This can happen if all of one treatment is processed in one batch and all of a different treatment is in another batch. One way to avoid batch effects is by balancing treatments within batches. Here we tested different batch effect correction packages.

Pathway analysis software evaluation

BCF contributor(s): Annelie Angerfors

Omics data can be analyzed in several ways and one common way is by expression analysis, commonly via Ingenuity Pathway Analysis (IPA) which is an all-in-one application. IPA is the main application used by the Bioinformatics Core Facility (BCF) for requested expression analyses for projects on omics data, mainly in regards to fold change, p-value and FDR. Several pathway analysis applications are emerging, some are web-based and some are software to download, and the different applications are somewhat based on different pathway collections and types of analyses. Here we compared PA was compared with four more applications, the software Cytoscape and web-based Reactome, DAVID and Enrichr.


Online education during the COVID-19 pandemic

BCF contributor(s): Marcela Davila and Sanna Abrahamsson

Due to the worldwide COVID-19 pandemic, new strategies had to be adopted to move from class-room-based education to online education, in a very short time. The lack of time to set up these strategies, hindered a proper design of online instructions and delivery of knowledge. Onsite prac-tical education, including bioinformatics-related training, tend to rely on extensive practice, where students and instructors have a face-to-face interaction to improve the learning outcome. For these courses to maintain their high quality when adapted as online courses, different designs need to be tested and the students’ perceptions need to be heard. This study focuses on short bioinformatics-related courses for graduate students at the University of Gothenburg, Sweden, which were originally developed for onsite training. Once adapted as online courses, several modifications in their design were tested to obtain the best fitting learning strategy for the students. To improve the online learning experience, we propose a combination of: 1) short synchronized sessions, 2) extended time for own and group practical work, 3) recorded live lectures and 4) increased opportunities for feedback in several formats.