Detailed feature profile of MapToCleave processed and unprocessed miRNA precursors

Wenjing Kang, Bastian Fromm, Inna Biryukova & Marc Friedländer
This is the Supplemental Data 6 of the MapToCleave study.
The file folder contains detailed feature information of MapToCleave processed and unprocessed miRNA precursors. There are seven feature files for each miRNA precursor. The file with the suffix “.HEK.statistics” provides basic information on the hairpin. The file with the suffix “.fold” provides RNA secondary structure in dot bracket format predicted by RNAfold. The file with the suffix “.str” provides printed RNA secondary structure in txt...

WGS in familial chronic lymphocytic leukemia

Viktor Ljungström & Richard Rosenquist Brandell
Data Set Description

This repository contains data from a study of three families in which two or more siblings developed chronic lymphocytic leukemia (CLL). Informed consent was provided in accordance with the declaration of Helsinki and the study was approved by the hospital medical ethics committee (METC2015-741).
The data consists of BAM-files from whole-genome sequencing (WGS) of nine individuals with data from tumor and matched normal tissue. WGS libraries where prepared using the TruSeq Nano Kit (Illumina...

Data for ''RecA finds homologous DNA by reduced dimensionality search’

Jakub Wiktor, Arvid Heden Gynnå, Prune Leroy, Jimmy Larsson, Giovanna Coceano, Ilaria Testa & Johan Elf
### Dataset description
The data here is provided to support the publication 'RecA finds homologous DNA by reduced dimensionality search’. The supporting data is largely of two types: (1) Image sequences from automated widefield microscopy of live cells in a microfluidic device, and (2) STED superresolution images of fixed and immunostained cells

For automated widefield microscopy, the 'data' folder contains the raw microscopy images, together with the output from the image processing pipeline - stabilised and...

Data and code for: Sequence specificity in DNA binding is mainly governed by association

Emil Marklund, Guanzhong Mao, Jinwen Yuan, Spartak Zikrin, Eldar Abdurakhmanov, Sebastian Deindl & Johan Elf
Sequence-specific binding of proteins to DNA is essential for accessing genetic information. Here, we derive a simple equation for target-site recognition, which uncovers a previously unrecognized coupling between the macroscopic association and dissociation rates of the searching protein. Importantly, this relationship makes it possible to recover the relevant microscopic rates from experimentally determined macroscopic ones. This post contains kinetic data for the lac repressor and code to analyse and interpret the data in light of...

Single cell whole genome sequencing of high hyperdiploid acute lymphoblastic leukemia

Eleanor Woodward, Minjun Yang, Larissa Helena Moura-Castro, Hilda van den Bos, Rebeqa Gunnarsson, Linda Olsson Arvidsson, Diana C. J. Spierings, Anders Castor, Nicolas Duployez, Marketa Zaliova, Jan Zuna, Bertil Johansson, Floris Foijer & Kajsa Paulsson
This dataset was collected from viable bone marrow cells obtained at diagnosis from nine patients with high hyperdiploid ALL and one normal bone marrow sample. All samples were subjected to low pass single cell whole genome sequencing with the median sequencing coverage of 0.02x. Single nuclei in G0/G1 phase were isolated using a fluorescence-activated cell sorting (FACS) cytometer. DNA libraries were constructed and associated next-generation sequencing was carried out by European Research Institute for the...

Data and most relevant results for the FoldDock project

Patrick Bryant, Gabriele Pozzati & Arne Elofsson
Data and main results for the study "Improved prediction of protein-protein interactions using AlphaFold2 and extended multiple-sequence alignments."
Contained datasets consist of:1 - 219 heterodimers from dockground benchmark 4 dataset 2 - 1503 heterodimeric structures from a recent study (Green, A. G. et al. Nat. Commun. 12, 1–12 (2021))3 - 7 heterodimeric complexes from CASP144 - 8 novel heterodimeric complexes deposited in the PDB database after 15 June 2021
For each one of the mentioned datasets it...

Morphological profiling of environmental chemical combinations - images, pipelines, and features

Jonne Rietdijk, Tanya Aggarwal, Polina Georgiev, Maris Lapins, Jordi Carreras-Puigvert & Ola Spjuth
Abstract: Environmental chemicals are commonly studied one at a time, and there is a need to advance our understanding of the effect of exposure to their combinations. Here we apply high-content microscopy imaging of cells stained with multiplexed dyes (Cell Painting) to profile the effects of Cetyltrimethylammonium bromide (CTAB), Bisphenol A (BPA), and Dibutyltin dilaurate (DBTDL) exposure on four human cell lines; both individually and in all combinations. We show that morphological features can...

Single base substitution mutational signatures in pediatric acute myeloid leukemia based on whole genome sequencing

Rebeqa Gunnarsson, Minjun Yang, Linda Olsson Arvidsson, andrea Biloglav, Mikael Behrendtz, Anders Castor, Kajsa Paulsson & Bertil Johansson
This dataset includes whole genome sequencing (WGS) data of 20 diagnostic, and 20 remission samples from 20 children/adolescents with acute myeloid leukemia (AML), treated at the Departments of Pediatrics at Lund and Linköping University Hospitals between 1994 and 2016. The median age of the patients was 8 years (range 0-17 years) and the female/male ratio was 1:1. DNA was extracted from diagnostic bone marrow (BM; n = 17)/peripheral blood (PB; n = 3) samples, remission...

Modelling of Large Protein Complexes

Patrick Bryant & Arne Elofsson
AlphaFold and AlphaFold-multimer can predict the structure of single- and multiple chain proteins with very high accuracy. However, predicting protein complexes with more than a handful of chains is still unfeasible, as the accuracy rapidly decreases with the number of chains and the protein size is limited by the memory on a GPU. Nevertheless, it might be possible to predict the structure of large complexes starting from predictions of subcomponents. Here, we take a graph...

Data and code related to \"Direct measurements of mRNA translation kinetics in living cells\", by Metelev et al.

Mikhail Metelev, Erik Lundin, Ivan L. Volkov, Arvid Heden Gynnå, Johan Elf & Magnus Johansson
This repository contains microscopy data and analysis code related to the publication: Metelev et al., "Direct measurements of mRNA translation kinetics in living cells".
The study includes single-particle tracking of ribosomal subunits in live E. coli cells, for analysis of translation kinetics under different conditions.
A detailed list of files and file organization can be found in readme.pdf.

WGS for patient specific MRD in pediatric ALL

Cecilia Arthur, Fatemah Rezayee, Nina Mogensen, Leonie Saft, Richard Rosenquist Brandell, Magnus Nordenskjöld, Arja Harila-Saari, Emma Tham & Gisela Barbany
Data from a study of six children with acute lymphoblastic leukemia (four patients with precursor B-cell ALL (Acute lymphoblastic leukemia) and two patients with T-cell ALL). None had stratifying genetics or central nervous system involvement. The study was approved by the Ethical Review Board at Stockholm County and written informed consent was obtained from the patients’ guardians.
The data consits of BAM-files from whole-genome sequencing (WGS) of diagnostic bone marrow samples (30X coverage on...


Charlotte Thålin, Sebastian Havervall, Ulrika Marking, Nina Greilert-Norin, Kim Blom, Max Gordon, Jonas Klingström, Peter Nilsson, Sophia Hober, Mia Phillipson, Sara Mangsbo & Mikael Åberg
The COMMUNITY pandemic surveillance cohort is a longitudinal cohort study including 2149 healthcare workers and 118 COVID-19 patients. Dataset includes: 1. Serological data at baseline April-May 2020 and at follow-up every four month (ongoing). 2. Data on memory T cell responses 3. Register data from Swedish vaccination register (VAL Vaccinera) and national communicable diseases register SmiNet (Public Health Agency of Sweden). 3. Self-reported symptoms compatible with COVID-19 since 1 January 2020, occupation, work location and...

Data: Gene expression profile in peripheral B-lymphocytes and sera of individuals with fibromyalgia

Joakim Klar
Overview Fibromyalgia (FM) is an idiopathic chronic disease characterized by widespread musculoskeletal pain, hyperalgesia and allodynia, often accompanied by fatigue, cognitive dysfunction and other symptoms. We investigated the gene expression profile in peripheral B-cells in FM patients and healthy matched control individuals.
Summary The uploaded data consist of count table and sample information and can be used for gene expression analysis of patients and controls.
Generation of Data A total of 100 ng...

Data from: Peacefully extreme - loss of combative strategies in the unstressed, sugar-xerophilic mould, Xeromyces bisporus

Su-Lin L. Leong, Henrik Lantz, Olga Vinnere-Pettersson, Jens C. Frisvad, Ulf Thrane, Hermann J. Heipieper, Jan Dijksterhuis, Manfred Grabherr, Mats Pettersson, Christian Tellgren-Roth & Johan Schnurer
Annotation in GFF3-format for the draft genome of Xeromyces bisporus. Genome available for download at EMBL, accession nr. PRJEB6149.

Data from: Function of isolated pancreatic islets from patients at onset of type 1 diabetes; Insulin secretion can be restored after some days in a non-diabetogenic environment in vitro. Results from the DiViD study.

Lars Krogvold, Oskar Skog, Görel Sundström, Bjørn Edwin, Trond Buanes, Kristian F Hanssen, Johnny Ludvigsson, Manfred Grabherr, Olle Korsgren & Knut Dahl-Jørgensen
RNA-Sequencing reads from pancreatit islets.

Computational Study of Mammalian Alcohol Dehydrogenase 5 - The Odd Sibling

Linus Östberg, Jan-Olov Höög & Bengt Persson
Structure and sequence analysis of Alcohol Dehydrogenase 5.

Data from: Protein structure validation and refinement using amide proton chemical shifts derived from quantum mechanics

Anders S. Christensen, Troels E. Linnet, Mikael Borg, Wouter Boomsma, Kresten Lindorff-Larsen, Thomas Hamelryck & Jan H. Jensen
Ensembles of protein structures resulting from Monte Carlo simulations. Each archive contains four independent simulations from the same starting structure and identical settings, but different random seeds.

Dataset for \"Serological and molecular study of Crimean-Congo hemorrhagic fever virus in cattle from selected districts in Uganda\" _serology_25012021

Stephen Balinandi, Claudia Von Brömssen, Alex Tumusiime, Jackson Kyondo, , Vanessa Monteil, Ali Mirazimi, Julius Lutwama, Lawrence Mugisha & Maja Malmberg

DATASET This dataset presented is a summary of information about the animals and test results that are reported in the Manuscript, ‘Serological and Molecular Study of Crimean-Congo Hemorrhagic Fever Virus in cattle from selected districts in Uganda’. Below is an explanation text on some of the data variables and acronyms found in the file. A: BIODATA WORKSHEET contains information about number, code, district, sub county, parish, village and GPS location of where study samples were...

Antagonistic interaction between heterotrophic bacteria and cyanobacteria

Omneya Ahmed Osman & Stefan Bertilsson
Metatranscriptomic data of antagonistic interaction between cyanobacteria and heterotrophic bacteria after 24 h incubation time

SweGen dataset from SweGen: A whole-genome map of genetic variability in a cross-section of the Swedish population

Adam Ameur, Robert Karlsson, Patrik Magnusson & Ulf Gyllensten
A high-quality map of genetic variation in the Swedish population based on a representative cross-section of 1000 individuals.

Data from: Acute sleep loss results in tissue-specific alterations in epigenetic state and metabolic fuel utilization in humans

Jonathan Cedernaes & Christian Benedict
RNA-seq data and DNA methylation array data from 15 study participants. Biopsies were takes from adipose tissue and muscle, on two occasions: after a night of normal sleep and after a night of total sleep deprivation.

Association of CSF proteins with tau and amyloid β levels in asymptomatic 70-year-olds_Protein data

Julia Remnestål
A spread sheet with participant information on the concentration of Alzheimer's disease CSF markers (abeta42, t-tau and p-tau) as well as levels of brain-enriched proteins. The brain-enriched proteins are denoted by HGNC ID and antibody name.
This spreadsheet is connected to the manuscript Association of CSF proteins with tau and amyloid β levels in asymptomatic 70-year-olds published in Alzheimer's Research & Therapy in March 2021.
AbstractIncreased knowledge of the evolution of molecular changes in neurodegenerative disorders such...

Dataset of SARS-CoV-2 wastewater data from Uppsala, Sweden

Anna Székely & Nahla Mohamed
This is a metadata record for a continuously updated dataset of SARS-CoV-2 RNA data in wastewater in Uppsala, Sweden.
The dataset is part of a research study led associate professor Anna J. Székely (SLU, Swedish University of Agricultural Sciences) and her research groups in collaboration with Uppsala Vatten. The research group is part of the Environmental Virus Profiling Research Area of the SciLifeLab National COVID-19 Research Program.
The viral content is concentrated according to the protocol...

Snat10 Knockout Mice Cortical Neuronal Cells (ImageXpress XLS Example Images)

Polina Georgiev, Ben Blamey & Ola Spjuth
Image set collected in the Spjuth Lab (https://pharmb.io/), used as a test dataset for the HASTE software tools, and referenced in associated publications.
The images are of Snat10 knockout mice cortical neuronal cells, grown for 7 days and treated with either hydrogen peroxide or glutamate.
In total, there are 2699 TIFF images, organized by wavelength.
Images were taken with an ImageXpress XLS, at the Spjuth Lab.
All the archives decompress into this path:./181214-KOday7-40X-H2O2-Glu/2018-12-14/9/*.tif

GHG feature profiling with multiple definitions

Wenjing Kang, Bastian Fromm, Inna Biryukova & Marc Friedländer
This is the Supplemental Data 13 of the MapToCleave study.
This dataset is related to Figure 6E. The GHG feature is profiled based on the different definitions that are described in the method section “Identifying the presence of the GHG feature using different definitions”. The file “GHG_feature_MapToCleave.txt” shows the presence of the GHG features in the MapToCleave processed and unprocessed miRNA precursors. The file “GHG_feature_MirGeneDB.txt” shows the presence of the GHG features in MirGeneDB human...

