Scalable structural clustering of local RNA secondary structures

Fabrizio Costa, Steffen Heyne, Dominic Rose & Rolf Backofen
Here, we propose an alignment-free approach for clustering RNA sequences according to sequence and structure information. We extend a fast graph kernel technique that we have developed for chemoinformatics applications and we adapt it to detect similarities between RNA secondary structures. The key novelties are twofold: (1) we represent multiple folding hypothesis associated to a single RNA sequence in a flexible graph format; and (2) we efficiently convert the graph encoding into a very high...
