Thumbnails:
List:
Year:
Category:
Session:
Poster:
Getting poster data...
Bruno Iochins Grisci, Marcio Dorn (Structural Bioinformatics and Computational Biology Lab - Institute of Informatics - Federal University of Rio Grande do Sul, Porto Alegre, Brazil)
Gene selection is a subdivision of feature selection and dimensionality reduction. It is applied to gene expression data sets from microarray or RNA-seq experiments. In cancer studies, gene selection algorithms aid in biomarkers identification as they can find the subset of genes related to specific cancer types. In gene selection, the features are the expression value of genes since it preserves their physical meaning, allowing better interpretation. In this work, we present weighted t-SNE, a way to visualize and compare selections of genes with distinct algorithms. t-SNE is a method for visualizing high-dimensional data. The algorithm must compute the distance between the data points, usually relying on the Euclidean distance. We propose that using the weighted Euclidean distance, with the features importance scores as weights, a 2D visualization represents the data distribution after the selection. The scores should range from 0 to 1 (more relevant), so when scaling each dimension by its score, the dimensions with higher relevance will account for a larger portion of the distance between the points, and thus will have more influence in their position in the final visualization.