Résumé : Breast cancer is the most common cancer in women and research efforts to unravel the underlying mechanisms that drive carcinogenesis are continuous. The emergence of high-throughput sequencing techniques and their constant advancement, in combination with large scale studies of genomic and transcriptomic data, allowed the identification of important genetic changes that take place in the breast cancer genome, including somatic mutations, copy number aberrations and genomic rearrangements.The overall aim of this thesis is to explore the presence of genetic changes that take place in the breast cancer transcriptome and their possible contribution to carcinogenesis. The aim of the first research study was the identification of expressed gene fusions in breast cancer and the study of their association with other genomic events. For achieving this, transcriptome sequencing and Single Nucleotide Polymorphism arrays data for a cohort of 55 tumors and 10 normal breast tissues were combined. Gene fusions were detected in the majority of the samples, with evident differences between breast cancer subtypes, where HER2+ samples had significantly more fusions than the other subtypes. The genome-wide analysis uncovered localization of fusion genes in specific chromosomes like 17, 8 or 20. Additionally, a positive correlation between the number of gene fusions and the number of amplifications was observed, including the association between fusions on chromosome 17 and the amplifications in HER2+ samples, which can be attributed to the highly rearranged genomes of these subtypes. Finally, the absence of highly recurrent fusions across this cohort adds to the notion that gene fusions in breast cancer are most likely private events, with the majority being “passenger” events. In the second research study, the aim was to identify a connection between viral infections and breast cancer by devising five different computational methods for the analysis of both transcriptome and exome data in a cohort of 58 breast tumors. Despite being able to detect viral sequences in our testing dataset, no significantly high numbers of viral sequences were detected in our samples. Specifically, viral sequences (~2-30 reads) were extracted belonging to viruses EBV, HHV6 and Merkel cell polyomavirus. Such low levels of viral expression direct against a viral etiology for breast cancer but one should not exclude possible cases of integrated but silent viruses.In the third research project, we analyzed in silico the transcriptional profiles of human endogenous retroviruses in breast cancer. Despite being scattered across the genome in large numbers, a number of ERVs are actively transcribed, consisting of a small percentage of the total mapped reads. Alongside protein coding genes and lncRNAs, they show distinct expression profiles across the different breast cancer subtypes with luminal and basal-like samples clear separating from each other. Additionally, distinct profiles between ER+ and ER- samples were observed. Tumor specific ERV loci show an association with the immune status of the tumors, indicating that ERVs are reactivated in tumors and could play a role in the activation of the immune response cascade.The results presented in this thesis exhibit only in a small fragment the diversity and heterogeneity of the breast cancer transcriptome. The strength of the sequencing techniques allows the in depth detection of different genomic events. Gene fusions should be considered as part of the breast cancer transcriptome but their low recurrence across samples indicates for a role as passenger events. Under the light of existing results, viral infections do not play a significant role in breast cancer. On the other hand, human endogenous retroviruses, despite originating from exogenous viruses, seems to exhibit transcriptional profiles similar to those of normal genes, indicating that they are part of the genome’s transcriptional machinery.