This project is maintained by sarbal
Let’s do some analyses.
Download these files into your working directory:
####
To check your working directory:
getwd()
To set your working diretory:
setwd("H:/URP")
Run this to install/load libraries
source("helper.R")
load("lesson4.Rdata")
A method to identify properties of genes or proteins that are over-represented. GSE is a (relatively) straightforward method to sumamrize the results of your experiment, in particualr when you believe there is some link or association to a known phenotype (e.g., enrichment for dopamine receptors in Parkinson’s disease). Uses statistical approaches to identify significantly enriched groups of genes, and the main annotation database is the Gene Ontology (GO) and the Molecular Signatures database (MSigDB).
There have been many others, and the simplest is a hypergeometric test to measure overlap.
install.packages("enrichR")
library(enrichR)
dbs <- listEnrichrDbs()
dbs <- c("GO_Molecular_Function_2015", "GO_Cellular_Component_2015", "GO_Biological_Process_2015")
enriched <- enrichr(c("Runx1", "Gfi1", "Gfi1b", "Spi1", "Gata1", "Kdr"), dbs)
library(EGAD)
annot = make_annotations( (GO.mouse[,c(1,3)][GO.mouse[,4]!="IEA",] ), unique(GO.mouse[,1]), unique(GO.mouse[,3]) )
enriched2 <- gene_set_enrichment(c("Runx1", "Gfi1", "Gfi1b", "Spi1", "Gata1", "Kdr"), annot, voc)
multifunc_assessment <- calculate_multifunc(annot)
plot(multifunc_assessment[,4])
auc_mf <- auc_multifunc(annot, multifunc_assessment[,4])
hist <- plot_distribution(auc_mf, xlab="AUROC", med=FALSE, avg=FALSE)
m = match( enriched2[,1], names(auc_mf))
hist <- plot_distribution(auc_mf[m], xlab="AUROC", med=FALSE, avg=FALSE)
Let’s take a look at this paper: https://genome.cshlp.org/content/22/4/602
gene_expression_file = " Gene.expression.data/Normalized.expression.data.txt"
exprs = read.table(gene_expression_file, header=T, row.names=1)
samples = matrix(unlist(strsplit(colnames(exprs)[33:94] , "_" ) ) , byrow=T, ncol=2)
exprs.means = t(sapply(1:dim(exprs)[1], function(i) tapply( as.numeric(exprs[i,33:94]), samples[,1], mean, na.rm=T) ))
samples.sub = names(tapply( as.numeric(exprs[i,33:94]), samples[,1], mean, na.rm=T))
colnames(exprs.means) = samples.sub
rownames(exprs.means) = rownames(exprs)
samples.cor = cor(exprs.means[gene.subset,-filt.species ], m=”s”, use=”p”)
4. Plot/graph
heatmap.3(samples.cor) samples.cor = cor(exprs.means[ ,-filt.species], m=”s”, use=”p”) heatmap.3(samples.cor) ```
Back to the homepage