Infer a gene regulatory network from gene expression data and make a ROC plot. D
ID: 3855336 • Letter: I
Question
Infer a gene regulatory network from gene expression data and make a ROC plot.
Download the gene expression data in the link below, where there are 500 samples and
each sample has 10 gene expression. R or phyton computing language perferred, but any accepted.
http://ksuweb.kennesaw.edu/~mkang9/teaching/CS4491_CS7990/Gene_expression_1.csv
Task 1: Gene regulatory networks inference based on the correlation-based approach.
- Dataset:
o Gene_expression_1.csv: contains gene expression data for task 1
o Adj_1.csv: contains adjacency matrix of ground truth for task 1
1. Load the gene expression data (Gene_expression_1.csv) and the ground truth adjacency
matrix (Adj_1.csv).
2. Compute pairwise correlation matrix, and show the matrix. E.g., see Fig. 1.
3. Given the range of threshold (e.g., 0, 0.1, 0.2, 0.3, …, 0.9, 1), compare the adjacency
matrices between the network and the ground truth.
4. Compute a confusion matrix for each threshold
5. Compute TPR and FPR for each threshold
6. Make a ROC plot. E.g., see Fig. 2
L10 10) (2,10.1731708398 0.000000e 00 2612879e-o4 Bla.00010s41912.612879e-04 (4.10.0004 2152017774737e-07 (sela.36169033675.903489-01 (6.10.6628425202 2.824813e-01 2. lo 32388276a3 5.282464e-01 1.954879e-D42.50638le o4 5.220s84e 01 5.010s28e-01 0.0000000000 2.508S63e-04 3.773s63e-015.754074e-01. IT, (8.10.0006998978 1.042277e-03 (9.10 1313074479 10.10.4178 738114 6582586e-01 Figure 1. Correlation matrix 0.0 0.2 0.4 0.6 0.8 1.0 Figure 2. ROC in Task 1Explanation / Answer
import pandas as pd
import numpy as np
import sklearn as sk
import scipy as sp
import pylab as pl
from sklearn.metrics import roc_curve, auc
Fiber_df = pd.read_csv('http://ksuweb.kennesaw.edu/~mkang9/teaching/CS4491_CS7990/Gene_expression_1.csv')
df = Fiber_df.corr()
print("collation")
print(df)
y_test = np.array(df)[:,0]
y_test = np.array(df)[:,1]
fpr, tpr, thresholds = roc_curve(y_test, probas)
roc_auc = auc(fpr, tpr)
pl.clf()
pl.plot(fpr, tpr, label='ROC curve (area = %0.2f)' % roc_auc)
pl.plot([0, 1], [0, 1], 'k--')
pl.xlim([0.0, 1.0])
pl.ylim([0.0, 1.0])
pl.title('Output')
pl.show()