Skip to content

Compute summary statistics on a bedpe file

Using bedpe.sumstats.sh

Find min-max

One can find min and max scores of a standard .bedpe file using:

awk 'BEGIN{FS="\t"; max=0;} {if($8>max){max=$8}} END{print max;}' EnhancerPredictions.bedpe
awk 'BEGIN{FS="\t"; min=1;} {if($8<min){min=$8}} END{print min;}' EnhancerPredictions.bedpe

Launch bedpe.sumstats.sh

Then in order to compute summary statistics (replace 0 and 0.27 with min and max respectively) one may use the following script:

cd <where_the_bedpe_is_located>
mkdir sumstats
cd sumstats
~/scripts/sarah_djebali/bedpe.sumstats.sh ../EnhancerPredictions.bedpe.gz "0-0.27" "500-500"

It will produce the following outputs:

Distance_by_score.quantile.density.png       EnhancerPredictions.scorequantile.bedpe.gz  refelt.scorequantile.nbconn.nbtimes.tsv
Distance.png                                 refelt.scorequantile.fraglength.png         Score.png
dist.score.quantiles.tsv.gz                  refelt.scorequantile.fraglength.tsv.gz      scorequantile.refelt.nb.fraglength.distrib.tsv
EnhancerPredictions.nbconn.acc.to.score.tsv  refelt.scorequantile.nbconn.nbtimes.png

The outputs include plots that help to immediately visualize the results. For instance:

eog Distance_by_score.quantile.density.png

eog stands for eye of gnome, the default gnome image viewer

Number of connections of element 1 and of element 2. Here elt 1 are enhancers and elt 2 are TSS: a TSS generally makes more connections to enhancers, than enhancers make connections to TSS Number of connections of element 1 and of element 2. Here elt 1 are enhancers and elt 2 are TSS: a TSS usually makes more connections to enhancers, than enhancers make connections to TSS (the four screens distinguish between the four score-quartiles)