Skip to content

Module : MAD filtering

LehmannN edited this page Jun 13, 2018 · 3 revisions

Module : MAD-filtering

This module removes bad quality cells from the dataset based on median absolute deviation (MAD).

  • Internal name : MAD-filtering

  • Avalaible : local mode

  • Input Ports :

    • matrix : initial count matrix (tsv)
    • cells : initial cells metadata (tsv)
    • genes : genes metadata (tsv)
  • Output Ports :

    • completcellsoutput : initial cells metadata (tsv) (completed with quality metrics)
    • matrixoutput : filtered count matrix (tsv)
    • cellsoutput : filtered cells metadata (tsv)
  • Optional parameters :

Parameter Type Description Default Value
detection_threshold integer Minimal number of reads to consider a feature as detected 10
expression_option string Type of feature to detect (Endogenous, Nuclear or All) Endogenous
n_Mad integer Maximal number of median absolute deviations for number of reads and number of detected features 5
direction string Direction to consider for MAD filtering : lower, upper or both both
groups string Name of the column for cell grouping, if cells are in one group value is Null Null
prop_mt float Maximum proportion of reads mapping to mitochondrial features 0.1
prop_sp float Maximum proportion of reads mapping to exogenous features 0.5
nb_filters int Minimum number of failures triggering removal 1
  • Configuration example
<step id="QC" skip="false">
	<module>MAD-filtering</module>
	<parameters>
		<parameter>
			<name>n_Mad</name>
			<value>3</value>	
		</parameter>
		<parameter>
			<name>direction</name>
			<value>lower</value>	
		</parameter>
		<parameter>
			<name>groups</name>
			<value>CellType</value>	
		</parameter>
	  </parameters>
</step>

Interpreting output files

Scatter Plot

After cleaning the data, the module produces two scatter plot, showing all cells in term of number of feature (y-axis) and number of reads (x-axis).

Raw_Cellplot

The first one shows all cells. The red ones are those being eliminated.

Filtered_cellplot

The second one shows the remaining cells after filtering. At the end of the filtering, cells should behave like a mixture of gaussians, i.e. you can wrap them in a given number of ellipses.