Downsampling function
- celltypist.samples.downsample_adata(adata: AnnData, mode: str = 'total', n_cells: int | None = None, by: str | None = None, balance_cell_type: bool = False, random_state: int = 0, return_index: bool = True) AnnData | ndarray [source]
Downsample cells to a given number (either in total or per cell type).
- Parameters:
adata – An
AnnData
object representing the input data.mode – The way downsampling is performed. Default to downsampling the input cells to a total of n_cells. Set to ‘each’ if you want to downsample cells within each cell type to n_cells. (Default: ‘total’)
n_cells – The total number of cells (mode = ‘total’) or the number of cells from each cell type (mode = ‘each’) to sample. For the latter, all cells from a given cell type will be selected if its cell number is fewer than n_cells.
by – Key (column name) of the input AnnData representing the cell types.
balance_cell_type – Whether to balance the cell type frequencies when mode = ‘total’. Setting to True will sample rare cell types with a higher probability, ensuring close-to-even cell type compositions. This argument is ignored if mode = ‘each’. (Default: False)
random_state – Random seed for reproducibility.
return_index – Only return the downsampled cell indices. Setting to False if you want to get a downsampled version of the input AnnData. (Default: True)
- Return type:
Depending on return_index, returns the downsampled cell indices or a subset of the input AnnData.