Prediction result

class celltypist.classifier.AnnotationResult(labels: DataFrame, decision_mat: DataFrame, prob_mat: DataFrame, adata: AnnData)[source]

Bases: object

Class that represents the result of a celltyping annotation process.

Parameters:
  • labels – A DataFrame object returned from the celltyping process, showing the predicted labels.

  • decision_mat – A DataFrame object returned from the celltyping process, showing the decision matrix.

  • prob_mat – A DataFrame object returned from the celltyping process, showing the probability matrix.

  • adata – An AnnData object representing the input object.

predicted_labels

Predicted labels including the individual prediction results and (if majority voting is done) majority voting results.

decision_matrix

Decision matrix with the decision score of each cell belonging to a given cell type.

probability_matrix

Probability matrix representing the probability each cell belongs to a given cell type (transformed from decision matrix by the sigmoid function).

cell_count

Number of input cells which undergo the prediction process.

adata

An AnnData object representing the input data.

summary_frequency(by: str = 'predicted_labels') DataFrame[source]

Get the frequency of cells belonging to each cell type predicted by celltypist.

Parameters:

by – Column name of predicted_labels specifying the prediction type which the summary is based on. Set to ‘majority_voting’ if you want to summarize for the majority voting classifier. (Default: ‘predicted_labels’)

Returns:

A DataFrame object with cell type frequencies.

Return type:

DataFrame

to_adata(insert_labels: bool = True, insert_conf: bool = True, insert_conf_by: str = 'predicted_labels', insert_decision: bool = False, insert_prob: bool = False, prefix: str = '') AnnData[source]

Insert the predicted labels, decision or probability matrix, and (if majority voting is done) majority voting results into the AnnData object.

Parameters:
  • insert_labels – Whether to insert the predicted cell type labels and (if majority voting is done) majority voting-based labels into the AnnData object. (Default: True)

  • insert_conf – Whether to insert the confidence scores into the AnnData object. (Default: True)

  • insert_conf_by – Column name of predicted_labels specifying the prediction type which the confidence scores are based on. Setting to ‘majority_voting’ will insert the confidence scores corresponding to the majority-voting result. (Default: ‘predicted_labels’)

  • insert_decision – Whether to insert the decision matrix into the AnnData object. (Default: False)

  • insert_prob – Whether to insert the probability matrix into the AnnData object. This will override the decision matrix even when insert_decision is set to True. (Default: False)

  • prefix – Prefix for the inserted columns in the AnnData object. Default to no prefix used.

Returns:

Depending on whether majority voting is done, an AnnData object with the following columns (prefixed with prefix) added to the observation metadata: 1) predicted_labels, individual prediction outcome for each cell. 2) over_clustering, over-clustering result for the cells. 3) majority_voting, the cell type label assigned to each cell after the majority voting process. 4) conf_score, the confidence score of each cell. 5) name of each cell type, which represents the decision scores (or probabilities if insert_prob is True) of a given cell type across cells.

Return type:

AnnData

to_plots(folder: str, plot_probability: bool = False, format: str = 'pdf', prefix: str = '') None[source]

Plot the celltyping and (if majority voting is done) majority-voting results.

Parameters:
  • folder – Path to a folder which stores the output figures.

  • plot_probability – Whether to also plot the decision score and probability distributions of each cell type across the test cells. If True, a number of figures will be generated (may take some time if the input data is large). (Default: False)

  • format – Format of output figures. Default to vector PDF files (note dots are still drawn with png backend). (Default: ‘pdf’)

  • prefix – Prefix for the output figures. Default to no prefix used.

Returns:

Depending on whether majority voting is done and plot_probability, multiple UMAP plots showing the prediction and majority voting results in the folder: 1) predicted_labels, individual prediction outcome for each cell overlaid onto the UMAP. 2) over_clustering, over-clustering result of the cells overlaid onto the UMAP. 3) majority_voting, the cell type label assigned to each cell after the majority voting process overlaid onto the UMAP. 4) name of each cell type, which represents the decision scores and probabilities of a given cell type distributed across cells overlaid onto the UMAP.

Return type:

None

to_table(folder: str, prefix: str = '', xlsx: bool = False) None[source]

Write out tables of predicted labels, decision matrix, and probability matrix.

Parameters:
  • folder – Path to a folder which stores the output table/tables.

  • prefix – Prefix for the output table/tables. Default to no prefix used.

  • xlsx – Whether to merge output tables into a single Excel (.xlsx). (Default: False)

Returns:

Depending on xlsx, return table(s) of predicted labels, decision matrix and probability matrix.

Return type:

None