Welcome to scConfluence’s documentation!#

scConfluence is a novel method for the integration of unpaired multiomics data combining uncoupled autoencoders and Inverse Optimal Transport to learn low-dimensional cell embeddings. These embeddings can then be used for visualization and clustering, useful for discovering subpopulations of cells, and for imputation of features across modalities. Read the preprint!

Explanatory figure

Install the package#

scConfluence is implemented as a Python package seamlessly integrated within the scverse ecosystem, in particular Muon and Scanpy.

via PyPI#

On all operating systems, the easiest way to install scConfluence is via PyPI. Installation should typically take a minute and is continuously tested with Python 3.10 on an Ubuntu virtual machine.

pip install scconfluence

via GitHub (development version)#

git clone git@github.com:cantinilab/scconfluence.git
cd scconfluence
pip install .

Test your development installation (optional)#

pip install pytest
pytest .

Getting started#

scConfluence takes as an input a MuData object and populates its obsm field with the latent embeddings. See tutorials for more detailed examples of applications of the method.

You may download a preprocessed 10X Multiome demo dataset here.

A GPU is not required for the method to run, but is strongly recommended.

import scconfluence
import mudata as md
import scanpy as sc
from scipy.spatial.distance import cdist

# Load data into a Muon object.
mdata = md.read_h5mu("my_data.h5mu")

# Compute the cross-modality distance matrix using connected features
mdata.uns["cross_modality1+modality2"] = cdist(mdata["modality1"].obsm["cm_features"],
                                               mdata["modality2"].obsm["cm_features"])
mdata.uns["cross_keys"] = ["cross_modality1+modality2"]


# Initialize and train the model.
autoencoders = {"modality1": scconfluence.model.AutoEncoder(mdata["modality1"],
                                                             modality="modality1"),
                "modality2": scconfluence.model.AutoEncoder(mdata["modality2"],
                                                             modality="modality2")}
model = scconfluence.model.ScConfluence(mdata, unimodal_aes=autoencoders)
model.fit(save_path="results")
mdata.obsm["latent"] = model.get_latent().loc[mdata.obs_names]

# Visualize the embedding with UMAP.
sc.pp.neighbors(mdata, use_rep="latent")
sc.tl.umap(mdata)
sc.pl.umap(mdata)

Citation#

@article {Samaran2024unpaired,
  author = {Jules Samaran and Gabriel Peyre and Laura Cantini},
  title = {scConfluence : single-cell diagonal integration with regularized Inverse Optimal Transport on weakly connected features},
  year = {2024},
  doi = {10.1101/2024.02.26.582051},
  publisher = {Cold Spring Harbor Laboratory},
  journal = {bioRxiv}
}