Analyze the dataset and save a result#

import lamindb as ln
import lnschema_bionty as lb

ln.track()
💡 loaded instance: testuser1/test-facs (lamindb 0.54.4)
💡 notebook imports: lamindb==0.54.4 lnschema_bionty==0.31.2 scanpy==1.9.5
💡 Transform(id='zzJzdgJ763Dyz8', name='Analyze the dataset and save a result', short_name='facs3', version='0', type=notebook, updated_at=2023-10-02 10:21:34, created_by_id='DzTjkKse')
💡 Run(id='9GK8q934Y57ZGYUfo4hl', run_at=2023-10-02 10:21:34, transform_id='zzJzdgJ763Dyz8', created_by_id='DzTjkKse')
ln.Dataset.filter().df()
name description version hash reference reference_type transform_id run_id file_id initial_version_id updated_at created_by_id
id
UYxBk5c2glqsNytkYhWE My versioned cytometry dataset None 1 e1rTes7lUXcu_bndvHhLbg None None OWuTtS4SAponz8 WVFjYsTXmmlztFLeq1e5 UYxBk5c2glqsNytkYhWE None 2023-10-02 10:21:12 DzTjkKse
UYxBk5c2glqsNytkYhPu My versioned cytometry dataset None 2 cSKkfcii0eGS8TGGTW53 None None SmQmhrhigFPLz8 MAsFiWynt6ZzShK6Xhp5 None UYxBk5c2glqsNytkYhWE 2023-10-02 10:21:22 DzTjkKse
dataset = ln.Dataset.filter(name="My versioned cytometry dataset", version="2").one()
adata = dataset.load(join="inner")
/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/anndata/_core/anndata.py:1838: UserWarning: Observation names are not unique. To make them unique, call `.obs_names_make_unique`.
  utils.warn_names_duplicates("obs")

The AnnData has the reference to the individual files in the .obs annotations:

adata.obs.file_id.cat.categories
Index(['UYxBk5c2glqsNytkYhWE', 'xz2I67xaZiTOauRupaju'], dtype='object')

By default, the intersection of features is used:

adata.var.index
Index(['CD57', 'Cd19', 'Cd4', 'CD8', 'CD3', 'CD27', 'Cd14', 'Ccr7', 'CD127',
       'CD28'],
      dtype='object')

Let us create a plot:

markers = lb.CellMarker.lookup()
import scanpy as sc

sc.pp.pca(adata)
sc.pl.pca(adata, color=markers.cd14.name, save="_cd14")
WARNING: saving figure to file figures/pca_cd14.pdf
https://d33wubrfki0l68.cloudfront.net/48e2ea9226f41f0769f8e05a164d9e5dac41df81/f012c/_images/7d024c4b8240f96c95d3d59461cbe256f49599f17ebfc3086cbc6112c7eaf307.png
file = ln.File("./figures/pca_cd14.pdf", description="My result on CD14")

file.save()
file.view_flow()
https://d33wubrfki0l68.cloudfront.net/5f0e2295d117dda90f3fb7b2f4eb911deebe00fb/4f439/_images/523145a85267d62933820c2e57a9de8658dbdec4dbe1a4c184fdc26b45d3954b.svg

Given the image is part of the notebook, there isn’t an actual need to save it and you can also rely on the report that you’ll create when saving the notebook via the command line via:

lamin save <notebook_path>
# clean up test instance
!lamin delete --force test-facs
!rm -r test-flow
💡 deleting instance testuser1/test-facs
✅     deleted instance settings file: /home/runner/.lamin/instance--testuser1--test-facs.env
✅     instance cache deleted
✅     deleted '.lndb' sqlite file
❗     consider manually deleting your stored data: /home/runner/work/lamin-usecases/lamin-usecases/docs/test-facs
rm: cannot remove 'test-flow': No such file or directory