|
Persistent Identifier
|
doi:10.18130/V3/K7TGEM |
|
Publication Date
|
2025-10-31 |
|
Title
| Cell Maps for Artificial Intelligence - October 2025 Data Release (Beta) |
|
Author
| Clark, TUniversity of VirginiaORCIDhttps://orcid.org/0000-0003-4060-7360
Parker, JUniversity of California, San DiegoORCIDhttps://orcid.org/0000-0003-4535-3486
Al Manir, SUniversity of VirginiaORCIDhttps://orcid.org/0000-0003-4647-3877
Axelsson, UKTH Royal Institute of Technology,
Ballllosero Navarro, FStanford UniversityORCIDhttps://orcid.org/0000-0002-4180-422X
Chinn, BUniversity of California San Diego
Churas, CPUniversity of California San Diegohttps://orcid.org/0000-0001-9998-705X
Dailamy, AUniversity of California, San DiegoORCIDhttps://orcid.org/0000-0002-6711-8260
Doctor, YUniversity of California, San DiegoORCIDhttps://orcid.org/0009-0009-0483-7506
Fall, JKTH - Royal Institute of Technology
Forget, AUniversity of California San FranciscoORCIDhttps://orcid.org/0000-0003-0223-0312
Gao, JUniversity of California San DiegoORCIDhttps://orcid.org/0000-0002-6311-3526
Hansen, JNStanford UniversityORCIDhttps://orcid.org/0000-0002-4650-9094
Hu, MUniversity of California San Diegohttps://orcid.org/0000-0002-1571-8029
Johannesson, AKTH - Royal Institute of Technology
Khaliq, HUniversity of California San Diego
Lee, YHUniversity of California San DiegoORCIDhttps://orcid.org/0000-0003-0917-355X
Lenkiewicz, JUniversity of California San Diegohttps://orcid.org/0000-0001-7252-8638
Levinson, MAUniversity of VirginiaORCIDhttps://orcid.org/0000-0003-0384-8499
Marquez, CUniversity of California San DiegoORCID0000-0003-3960-420X
Metallo, CUniversity of California San DiegoORCIDhttps://orcid.org/0000-0003-2404-3040
Muralidharan, MUniversity of California San Francisco
Nourreddine, SUniversity of California San Diegohttps://orcid.org/0000-0003-3881-7588
Niestroy, JUniversity of VirginiaORCIDhttps://orcid.org/0000-0002-1103-3882
Obernier, KUniversity of California San FranciscoORCIDhttps://orcid.org/0000-0002-4025-1299
Pan, EUniversity of California San Diego
Polacco, BUniversity of California San Francisco
Pratt, DUniversity of California San DiegoORCIDhttps://orcid.org/0000-0002-1471-9513
Qian, GUniversity of California San DiegoORCIDhttps://orcid.org/0009-0005-4217-2745
Schaffer, LUniversity of California San DiegoORCIDhttps://orcid.org/0000-0001-6339-9141
Sigaeva, AKTH Royal Institute of TechnologyORCIDhttps://orcid.org/0000-0003-3361-3797
Thaker, SUniversity of Alabama at BirminghamORCIDhttps://orcid.org/0000-0001-6730-2773
Zhang, YUniversity of California San Diego
Bélisle-Pipon, JCSimon Fraser UniversityORCIDhttps://orcid.org/0000-0002-8965-8153
Brandt, CYale UniversityORCIDhttps://orcid.org/0000-0001-8179-1796
Chen, JYThe University of Alabama at BirminghamORCIDhttps://orcid.org/0000-0002-6112-415X
Ding, YUniversity of Texas at AustinORCIDhttps://orcid.org/0000-0003-2567-2009
Fodeh, SYale UniversityORCIDhttps://orcid.org/0000-0003-4664-3143
Krogan, NUniversity of California San FranciscoORCIDhttps://orcid.org/0000-0003-4902-337X
Lundberg, EStanford UniversityORCIDhttps://orcid.org/0000-0001-7034-0850
Mali, PUniversity of California San Diegohttps://orcid.org/0000-0002-3383-1287
Payne-Foster, PUniversity of AlabamaORCIDhttps://orcid.org/0000-0002-3508-3577
Ratcliffe, SUniversity of VirginiaORCIDhttps://orcid.org/0000-0002-6644-8284
Ravitsky, VUniversity of MontrealORCIDhttps://orcid.org/0000-0002-7080-8801
Sali, AUniversity of California San DiegoORCIDhttps://orcid.org/0000-0003-0435-6197
Schulz, WYale UniversityORCIDhttps://orcid.org/0000-0002-2048-4028
Ideker, TUniversity of California San DiegoORCIDhttps://orcid.org/0000-0002-1708-8454 |
|
Point of Contact
|
Use email button above to contact.
Ideker, Trey (University of California San Diego) |
|
Dataset Description
| Description
This dataset is the October 2025 Data Release of Cell Maps for Artificial Intelligence (CM4AI; CM4AI.org), the Functional Genomics Grand Challenge in the NIH Bridge2AI program. CM4AI is generating multi-modal data including protein-protein interaction (PPI), spatial localization, and genetic perturbation data in MDA-MB-468 breast cancer cells (+/- paclitaxel or vorinostat) and iPSCs (+/- differentiation). This Beta release includes:
- Perturb-seq data for MDA-MB-468 breast cancer cells +/- treatment and undifferentiated (parental) KOLF2.1J iPSCs
- SEC-MS data for MDA-MB-468 breast cancer cells +/- treatment, undifferentiated KOLF2.1J iPSCs, and iPSC-derived neuron progenitor cells (NPCs), neurons, and cardiomyocytes
- IF images in MDA-MB-468 breast cancer cells +/- treatment
External Data Links
Access external data resources related to this dataset:
- Perturb-seq data in KOLF2.1J iPSCs (undifferentiated): Embargoed
- Perturb-seq data in MDA-MB-468 breast cancer cells (+/- treatment): Embargoed
- SEC-MS data in KOLF2.1J iPSCs (undifferentiated, NPC, neuron, and cardiomyocyte): MassIVE Repository
- SEC-MS data in MDA-MB-468 breast cancer cells (+/- treatment): MassIVE Repository
Data Governance & Ethics
- Human Subjects: No
- De-identified Samples: Yes
- FDA Regulated: No
- Data Governance Committee: Jillian Parker (jillianparker@health.ucsd.edu)
- Ethical Review: Vardit Ravitsky (ravitskyv@thehastingscenter.org) and Jean-Christophe Belisle-Pipon (jean-christophe_belisle-pipon@sfu.ca)
Completeness
These data are not yet in completed final form:
- Some datasets are under temporary pre-publication embargo
- Protein-protein interaction (SEC-MS), protein localization (IF imaging), and CRISPRi perturbSeq data interrogate sets of proteins which incompletely overlap
- Computed cell maps not included in this release
Maintenance Plan
- Dataset will be regularly updated and augmented through the end of the project in November 2026
- Updates on a quarterly basis
- Long term preservation in the University of Virginia Dataverse, supported by committed institutional funds
Intended Use
This dataset is intended for:
- AI-ready datasets to support research in functional genomics
- AI model training
- Cellular process analysis
- Cell architectural changes and interactions in presence of specific disease processes, treatment conditions, or genetic perturbations
Limitations
Researchers should be aware of inherent limitations:
- This is an interim release
- Does not contain predicted cell maps, which will be added in future releases
- The current release is most suitable for bioinformatics analysis of the individual datasets
- Requires domain expertise for meaningful analysis
Prohibited Uses
- These laboratory data are not to be used in clinical decision-making or in any context involving patient care without appropriate regulatory oversight and approval
Potential Sources of Bias
Users should be aware of potential biases:
- Data in this release was derived from commercially available de-identified human cell lines
- Does not represent all biological variants which may be seen in the population at large
(2025-06-30) |
|
Subject
| Medicine, Health and Life Sciences |
|
Keyword
| AI http://purl.obolibrary.org/obo/NCIT_C16309 (NCI Thesaurus)
affinity purification http://www.bioassayontology.org/bao#BAO_0002603 (BioAssay Ontology (BAO))
AP-MS http://www.ebi.ac.uk/swo/SWO_1100012 (Software Ontology)
artificial intelligence http://purl.obolibrary.org/obo/NCIT_C16309 (NCI Thesaurus)
breast cancer http://purl.bioontology.org/ontology/LNC/LA14283-8 (LOINC)
Bridge2AI
cardiomyocyte http://purl.obolibrary.org/obo/CL_0000746
CM4AI
CRISPR/Cas9 http://www.bioassayontology.org/bao#BAO_0010249 (Bioassay Ontology (BAO))
induced pluripotent stem cell http://www.ebi.ac.uk/efo/EFO_0004905 (Experimental Factor Ontology (EFO))
iPSC http://www.ebi.ac.uk/efo/EFO_0004905 (Experimental Factor Ontology (EFO))
KOLF2.1J
machine learning http://purl.obolibrary.org/obo/OBI_0002587 (Ontology of Biomedical Investigations (OBI)) http://purl.obolibrary.org/obo/obi.owl
mass spectroscopy http://purl.bioontology.org/ontology/MESH/D013058 (Medical Subject Headings (MeSH))
MDA-MB-468
neural progenitor cell http://purl.obolibrary.org/obo/CL_0011020 (Cell Ontology (CL))
NPC http://purl.obolibrary.org/obo/CL_0011020 (Cell Ontology (CL)) http://purl.obolibrary.org/obo/cl.owl
neuron http://purl.obolibrary.org/obo/CL_0000540 (Cell Ontology (CL)) http://purl.obolibrary.org/obo/cl.owl
paclitaxel http://purl.obolibrary.org/obo/CHEBI_45863 (Chemical Entitites of Biological Interest (CHEBI))
perturb-seq http://www.ebi.ac.uk/efo/EFO_0008860 (Experimental Factor Ontology (EFO))
perturbation sequencing http://www.ebi.ac.uk/efo/EFO_0008860 (Experimental Factor Ontology (EFO))
protein-protein interaction http://purl.obolibrary.org/obo/NCIT_C18469 (NCI Thesaurus (NCIT))
protein localization http://purl.obolibrary.org/obo/GO_0008104 (Gene Ontology (GO)) http://purl.obolibrary.org/obo/go/extensions/go-plus.owl
single-cell RNA sequencing http://www.ebi.ac.uk/efo/EFO_0008913 (Experimental Factor Ontology (EFO))
scRNAseq http://www.ebi.ac.uk/efo/EFO_0008913 (Experimental Factor Ontology (EFO))
SEC-MS
size exclusion chromatography
subcellular imaging
vorinostat http://purl.obolibrary.org/obo/CHEBI_45716 (Chemical Entitites of Biological Interest (CHEBI)) |
|
Related Publication
| References: Clark T, Parker J, Schaffer L, Obernier K, Al Manir S, Churas CP, Dailamy A, Doctor Y, Forget A, Hansen JN, Hu M, Lenkiewicz J, Levinson MA, Marquez C, Nourreddine S, Niestroy J, Pratt D, Qian G, Thaker S, Bélisle-Pipon JC, Brandt C, Chen J, Ding Y, Fodeh S, Krogan N, Lundberg E, Mali P, Payne-Foster P, Ratcliffe S, Ravitsky V, Sali A, Schulz W, Ideker T. Cell Maps for Artificial Intelligence: AI-Ready Maps of Human Cell Architecture from Disease-Relevant Cell Lines. 2024. doi: http://doi.org/10.1101/2024.05.21.589311
Nourreddine S, Doctor Y, Dailamy A, Forget A, Lee YH, Chinn B, Khaliq H, Polacco B, Muralidharan M, Pan E, Zhang Y, Sigaeva A, Hansen JN, Gao J, Parker JA, Obernier K, Clark T, Chen JY, Metallo C, Lundberg E, Ideker T, Krogan N, Mali P. A PERTURBATION CELL ATLAS OF HUMAN INDUCED PLURIPOTENT STEM CELLS. bioRxiv. 2024 Nov 4;2024.11.03.621734. PMCID: PMC11580897 doi: https://doi.org/10.1101/2024.11.03.621734 |
|
Data Creation Date
| 2025-02-27 |
|
Production Location
| University of California San Diego; University of California San Francisco; Stanford University; University of Virginia |
|
Funding Information
| National Institutes of Health: 1OT2OD032742-01 |
|
Depositor
| Niestroy, Justin |
|
Deposit Date
| 2025-02-27 |