SRTsim

SRTsim was specifically developed for simulating spatial transcriptome data. Besides the gene expression profile, users should also provide the spatial coordinates of each cell (spot). The reference data can be downloaded here.

Estimating parameters from a real dataset

Before simulating datasets, it is important to estimate some essential parameters from a real dataset in order to make the simulated data more real.

library(simmethods)
# Load data (downloaded from https://zenodo.org/record/8251596/files/data118_spatial_OV.rds?download=1)
data <- readRDS("../../../../preprocessed_data/data118_spatial_OV.rds")
ref_data <- t(as.matrix(data$data$counts))

In addition, we can set the spatial coordinates by spatial.x and spatial.y parameters.

other_prior <- list(spatial.x = data$data_info$spatial_coordinate$x,
                    spatial.y = data$data_info$spatial_coordinate$y)

Execute the parameter estimation:

estimate_result <- simmethods::SRTsim_estimation(
  ref_data = ref_data,
  other_prior = other_prior,
  verbose = T,
  seed = 10
)
# Estimating parameters using SRTsim

Users can also input the group information of cells:

other_prior <- list(spatial.x = data$data_info$spatial_coordinate$x,
                    spatial.y = data$data_info$spatial_coordinate$y,
                    group.condition = data$data_info$group_condition)
estimate_result <- simmethods::SRTsim_estimation(
  ref_data = ref_data,
  other_prior = other_prior,
  verbose = T,
  seed = 10
)
# Estimating parameters using SRTsim

Simulating datasets using SRTsim

  1. Datasets with default parameters
  2. Simulate cell groups

Datasets with default parameters

simulate_result <- simmethods::SRTsim_simulation(
  parameters = estimate_result$estimate_result,
  other_prior = NULL,
  return_format = "SCE",
  seed = 111
)
# nSpots: 3492
# nGenes: 1056
# nGroups: 2
SCE_result <- simulate_result[["simulate_result"]]
dim(SCE_result)
# [1] 1056 3492
head(colData(SCE_result))
# DataFrame with 6 rows and 4 columns
#                            x         y       group          cell_name
#                    <numeric> <numeric> <character>        <character>
# AAACAAGTATCTCCCA-1        27        38           B AAACAAGTATCTCCCA-1
# AAACACCAATAACTGC-1       110        29           B AAACACCAATAACTGC-1
# AAACAGGGTCTATATT-1       116        41           B AAACAGGGTCTATATT-1
# AAACATTTCCCGGATT-1        32        27           A AAACATTTCCCGGATT-1
# AAACCCGAACGAAATC-1        14        43           B AAACCCGAACGAAATC-1
# AAACCGGAAATGTTAA-1         5        34           B AAACCGGAAATGTTAA-1

Simulate cell groups

There is a strict rule for simulating cell groups using SRTsim:

  1. Users can simulate cell groups when the information of cell group labels is used for parameter estimation;

  2. The number of the simulated cell groups must be equal to that of the real groups used in parameter estimation.

As we used the information of cell groups in parameter estimation, so we can simulate the data with cell groups.

simulate_result <- simmethods::SRTsim_simulation(
  parameters = estimate_result$estimate_result,
  other_prior = NULL,
  return_format = "list",
  seed = 111
)
# nSpots: 3492
# nGenes: 1056
# nGroups: 2
cell_meta <- simulate_result$simulate_result$col_meta
head(cell_meta)
#                      x  y group          cell_name
# AAACAAGTATCTCCCA-1  27 38     B AAACAAGTATCTCCCA-1
# AAACACCAATAACTGC-1 110 29     B AAACACCAATAACTGC-1
# AAACAGGGTCTATATT-1 116 41     B AAACAGGGTCTATATT-1
# AAACATTTCCCGGATT-1  32 27     A AAACATTTCCCGGATT-1
# AAACCCGAACGAAATC-1  14 43     B AAACCCGAACGAAATC-1
# AAACCGGAAATGTTAA-1   5 34     B AAACCGGAAATGTTAA-1

The x and y columns represent the spatial positions of cells (spots), and the group column denotes the group labels of cells.

Check the group labels of cells:

table(cell_meta$group)
# 
#    A    B 
# 1051 2441

Visualize the spatial spots:

library(ggplot2)
location <- simulate_result$simulate_result$col_meta
p <- ggplot(location, aes(x = x, y = y))+
  geom_point(aes(color = group))+
  theme(panel.grid = element_blank(),
        axis.title = element_blank(),
        axis.text = element_blank(),
        legend.position = "bottom")
p