Analysis
Analyze output from a pcvct project. It is anticipated that this will eventually be split off into its own module or even package. Possibly with loader.jl.
Public API
pcvct.connectedComponents
— FunctionconnectedComponents(snapshot::PhysiCellSnapshot, graph=:neighbors; include_cell_type_names=:all_in_one, exclude_cell_type_names::String[], include_dead::Bool=false)
Find the connected components of a graph in a PhysiCell snapshot.
The computation can be done on subsets of cells based on their cell types.
Arguments
snapshot::PhysiCellSnapshot
: the snapshot to analyzegraph
: the graph data to use (default is:neighbors
); must be one of:neighbors
,:attachments
, or:spring_attachments
; can also be a string
Keyword Arguments
include_cell_type_names
: the cell types to include in the analysis (default is:all_in_one
). Full list of options::all
- compute connected components for all cell types individually:all_in_one
- compute connected components for all cell types together"cell_type_1"
- compute connected components only for the cells of typecell_type_1
["cell_type_1", "cell_type_2"]
- compute connected components for the cells of typecell_type_1
andcell_type_2
separately[["cell_type_1", "cell_type_2"]]
- compute connected components for the cells of typecell_type_1
andcell_type_2
together[["cell_type_1", "cell_type_2"], "cell_type_3"]
- compute connected components for the cells of typecell_type_1
andcell_type_2
together, and for the cells of typecell_type_3
separately
exclude_cell_type_names
: the cell types to exclude from the analysis (default isString[]
); can be a single string or a vector of stringsinclude_dead
: whether to include dead cells in the analysis (default isfalse
)
Returns
A dictionary in which each key is one of the following:
- cell type name (String)
- list of cell type names (Vector{String})
- list of cell type names followed by the symbol :include_dead (Vector{Any})
For each key, the value is a list of connected components in the graph. Each component is represented as a vector of vertex labels. As of this writing, the vertex labels are the simple AgentID
class that wraps the cell ID.
pcvct.motilityStatistics
— MethodmotilityStatistics(simulation_id::Integer[; direction=:any])
Return the mean speed, distance traveled, and time alive for each cell in the simulation, broken down by cell type in the case of cell type transitions.
The time is counted from when the cell first appears in simulation output until it dies or the simulation ends, whichever comes first. If the cell transitions to a new cell type during the simulation, the time is counted for each cell type separately. Each cell type taken on by a given cell will be a key in the dictionary returned at that entry.
Arguments
simulation_id::Integer
: The ID of the PhysiCell simulation. ASimulation
object can also be passed in.direction::Symbol
: The direction to compute the mean speed. Can be:x
,:y
,:z
, or:any
(default). If:x
, for example, the mean speed is calculated using only the x component of the cell's movement.
Returns
AgentDict{Dict{String, NamedTuple}}
: AnAgentDict
, i.e., one entry per cell in the simulation. Each dictionary has keys for each cell type taken on by the cell. The values are NamedTuples with fields:time
,:distance
, and:speed
.
Example
ms = motilityStatistics(1) # an AgentDict{Dict{String, NamedTuple}}, one per cell in the simulation
ms[1]["epithelial"] # NamedTuple with fields :time, :distance, :speed for the cell with ID 1 in the simulation corresponding to its time as an `epithelial` cell
ms[1]["mesenchymal"].time # time spent as a `mesenchymal` cell for the cell with ID 1 in the simulation
ms[1]["mesenchymal"].distance # distance traveled as a `mesenchymal` cell for the cell with ID 1 in the simulation
ms[1]["mesenchymal"].speed # mean speed as a `mesenchymal` cell for the cell with ID 1 in the simulation
PairCorrelationFunction.pcf
— Functionpcf(S::AbstractPhysiCellSequence, center_cell_types, target_cell_types=center_cell_types; include_dead::Union{Bool,Tuple{Bool,Bool}}=false, dr::Float64=20.0)
Calculate the pair correlation function (PCF) between two sets of cell types in a PhysiCell simulation snapshot or sequence.
The center_cell_types
and target_cell_types
can be strings or vectors of strings. This will compute one PCF rather than one for each pair of (center, target) cell types, i.e., all centers are compared to all targets. If omitted, the targetcelltypes will be the same as the centercelltypes, i.e., not a cross-PCF. The include_dead
argument can be a boolean or a tuple of booleans to indicate whether to include the dead centers and/or targets, respectively. The dr
argument specifies the bin size (thickness of each annulus) for the PCF calculation.
Arguments
S::AbstractPhysiCellSequence
: APhysiCellSnapshot
orPhysiCellSequence
object.center_cell_types
: The cell type name(s) to use as the center of the PCF.target_cell_types
: The cell type name(s) to use as the target of the PCF.
Keyword Arguments
include_dead::Union{Bool,Tuple{Bool,Bool}}
: Whether to include dead cells in the PCF calculation. If a tuple, the first element indicates whether to include dead centers and the second element indicates whether to include dead targets.dr::Float64
: The bin size for the PCF calculation.
Alternate methods
pcf(simulation::Simulation, index::Union{Integer, Symbol}, center_cell_types, target_cell_types=center_cell_types; kwargs...)
: Calculate the PCF for a specific snapshot in a simulation.pcf(simulation_id::Integer, index::Union{Integer, Symbol}, center_cell_types, target_cell_types=center_cell_types; kwargs...)
: Calculate the PCF for a specific snapshot in a simulation by ID.pcf(simulation_id::Integer, center_cell_types, target_cell_types=center_cell_types; kwargs...)
: Calculate the PCF for all snapshots in a simulation by ID.pcf(simulation::Simulation, center_cell_types, target_cell_types=center_cell_types; kwargs...)
: Calculate the PCF for all snapshots in a simulation.
Returns
A PCVCTPCFResult
object containing the time, radii, and g values of the PCF. Regardless of the type of S
, the time and radii will always be vectors. If S
is a snapshot, the g values will be a vector of the PCF. If S
is a sequence, the g values will be a (length(radii)-1 x length(time)) matrix of the PCF.
pcvct.finalPopulationCount
— FunctionfinalPopulationCount(simulation::Simulation[; include_dead::Bool=false])
Return the final population count of a simulation as a dictionary with cell type names as keys and their counts as values.
Also works with the simulation ID:
fpc = finalPopulationCount(1)
Example
fpc = finalPopulationCount(simulation)
final_default_count = fpc["default"]
pcvct.plotbycelltype
— Functionplotbycelltype(T::AbstractTrial; include_dead::Bool=false, include_cell_type_names=:all, exclude_cell_type_names=String[], time_unit=:min)
Plot the population time series of a trial by cell type.
Each cell type gets its own subplot. Each monad gets its own series within each subplot.
Arguments
T::AbstractTrial
: The trial to plot.
Keyword Arguments
include_dead::Bool
: Whether to include dead cells in the count. Default isfalse
.include_cell_type_names::Vector{String}
: A vector of cell type names to include in the plot. Default is:all
, which includes all cell types in separate plots.exclude_cell_type_names::Vector{String}
: A vector of cell type names to exclude from the plot. Default is an empty vector, i.e., no cell types are excluded.time_unit::Symbol
: The time unit to use for the x-axis. Default is:min
, which uses minutes.
pcvct.AverageSubstrateTimeSeries
— TypeAverageSubstrateTimeSeries
A struct to hold the average substrate concentrations over time for a PhysiCell simulation.
Constructed using AverageSubstrateTimeSeries(x)
where x
is any of the following: Integer
(simulation ID), PhysiCellSequence
, or Simulation
.
Fields
simulation_id::Int
: The ID of the PhysiCell simulation.time::Vector{Real}
: The time points at which the snapshots were taken.substrate_concentrations::Dict{String, Vector{Real}}
: A dictionary mapping substrate names to vectors of their average concentrations over time.
Example
```julia asts = pcvct.AverageSubstrateTimeSeries(1) # Load average substrate time series for Simulation 1 asts.time # Get the time points asts["time"] # alternative way to get the time points asts["oxygen"] # Get the oxygen concentration over time
pcvct.ExtracellularSubstrateTimeSeries
— TypeExtracellularSubstrateTimeSeries
A struct to hold the mean extracellular substrate concentrations per cell type over time for a PhysiCell simulation.
Fields
simulation_id::Int
: The ID of the PhysiCell simulation.time::Vector{Real}
: The time points at which the snapshots were taken.data::Dict{String, Dict{String, Vector{Real}}}
: A dictionary mapping cell type names to dictionaries mapping substrate names to vectors of their average concentrations over time.
Example
ests = pcvct.ExtracellularSubstrateTimeSeries(1) # Load extracellular substrate time series for Simulation 1
ests.time # Get the time points
ests["cancer"]["oxygen"] # Get the oxygen concentration over time for the cancer cell type
ests = pcvct.ExtracellularSubstrateTimeSeries(simulation; include_dead=true) # Load extracellular substrate time series for a Simulation object, including dead cells
ests["time"] # Alternate way to get the time points
ests["cd8"]["IFNg"] # Get the interferon gamma concentration over time for the CD8 cell type
ests = pcvct.ExtracellularSubstrateTimeSeries(sequence) # Load extracellular substrate time series for a PhysiCellSequence object
Private API
pcvct._connectedComponents
— Method_connectedComponents(G::MetaGraph)
Find the connected components of a graph.
First compute the transitive closure of the underlying Graphs.Graph
object. Then, update the edge data of the MetaGraph
object to reflect the new edges. Finally, loop over vertex labels and find the vertices they are connected to until all vertices are accounted for. Returns a list of connected components, where each component is represented as a vector of vertex labels.
pcvct._motilityStatistics
— Method_motilityStatistics(p[; direction=:any])
Compute the motility statistics for a single cell in the PhysiCell simulation.
Accounts for cell type transitions and computes the distance traveled, time spent, and mean speed for each cell type the given cell has taken on during the simulation. The speed can be restricted to a specific direction (x, y, z) or calculated in any direction. In either case, the distance is unsigned.
This function is used internally by motilityStatistics
.
Returns
A Dict{String, NamedTuple}
where each key is a cell type name visited by the cell and the value is a NamedTuple
with fields :time
, :distance
, and :speed
. The values in this named tuple are the time, distance traveled, and mean speed for the cell in that cell type, i.e., all scalars.
pcvct.PCVCTPCFResult
— TypePCVCTPCFResult
A struct to hold the results of the pair correlation function (PCF) calculation.
The start and end radii for each annulus are stored in the radii
field. Thus, there is one more radius than there are annuli, i.e. length(radii) == size(g, 1) + 1
. Each column of g
corresponds to a time point in the time
field, hence size(g, 2) == length(time)
.
Fields
time::Vector{Float64}
: The time points at which the PCF was calculated.pcf_result::PairCorrelationFunction.PCFResult
: The result of the PCF calculation.
Example
using PairCorrelationFunction
time = 12.0
radii = [0.0, 1.0, 2.0]
g = [0.5, 1.2]
pcvct.PCVCTPCFResult(time, PairCorrelationFunction.PCFResult(radii, g))
# output
PCVCTPCFResult:
Time: 12.0
Radii: 0.0 - 2.0 with 2 annuli, Δr = 1.0
g: 0.5 - 1.2 (min - max)
using PairCorrelationFunction
time = [12.0; 24.0; 36.0]
radii = [0.0, 1.0, 2.0]
g = [0.5 0.6 0.4; 1.2 1.15 1.4]
pcvct.PCVCTPCFResult(time, PairCorrelationFunction.PCFResult(radii, g))
# output
PCVCTPCFResult:
Time: 12.0 - 36.0 (n = 3)
Radii: 0.0 - 2.0 with 2 annuli, Δr = 1.0
g: 0.4 - 1.4 (min - max)
pcvct.cellPositionsForPCF
— MethodcellPositionsForPCF(cells::DataFrame, cell_types::Vector{String}, cell_name_to_type_dict::Dict{String,Int}, include_dead::Bool, is_3d::Bool)
Get the positions of the cells for the PCF calculation.
pcvct.isCrossPCF
— MethodisCrossPCF(center_cell_types::Vector{String}, target_cell_types::Vector{String})
Check if the center and target cell types are the same (PCF) or disjoint (cross-PCF); if neither, throw an error.
pcvct.pcfConstants
— MethodpcfConstants(S::AbstractPhysiCellSequence, dr::Float64)
Create a Constants
object for the PCF calculation based on the mesh of the snapshot or sequence.
pcvct.pcfSnapshotCalculation
— MethodpcfSnapshotCalculation(snapshot::PhysiCellSnapshot, center_cell_types::Vector{String}, target_cell_types::Vector{String}, cell_name_to_type_dict::Dict{String,Int}, include_dead::Union{Bool,Tuple{Bool,Bool}}, is_cross_pcf::Bool, constants::Constants)
Calculate the pair correlation function (PCF) for a given snapshot.
pcvct.preparePCF!
— MethodpreparePCF!(S::AbstractPhysiCellSequence, center_cell_types, target_cell_types, include_dead::Union{Bool,Tuple{Bool,Bool}}, dr::Float64)
Prepare the arguments for the PCF calculation.
pcvct.preparePCFPlot
— MethodpreparePCFPlot(results::Vector{PCVCTPCFResult}; time_unit=:min, distance_unit=:um)
Prepare the time and radii for the PCF plot.
pcvct.processDistance
— MethodprocessDistance(distance::Vector{Float64}, distance_unit::Symbol)
Process the distance vector to convert it to the desired distance unit. Options are :um, :mm, and :cm.
pcvct.processPCFCellTypes
— MethodprocessPCFCellTypes(cell_types)
Process the cell types for the PCF calculation so that they are always a vector of strings.
pcvct.processTime
— MethodprocessTime(time::Vector{Float64}, time_unit::Symbol)
Process the time vector to convert it to the desired time unit. Options are :min, :s, :h, :d, :w, :mo, and :y.
pcvct.AbstractPopulationTimeSeries
— TypeAbstractPopulationTimeSeries
Abstract type representing a population time series for either a simulation or a monad.
pcvct.MonadPopulationTimeSeries
— TypeMonadPopulationTimeSeries <: AbstractPopulationTimeSeries
Holds the data for a monad's population time series.
Note: unlike SimulationPopulationTimeSeries
, this type does not save the data to a file.
Examples
mpts = MonadPopulationTimeSeries(1)
mpts = MonadPopulationTimeSeries(monad(1))
Fields
monad_id::Int
: The ID of the monad.monad_length::Int
: The number of simulations in the monad.time::Vector{Real}
: The time points of the population time series.cell_count::Dict{String, NamedTuple}
: A dictionary where keys are cell type names and values are NamedTuples with fields:counts
,:mean
, and:std
.
pcvct.SimulationPopulationTimeSeries
— TypeSimulationPopulationTimeSeries <: AbstractPopulationTimeSeries
Holds the data for a simulation's population time series.
If constructed using a Simulation
or an Integer
(representing a simulation ID), it will save the time series inside the simulations/simulation_id/summary/
folder. It will also look for previously computed time series there to avoid recomputing them.
Examples
spts = SimulationPopulationTimeSeries(1) # first checks if the population time series is already computed and if not, computes it
spts = SimulationPopulationTimeSeries(Simulation(1)) # first checks if the population time series is already computed and if not, computes it
spts = SimulationPopulationTimeSeries(1; include_dead=true) # similar, but counts dead cells as well; the file name has "_include_dead" appended
Fields
simulation_id::Int
: The ID of the simulation.time::Vector{Real}
: The time points of the population time series.cell_count::Dict{String, Vector{Integer}}
: A dictionary where keys are cell type names and values are vectors of cell counts over time.
pcvct.formatTimeRange
— MethodformatTimeRange(v::Vector{<:Real})
Format a vector of time points into a string representation.
Used only for printing certain classes to the console.
pcvct.getMeanCounts
— MethodgetMeanCounts(apts::AbstractPopulationTimeSeries)
Return the mean counts of a population time series.
pcvct.populationCount
— FunctionpopulationCount(snapshot, cell_type_to_name_dict::Dict{Int,String}=Dict{Int,String}(), labels::Vector{String}=String[]; include_dead::Bool=false)
Return the population count of a snapshot as a dictionary with cell type names as keys and their counts as values.
If the snapshot is missing, it will return missing. This helps in cases where the files have been deleted, for example by pruning.
Arguments
snapshot::PhysiCellSnapshot
: The snapshot to count the cells in.cell_type_to_name_dict::Dict{Int,String}
: A dictionary mapping cell type IDs to names (default is an empty dictionary). If not provided, it is read from the snapshot files.labels::Vector{String}
: The labels to identify the cell data. If not provided, it is read from the snapshot files.
Keyword Arguments
include_dead::Bool
: Whether to include dead cells in the count. Default isfalse
.
pcvct.populationTimeSeries
— MethodpopulationTimeSeries(M::AbstractMonad[; include_dead::Bool=false])
Return the population time series of a simulation or a monad.
See SimulationPopulationTimeSeries
and MonadPopulationTimeSeries
for more details.
pcvct.processExcludeCellTypes
— MethodprocessExcludeCellTypes(exclude_cell_type_names)
Process the exclude_cell_type_names
argument to ensure it is in the correct format.
If exclude_cell_type_names
is a string, it is converted to a single-element vector. If it is a vector, it is returned as is.
pcvct.processIncludeCellTypes
— MethodprocessIncludeCellTypes(include_cell_type_names, all_cell_types::Vector{String})
Process the include_cell_type_names
argument to ensure it is in the correct format.
Uses the all_cell_types
vector to determine the valid cell types.
Arguments
include_cell_type_names
: the cell types to include in the analysis (default is:all_in_one
). Full list of options::all
- return the vector of all cell types:all_in_one
- return a vector with a single element, which is a vector of all cell types"cell_type_1"
- return ["celltype1"]["cell_type_1", "cell_type_2"]
- return ["celltype1", "celltype2"][["cell_type_1", "cell_type_2"]]
- return [["celltype1", "celltype2"]][["cell_type_1", "cell_type_2"], "cell_type_3"]
- return [["celltype1", "celltype2"], "celltype3"]
all_cell_types
: a vector of all cell types in the simulation
pcvct.VoxelWeights
— TypeVoxelWeights
A struct to hold the voxel weights for a PhysiCell simulation.
Fields
use_weights::Bool
: Whether to use weights for the voxel volumes. If all voxel volumes are the same, this is set tofalse
.weights::Vector{Real}
: The weights for the voxel volumes.weight_total::Real
: The total weight of the voxel volumes.
pcvct.averageExtracellularSubstrate
— FunctionaverageExtracellularSubstrate(snapshot::PhysiCellSnapshot, cell_type_to_name_dict::Dict{Int, String}=Dict{Int, String}(), substrate_names::Vector{String}=String[], labels::Vector{String}=String[]; include_dead::Bool=false)
Compute the average extracellular substrate concentrations for each cell type in a PhysiCell snapshot.
Arguments
snapshot::PhysiCellSnapshot
: The snapshot to analyze.cell_type_to_name_dict::Dict{Int, String}
: A dictionary mapping cell type IDs to their names. If not provided, it is read from the snapshot files.substrate_names::Vector{String}
: The names of the substrates in the simulation. If not provided, it is read from the snapshot files.labels::Vector{String}
: The labels to use for the cells. If not provided, it is read from the snapshot files.
Keyword Arguments
include_dead::Bool
: Whether to include dead cells in the analysis (default isfalse
).
Returns
Dict{String, Dict{String, Real}}
: A dictionary mapping cell type names to dictionaries mapping substrate names to their average concentrations.
That is, if aes
is the output of this function, then aes["cell_type_name"]["substrate_name"]
is the average concentration of substrate_name
for cells of type cell_type_name
in the snapshot.
pcvct.averageSubstrate
— FunctionaverageSubstrate(snapshot::PhysiCellSnapshot, substrate_names::Vector{String}=String[], voxel_weights::VoxelWeights=VoxelWeights(snapshot))
Compute the average substrate concentrations for every substrate in a PhysiCell snapshot.
The voxel volumes are used as weights if they are not all the same.
Arguments
snapshot::PhysiCellSnapshot
: The snapshot to analyze.substrate_names::Vector{String}
: The names of the substrates in the simulation. If not provided, it is read from the snapshot files.voxel_weights::VoxelWeights
: The voxel weights to use. If not provided, it is computed from the snapshot.
Returns
data::Dict{String, Real}
: A dictionary mapping substrate names to their average concentrations.
pcvct.computeVoxelIndices
— MethodcomputeVoxelIndices(cells::DataFrame, mesh::Dict{String, Vector{Float64}})
Compute the voxel (linear) indices (1-nxnynz) for a set of cells in a PhysiCell simulation.
pcvct.computeVoxelSubscripts
— MethodcomputeVoxelSubscripts(cells::DataFrame, mesh::Dict{String, Vector{Float64}})
Compute the voxel subscripts (1-nx, 1-ny, 1-nz) for a set of cells in a PhysiCell simulation.