Database

Create and manage the pcvct database.

Public API

pcvct.printSimulationsTableMethod
printSimulationsTable()

Print a table of simulations and their varied values. See keyword arguments below for more control of the output.

There are many methods for this function. The simplest is printSimulationsTable(), which prints all simulations in the database. You can also pass in any number of simulations, monads, samplings, and trials to print a table of those simulations:

printSimulationsTable([simulation_1, monad_3, sampling_2, trial_1])

Finally, a vector of simulation IDs can be passed in:

printSimulationsTable([1, 2, 3])

Keyword arguments can be used with any of these methods to control the output:

Keyword Arguments

  • sink: A function to print the table. Defaults to println. Note, the table is a DataFrame, so you can also use CSV.write to write the table to a CSV file.
  • remove_constants::Bool: If true, removes columns that have the same value for all simulations. Defaults to true.
  • sort_by::Vector{String}: A vector of column names to sort the table by. Defaults to all columns. To populate this argument, first print the table to see the column names.
  • sort_ignore::Vector{String}: A vector of column names to ignore when sorting. Defaults to the database IDs associated with the simulations.
source
pcvct.simulationsTableMethod
simulationsTable(T; kwargs...)

Return a DataFrame with the simulation data calling simulationsTableFromQuery with those keyword arguments.

There are three options for T:

  • T can be any Simulation, Monad, Sampling, Trial, or any array (or vector) of such.
  • T can also be a vector of simulation IDs.
  • If omitted, creates a DataFrame for all the simulations.
source

Private API

pcvct.appendVariationsMethod
appendVariations(location::Symbol, df::DataFrame)

Add the varied parameters associated with the location to df.

source
pcvct.constructSelectQueryFunction
constructSelectQuery(table_name::String, condition_stmt::String=""; selection::String="*")

Construct a SELECT query for the given table name, condition statement, and selection.

source
pcvct.createPCVCTTableMethod
createPCVCTTable(table_name::String, schema::String; db::SQLite.DB=centralDB())

Create a table in the database with the given name and schema. The table will be created if it does not already exist.

The table name must end in "s" to help normalize the ID names for these entries. The schema must have a PRIMARY KEY named as the table name without the "s" followed by "_id."

source
pcvct.createSchemaMethod
createSchema(is_new_db::Bool; auto_upgrade::Bool=false)

Create the schema for the database. This includes creating the tables and populating them with data.

source
pcvct.initializeDatabaseMethod
initializeDatabase(path_to_database::String; auto_upgrade::Bool=false)

Initialize the database at the given path. If the database does not exist, it will be created.

Also, check the version of pcvct used to create the database and upgrade it if necessary.

source
pcvct.inputFolderIDMethod
inputFolderID(location::Symbol, folder_name::String; db::SQLite.DB=centralDB())

Retrieve the ID of the folder associated with the given location and folder name.

source
pcvct.inputFolderNameMethod
inputFolderName(location::Symbol, id::Int)

Retrieve the folder name associated with the given location and ID.

source
pcvct.insertFolderFunction
insertFolder(location::Symbol, folder::String, description::String="")

Insert a folder into the database. If the folder already exists, it will be ignored.

If the folder already has a description from the metadata.xml file, that description will be used instead of the one provided.

source
pcvct.isStartedMethod
isStarted(simulation_id::Int[; new_status_code::Union{Missing,String}=missing])

Check if a simulation has been started. Can also pass in a Simulation object in place of the simulation ID.

If new_status_code is provided, update the status of the simulation to this value. The check and status update are done in a transaction to ensure that the status is not changed by another process.

source
pcvct.metadataDescriptionMethod
metadataDescription(path_to_folder::AbstractString)

Get the description from the metadata.xml file in the given folder using the description element as a child element of the root element.

source
pcvct.monadsSchemaMethod
monadsSchema()

Create the schema for the monads table. This includes the columns and their types.

source
pcvct.queryToDataFrameMethod
queryToDataFrame(query::String; db::SQLite.DB=centralDB(), is_row::Bool=false)

Execute a query against the database and return the result as a DataFrame.

If is_row is true, the function will assert that the result has exactly one row, i.e., a unique result.

source
pcvct.reinitializeDatabaseMethod
reinitializeDatabase()

Reinitialize the database by searching through the data/inputs directory to make sure all are present in the database.

source
pcvct.samplingsSchemaMethod
samplingsSchema()

Create the schema for the samplings table. This includes the columns and their types.

source
pcvct.simulationsTableFromQueryMethod
simulationsTableFromQuery(query::String; remove_constants::Bool=true, sort_by=String[], sort_ignore=[:SimID; shortLocationVariationID.(projectLocations().varied)])

Return a DataFrame containing the simulations table for the given query.

By default, will ignore the simulation ID and the variation IDs for the varied locations when sorting. The sort order can be controlled by the sort_by and sort_ignore keyword arguments.

By default, constant columns (columns with the same value for all simulations) will be removed (unless there is only one simulation). Set remove_constants to false to keep these columns.

Arguments

  • query::String: The SQL query to execute.

Keyword Arguments

  • remove_constants::Bool: If true, removes columns that have the same value for all simulations. Defaults to true.
  • sort_by::Vector{String}: A vector of column names to sort the table by. Defaults to all columns. To populate this argument, it is recommended to first print the table to see the column names.
  • sort_ignore::Vector{String}: A vector of column names to ignore when sorting. Defaults to the simulation ID and the variation IDs associated with the simulations.
source
pcvct.variationIDsMethod
variationIDs(location::Symbol, S::AbstractSampling)

Return a vector of the variation IDs for the given location associated with S.

source
pcvct.variationsDatabaseMethod
variationsDatabase(location::Symbol, folder::String)

Return the database for the location and folder.

The second argument can alternatively be the ID of the folder or an AbstractSampling object (simulation, monad, or sampling) using that folder.

source
pcvct.variationsTableMethod
variationsTable(query::String, db::SQLite.DB; remove_constants::Bool=false)

Return a DataFrame containing the variations table for the given query and database.

Remove constant columns if remove_constants is true and the DataFrame has more than one row.

source
pcvct.variationsTableMethod
variationsTable(location::Symbol, ::Missing, variation_ids::AbstractVector{<:Integer}; kwargs...)

If the location folder does not contain a variations database, return a DataFrame with all variation IDs set to 0.

source
pcvct.variationsTableMethod
variationsTable(location::Symbol, ::Nothing, variation_ids::AbstractVector{<:Integer}; kwargs...)

If the location is not being used, return a DataFrame with all variation IDs set to -1.

source
pcvct.variationsTableMethod
variationsTableName(location::Symbol, variations_database::SQLite.DB, variation_ids::AbstractVector{<:Integer}; remove_constants::Bool=false)

Return a DataFrame containing the variations table for the given location, variations database, and variation IDs.

source
pcvct.variationsTableMethod
variationsTable(location::Symbol, S::AbstractSampling; remove_constants::Bool=false)

Return a DataFrame containing the variations table for the given location and sampling.

source
pcvct.vctDBQueryMethod
vctDBQuery(query::String; db::SQLite.DB=centralDB())

Execute a query against the database and return the result.

source