Reference¶
chemoecology_tools¶
Chemoecology tools for chemical ecology analysis.
- class chemoecology_tools.GCMSExperiment(abundance_df, metadata_df, id_col='ID', experiment_name=None, chemical_metadata=None)¶
Gas Chromatography-Mass Spectrometry (GCMS) experimental data container.
Manages GCMS abundance data, experimental metadata, and chemical properties.
- Parameters:
abundance_df (DataFrame)
metadata_df (DataFrame)
id_col (str)
experiment_name (str | None)
chemical_metadata (dict[str, dict[str, Any]] | None)
- abundance_df¶
DataFrame containing GCMS chemical abundance measurements
- metadata_df¶
DataFrame containing sample and experimental metadata
- id_col¶
Column name used to join abundance and metadata
- experiment_name¶
Optional identifier for the experiment
- chemical_metadata¶
Dictionary of chemical properties from config
- calculate_relative_abundance()¶
Calculate relative abundance of chemical compounds.
- Returns:
GCMSExperiment with relative abundance values
- Return type:
- filter_samples(criteria)¶
Filter samples based on metadata criteria.
- Parameters:
criteria (dict[str, list[str]]) – Filtering criteria {column: values_to_exclude}
- Returns:
New GCMSExperiment with filtered data
- Return type:
- filter_trace_compounds(threshold=0.005)¶
Filter out trace chemical amounts below threshold.
- Parameters:
threshold (float) – Minimum abundance value to keep (lower values set to 0)
- Returns:
GCMSExperiment with filtered abundance values
- Raises:
ValueError – If threshold is not between 0 and 1
- Return type:
- classmethod from_files(abundance_path, metadata_path, user_chemical_metadata=None, fetch_pubchem=True, id_col='ID', filter_dict=None, experiment_name=None)¶
Create experiment from data files.
- Parameters:
abundance_path (str | Path) – Path to abundance data file
metadata_path (str | Path) – Path to metadata file
user_chemical_metadata (str | Path | None) – Optional path to chemical properties YAML
fetch_pubchem (bool) – Whether to fetch PubChem data for chemicals
id_col (str) – Column name to join on
filter_dict (dict[str, list[str]] | None) – Optional filtering criteria {column: values_to_exclude}
experiment_name (str | None) – Optional experiment identifier
- Returns:
New GCMSExperiment instance
- Return type:
- get_abundance_matrix()¶
Get chemical abundance matrix.
- Returns:
DataFrame containing only chemical abundance measurements
- Return type:
DataFrame
- get_chemical_property(chemical, property_name, default=None)¶
Get property value for a chemical.
- Parameters:
chemical (str) – Name of the chemical
property_name (str) – Name of the property to retrieve
default (Any | None) – Value to return if property not found
- Returns:
Property value or default if not found
- Return type:
Any
- get_chemicals_by_property(property_name, value)¶
Get chemicals that have a specific property value.
- Parameters:
property_name (str) – Name of the property to match
value (Any) – Value to match
- Returns:
List of chemical names with matching property
- Return type:
list[str]
- get_metadata(columns=None)¶
Get metadata columns.
- Parameters:
columns (list[str] | None) – Optional list of column names to return
- Returns:
DataFrame containing requested metadata columns
- Return type:
DataFrame
- merge()¶
Merge abundance and metadata.
- Returns:
DataFrame with joined abundance and metadata
- Return type:
DataFrame
- chemoecology_tools.perform_nmds(experiment, n_components=2, random_state=42)¶
Perform NMDS on chemical data.
- Parameters:
experiment (GCMSExperiment) – GCMSExperiment instance containing the data
n_components (int) – Number of dimensions to reduce to
random_state (int) – Random seed for reproducibility
- Returns:
DataFrame containing NMDS coordinates (NMDS1, NMDS2)
- Return type:
DataFrame
- chemoecology_tools.plot_nmds(experiment, nmds_coords, group_col=None, title='NMDS Plot', width=10, aspect_ratio=0.618)¶
Create a beautifully styled NMDS plot for GCMS experiment data.
- Parameters:
experiment (GCMSExperiment) – GCMSExperiment instance containing the data
nmds_coords (DataFrame) – DataFrame containing NMDS coordinates (NMDS1, NMDS2)
group_col (str | None) – Optional metadata column name to group/color points by
title (str) – Plot title
width (float) – Figure width in inches
aspect_ratio (float) – Height/width ratio for the figure
- Returns:
matplotlib Figure object containing the styled plot
- Return type:
Figure
- chemoecology_tools.setup_plotting_style()¶
Configure global matplotlib and seaborn plotting style settings.
- Return type:
None