Skip to content

About

wss-test: Water Sample Schema

The wss-test schema defines a structured data model for water sample measurements. It extends the BERtron common data model types (Attribute, QuantityValue) with environmental measurement provenance and variable semantics.

Core Idea

BERtron defines the value containers; wss-test adds measurement provenance.

The schema inherits base types from BERtron and extends them:

  • Dataset (maps to bertron:DataCollection) — top-level container, adding variables[] and samples[] to group definitions and data together
  • Sample (maps to bertron:Entity) — replaces generic properties[] with typed site_code, medium, replicate, and measurements[]
  • Variable (extends bertron:Attribute) — semantic definition of what is being measured. Inherits label from bertron:Attribute to name the variable, and adds expression_basis, default_unit, and missing_value_code
  • Measurement (extends bertron:QuantityValue) — a single measured value with full provenance, adding method_id, flag, datetime_measured, statistic, temporal_aggregation, reported_precision, and notes

What Goes Where

Concept BERtron provides wss-test adds
Data container DataCollection (id, title, description) Dataset + variables[], samples[]
Sample context Entity (id, name, properties[]) Sample + site_code, .medium, .replicate, measurements[]
What was measured Attribute (id, label) Variable + .expression_basis, .default_unit, .missing_value_code
How much QuantityValue.numeric_value, .unit
How it was measured Measurement.method_id
Quality Measurement.flag, .notes
When Measurement.datetime_measured
Aggregation Measurement.statistic, .temporal_aggregation
Precision Measurement.reported_precision

Key Design Decision

The previous MeasurementSpecification class has been eliminated. It conflated two concerns:

  • What is being measured — now captured by Variable, defined once per variable
  • How it was measured — now travels with each Measurement via method_id

This means you define "dissolved oxygen" once as a Variable, not once per analytical method. Two DO methods on the same sample are distinguishable by method_id, not by duplicating variable definitions.

Technology Stack

  • LinkML — Linked Data Modeling Language for schema definition
  • MkDocs with Material theme for documentation
  • Python for data validation and transformation

Development

Prerequisites

  • Python 3.10+
  • uv for package management
  • just for running commands

Quick Start

# Install dependencies
just install

# Generate documentation
just gen-doc

# Run local documentation server
just testdoc

# Run all tests
just test

Project Structure

wss-demo/
├── src/
│   ├── docs/                    # Documentation source files
│   └── wss_test/
│       ├── schema/              # LinkML schema definition
│       └── datamodel/           # Generated Python models
├── docs/                        # Generated documentation (do not edit)
├── project/                     # Generated artifacts
├── tests/
│   └── data/                    # Test data files
└── examples/                    # Usage examples

License

This project is released under the MIT License.

Acknowledgments

This project uses the linkml-project-copier template for project structure and build tooling.

Contact

For questions or feedback, please open an issue on the GitHub repository.