About
wss-test: Water Sample Schema
The wss-test schema defines a structured data model for water sample measurements. It extends the BERtron common data model types (Attribute, QuantityValue) with environmental measurement provenance and variable semantics.
Core Idea
BERtron defines the value containers; wss-test adds measurement provenance.
The schema inherits base types from BERtron and extends them:
- Dataset (maps to
bertron:DataCollection) — top-level container, addingvariables[]andsamples[]to group definitions and data together - Sample (maps to
bertron:Entity) — replaces genericproperties[]with typedsite_code,medium,replicate, andmeasurements[] - Variable (extends
bertron:Attribute) — semantic definition of what is being measured. Inheritslabelfrombertron:Attributeto name the variable, and addsexpression_basis,default_unit, andmissing_value_code - Measurement (extends
bertron:QuantityValue) — a single measured value with full provenance, addingmethod_id,flag,datetime_measured,statistic,temporal_aggregation,reported_precision, andnotes
What Goes Where
| Concept | BERtron provides | wss-test adds |
|---|---|---|
| Data container | DataCollection (id, title, description) |
Dataset + variables[], samples[] |
| Sample context | Entity (id, name, properties[]) |
Sample + site_code, .medium, .replicate, measurements[] |
| What was measured | Attribute (id, label) |
Variable + .expression_basis, .default_unit, .missing_value_code |
| How much | QuantityValue.numeric_value, .unit |
— |
| How it was measured | — | Measurement.method_id |
| Quality | — | Measurement.flag, .notes |
| When | — | Measurement.datetime_measured |
| Aggregation | — | Measurement.statistic, .temporal_aggregation |
| Precision | — | Measurement.reported_precision |
Key Design Decision
The previous MeasurementSpecification class has been eliminated. It conflated two concerns:
- What is being measured — now captured by
Variable, defined once per variable - How it was measured — now travels with each
Measurementviamethod_id
This means you define "dissolved oxygen" once as a Variable, not once per analytical method.
Two DO methods on the same sample are distinguishable by method_id, not by duplicating
variable definitions.
Technology Stack
- LinkML — Linked Data Modeling Language for schema definition
- MkDocs with Material theme for documentation
- Python for data validation and transformation
Development
Prerequisites
Quick Start
# Install dependencies
just install
# Generate documentation
just gen-doc
# Run local documentation server
just testdoc
# Run all tests
just test
Project Structure
wss-demo/
├── src/
│ ├── docs/ # Documentation source files
│ └── wss_test/
│ ├── schema/ # LinkML schema definition
│ └── datamodel/ # Generated Python models
├── docs/ # Generated documentation (do not edit)
├── project/ # Generated artifacts
├── tests/
│ └── data/ # Test data files
└── examples/ # Usage examples
License
This project is released under the MIT License.
Acknowledgments
This project uses the linkml-project-copier template for project structure and build tooling.
Contact
For questions or feedback, please open an issue on the GitHub repository.