Custom media types
In PineXQ workflows, the platform validates that media types match when data flows between Jobs. By default, DataSlots use well-known
media types like application/json from the MediaTypes enum. This works for simple workflows, but is too coarse when different
JSON schemas represent fundamentally different data — two DataSlots both typed as application/json will pass validation even
though they carry incompatible data structures.
Custom media types solve this by generating a distinct media type for each data schema. The platform can then enforce schema-level
type safety: a ProcessingStep producing application/vnd.pinexq.myproject.sensor-reading.v1+json will only connect to a downstream
step that expects exactly that type. This applies even to single Jobs which can only be configured DataSlots with matching media types.
The @media_type_def decorator lets you annotate your data classes with a name and version. The framework builds the full media type
string from these parameters — no need to construct it manually. Use get_media_type(MyClass) to retrieve the string and pass it to
DataSlot decorators.
Annotating Classes with @media_type_def
Section titled “Annotating Classes with @media_type_def”Use the @media_type_def decorator to annotate a Pydantic model or dataclass with media type metadata:
from pydantic import BaseModelfrom pinexq.procon.core.media_type import media_type_def, get_media_type
@media_type_def(name="sensor-reading", version=1, namespace="myproject")class SensorReadingV1(BaseModel): timestamp: str value: float unit: str
# application/vnd.pinexq.myproject.sensor-reading.v1+jsonThe decorator works with both Pydantic BaseModel subclasses and standard @dataclass classes.
Format and Parameters
Section titled “Format and Parameters”The generated media type follows the format:
application/vnd.<vendor>.<namespace>.<name>.v<version>+<suffix>| Parameter | Required | Default | Description |
|---|---|---|---|
name | Yes | — | Logical name for the data schema. |
version | Yes | — | Schema version number (positive integer, >= 1). |
namespace | Yes | — | Namespace to group schemas under (e.g., project, team, domain). |
vendor | No | "pinexq" | Vendor prefix (first segment after vnd.). |
suffix | No | "json" | Structured syntax suffix (json, xml, csv, etc.). Also determines whether DataSlots open files in text or binary mode. |
Retrieving the Media Type String
Section titled “Retrieving the Media Type String”Use get_media_type() to retrieve the precomputed media type string from an annotated class:
media_type = get_media_type(SensorReadingV1)# "application/vnd.pinexq.myproject.sensor-reading.v1+json"If the class has no @media_type_def annotation, a TypeError is raised.
Using Custom Media Types in DataSlots
Section titled “Using Custom Media Types in DataSlots”Pass get_media_type() to the media_type parameter on DataSlot decorators. You also need to provide reader and writer functions for serialization:
from pinexq.procon.dataslots.default_reader_writer import DefaultReaderWriter
@dataslot.input('reading', reader=lambda f: DefaultReaderWriter.pydantic_base_reader(f, SensorReadingV1), media_type=get_media_type(SensorReadingV1))@dataslot.returns(writer=DefaultReaderWriter.pydantic_base_writer, media_type=get_media_type(SensorReadingV1))def process_reading(self, reading: SensorReadingV1) -> SensorReadingV1: ...Reader and Writer Functions
Section titled “Reader and Writer Functions”DataSlots do not automatically serialize or deserialize Pydantic models or dataclasses — you must provide explicit reader and writer functions.
For Pydantic models, the framework provides DefaultReaderWriter with built-in support. For dataclasses, you need to supply custom reader/writer
functions (e.g., using dataclasses.asdict and json.dump).
| Utility | Purpose |
|---|---|
DefaultReaderWriter.pydantic_base_reader | Deserializes JSON into a Pydantic model instance. |
DefaultReaderWriter.pydantic_base_writer | Serializes a Pydantic model instance to JSON. |
DefaultReaderWriter.pydantic_list_base_reader | Reads a JSON array into a list of Pydantic model instances. |
DefaultReaderWriter.pydantic_list_base_writer | Writes a list of Pydantic model instances as a JSON array. |
The reader callable only receives a file handle, so pydantic_base_reader needs a lambda to bind the target type. The writer can be passed directly.
Reducing Repetition with with_defaults
Section titled “Reducing Repetition with with_defaults”When many classes share common parameters, use with_defaults() to pre-fill namespace, vendor, and/or suffix. Individual calls can still
override any pre-filled value:
my_schema = media_type_def.with_defaults(namespace="myproject")
@my_schema(name="sensor-reading", version=1)class SensorReadingV1(BaseModel): ...
# Override vendor and suffix for a specific class:@my_schema(name="report", version=1, vendor="acme", suffix="xml")class AcmeReportV1(BaseModel): ...# "application/vnd.acme.myproject.report.v1+xml"Versioning Guidelines
Section titled “Versioning Guidelines”Each annotated class represents one specific version of a data schema. The platform enforces exact match on media type strings,
so a ProcessingStep consuming v1 will not accidentally receive v2 data.
When to create a new version:
- Any change to the schema’s fields (added, removed, renamed, or type-changed) requires a new version number.
- Changes to field constraints (e.g., tighter validation) that could break existing consumers should bump the version.
Note: While Pydantic (and dataclasses with default values) can technically handle added optional fields without breaking deserialization, we recommend bumping the version for any schema change. Keeping the same version undermines the platform’s exact-match safety and makes contract changes invisible to other workflow participants.
How to version:
Create a separate class for each version. The name parameter stays the same; only version changes:
@media_type_def(name="sensor-reading", version=1, namespace="myproject")class SensorReadingV1(BaseModel): timestamp: str value: float
@media_type_def(name="sensor-reading", version=2, namespace="myproject")class SensorReadingV2(BaseModel): timestamp: str value: float unit: str source: strThese produce distinct media types:
application/vnd.pinexq.myproject.sensor-reading.v1+jsonapplication/vnd.pinexq.myproject.sensor-reading.v2+json
Naming Conventions and Restrictions
Section titled “Naming Conventions and Restrictions”All string parameters are normalized to lowercase. The following rules are enforced at class definition time:
| Parameter | Rule | Pattern | Valid Examples | Invalid Examples |
|---|---|---|---|---|
name | Starts with letter, lowercase alphanumeric + hyphens, no dots | [a-z][a-z0-9\-]* | sensor-reading, calibration | SensorReading, sensor.reading, 2sensors |
namespace | Same as name | [a-z][a-z0-9\-]* | myproject, data-pipeline | my.project, My_Project |
vendor | Same as name | [a-z][a-z0-9\-]* | pinexq, acme | Acme Inc. |
version | Positive integer (>= 1) | — | 1, 2, 42 | 0, -1, "1" |
suffix | Starts with letter, lowercase alphanumeric, no hyphens | [a-z][a-z0-9]* | json, xml, csv | JSON, json-ld |
Length limits (RFC 6838):
- The media subtype (everything after
application/) must not exceed 127 characters — aValueErroris raised if exceeded. - A
UserWarningis emitted if it exceeds 64 characters (recommended limit). With thevnd.pinexq.prefix (~12 chars) and.v<N>+jsonsuffix (~8-10 chars), roughly 44 characters remain for<namespace>.<name>within the recommended 64.
Naming tips:
- Use hyphens to separate words within a segment:
sensor-reading, notsensorreading. - Use short, descriptive names: they appear in manifests and platform UIs.
- The
nameidentifies the data schema, not the Python class. Multiple versions of the same schema share the samename— onlyversionchanges. The class name (e.g.,SensorReadingV1,SensorReadingV2) is independent and can be chosen freely.