scientific_pydantic¶
scientific_pydantic is an extension module to
pydantic that adds support for a number
of common data types in scientific computing.
Installation¶
This project can be installed from either PyPI or conda-forge via the
scientific-pydantic package.
Motivation¶
Let's say with only pydantic, you wanted to put a numpy.ndarray object into
one of your models:
pydantic.errors.PydanticSchemaGenerationError: Unable to generate pydantic-core schema for <class 'numpy.ndarray'>. Set `arbitrary_types_allowed=True` in the model_config to ignore this error or implement `__get_pydantic_core_schema__` on your type to fully support it.
arbitrary_types_allowed=True can work, but disables JSON
serialization and validation and does not support lax parsing and input
conversions, which are a powerful feature of pydantic. This library takes the
approach of implementing __get_pydantic_core_schema__ in adapter objects and
using Annotated, as described
here.
With scientific_pydantic, this example looks like:
import typing as ty
import numpy as np
import pydantic
from scientific_pydantic.numpy import NDArrayAdapter
class MyModel(pydantic.BaseModel):
arr: ty.Annotated[np.ndarray, NDArrayAdapter()]
pydantic experience with serialization and input conversions.
Usage¶
In general, it is recommended to use from-style imports with this library. The
import path for an adapter is normally akin to the import path of the type it is
adapting. For instance, the adapter for scipy.spatial.transform.Rotation would
be:
Annotated pattern, as was shown in the Motivation
section. This allows for typecheckers that support pydantic, such as
pyrefly to understand the type of the fields.
A number of adapters provided with this library also take parameters that define
common validation operations. This takes inspiration from pydantic.Field. For
instance,
scientific_pydantic, this style of
validation logic can be accomplished via:
import typing as ty
import numpy as np
import pydantic
from scientific_pydantic.numpy import NDArrayAdapter
class MyModel(pydantic.BaseModel):
a: ty.Annotated[np.ndarray, NDArrayAdapter(shape=(None, 3), ge=0)]
a to be an N x 3 ndarray where all elements are
non-negative. See the individual adapters in the API documentation for a
description of the parameters each one takes.
Design Philosophy¶
This library has an interesting conundrum from a dependency standpoint. Since the goal is to provide adapters for common types that come from many different libraries, there are a few options for how dependency management can work:
- Depend on all packages being supported and enforce version constraints. This would violate the "pay for what you use" principle and is thus not a good option.
- Split the package into
Ndifferent, but related, packages (e.g.scientific_pydantic_shapely) that enforce version constraints individually. This is tractable, but leads to a large number of packages to maintain and version together. - Have 1 package and only depend on
pydantic(andpydantic_core). Users will bring their own versions of the packages they want to use.
This library takes approach #3. This puts the burden of version compatibility
onto this library. For instance, scipy.spatial.transform.Rotation objects
gained the ability to support N-D arrays of rotation transforms in version
1.17.0. Thus, validation features related to this must be disabled if the user
brings their own scipy version that is < 1.17.0. This adds complexity to
this library, but prevents either the dependency bloat from option 1 or the
package bloat from option 2.
By only depending on pydantic, the library must not import anything from a
third-party library at global scope. This is accomplished via liberal use of
delayed and nested import statments and enforced via a unit test.