Data validation with ORMs#
LaminDB implements data validation at the ORM level by fully integrating the SQLModel ORM with pydantic type checking.
Let’s take a look at data validation behavior in LaminDB.
import lamindb as ln
import lamindb.schema as lns
import pytest
from pydantic import ValidationError
✅ Loaded instance: testuser1/testdb
Missing required field#
Let’s create a User
instance without the required email
and handle
fields.
with pytest.raises(ValidationError) as e:
exception = e
user_missing = lns.User(id="123")
print(exception.exconly())
pydantic.error_wrappers.ValidationError: 2 validation errors for User
email
field required (type=value_error.missing)
handle
field required (type=value_error.missing)
Field type error#
Let’s create a Transform
instance with the wrong type for the optional field name
.
from datetime import datetime
with pytest.raises(ValidationError) as e:
exception = e
invalid_transform = ln.Transform(name=datetime.now())
print(exception.exconly())
pydantic.error_wrappers.ValidationError: 1 validation error for Transform
name
str type expected (type=type_error.str)
Invalid categorical#
Let’s pass an invalid categorical to the type
field in Usage
, which only accepts the values ‘ingest’, ‘insert’, ‘select’, ‘update’, ‘delete’, ‘load’, and ‘link’.
from lnschema_core._core import SQLModel
from lnschema_core._types import Usage as UsageType
from sqlmodel import Field
class Usage(SQLModel, table=True): # type: ignore
id: str = Field(default=None, primary_key=True)
type: UsageType = Field(nullable=False, index=True)
assert Usage(type="ingest")
with pytest.raises(ValidationError) as e:
exception = e
invalid_usage = Usage(type="invalid categorical")
print(exception.exconly())
pydantic.error_wrappers.ValidationError: 1 validation error for Usage
type
value is not a valid enumeration member; permitted: 'ingest', 'insert', 'select', 'update', 'delete', 'load', 'link' (type=type_error.enum; enum_values=[<Usage.ingest: 'ingest'>, <Usage.insert: 'insert'>, <Usage.select: 'select'>, <Usage.update: 'update'>, <Usage.delete: 'delete'>, <Usage.load: 'load'>, <Usage.link: 'link'>])
Special cases#
Data validation with the LaminDB ORM mirrors the standard Pydantic behavior, including variable casting (see example #1 below) and extra field behaviors (see example #2 below). These can be changed through Pydantic’s configuration.
The only difference in behavior between LaminDB and Pydantic is strict type checking for Relationship
fields (see example #3 below), which is implemented in LaminDB.
Argument casting#
LaminDB mirrors Pydantic’s default behavior of casting input variables to conform to field types (see details in Pydantic’s documentation).
Let’s take a look at the default behavior by creating transform instances with int
and bool
inputs to the name
field, which is string-typed in the schema.
# Name (int) is cast to str
transform_name_int_to_str = ln.Transform(name=1)
type(transform_name_int_to_str.name)
str
# Name (bool) is cast to str
transform_name_bool_to_str = ln.Transform(name=True)
type(transform_name_int_to_str.name)
str
Extra fields#
LaminDB also mirror’s Pydantic default behavior of accepting extra fields not defined in the schema.
# No error is raised for the extra field
transform = ln.Transform(name="Test", extra_field="This field is not defined in the schema")
Strict type checking for relationships#
Differently from Pydantic, LaminDB enforces strict type checking for Relationship
fields.
Below is a simple example of Pydantic’s lenient type checking for Relationship
fields. Rather than enforcing the Car
type in the Wheel.car
field, it only enforces type-checking on the attributes of the input object.
from sqlmodel import SQLModel, Field, Relationship
from typing import Optional, List
class Car(SQLModel, table=False):
id: str = Field(primary_key=True, default=None)
name: str
wheels: List["Wheel"] = Relationship()
class Wheel(SQLModel, table=False):
id: str = Field(primary_key=True, default=None)
name: str
car: Optional["Car"] = Relationship()
class Bird(SQLModel, table=False):
id: str = Field(primary_key=True, default=None)
name: str
# Pydantic does not raise a validation error for wrong type in the car field
wheel = Wheel(name="Test Wheel", car=Bird(name="Test"))
LaminDB, on the other hand, enforces strict type checking for Relationships
.
with pytest.raises(TypeError) as e:
exception = e
run = lns.Run(name="Test Run", transform=Bird(name="This is not a Transform"))
print(exception.exconly())
TypeError: transform needs to be of type Transform