Skip to content

Validation Examples

Examples for data filtering, validation, and transformation.

Overview

The validation module provides composable filters for data cleaning and validation with seamless Pydantic integration.

Examples

1. Basic Validation

File: examples/validation/01_basic_validation.py

Demonstrates basic filter usage and composition.

Topics: - Basic filters (strip, lowercase, uppercase, truncate) - Filter composition with .then() - Filter chains - Email normalization - Slugify for URL-safe strings - Remove special characters - Regex replacement

Run:

python examples/validation/01_basic_validation.py

2. Pydantic Integration

File: examples/validation/02_pydantic_integration.py

Shows integration with Pydantic models.

Topics: - Pydantic field validators - FilteredModel base class - Automatic field filtering - Multi-field validation - Real-world use cases (user registration, forms)

Run:

python examples/validation/02_pydantic_integration.py

Quick Start

from dspu.validation import (
    StripWhitespaceFilter,
    LowercaseFilter,
    EmailNormalizationFilter
)

# Compose filters
email_filter = (
    StripWhitespaceFilter()
    .then(LowercaseFilter())
    .then(EmailNormalizationFilter())
)

# Apply
clean_email = email_filter("  Alice@GMAIL.COM  ")
# Result: "alice@gmail.com"

Built-in Filters

String Filters

from dspu.validation import (
    StripWhitespaceFilter,
    LowercaseFilter,
    UppercaseFilter,
    TruncateFilter,
)

# Strip whitespace
strip = StripWhitespaceFilter()
strip("  hello  ")  # "hello"

# Lowercase
lower = LowercaseFilter()
lower("HELLO")  # "hello"

# Uppercase
upper = UppercaseFilter()
upper("hello")  # "HELLO"

# Truncate
truncate = TruncateFilter(max_length=10, suffix="...")
truncate("Hello World!")  # "Hello W..."

Specialized Filters

from dspu.validation import (
    EmailNormalizationFilter,
    SlugifyFilter,
    RemoveSpecialCharsFilter,
    RegexReplaceFilter,
)

# Email normalization
email = EmailNormalizationFilter()
email("Alice@Gmail.com")  # "alice@gmail.com"

# Slugify
slug = SlugifyFilter()
slug("Hello World!")  # "hello-world"

# Remove special chars
remove = RemoveSpecialCharsFilter()
remove("hello@world!")  # "helloworld"

# Regex replace
regex = RegexReplaceFilter(pattern=r"\d+", replacement="X")
regex("abc123def")  # "abcXdef"

Filter Composition

Using .then()

from dspu.validation import StripWhitespaceFilter, LowercaseFilter

# Chain filters
filter = StripWhitespaceFilter().then(LowercaseFilter())
filter("  HELLO  ")  # "hello"

Using FilterChain

from dspu.validation import FilterChain

# Create chain
chain = FilterChain([
    StripWhitespaceFilter(),
    LowercaseFilter(),
    TruncateFilter(max_length=10),
])

# Apply chain
chain("  HELLO WORLD  ")  # "hello worl"

Pydantic Integration

Using pydantic_filter_validator

from pydantic import BaseModel
from dspu.validation import (
    pydantic_filter_validator,
    StripWhitespaceFilter,
    LowercaseFilter,
    EmailNormalizationFilter,
)

# Create filter
email_filter = (
    StripWhitespaceFilter()
    .then(LowercaseFilter())
    .then(EmailNormalizationFilter())
)

class User(BaseModel):
    email: str

    # Apply filter to field
    _email_filter = pydantic_filter_validator("email", email_filter)

# Automatic filtering
user = User(email="  ALICE@EXAMPLE.COM  ")
print(user.email)  # "alice@example.com"

Using FilteredModel

from dspu.validation import FilteredModel, StripWhitespaceFilter

class User(FilteredModel):
    name: str
    email: str

    # Define filters
    _filters = {
        "name": StripWhitespaceFilter(),
        "email": StripWhitespaceFilter().then(LowercaseFilter()),
    }

# Automatic filtering on all fields
user = User(name="  Alice  ", email="  ALICE@EXAMPLE.COM  ")
print(user.name)   # "Alice"
print(user.email)  # "alice@example.com"

Common Patterns

Pattern 1: Email Validation

from dspu.validation import (
    StripWhitespaceFilter,
    LowercaseFilter,
    EmailNormalizationFilter,
)
from pydantic import BaseModel, EmailStr

email_filter = (
    StripWhitespaceFilter()
    .then(LowercaseFilter())
    .then(EmailNormalizationFilter())
)

class User(BaseModel):
    email: EmailStr  # Pydantic email validation

    _email_filter = pydantic_filter_validator("email", email_filter)

# Both filtering and validation
user = User(email="  Alice@Gmail.com  ")
print(user.email)  # "alice@gmail.com"

Pattern 2: Username Normalization

from dspu.validation import (
    StripWhitespaceFilter,
    LowercaseFilter,
    RemoveSpecialCharsFilter,
)

username_filter = (
    StripWhitespaceFilter()
    .then(LowercaseFilter())
    .then(RemoveSpecialCharsFilter())
)

class User(BaseModel):
    username: str

    _username_filter = pydantic_filter_validator("username", username_filter)

user = User(username="  Alice@123!  ")
print(user.username)  # "alice123"

Pattern 3: URL Slug Generation

from dspu.validation import SlugifyFilter

class Article(BaseModel):
    title: str
    slug: str = ""

    _slug_filter = pydantic_filter_validator("slug", SlugifyFilter())

    @model_validator(mode='after')
    def generate_slug(self):
        if not self.slug:
            self.slug = SlugifyFilter()(self.title)
        return self

article = Article(title="Hello World!")
print(article.slug)  # "hello-world"

Pattern 4: Text Sanitization

from dspu.validation import (
    StripWhitespaceFilter,
    TruncateFilter,
    RemoveSpecialCharsFilter,
)

# Sanitize user input
sanitize = (
    StripWhitespaceFilter()
    .then(RemoveSpecialCharsFilter())
    .then(TruncateFilter(max_length=100))
)

class Comment(BaseModel):
    text: str

    _text_filter = pydantic_filter_validator("text", sanitize)

comment = Comment(text="  <script>alert('xss')</script>Hello!  ")
print(comment.text)  # "scriptalertxssscriptHello" (truncated to 100)

Custom Filters

Creating a Custom Filter

from dspu.validation import Filter

class CapitalizeFilter(Filter):
    def apply(self, value: str) -> str:
        return value.capitalize()

# Use custom filter
capitalize = CapitalizeFilter()
capitalize("hello world")  # "Hello world"

# Compose with other filters
filter = StripWhitespaceFilter().then(CapitalizeFilter())
filter("  hello  ")  # "Hello"

Parameterized Filter

class ReplaceFilter(Filter):
    def __init__(self, old: str, new: str):
        self.old = old
        self.new = new

    def apply(self, value: str) -> str:
        return value.replace(self.old, self.new)

# Use parameterized filter
replace = ReplaceFilter(old="@", new="[at]")
replace("alice@example.com")  # "alice[at]example.com"

Best Practices

DO: - Compose filters for reusability - Use Pydantic integration for automatic validation - Create custom filters for domain-specific logic - Chain filters in logical order (strip → lowercase → specific) - Test filters with edge cases

DON'T: - Don't mutate input data in filters - Don't perform validation in filters (use Pydantic validators) - Don't create complex filters (compose simple ones) - Don't skip input sanitization for user data - Don't trust client-side validation alone

Filter Reference

Filter Purpose Example
StripWhitespaceFilter Remove leading/trailing whitespace " hello ""hello"
LowercaseFilter Convert to lowercase "HELLO""hello"
UppercaseFilter Convert to uppercase "hello""HELLO"
TruncateFilter Limit string length "hello world""hello..."
EmailNormalizationFilter Normalize email "Alice@Gmail.com""alice@gmail.com"
SlugifyFilter Create URL-safe slug "Hello World!""hello-world"
RemoveSpecialCharsFilter Remove non-alphanumeric "hello@world!""helloworld"
RegexReplaceFilter Regex find and replace "abc123""abcXXX"

See Also