Validation Examples¶
Examples for data filtering, validation, and transformation.
Overview¶
The validation module provides composable filters for data cleaning and validation with seamless Pydantic integration.
Examples¶
1. Basic Validation¶
File: examples/validation/01_basic_validation.py
Demonstrates basic filter usage and composition.
Topics:
- Basic filters (strip, lowercase, uppercase, truncate)
- Filter composition with .then()
- Filter chains
- Email normalization
- Slugify for URL-safe strings
- Remove special characters
- Regex replacement
Run:
2. Pydantic Integration¶
File: examples/validation/02_pydantic_integration.py
Shows integration with Pydantic models.
Topics: - Pydantic field validators - FilteredModel base class - Automatic field filtering - Multi-field validation - Real-world use cases (user registration, forms)
Run:
Quick Start¶
from dspu.validation import (
StripWhitespaceFilter,
LowercaseFilter,
EmailNormalizationFilter
)
# Compose filters
email_filter = (
StripWhitespaceFilter()
.then(LowercaseFilter())
.then(EmailNormalizationFilter())
)
# Apply
clean_email = email_filter(" Alice@GMAIL.COM ")
# Result: "alice@gmail.com"
Built-in Filters¶
String Filters¶
from dspu.validation import (
StripWhitespaceFilter,
LowercaseFilter,
UppercaseFilter,
TruncateFilter,
)
# Strip whitespace
strip = StripWhitespaceFilter()
strip(" hello ") # "hello"
# Lowercase
lower = LowercaseFilter()
lower("HELLO") # "hello"
# Uppercase
upper = UppercaseFilter()
upper("hello") # "HELLO"
# Truncate
truncate = TruncateFilter(max_length=10, suffix="...")
truncate("Hello World!") # "Hello W..."
Specialized Filters¶
from dspu.validation import (
EmailNormalizationFilter,
SlugifyFilter,
RemoveSpecialCharsFilter,
RegexReplaceFilter,
)
# Email normalization
email = EmailNormalizationFilter()
email("Alice@Gmail.com") # "alice@gmail.com"
# Slugify
slug = SlugifyFilter()
slug("Hello World!") # "hello-world"
# Remove special chars
remove = RemoveSpecialCharsFilter()
remove("hello@world!") # "helloworld"
# Regex replace
regex = RegexReplaceFilter(pattern=r"\d+", replacement="X")
regex("abc123def") # "abcXdef"
Filter Composition¶
Using .then()¶
from dspu.validation import StripWhitespaceFilter, LowercaseFilter
# Chain filters
filter = StripWhitespaceFilter().then(LowercaseFilter())
filter(" HELLO ") # "hello"
Using FilterChain¶
from dspu.validation import FilterChain
# Create chain
chain = FilterChain([
StripWhitespaceFilter(),
LowercaseFilter(),
TruncateFilter(max_length=10),
])
# Apply chain
chain(" HELLO WORLD ") # "hello worl"
Pydantic Integration¶
Using pydantic_filter_validator¶
from pydantic import BaseModel
from dspu.validation import (
pydantic_filter_validator,
StripWhitespaceFilter,
LowercaseFilter,
EmailNormalizationFilter,
)
# Create filter
email_filter = (
StripWhitespaceFilter()
.then(LowercaseFilter())
.then(EmailNormalizationFilter())
)
class User(BaseModel):
email: str
# Apply filter to field
_email_filter = pydantic_filter_validator("email", email_filter)
# Automatic filtering
user = User(email=" ALICE@EXAMPLE.COM ")
print(user.email) # "alice@example.com"
Using FilteredModel¶
from dspu.validation import FilteredModel, StripWhitespaceFilter
class User(FilteredModel):
name: str
email: str
# Define filters
_filters = {
"name": StripWhitespaceFilter(),
"email": StripWhitespaceFilter().then(LowercaseFilter()),
}
# Automatic filtering on all fields
user = User(name=" Alice ", email=" ALICE@EXAMPLE.COM ")
print(user.name) # "Alice"
print(user.email) # "alice@example.com"
Common Patterns¶
Pattern 1: Email Validation¶
from dspu.validation import (
StripWhitespaceFilter,
LowercaseFilter,
EmailNormalizationFilter,
)
from pydantic import BaseModel, EmailStr
email_filter = (
StripWhitespaceFilter()
.then(LowercaseFilter())
.then(EmailNormalizationFilter())
)
class User(BaseModel):
email: EmailStr # Pydantic email validation
_email_filter = pydantic_filter_validator("email", email_filter)
# Both filtering and validation
user = User(email=" Alice@Gmail.com ")
print(user.email) # "alice@gmail.com"
Pattern 2: Username Normalization¶
from dspu.validation import (
StripWhitespaceFilter,
LowercaseFilter,
RemoveSpecialCharsFilter,
)
username_filter = (
StripWhitespaceFilter()
.then(LowercaseFilter())
.then(RemoveSpecialCharsFilter())
)
class User(BaseModel):
username: str
_username_filter = pydantic_filter_validator("username", username_filter)
user = User(username=" Alice@123! ")
print(user.username) # "alice123"
Pattern 3: URL Slug Generation¶
from dspu.validation import SlugifyFilter
class Article(BaseModel):
title: str
slug: str = ""
_slug_filter = pydantic_filter_validator("slug", SlugifyFilter())
@model_validator(mode='after')
def generate_slug(self):
if not self.slug:
self.slug = SlugifyFilter()(self.title)
return self
article = Article(title="Hello World!")
print(article.slug) # "hello-world"
Pattern 4: Text Sanitization¶
from dspu.validation import (
StripWhitespaceFilter,
TruncateFilter,
RemoveSpecialCharsFilter,
)
# Sanitize user input
sanitize = (
StripWhitespaceFilter()
.then(RemoveSpecialCharsFilter())
.then(TruncateFilter(max_length=100))
)
class Comment(BaseModel):
text: str
_text_filter = pydantic_filter_validator("text", sanitize)
comment = Comment(text=" <script>alert('xss')</script>Hello! ")
print(comment.text) # "scriptalertxssscriptHello" (truncated to 100)
Custom Filters¶
Creating a Custom Filter¶
from dspu.validation import Filter
class CapitalizeFilter(Filter):
def apply(self, value: str) -> str:
return value.capitalize()
# Use custom filter
capitalize = CapitalizeFilter()
capitalize("hello world") # "Hello world"
# Compose with other filters
filter = StripWhitespaceFilter().then(CapitalizeFilter())
filter(" hello ") # "Hello"
Parameterized Filter¶
class ReplaceFilter(Filter):
def __init__(self, old: str, new: str):
self.old = old
self.new = new
def apply(self, value: str) -> str:
return value.replace(self.old, self.new)
# Use parameterized filter
replace = ReplaceFilter(old="@", new="[at]")
replace("alice@example.com") # "alice[at]example.com"
Best Practices¶
✅ DO: - Compose filters for reusability - Use Pydantic integration for automatic validation - Create custom filters for domain-specific logic - Chain filters in logical order (strip → lowercase → specific) - Test filters with edge cases
❌ DON'T: - Don't mutate input data in filters - Don't perform validation in filters (use Pydantic validators) - Don't create complex filters (compose simple ones) - Don't skip input sanitization for user data - Don't trust client-side validation alone
Filter Reference¶
| Filter | Purpose | Example |
|---|---|---|
| StripWhitespaceFilter | Remove leading/trailing whitespace | " hello " → "hello" |
| LowercaseFilter | Convert to lowercase | "HELLO" → "hello" |
| UppercaseFilter | Convert to uppercase | "hello" → "HELLO" |
| TruncateFilter | Limit string length | "hello world" → "hello..." |
| EmailNormalizationFilter | Normalize email | "Alice@Gmail.com" → "alice@gmail.com" |
| SlugifyFilter | Create URL-safe slug | "Hello World!" → "hello-world" |
| RemoveSpecialCharsFilter | Remove non-alphanumeric | "hello@world!" → "helloworld" |
| RegexReplaceFilter | Regex find and replace | "abc123" → "abcXXX" |