Code Style Guide¶
DSPU code style conventions and best practices.
Overview¶
We use: - Ruff for linting and formatting - Pyrefly for type checking - pytest for testing - Google style docstrings
Code Formatting¶
Line Length¶
Maximum 100 characters:
# Good
result = process_data(
input_data, transform=lambda x: x * 2, validate=True
)
# Bad - too long
result = process_data(input_data, transform=lambda x: x * 2, validate=True, cleanup=True, verbose=True)
Imports¶
Sort imports automatically with ruff:
# Standard library
import os
import sys
from pathlib import Path
# Third-party
import httpx
from pydantic import BaseModel
# Local
from dspu.core import Registry
from dspu.io import Storage
Type Hints¶
Always use type hints:
# Good
def process(data: list[dict], limit: int = 100) -> list[dict]:
...
# Bad
def process(data, limit=100):
...
Naming Conventions¶
Classes¶
PascalCase:
Functions and Methods¶
snake_case:
Constants¶
UPPER_SNAKE_CASE:
Private Members¶
Leading underscore:
Docstrings¶
Google Style¶
def function(arg1: str, arg2: int, optional: bool = False) -> dict:
"""Short one-line summary.
Longer description if needed. Explain what the function does,
its behavior, and any important details.
Args:
arg1: Description of arg1
arg2: Description of arg2
optional: Description of optional argument. Defaults to False.
Returns:
Description of return value and its structure.
Raises:
ValueError: When arg1 is empty
TypeError: When arg2 is negative
Example:
>>> result = function("test", 42)
>>> print(result)
{'status': 'success'}
"""
...
Class Docstrings¶
class DataProcessor:
"""Process and transform data.
This class provides methods for processing data with various
transformations and validations.
Attributes:
config: Configuration for processor
state: Current processor state
Example:
>>> processor = DataProcessor(config)
>>> result = processor.process(data)
"""
def __init__(self, config: Config):
"""Initialize processor.
Args:
config: Processor configuration
"""
...
Type Hints¶
Basic Types¶
from typing import Optional, Union
def process(
data: str,
count: int,
threshold: float,
enabled: bool,
optional: Optional[str] = None,
flexible: Union[str, int] = "default",
) -> dict:
...
Collections¶
from typing import List, Dict, Set, Tuple
def process(
items: list[str], # List of strings
mapping: dict[str, int], # String to int mapping
unique: set[str], # Set of strings
pair: tuple[str, int], # Fixed-size tuple
records: list[dict[str, Any]], # List of dicts
) -> list[str]:
...
Callables¶
from typing import Callable
def process(
data: list[int],
transform: Callable[[int], str],
) -> list[str]:
...
Protocols¶
from typing import Protocol
class Serializable(Protocol):
"""Protocol for serializable objects."""
def to_dict(self) -> dict:
"""Convert to dictionary."""
...
Error Handling¶
Specific Exceptions¶
# Good
from dspu.core import ValidationError
if not data:
raise ValidationError(
"Data cannot be empty",
field="data",
constraint="non_empty"
)
# Bad
if not data:
raise Exception("Data is empty")
Exception Chaining¶
try:
data = load_data(path)
except FileNotFoundError as e:
raise ConfigurationError(
f"Config file not found: {path}",
suggestion="Create config.yaml"
) from e # Chain exception
Code Organization¶
Module Structure¶
"""Module docstring.
Description of what this module provides.
"""
# Imports
from typing import Optional
from .core import SomeClass
# Constants
DEFAULT_VALUE = 42
# Classes
class MyClass:
...
# Functions
def helper_function():
...
# Exports
__all__ = ["MyClass", "helper_function"]
File Organization¶
Keep files focused and reasonably sized: - < 500 lines preferred - Split large files into submodules - Group related functionality
Testing Style¶
Test Names¶
Descriptive test names:
def test_scaler_fits_on_training_data():
...
def test_encoder_handles_unknown_categories():
...
def test_config_loads_from_multiple_sources():
...
Arrange-Act-Assert¶
def test_processor_transforms_data():
# Arrange
processor = DataProcessor()
data = [[1, 2], [3, 4]]
# Act
result = processor.transform(data)
# Assert
assert len(result) == 2
assert result[0] == [expected1, expected2]
Fixtures¶
import pytest
@pytest.fixture
def sample_data():
"""Provide sample data for tests."""
return [[1, 2, 3], [4, 5, 6]]
def test_with_fixture(sample_data):
result = process(sample_data)
assert len(result) == 2
Best Practices¶
Use Protocols Over Inheritance¶
# Good
from typing import Protocol
class Processor(Protocol):
def process(self, data: str) -> str: ...
# Bad
class BaseProcessor(ABC):
@abstractmethod
def process(self, data: str) -> str: ...
Explicit Over Implicit¶
# Good
config = Config.load(
AppConfig,
sources=[FileSource("config.yaml"), EnvSource(prefix="APP_")]
)
# Bad
config = Config.auto_load() # Too magical
Type Safety¶
Early Validation¶
# Good
config = Config.load(AppConfig, sources=[...]) # Validates immediately
# Bad
config = load_raw_dict()
port = int(config["port"]) # Fails at runtime
Running Checks¶
All Checks¶
# Format
uv run ruff format .
# Lint
uv run ruff check .
# Type check
uv run pyrefly check src
# Test
uv run pytest
Pre-commit¶
Configuration¶
pyproject.toml¶
[tool.ruff]
line-length = 100
target-version = "py311"
[tool.ruff.lint]
select = ["E", "F", "I", "N", "W"]
[tool.pyrefly]
# Pyrefly automatically detects Python version and applies strict checks
Further Reading¶
- PEP 8 - Python style guide
- PEP 257 - Docstring conventions
- PEP 484 - Type hints
- Google Python Style Guide