Skip to content

Code Style Guide

DSPU code style conventions and best practices.

Overview

We use: - Ruff for linting and formatting - Pyrefly for type checking - pytest for testing - Google style docstrings

Code Formatting

Line Length

Maximum 100 characters:

# Good
result = process_data(
    input_data, transform=lambda x: x * 2, validate=True
)

# Bad - too long
result = process_data(input_data, transform=lambda x: x * 2, validate=True, cleanup=True, verbose=True)

Imports

Sort imports automatically with ruff:

# Standard library
import os
import sys
from pathlib import Path

# Third-party
import httpx
from pydantic import BaseModel

# Local
from dspu.core import Registry
from dspu.io import Storage

Type Hints

Always use type hints:

# Good
def process(data: list[dict], limit: int = 100) -> list[dict]:
    ...

# Bad
def process(data, limit=100):
    ...

Naming Conventions

Classes

PascalCase:

class DataProcessor:
    ...

class HTTPClient:
    ...

Functions and Methods

snake_case:

def process_data(...):
    ...

def get_user_by_id(...):
    ...

Constants

UPPER_SNAKE_CASE:

MAX_RETRIES = 3
DEFAULT_TIMEOUT = 30.0
API_VERSION = "v1"

Private Members

Leading underscore:

class MyClass:
    def __init__(self):
        self._internal_state = {}

    def _private_method(self):
        ...

Docstrings

Google Style

def function(arg1: str, arg2: int, optional: bool = False) -> dict:
    """Short one-line summary.

    Longer description if needed. Explain what the function does,
    its behavior, and any important details.

    Args:
        arg1: Description of arg1
        arg2: Description of arg2
        optional: Description of optional argument. Defaults to False.

    Returns:
        Description of return value and its structure.

    Raises:
        ValueError: When arg1 is empty
        TypeError: When arg2 is negative

    Example:
        >>> result = function("test", 42)
        >>> print(result)
        {'status': 'success'}
    """
    ...

Class Docstrings

class DataProcessor:
    """Process and transform data.

    This class provides methods for processing data with various
    transformations and validations.

    Attributes:
        config: Configuration for processor
        state: Current processor state

    Example:
        >>> processor = DataProcessor(config)
        >>> result = processor.process(data)
    """

    def __init__(self, config: Config):
        """Initialize processor.

        Args:
            config: Processor configuration
        """
        ...

Type Hints

Basic Types

from typing import Optional, Union

def process(
    data: str,
    count: int,
    threshold: float,
    enabled: bool,
    optional: Optional[str] = None,
    flexible: Union[str, int] = "default",
) -> dict:
    ...

Collections

from typing import List, Dict, Set, Tuple

def process(
    items: list[str],              # List of strings
    mapping: dict[str, int],       # String to int mapping
    unique: set[str],              # Set of strings
    pair: tuple[str, int],         # Fixed-size tuple
    records: list[dict[str, Any]], # List of dicts
) -> list[str]:
    ...

Callables

from typing import Callable

def process(
    data: list[int],
    transform: Callable[[int], str],
) -> list[str]:
    ...

Protocols

from typing import Protocol

class Serializable(Protocol):
    """Protocol for serializable objects."""

    def to_dict(self) -> dict:
        """Convert to dictionary."""
        ...

Error Handling

Specific Exceptions

# Good
from dspu.core import ValidationError

if not data:
    raise ValidationError(
        "Data cannot be empty",
        field="data",
        constraint="non_empty"
    )

# Bad
if not data:
    raise Exception("Data is empty")

Exception Chaining

try:
    data = load_data(path)
except FileNotFoundError as e:
    raise ConfigurationError(
        f"Config file not found: {path}",
        suggestion="Create config.yaml"
    ) from e  # Chain exception

Code Organization

Module Structure

"""Module docstring.

Description of what this module provides.
"""

# Imports
from typing import Optional
from .core import SomeClass

# Constants
DEFAULT_VALUE = 42

# Classes
class MyClass:
    ...

# Functions
def helper_function():
    ...

# Exports
__all__ = ["MyClass", "helper_function"]

File Organization

Keep files focused and reasonably sized: - < 500 lines preferred - Split large files into submodules - Group related functionality

Testing Style

Test Names

Descriptive test names:

def test_scaler_fits_on_training_data():
    ...

def test_encoder_handles_unknown_categories():
    ...

def test_config_loads_from_multiple_sources():
    ...

Arrange-Act-Assert

def test_processor_transforms_data():
    # Arrange
    processor = DataProcessor()
    data = [[1, 2], [3, 4]]

    # Act
    result = processor.transform(data)

    # Assert
    assert len(result) == 2
    assert result[0] == [expected1, expected2]

Fixtures

import pytest

@pytest.fixture
def sample_data():
    """Provide sample data for tests."""
    return [[1, 2, 3], [4, 5, 6]]

def test_with_fixture(sample_data):
    result = process(sample_data)
    assert len(result) == 2

Best Practices

Use Protocols Over Inheritance

# Good
from typing import Protocol

class Processor(Protocol):
    def process(self, data: str) -> str: ...

# Bad
class BaseProcessor(ABC):
    @abstractmethod
    def process(self, data: str) -> str: ...

Explicit Over Implicit

# Good
config = Config.load(
    AppConfig,
    sources=[FileSource("config.yaml"), EnvSource(prefix="APP_")]
)

# Bad
config = Config.auto_load()  # Too magical

Type Safety

# Good
transformers: Registry[Transformer] = Registry()

# Bad
transformers = {}  # Untyped

Early Validation

# Good
config = Config.load(AppConfig, sources=[...])  # Validates immediately

# Bad
config = load_raw_dict()
port = int(config["port"])  # Fails at runtime

Running Checks

All Checks

# Format
uv run ruff format .

# Lint
uv run ruff check .

# Type check
uv run pyrefly check src

# Test
uv run pytest

Pre-commit

# Install hooks
uv run pre-commit install

# Run manually
uv run pre-commit run --all-files

Configuration

pyproject.toml

[tool.ruff]
line-length = 100
target-version = "py311"

[tool.ruff.lint]
select = ["E", "F", "I", "N", "W"]

[tool.pyrefly]
# Pyrefly automatically detects Python version and applies strict checks

Further Reading