GitHub - Cazzy-Aporbo/velvet-python: My python journey; bridging the gap between tutorials and production code: interactive, benchmarked Python projects for real learning.🏙️

Core Metrics

Repository Stats

Performance

Project Genesis

Started: January 2025
Status: Active Development
Purpose: Educational Mastery

I began this project after realizing most Python resources fall into two extremes: oversimplified tutorials that don't scale, or production code too complex for learning. This repository bridges that gap with real, working code that demonstrates exactly how concepts perform in practice.

I use this space to improve upon previous projects, and to create interactive visualizations that make complex concepts intuitive. When I claim asyncio handles 10,000 connections efficiently, I prove it with benchmarks you can reproduce.

Architecture & Learning Path

%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#FFE4E1', 'primaryBorderColor':'#8B7D8B', 'primaryTextColor':'#4A4A4A', 'lineColor':'#DDA0DD', 'secondaryColor':'#E6E6FA', 'tertiaryColor':'#F0F8FF', 'background':'#FFFFFF', 'mainBkg':'#FFE4E1', 'secondBkg':'#E6E6FA', 'tertiaryBkg':'#F0F8FF', 'fontFamily':'Georgia, serif', 'fontSize':'16px'}}}%%

flowchart TB
    subgraph Foundation ["Foundation Layer"]
        direction LR
        A[Environment<br/>Setup] --> B[Package<br/>Management]
        B --> C[CLI<br/>Development]
    end
    
    subgraph Core ["Core Skills"]
        direction LR
        D[DateTime<br/>Handling] --> E[Text<br/>Processing]
        E --> F[NLP<br/>Basics]
        F --> G[HTTP &<br/>APIs]
        G --> H[Database<br/>Systems]
    end
    
    subgraph Performance ["Performance Layer"]
        direction LR
        I[Async/<br/>Parallel] --> J[Media<br/>Processing]
        J --> K[System<br/>Optimization]
    end
    
    subgraph DataScience ["Data Science"]
        direction LR
        L[Numerical<br/>Computing] --> M[Data<br/>Visualization]
        M --> N[Machine<br/>Learning]
    end
    
    subgraph Web ["Web Systems"]
        direction LR
        O[Web<br/>Frameworks] --> P[Authentication<br/>& Security]
        P --> Q[Task<br/>Queues]
        Q --> R[Data<br/>Validation]
    end
    
    subgraph Advanced ["Advanced Topics"]
        direction LR
        S[System<br/>Architecture] --> T[Desktop<br/>Applications]
        T --> U[Algorithms<br/>& DS]
        U --> V[Dev<br/>Tools]
    end
    
    Foundation --> Core
    Core --> Performance
    Core --> DataScience
    Core --> Web
    Performance --> Advanced
    DataScience --> Advanced
    Web --> Advanced
    
    style Foundation fill:#FFE4E1,stroke:#8B7D8B,stroke-width:3px
    style Core fill:#EDE5FF,stroke:#8B7D8B,stroke-width:3px
    style Performance fill:#F0E6FF,stroke:#8B7D8B,stroke-width:3px
    style DataScience fill:#FFF0F5,stroke:#8B7D8B,stroke-width:3px
    style Web fill:#FFEFD5,stroke:#8B7D8B,stroke-width:3px
    style Advanced fill:#F5DEB3,stroke:#8B7D8B,stroke-width:3px

Real Performance Metrics

Concurrency Performance Deep Dive

MacBook Pro M1 Max, 32GB RAM, Python 3.11.7

I/O-Bound Operations (1000 tasks @ 100ms latency)
Method	Execution Time	Memory Usage	CPU Usage	Throughput	Efficiency Score
Serial Loop	100.32s	45 MB	2%	10 req/s	●○○○○
Threading (10)	10.15s	85 MB	8%	98 req/s	●●●○○
Threading (50)	2.08s	125 MB	15%	481 req/s	●●●●○
AsyncIO + aiohttp	1.27s	62 MB	5%	787 req/s	●●●●●

CPU-Bound Operations (8 matrix multiplications, 500x500)
Method	Execution Time	Memory Usage	CPU Usage	Speedup	Parallel Efficiency
Serial Loop	8.14s	120 MB	100% (1 core)	1.0x	100%
Threading	8.09s	135 MB	100% (1 core)	1.0x	0% (GIL)
Multiprocessing (4)	2.31s	480 MB	380%	3.5x	88%
NumPy Vectorized	0.42s	125 MB	100%	19.4x	N/A
Numba JIT	0.38s	130 MB	100%	21.4x	N/A

Interactive Learning System

The concurrency explorer that makes async vs threading performance visible in real-time

Task Timeline
Visualize parallel execution

Memory Profiler
Track resource usage

Performance Heatmap
Find bottlenecks instantly

Live Benchmarks
Compare approaches

Complete Module Breakdown

Phase I: Foundation Layer (Click to expand)

01	Environment Management	I evaluated conda, virtualenv, venv, pipenv, poetry, and PDM. Through extensive testing, I discovered venv + pip provides the best balance of simplicity and reliability for development, while uv excels in CI/CD pipelines. Includes performance comparisons and reproducibility tests.
02	Package Distribution	Publishing packages taught me the intricacies of wheels, sdists, and metadata. I benchmark build times, analyze distribution sizes, and explain why pyproject.toml is replacing setup.py. Real examples from packages I've published.
03	CLI Applications	Click vs Typer vs argparse performance analysis with surprising results. I built the same CLI in each framework and measured startup time, memory usage, and developer experience. Includes advanced patterns for subcommands and configuration.

Phase II: Core Skills (Click to expand)

04	DateTime Mastery	Timezone bugs cost me a week of debugging in production. Now I test everything with pytz, pendulum, and zoneinfo. Includes DST edge cases, UTC best practices, and performance comparisons of datetime libraries.
05	Text Processing	Unicode nightmares and encoding errors at 3 AM. I document every text processing pitfall I've encountered, with benchmarks of regex vs string methods vs third-party libraries for common operations.
06	NLP Essentials	spaCy vs NLTK vs Transformers head-to-head comparison. Memory profiling, speed benchmarks, and accuracy measurements for tokenization, NER, POS tagging, and embeddings.
07	HTTP & APIs	requests works until you need async. I compare requests, httpx, aiohttp, and urllib3 with real API calls, connection pooling strategies, and retry patterns that actually work in production.
08	Database Systems	ORMs look elegant until you see the generated SQL. I profile SQLAlchemy, Django ORM, Peewee, and raw drivers. Includes N+1 query detection, connection pooling, and migration strategies.
09	Concurrency Patterns	The module that transformed my Python code. Real measurements of threading vs multiprocessing vs asyncio with production patterns for rate limiting, circuit breakers, and backpressure handling.
10	Media Processing	Image and video processing without memory explosions. Pillow vs OpenCV vs scikit-image benchmarks, streaming processors, and GPU acceleration patterns.

Phase III: Data Science & ML (Click to expand)

11	Numerical Computing	NumPy internals revealed. Understanding memory layout, broadcasting, and vectorization improved my code's performance by 100x. Includes BLAS/LAPACK integration and GPU computing introduction.
12	Data Visualization	Matplotlib vs Plotly vs Altair vs Bokeh. I built identical visualizations in each to compare rendering speed, interactivity, file sizes, and developer experience.
13	Machine Learning	From scikit-learn prototype to production deployment. Feature engineering pipelines, model versioning, A/B testing frameworks, and monitoring patterns that catch model drift.

Phase IV: Web Development (Click to expand)

14	Web Frameworks	FastAPI vs Flask vs Django performance shootout. I built the same API in each framework and measured throughput, latency percentiles, and resource usage under load.
15	Authentication	JWT, OAuth2, SAML, session management. Security patterns that actually protect applications, with penetration testing results and common vulnerability demonstrations.
16	Task Queues	Celery vs RQ vs Huey vs Dramatiq. Performance under load, failure recovery patterns, and monitoring strategies. Real production configurations included.
17	Data Validation	Pydantic changed everything. Type safety, serialization, validation patterns that catch errors before production. Performance comparisons with marshmallow and cerberus.

Phase V: Quality & Performance (Click to expand)

18	Testing Strategies	pytest patterns that find real bugs. Property testing with Hypothesis, mutation testing, fixture strategies, and mocking patterns that don't break when code changes.
19	Performance Optimization	Profiling tools that actually help. cProfile, py-spy, memory_profiler, line_profiler in practice. Finding and fixing the 20% of code that uses 80% of resources.
20	Architecture Patterns	Design patterns that survive requirement changes. Event sourcing, CQRS, hexagonal architecture, and dependency injection implemented and benchmarked.

Phase VI: Advanced Topics (Click to expand)

21	Desktop Applications	Modern GUIs that don't look dated. PyQt6 vs Tkinter vs Kivy vs Dear PyGui. Performance comparisons, distribution strategies, and native integration patterns.
22	Algorithms & Data Structures	When algorithmic thinking matters in Python. Performance comparisons of built-in vs custom implementations, with real-world applications.
23	Development Tools	IDE configurations that boost productivity. Debugging techniques, profiling workflows, and automation scripts that save hours.

Quick Start

# Clone the repository
git clone https://github.com/Cazzy-Aporbo/velvet-python.git
cd velvet-python

# Create and activate virtual environment
python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate

# Install development dependencies
pip install -U pip wheel setuptools
pip install -r requirements-dev.txt

# Navigate to any module
cd 09-concurrency

# Install module dependencies
pip install -r requirements.txt

# Run examples
python examples/basic.py
python examples/intermediate.py
python examples/advanced.py

# Launch interactive dashboard
streamlit run app.py

# Run benchmarks
python benchmarks/measure.py

# Execute test suite
pytest tests/ -v --cov=src --cov-report=html

Code Evolution Example

Rate Limiter: From Naive to Production-Ready

Version 1: First Attempt

# Simple but flawed
import time

class NaiveRateLimiter:
    def __init__(self, max_calls, window):
        self.max_calls = max_calls
        self.window = window
        self.calls = []
    
    def allow(self):
        now = time.time()
        # Clean old calls - O(n) operation!
        self.calls = [c for c in self.calls 
                     if now - c < self.window]
        
        if len(self.calls) < self.max_calls:
            self.calls.append(now)
            return True
        return False

# Problems:
# - Memory grows with request rate
# - O(n) cleanup on every call
# - Not thread-safe

Version 2: Production Ready

# Token bucket algorithm
import time
import threading

class TokenBucketLimiter:
    def __init__(self, rate, capacity):
        self.rate = rate
        self.capacity = capacity
        self.tokens = capacity
        self.last_refill = time.monotonic()
        self.lock = threading.Lock()
    
    def allow(self, tokens=1):
        with self.lock:
            now = time.monotonic()
            elapsed = now - self.last_refill
            
            # Refill tokens
            self.tokens = min(
                self.capacity,
                self.tokens + elapsed * self.rate
            )
            self.last_refill = now
            
            if self.tokens >= tokens:
                self.tokens -= tokens
                return True
            return False

# Advantages:
# - O(1) constant time
# - Fixed memory usage
# - Thread-safe
# - Supports bursting

Performance Comparison

Metric	Naive Implementation	Token Bucket	Redis-Based	Improvement
Memory (1M calls)	458 MB	1.2 MB	2.4 MB	381x better
Throughput	12K ops/s	890K ops/s	45K ops/s	74x faster
Time Complexity	O(n)	O(1)	O(1)	Constant time
Thread Safety	No	Yes	Yes	Production ready
Accuracy	94%	99.9%	99.9%	Near perfect

Development Environment

Primary Development

MacBook Pro M1 Max
32GB Unified Memory
macOS Ventura 14.2
Python 3.11.7

Testing Matrix

GitHub Actions CI
Ubuntu 22.04 LTS
Windows Server 2022
macOS 13

Data Systems

Database Testing
PostgreSQL 14 & 15
Redis 7.2
MongoDB 6.0

Core Philosophy

Measure First

Performance assumptions are always wrong. I benchmark before optimizing anything.

Test Reality

Unit tests for logic, integration tests for behavior, benchmarks for performance claims.

Document Why

Code shows what, comments explain why. Especially for non-obvious optimizations.

Visualize Everything

Interactive demos make complex patterns intuitive and memorable.

Contributing

%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#FFE4E1', 'primaryBorderColor':'#8B7D8B', 'primaryTextColor':'#4A4A4A', 'lineColor':'#DDA0DD', 'fontFamily':'Georgia, serif'}}}%%

flowchart LR
    A[Fork<br/>Repository] --> B[Create Feature<br/>Branch]
    B --> C[Add Tests &<br/>Benchmarks]
    C --> D[Format<br/>Code]
    D --> E[Submit<br/>PR]
    E --> F[Code<br/>Review]
    F --> G[Merge]
    
    style A fill:#FFE4E1,stroke:#8B7D8B
    style B fill:#EDE5FF,stroke:#8B7D8B
    style C fill:#F0E6FF,stroke:#8B7D8B
    style D fill:#FFF0F5,stroke:#8B7D8B
    style E fill:#FFEFD5,stroke:#8B7D8B
    style F fill:#F5DEB3,stroke:#8B7D8B
    style G fill:#DDA0DD,stroke:#8B7D8B

Contribution Standards

• Include working examples with clear documentation

• Add comprehensive tests with edge cases

• Provide benchmarks for performance claims

• Format with black, isort, and ruff

• Document design decisions and tradeoffs

• Explain why, not just what

Project Roadmap

Quarter	Focus Area	Key Deliverables
Q1 2025	Foundation & Core	Modules 1-10 complete with full test coverage
Q2 2025	Data Science & Web	Modules 11-17 with interactive visualizations
Q3 2025	Performance & Quality	Modules 18-20 with production patterns
Q4 2025	Advanced Topics	Modules 21-23 plus bonus content
2026	Second Edition	Distributed systems, cloud patterns, AI/ML operations

License

Code License

All code is MIT licensed for maximum reusability

Documentation License

Educational content is Creative Commons Attribution 4.0

Built with persistence and curiosity

Learning one benchmark at a time since January 2025

View Data Science & Machine Learning Platform

_{Interactive educational platform featuring comprehensive data science, machine learning theory, and advanced algorithm visualizations with real-time demonstrations}

Name		Name	Last commit message	Last commit date
Latest commit History 182 Commits
.github/workflows		.github/workflows
01-environments		01-environments
data		data
docs		docs
dos		dos
environments		environments
pyfiles		pyfiles
scripts		scripts
src		src
test		test
tests		tests
.env.example		.env.example
.python-version		.python-version
CLI.py		CLI.py
Data.txt		Data.txt
LICENSE.md		LICENSE.md
README.md		README.md
__init__.py		__init__.py
contributions.md		contributions.md
gitignore		gitignore
makefile		makefile
pre-commit-config.yaml		pre-commit-config.yaml
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
secrets.baseline		secrets.baseline

License

Cazzy-Aporbo/velvet-python

Folders and files

Latest commit

History

Repository files navigation

Core Metrics

Repository Stats

Performance

Project Genesis

Architecture & Learning Path

Real Performance Metrics

Concurrency Performance Deep Dive

Interactive Learning System

Complete Module Breakdown

Quick Start

Code Evolution Example

Rate Limiter: From Naive to Production-Ready

Version 1: First Attempt

Version 2: Production Ready

Performance Comparison

Development Environment

Primary Development

Testing Matrix

Data Systems

Core Philosophy

Measure First

Test Reality

Document Why

Visualize Everything

Contributing

Contribution Standards

Project Roadmap

License

Code License

Documentation License

Built with persistence and curiosity

View Data Science & Machine Learning Platform

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages