Skip to content

Restructure repo for educational clarity#2717

Merged
keon merged 11 commits into
mainfrom
refactor/restructure-for-education
Feb 17, 2026
Merged

Restructure repo for educational clarity#2717
keon merged 11 commits into
mainfrom
refactor/restructure-for-education

Conversation

@keon
Copy link
Copy Markdown
Owner

@keon keon commented Feb 17, 2026

Summary

  • Phase 0: Modernize codebase with type hints, docstrings, explicit __init__.py exports, and CI update
  • Phase 1: Remove niche/orphan modules (automata/, ml/, distribution/, unix/)
  • Phase 2: Create algorithms/data_structures/ package — extract stack, queue, heap, hash table, linked list, graph DS, union-find into centralized location with backward-compatible re-exports
  • Phase 3-4: Rename packages to follow Python conventions (arrays→array, strings→string, bit→bit_manipulation, backtrack→backtracking, sort→sorting, search→searching, dp→dynamic_programming, maths→math, linkedlist→linked_list, queues→queue)
  • Phase 5: Merge bfs/ and dfs/ into graph/ — BFS/DFS are techniques, not categories
  • Phase 6-7: Move misplaced algorithms (top_sortgraph/, unionfind/count_islandsgraph/)
  • Phase 8: Flatten tree subdirectories — move tree DS (BST, AVL, trie, segment tree, fenwick tree, red-black tree, B-tree) to data_structures/, flatten algorithm files into tree/ with prefixed names
  • Phase 9: Consolidate TreeNode — single canonical definition in common/tree_node.py as a dataclass
  • Phase 10: Remove old DS source files, convert graph/graph.py to re-export shim
  • Phase 11: Expose data_structures in top-level __init__.py, delete stale C file, final cleanup

Test plan

  • All 414 tests pass after every phase
  • Backward-compatible re-exports maintained — existing import paths still work
  • git mv used for all moves to preserve file history
  • Verify CI passes on this branch
  • Manual review of new directory structure for educational clarity

🤖 Generated with Claude Code

keon and others added 11 commits February 17, 2026 05:03
…update

Phase 0 of the educational algorithms repo refactor:

- Add type annotations and PEP 257 docstrings with complexity analysis to all
  290+ algorithm and data structure modules
- Replace wildcard imports with explicit imports and __all__ lists across all
  __init__.py files
- Extract shared data structures (TreeNode, ListNode, Graph) into
  algorithms/common/
- Migrate build config from setup.py/tox/Travis to pyproject.toml and GitHub
  Actions
- Remove obsolete .coveragerc, .travis.yml, tox.ini, requirements.txt
- Add py.typed marker for PEP 561 type-checking support
- Add Phase 0 discovery tooling (tools/) and audit reports (docs/internal/)
- Add refactor.md roadmap for subsequent restructuring phases
- All 422 tests passing

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
These modules don't fit the educational algorithms focus:
- automata/dfa.py: isolated DFA simulator with no related modules
- ml/nearest_neighbor.py: single ML algorithm in an algorithms repo
- distribution/histogram.py: trivial frequency counter (reimplements Counter)
- unix/path/: OS path utilities, not classical algorithms

Removes 4 corresponding test files. No other modules depend on these.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copies stack, queue, priority queue, heap, hash table, linked list, graph,
and union-find data structures into a dedicated data_structures/ package.
Old __init__.py files updated to re-export from the new location, so all
existing imports continue to work unchanged.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…backtrack→backtracking

Clearer names that follow Python naming conventions:
- arrays/ → array/ (singular, matches the data type)
- strings/ → string/ (singular)
- bit/ → bit_manipulation/ (descriptive of the technique)
- backtrack/ → backtracking/ (standard algorithmic term)

All __init__.py import paths and test imports updated accordingly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ng, maths→math, linkedlist→linked_list, queues→queue

Descriptive names that match standard algorithmic terminology:
- sort/ → sorting/ (category name)
- search/ → searching/ (category name)
- dp/ → dynamic_programming/ (spell out the technique)
- maths/ → math/ (standard term)
- linkedlist/ → linked_list/ (snake_case per PEP 8)
- queues/ → queue/ (singular, matches the data type)

Fixes cross-module imports in math/chinese_remainder_theorem.py and
math/symmetry_group_cycle_index.py. All test files renamed and updated.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
BFS and DFS are traversal techniques, not top-level algorithm categories.
Moves all files from bfs/ and dfs/ into graph/, renaming where conflicts
exist (count_islands_bfs.py, count_islands_dfs.py, etc.).

Merges test_bfs.py, test_dfs.py, and test_topological.py into test_graph.py.
Deletes the now-empty bfs/ and dfs/ directories.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Phase 6: Move top_sort.py from sorting/ to graph/topological_sort_dfs.py
- Phase 7: Move unionfind/count_islands.py to graph/count_islands_unionfind.py,
  import Union from data_structures, delete unionfind/
- Phase 8: Flatten tree subdirectories — move tree DS (BST, AVL, trie,
  segment tree, fenwick tree, red-black tree, B-tree) to data_structures/,
  flatten traversal/bst/trie algorithm files into tree/ with prefixed names
- Phase 9: Consolidate TreeNode — tree/tree.py now re-exports from
  common/tree_node.py dataclass
- Phase 10: Remove old DS source files (stack.py, queue.py, etc.) now that
  __init__.py files re-export from data_structures/; convert graph/graph.py
  to re-export shim
- Phase 11: Expose data_structures in top-level __init__.py, add tree DS
  exports to data_structures/__init__.py, delete stale C file

All 414 tests pass.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Rewrite README.md to reflect the new project structure with accurate
  file paths, organized data structures table, and complete algorithm
  listing grouped by topic
- Remove stale Sphinx docs (old .rst files referencing pre-rename packages)
- Remove internal Phase 0 audit documents (docs/internal/phase0/)
- Remove legacy setup.py (pyproject.toml is now authoritative)
- Remove refactoring planning files (refactor.md, PHASE_0_SUMMARY.txt)
- Remove one-off migration tools (tools/)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix pyproject.toml build-backend: use setuptools.build_meta
  (setuptools.backends._legacy:_Backend does not exist)
- Remove license classifier conflicting with PEP 639 license field
- Narrow ruff lint rules to F (Pyflakes) + I (isort) to match
  codebase reality; ignore F401 for intentional re-exports
- Increase line-length to 120 to avoid pre-existing E501 noise
- Auto-fix 49 import sorting issues (ruff --fix)
- Fix F841: remove unused variable in ResizableHashTable.put()
- Fix F811: rename duplicate TestCountBinarySubstring class to
  TestRepeatString (was silently shadowing a test — now 415 pass)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Restore original strict ruff rule set ["E", "W", "F", "I", "N", "UP", "B", "SIM"]
and fix every violation across 189 files instead of loosening the linter config.

Fixes applied:
- ruff format: 395 whitespace issues (W191 tabs, W291/W293 trailing, E101 mixed)
- ruff check --fix: 129 auto-fixable (I001 isort, UP type-hint upgrades, SIM simplifications)
- Manual fixes: 146 remaining (N class naming, E501 line length, F811/F841 unused,
  B027 empty methods, E741 ambiguous variable names like l→left, V→vertices)
- Renamed duplicate TestCountBinarySubstring → TestRepeatString in test_string.py,
  recovering 1 previously shadowed test (414 → 415 tests now pass)

All checks passed: ruff clean, 415 tests green.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Every algorithm entry now includes a one-line description explaining
what it does. Added 7 usage examples covering graph, DP, backtracking,
data structures, searching, tree traversal, and string matching.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@keon keon merged commit 7f80fa3 into main Feb 17, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant