Skip to content
View JonTDean's full-sized avatar
🏠
Working from home
🏠
Working from home

Block or report JonTDean

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
JonTDean/README.md

About Me

I am a machine learning engineer, computational biologist, and systems designer working where clinical informatics, cybernetics, and information theory meet. My day job is building high-stakes data and ML systems for oncology and real-world evidence; my longer-horizon work is about treating those systems as goal-directed, feedback-rich processes rather than mere data plumbing.

Formally, my background combines bioinformatics & computational biology, computer science, and data infrastructure engineering. Conceptually, I draw a line from early cybernetics (control and communication in organisms and machines) through information theory (information as the resolution of uncertainty) to modern AI and multi-scale cognition. My focus is making that lineage concrete in code: ontologies become vectors; feedback loops become APIs; evaluation becomes a first-class artifact.


Fingerprint

Research & Practice Fingerprint

  • Cybernetics & information flow

    • Model clinical platforms as feedback systems: sensors (EHR, labs, genomics), controllers (mapping engines, policies), actuators (dashboards, decision support).
    • Treat pipelines as communication channels with noise, capacity, and distortion; design for graceful degradation instead of silent failure.
  • Vectorized ontologies & representation geometry

    • Embed NCIt, SNOMED CT, UMLS, RxNorm, FHIR value sets, and OBO ontologies into vector spaces; analyze manifold structure, capacity, and community detection (Leiden/Louvain).
    • Build mapping engines that combine approximate nearest neighbors, lexical features, and domain constraints to align FHIR resources and procedural codes to ontology terms.
  • Clinical data systems & governance

    • Architect federated data meshes: per-site marts and warehouses backed by relational + vector stores, coordinated by a mesh hub with policy, lineage, and evaluation.
    • Emphasize auditability and epistemic humility: every mapping, score, and model decision should be traceable, inspectable, and falsifiable.
  • Working style

    • Kanban-driven development, small reviewable PRs, and CI that enforces formatting, linting, tests, and documentation.
    • Preference for strong types and explicit invariants (Rust / typed schemas) in pipelines that must be correct for years, not just demos.

Github

What I Use GitHub For

  • Data-First Procedural Semantics (DFPS) / clinical platform work

    • Multi-crate Rust workspace for:
      • FHIR bundle ingestion, validation, and normalization.
      • Ontology-aware mapping of service requests, procedures, and observations to NCIt and related vocabularies via vector backends (FAISS / pgvector / similar).
      • Mesh-style orchestration across local datamarts, analytics marts, and shared governance layers.
  • Mapping evaluation & information-theoretic probing

    • CLIs to build, query, and introspect vector indexes; tools to compare lexical vs. embedding-based vs. hybrid matching strategies.
    • Analyze error modes as information-processing failures: ambiguous codes, underspecified contexts, brittle embeddings, and graph pathologies.
  • Frontends & inspection tools

    • Next.js + Tailwind + ShadCN UIs for:
      • Inspecting mapping neighborhoods (top-k candidates, confidence scores, lexical/semantic evidence).
      • Visualizing ontology graphs, local manifolds, and evaluation metrics in ways that clinicians and data stewards can actually reason about.

Programming Data

Engineering Stack & Practices

  • Languages & ecosystems

    • Rust for domain models, ingestion pipelines, mapping engines, governance / mesh services.
    • Python for experimentation, data analysis, evaluation harnesses, and research prototypes.
    • TypeScript / React / Next.js for visual analytics, operator consoles, and developer tooling.
  • Data & infra

    • PostgreSQL + SQLx for relational cores; dimensional datamarts / warehouses for downstream analytics.
    • Vector stores (FAISS, pgvector-style backends, ANN indices) as an explicit ontology layer, not an afterthought.
    • Containers and IaC (Docker, Terraform-style tooling) with GitHub Actions CI that runs fmt, lint, test, and doc checks.
  • Design principles

    • Domain-driven structure: domain (semantics), platform (infrastructure), app (interfaces), each with narrow, testable contracts.
    • Extensive CLI entry points (e.g., build_vector_index, map_bundles, map_codes, eval_mapping, load_datamart, validate_fhir) so experiments and pipelines are scripted, versioned, and repeatable.
    • Treat metrics, logs, and traces as feedback signals for a cybernetic system rather than mere observability garnish.

Language Data

Technical & Natural Languages

  • Programming

    • Primary: Rust, Python, TypeScript
    • Also: SQL, Bash, occasional JVM/web tooling when interfacing with legacy systems
    • Tooling: cargo, poetry/pip, Node/Bun, modern linters/formatters, GitHub Actions / similar CI.
  • Communication

    • Write design docs, evaluation reports, and governance notes that tie together code, data, and outcomes.
    • Strong bias toward:
      • Declaring assumptions and failure modes up front.
      • Making epistemic status explicit (“measured”, “estimated”, “hypothesized”).
      • Translating between engineers, clinicians, data stewards, and leadership.

Guiding perspectives (selected quotes)

“Information is information, not matter or energy.”
— Norbert Wiener

“Information is the resolution of uncertainty.”
— Claude Shannon

“Artificial intelligence is the science and engineering of making intelligent machines, especially intelligent computer programs.”
— John McCarthy

“Novel beings, novel goals.”
— Michael Levin

If you are thinking in the same space—**cybernetics-inspired clinical systems, information-theoretic views of pipelines, vectorized ontologies, and multi-scale cognition**—I’m always open to conversations, issues, or collaborative experiments.

Pinned Loading

  1. s-o-u-l s-o-u-l Public

    Rust

  2. Bitap_Search-Implementation Bitap_Search-Implementation Public

    JavaScript

  3. CoerusHooks CoerusHooks Public

    Hooks for AnquaeroCoerus

    TypeScript

  4. Demultiplex-Project Demultiplex-Project Public

    Application for reading fastq files and outputting a report based on reference data and csv files

    Python