Skip to content
View qasimio's full-sized avatar
:octocat:
Tinkerer
:octocat:
Tinkerer

Highlights

  • Pro

Block or report qasimio

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
qasimio/README.md

Muhammad Qasim

Portfolio LinkedIn Email Patreon


What I Build

I design and implement systems at the boundary between classical software engineering and applied AI — specifically:

  • Autonomous agent loops that plan, execute, and self-correct over multi-step coding tasks
  • Retrieval pipelines that ingest messy real-world documents and serve precise, ranked context to LLMs
  • CLI/TUI tooling that runs locally, requires no cloud dependency, and treats safety as a design constraint

My engineering approach is shaped by one conviction: language models are only as useful as the infrastructure surrounding them. I focus on that infrastructure.


Systems

Operon — Autonomous Code Intelligence Agent

Stars Python Textual License

A terminal-native agent that operates on a full repository — not just individual files. Operon builds a persistent, hash-gated symbol graph across the entire codebase, then uses it to execute multi-file refactors, generate structured documentation, and answer questions about execution flow.

Key engineering decisions:

  • Deterministic-first REVIEWER: verifies changes by comparing disk hash to diff memory snapshot before any LLM call — eliminates hallucinated confirmation
  • CRUD fast-path: structured operations (import insertion, variable renaming, comment placement) handled via tokenize and ast without LLM — removes the most common failure mode of small local models
  • 5-tier surgical diff engine: SEARCH/REPLACE patching with cascading fallbacks from exact string match through fuzzy multi-line tolerance
  • Mandatory approval gate: no filesystem write occurs without explicit human confirmation; timeout auto-rejects to prevent hang
  • 9-provider LLM router with hot-reload config — model switching takes effect on the next call, no restart

github.com/qasimio/Operon


MQNotebook — Enterprise RAG System

Python LlamaIndex Streamlit

A retrieval system designed for the documents that most RAG demos ignore: scanned PDFs, PPTX speaker notes, multi-sheet Excel files. The engineering focus is retrieval correctness under real document conditions.

Key engineering decisions:

  • Custom OCR pipeline (Tesseract + Poppler) handles flattened text and image-only pages
  • Hybrid search: dense vector retrieval re-ranked by a Cross-Encoder, yielding ~40% precision improvement over cosine similarity alone
  • Session-isolated storage handler resolves Windows file-locking failures (WinError 32) in persistent vector stores
  • Context injection optimized to reduce token usage by ~60% without precision loss

github.com/qasimio/MQNotebook · Live Demo


DevShelf — Vertical Search Engine (from First Principles)

Java IR

A search engine for Computer Science literature built without Lucene or ElasticSearch. The goal was to understand retrieval at the data-structure level before building RAG systems on top of it.

Key engineering decisions:

  • Positional inverted index with O(1) keyword lookup via custom hashing
  • Offline Indexer pre-processes corpora at build time, pushing query latency to sub-millisecond
  • O(L) Trie for prefix-completion; Levenshtein distance for fuzzy matching
  • TF-IDF scoring with behavioral re-ranking

The classical IR knowledge from DevShelf directly informs the hybrid search design in MQNotebook.

github.com/qasimio/DevShelf


foldr — Intelligent File Automation CLI

PyPI PyPI Downloads Downloads Python Platform License

A published CLI tool for organizing large, messy file collections by extension — with background watch mode, undo, deduplication, and custom category configuration. Designed for data pipeline preparation and keeping directories clean without any manual effort.

pip install foldr                          # one install, everything included

foldr ~/Downloads --preview               # see exactly what will happen
foldr ~/Downloads                         # organize (preview → confirm → move)
foldr ~/Downloads --recursive --depth 2   # include subdirectories
foldr watch ~/Downloads                   # organize now + keep watching forever
foldr undo                                # restore the last operation

Key engineering decisions:

  • Background daemon architecture: foldr watch spawns a detached OS-native subprocess; the calling terminal returns immediately while the daemon runs indefinitely, using inotify (Linux), kqueue (macOS), or ReadDirectoryChangesW (Windows) — 0% CPU when idle
  • Initial scan + continuous watch: daemon organizes all existing files on start, then reacts to every subsequent file creation, modification, or drag-in — no stale state, no one-time-only moves
  • JSON undo system: every organize operation writes an immutable history entry; foldr undo reverses any past operation independently without requiring sequential rollback
  • Zero external dependency for core output: ANSI rendering via ctypes (Win10+) with colorama fallback — no rich, no pyfiglet, no curses
  • Conflict-safe moves: resolves filename collisions by appending _(1), _(2) — never overwrites
  • Dry-run architecture: previews the complete I/O plan before any file is touched; directories are never modified
foldr ~/Downloads --dedup keep-newest     # remove duplicate files (irreversible — always preview first)
foldr history                             # browse past operations
foldr config --edit                       # open category config in editor

github.com/qasimio/foldr · pypi.org/project/foldr


Tech Stack

Languages

Python Java C++ Bash

AI · Agents · Retrieval

LlamaIndex LangChain HuggingFace PyTorch

Vector & Search

LanceDB ChromaDB Tesseract

Systems & Tooling

Linux Docker GitHub Actions Textual Watchdog PyPI


Other Work

Project What it demonstrates
BabyGPT Character-level LSTM language model from scratch in TensorFlow
Sentiment Filter NLP edge cases — negation paradox, context-sensitivity
MQ Banking Core Low-level transactional system in C++ with file-level I/O
Digital Eye CNN-based image classification pipeline in Keras

Currently Working On

  • Extending Operon's symbol graph to JS/TS via Babel AST integration
  • LSP server mode for editor integration
  • Structured output evaluation framework for RAG retrieval quality

GitHub Streak

Building systems that make LLMs reliable, not just capable.

Pinned Loading

  1. DevShelf DevShelf Public

    Blazing-fast offline search for 250+ CS books — built from scratch in Java.

    Java 12

  2. MQNotebook MQNotebook Public

    Enterprise-grade RAG and document search system for extracting reliable insights from real-world data.

    Python 11

  3. Foldr Foldr Public

    Clean folders Instantly with Preview and full Undo — Fast, Safe, Cross-platform.

    Python 9

  4. Scripting Scripting Public

    Shell 18