Toolkit for fine-tuning, ablating and unit-testing open-source LLMs.
-
Updated
Oct 25, 2024 - Python
Toolkit for fine-tuning, ablating and unit-testing open-source LLMs.
Distribution transparent Machine Learning experiments on Apache Spark
This project implements 30+ variants of ANN algorithms to find the K nearest neighbors in high-dimensional vector spaces. It is meant as a convenient sandbox: drop in your own ANN code, run a one-liner, and instantly compare build/search speed and recall against the bundled baselines.
Do models distinguish between declared-true and declared-false premises?
Attentively Embracing Noise for Robust Latent Representation in BERT (COLING 2020)
O(N) attention with a bounded inference KV cache. D4 Daubechies wavelet field + content-gated Q·K gather at dyadic offsets.
Reproducible research comparing GNN (GraphSAGE, GCN, GAT) vs ML baselines (XGBoost, RF) on Elliptic++ Bitcoin fraud detection. Features ablation experiments revealing when tabular models outperform graph neural networks.
Emotiwave is a research project investigating how well AI systems can recognise human emotions from video when one or more sensors fail. The core question: if you lose the audio, or the camera, or the transcript — does the system fall apart, or does it adapt?
Multi-agent verification for AI outputs: claim verification, RAG diagnostics, pre-action verification for agentic AI. Includes ablation studies proving multi-agent vs single-prompt tradeoffs, FaithBench benchmarks, and bias-triggering evaluation methodology
Machine Learning analysis for an imbalanced dataset. Developed as final project for the course "Machine Learning and Intelligent Systems" at Eurecom, Sophia Antipolis
Six Ways to Forget: Biologically-grounded forgetting mechanisms for LLM agent memory systems. 18 experiments, 4 falsified hypotheses, STDP ablation (Cohen's d = 3.163).
🧠 Automated neural network ablation studies using LLM agents and LangGraph. Systematically remove components, test performance, and gain insights into architecture importance through an intelligent multi-agent workflow.
A multimodal deep learning project for classifying mental health-related memes, combining both textual and visual features.
Re-implementation of the paper titled "Noise against noise: stochastic label noise helps combat inherent label noise" from ICLR 2021.
Evaluation framework for self-hosted LLMs. Systematic prompt ablation (baseline, CoT, few-shot, self-consistency voting) on Llama 3.1 8B via lm-evaluation-harness, with Wilson CI statistical analysis, determinism validation, and load testing under concurrency. Found chain-of-thought degrades accuracy 25pp at small scale.
Phase zero of Artificial Neuroplasticity: Giving models self-editing capacity, through a trained triumvirate of three models; Analyzer / Trainee / Evaluator. The Analyzer uses TransformerLens to watch the Trainee. The Evaluator is the Review Board,, confirming the Trainee has become smarter than itself. This IS NOT implemented in this phase zero.
A hands-on series of 6 Jupyter notebooks that build a GPT-style language model from absolute scratch, one component at a time. Each notebook adds a single architectural element, trains it on Shakespeare, and measures the improvement — creating a reverse ablation study that shows exactly what each piece contributes.
Binary image classification project to detect drones vs non-drone aerial objects (birds) using a pretrained ResNet-18 model. Built with PyTorch and transfer learning, includes class-imbalance handling, validation metrics, confusion matrix analysis, and an ablation study comparing frozen vs fine-tuned backbones.
Ablation Study of CapsuleNetwork on TimeSeries
Intelligent layer pruning toolkit for LLMs featuring iterative optimization, self-healing algorithms, and comprehensive benchmarking.
Add a description, image, and links to the ablation-study topic page so that developers can more easily learn about it.
To associate your repository with the ablation-study topic, visit your repo's landing page and select "manage topics."