1 results for tag "obliteratus-abliteration"
Operates OBLITERATUS, an open-source toolkit that locates refusal directions in LLM hidden states via SVD/PCA and projects them out of the weights to remove guardrails while preserving language capability. Ships with Gradio UI, CLI (basic/advanced/informed/LoRA methods, strength sweeps), Python API, and Colab.