We accidentally de-censored the model! Qwen-instruct which we use is censored and aligned. DeepSeek-R1 distilled models are censored and aligned. When we SFT the Qwen model with reasoning data in math and code, we get a model that is decensored, but not unaligned. If you are worried about censored models, give this model a go! You can try on
@ollama. `ollama run openthinker:32b` Or its pretty easy to run with vllm.