We accidentally de-censored the model! Qwen-instruct which we use is censored and aligned. DeepSeek-R1 distilled models are censored and aligned. When we SFT the Qwen model with reasoning data in math and code, we get a model that is decensored, but not unaligned. If you are https://t.co/MPMh0KoBr8

1 min read Original article ↗

We accidentally de-censored the model! Qwen-instruct which we use is censored and aligned. DeepSeek-R1 distilled models are censored and aligned. When we SFT the Qwen model with reasoning data in math and code, we get a model that is decensored, but not unaligned. If you are worried about censored models, give this model a go! You can try on

@ollama

. `ollama run openthinker:32b` Or its pretty easy to run with vllm.