Show HN: Single-agent long-horizon reasoning within one LLM run
huggingface.co- We build the Thread Inference Model (TIM) based on the transformer architecture, and its dedicated runtime TIMRUN.
- TIM + TIMRUN = Intelligent workflow generation, context engineering, and multi-hop tool use happens at the runtime level
- TIM + TIMRUN supports virtually unlimited reasoning enabled by context pruning, significantly improves the efficiency for long-horizon reasoning tasks
- Inference API is live at https://subconscious.dev/
- More details: https://github.com/subconscious-systems/TIMRUN Awesome! It looks like you’re building a “reasoning tree” approach with runtime-level context engineering and pruning. Quick question — how does the context-pruning mechanism decide which KV states to discard vs. retain? Just trying to understand how it balances memory efficiency with reasoning depth. I’ll sign up and try out the API — excited to try it out!