SOTA Code Retrieval with Efficient Code Embedding Models
qodo.aianyone else concerned that training models on synthetic, LLM-generated data might push us into a linguistic feedback loop? relying on LLM text for training could bias the next model towards even more overuse of words like "delve", "showcasing", and "underscores"...
SOTA? Lora? Seems like people are trying to usurp ham radio names for things.