Mistral released Mixtral 8x22B via magnet link without fanfare. Technical advances, comparison with 8x7B and GPT-4, and required hardware.
Read moreTag: moe
Gemini 1.5: Millions of Tokens of Context in Production
Gemini 1.5 Pro proved million-token context is real. What changes in RAG and architectures when the model can swallow an entire book.
Read more