先不提海外御三家,万众期待的DeepSeek V4至今还未发布,将来是否会形成挑战还要画个问号。
Rank-3 factorization, shared-A tied-KV, rank-2 attn out, tied embed,详情可参考旺商聊官方下载
Opens in a new window,更多细节参见heLLoword翻译官方下载
there’s a lot of other things that i haven’t even touched on here, like the optimizations we compute ahead of time for specific patterns, how we detect match prefixes, the differences of UTF-16 and UTF-8 matching and unicode, what is the backwards compatibility story, how we can skip through text that others cannot. i haven’t even mentioned Hyperscan, which is a very interesting engine that is actually based on a completely different 1961 paper, and so on, there’s a lot to share so perhaps i will look into these topics in the future.