RakutenAI-2.0-8x7B
RakutenAI-2.0-8x7B is an MoE-based foundation model derived from RakutenAI-7B, first introduced in March 2024. As part of a broader initiative to advance Japanese LLM technology, RakutenAI-2.0-8x7B adopts a Mixture of Experts (MoE) architecture with two active experts, resulting in 13B active parameters. This design enables dynamic expert selection based on input tokens, enhancing computational efficiency while maintaining high performance. RakutenAI-2.0-8x7B achieves state-of-the-art results on Japanese language understanding benchmarks while also demonstrating competitive performance on English evaluation tasks compared to similar models, including Swallow-MX-8x7B-NVE-0.1, Llama-3-Swallow-70B-v0.1, Sarashina2-70B, and PLaMo 100B.
RakutenAI-2.0-8x7B-instruct is a fine-tuned variant of RakutenAI-2.0-8x7B, designed to push the boundaries of Japanese large language models (LLMs). Developed as part of Rakuten's systematic effort to refine AI capabilities, this model builds upon the strengths of its foundation counterpart, excelling in instruction-following tasks while maintaining fluency, coherence, and contextual awareness.
Model Evaluation results can be found on our HuggingFace repository.
Model Downloads
| Model | Type | Download |
|---|---|---|
| RakutenAI-2.0-MoE | Foundation Model | 🤗 HuggingFace |
| RakutenAI-2.0-8x7B-instruct | Instruction-tuned Model | 🤗 HuggingFace |
Getting Started
You can refer to our "Getting Started" guide for a step-by-step tutorial on how to incorporate our models into your own applications.