Google has introduced Gemma, a new family of open models based on the research and technology of the existing Gemini models. Gemma is available in two sizes, Gemma 2B and Gemma 7B, each offering pre-trained and instruction-tuned variants.
Users can start using Gemma today with free access on Kaggle and a free tier for Colab notebooks. First-time Google Cloud users can also get $300 in credits, while researchers can apply for up to $500,000 in Google Cloud credits to speed up their projects.
Gemma has outperformed Llama 2 on various benchmarks, including MMLU, HellaSwag, and HumanEval.
To promote developer innovation and responsible use, Google is offering a Responsible Generative AI Toolkit alongside the models. This toolkit provides essential tools for creating safer AI applications with Gemma, offering guidance and support for developers.
Gemma is compatible with major frameworks like JAX, PyTorch, and TensorFlow through native Keras 3.0. The release includes ready-to-use Colab and Kaggle notebooks, integration with popular tools such as Hugging Face, MaxText, NVIDIA NeMo, and TensorRT-LLM.
Gemma models can be run on different platforms, from laptops and workstations to Google Cloud, with optimization for top performance on NVIDIA GPUs and Google Cloud TPUs.
This announcement follows Google’s recent introduction of Gemini 1.5 with a 1 million token context window — the largest ever in natural language processing models. In comparison, GPT-4 Turbo has a 128K context window, and Claude 2.1 has a 200K context window.