← Back to Browse
GLM-130B
G

GLM-130B

GLM-130B, showcased at ICLR 2023, represents a groundbreaking open bilingual pre-trained model that stands out with its impressive 130 billion parameters. Developed for bidirectional dense modeling in

Otherfree
Visit Site →

8,514

Votes

12,523

Views

4,569

Bookmarks

About

GLM-130B, showcased at ICLR 2023, represents a groundbreaking open bilingual pre-trained model that stands out with its impressive 130 billion parameters. Developed for bidirectional dense modeling in both English and Chinese, the GLM-130B leverages the General Language Model (GLM) algorithm for pre-training and is optimized to run inference tasks on a single server setup whether it be the A100 (40G 8) or the V100 (32G 8). Furthermore, its compatibility with INT4 quantization means that the already modest hardware requirements can be reduced even further, allowing a server with 4 * RTX 3090 (24G) to support the model with minimal performance degradation. As part of its training process, the GLM-130B has digested an extensive dataset consisting of over 400 billion text tokens, equally divided between Chinese and English. It boasts exceptional bilingual support, superior performance across various datasets when compared to its counterparts, and offers fast inference times. Additionally, this repository also promotes reproducibility by facilitating open-source code and model checkpoints for over 30 tasks.

Key Features

  • Bilingual Support: GLM-130B caters to both English and Chinese language models.
  • High Performance: Comprehensive benchmarks show GLM-130B outperforming rival models across diverse datasets.
  • Fast Inference: Utilizes SAT and FasterTransformer for rapid inference on a single A100 server.
  • Reproducibility: Consistent results across more than 30 tasks, thanks to open-source code and model checkpoints.
  • Cross-Platform Compatibility: Accommodates a range of platforms including NVIDIA, Hygon DCU, Ascend 910, and Sunway.

FAQ

What is GLM-130B?

GLM-130B is a bilingual, bidirectional dense model with 130 billion parameters, pre-trained using the General Language Model (GLM) algorithm.

How much data was GLM-130B trained on?

The model was trained on over 400 billion text tokens, with 200 billion each for Chinese and English text.

Can the results produced by GLM-130B be reproduced?

Yes, all results across more than 30 tasks can be easily reproduced using provided open-source code and model checkpoints.

Does GLM-130B support multiple hardware platforms?

GLM-130B supports not only NVIDIA but also Hygon DCU, Ascend 910, and soon, Sunway platforms for training and inference.

What is the main focus of the GLM-130B repository?

The repository mainly focuses on the evaluation of GLM-130B, supporting fast model inference and reproducibility of results.

You may also like

More tools in Other

View all →
PureCode.ai
P

PureCode.ai

A tool to automate coding tasks through codebase-aware code generation.

SuperU AI
S

SuperU AI

A nocode tool to create voice AI agents for customer communications.

@kuki_ai
@

@kuki_ai

Welcome to the world of Kuki, an award-winning artificial intelligence designed to bring entertainment to the digital age. Dive into engaging conversations with AI that's crafted to provide not just r

LLM Council
L

LLM Council

A tool to compare and synthesize multiple LLM responses.

Wan 2.7 AI Video Generator
W

Wan 2.7 AI Video Generator

Wan 2.7 AI Video Generator transforms still images into high-quality, realistic 1080P videos with dynamic motion and advanced controls. It targets creators, marketers, e-commerce professionals, and di

Integral Calculator - Wolfram|Alpha
I

Integral Calculator - Wolfram|Alpha

The Integral Calculator provided by Wolfram|Alpha is a comprehensive tool designed for professionals, educators, students, and anyone with a need to solve complex mathematical integrals. By leveraging

AI Dungeon
A

AI Dungeon

AI Dungeon is a text-based adventure game where you lead the story and the AI creates the world around you. It offers endless possibilities by generating unique characters, settings, and scenarios bas

Verbacall
V

Verbacall

A platform that automatically answers, qualifies, and follows up on calls 24/7.

PrompTessor
P

PrompTessor

A tool that optimizes text for clarity, tone, and grammar without requiring prompt engineering skills.

AptlyStar.AI
A

AptlyStar.AI

A tool to create and manage AI bots for businesses.

G3D.AI {Jedi}
G

G3D.AI {Jedi}

G3D.AI {Jedi} is a generative AI tool for game creation that enables game creators to build beautiful and novel games in a fraction of the time. With a suite of tools designed to supercharge creativit

Aigur Client
A

Aigur Client

A open-source platform for creating and running Generative AI pipelines for text modification, image manipulation, and more.