Red Hot Cyber

Cybersecurity is about sharing. Recognize the risk, combat it, share your experiences, and encourage others to do better than you.
Search
Red Hot Cyber Academy

Tencent Challenges the Giants! New Hunyuan-MT Beats Google Translate and GPT-4.1

Redazione RHC : 3 September 2025 09:54

Chinese company Tencent has made public the source code of a new set of Hunyuan-MT language models, specially optimized for translation tasks. The developers claim that the algorithms perform better than Google Translate in the popular WMT25 benchmark.

The set comprises four models, including two flagship models: Hunyuan-MT-7B and Hunyuan-MT-Chimera-7B, each containing 7 billion parameters. Two compressed versions are also presented, which use less memory but operate with a slight loss in translation quality.

Tencent used four datasets for training. Two of them included texts in 33 languages without translation, while the other two included several million sentence pairs and their translations. This approach allowed them to combine language knowledge with general scholarship.

The effectiveness of the models was tested using the MMLU-Pro test, designed to assess general scholarship. Hunyuan-MT performed better than Llama-3-8B-Base, despite having fewer parameters.

After initial training, the models underwent an additional training step using reinforcement learning. Tencent provided them with tasks and feedback on the quality of the translation, helping to improve its accuracy.

The quality was assessed by a separate AI system, which analyzed the semantic correspondence of the translation to the original and the correct use of terminology in different fields.

The first model in the series, Hunyuan-MT-7B, is based on the classic architecture of language models. The Chimera-7B variant uses an ensemble method: several neural networks process a query simultaneously, and their responses are then combined into a higher-quality final version.

In the WMT25 tests comparing translations in 31 language pairs, Hunyuan-MT outperformed Google Translate in 30 cases, with some pairs achieving results as much as 65% higher.

Furthermore, Tencent’s series performed better than GPT-4.1 and Anthropic’s Claude 4 Sonnet in most language pairs in the same benchmark.

Redazione
The editorial team of Red Hot Cyber consists of a group of individuals and anonymous sources who actively collaborate to provide early information and news on cybersecurity and computing in general.

Lista degli articoli