Rooted in Taiwan, APMIC Has Secured a Position on the Top Global LLMs Ranking Board

March 29, 2023 at 10:00:00 PM

As AI continues sweeping across the globe, APMIC, a Taiwan-based company that specializes in Natural Language Understanding (NLU), has boldly entered the fierce competition with its large language model (LLM)—caigun-lora-model.

Learn More

In recent years, AI and all the stories revolving around it have been on everyone's lips with LLMs becoming a focus in the research field. These LLMs stand out for their excellent natural language processing capabilities and ability to cope with complex language tasks like text generation, classification, sentiment analysis, and so on. However, to fully evaluate the performance of these models and to choose the best models possible, we need to have an accurate ranking board as a reference: Hugging Face, the well-known transformers library made for NLP applications. At the beginning of last year, 2024, APMIC's LLM ranked 1st in the Taiwan region and 64th globally on Hugging Face's leaderboard with an average score of 71.19.

The benchmarks (datasets) used to evaluate an LLM include reasoning multiple-choice questions (HellaSwag), basic knowledge question answering (WinoGrande), and multi-understanding tasks (Multimodal language understanding). These datasets represent a variety of challenges to evaluate the capabilities of language models in the most comprehensive ways.

HellaSwag (HS) is a challenging dataset used to evaluate language models' common-sense reasoning through sentence completion tasks. This is different from traditional language comprehension tests because HS questions usually contain confusing context or non-intuitive answers.

WinoGrande (WG) is a large-scale dataset newly published by Keisuke Sakaguchi and other researchers at the University of Washington in 2019. Since the dataset was inspired by the original WSC design, it expands the content of the original WSC to 44,000 sets of multiple-choice questions. WG is used to challenge the model with basic knowledge Q&A, providing a more comprehensive evaluation.

Multimodal Language Understanding (MMLU) is a composite benchmark that evaluates language models through different data formats such as texts and images. MMLU is used to provide a comprehensive evaluation framework, assessing the performance of language models in different aspects.

With different benchmarks assessing and thousands, if not millions of LLMs trying to get on this leaderboard, Jerry Wu, Founder and CEO of APMIC, also a Google Developer Expert in Machine Learning, pointed out that no team from Taiwan has made it into the list until APMIC, heralding the breakthrough of caigunn-lora-model.

Since the team aimed to develop a localized LLM (LLM), they named it CaiGunn, which means "to chat" in Taiwan's Minnan dialect. This LLM not only performs well in the benchmark tests but also exhibits strong localization features. Whether processing articles, websites, or document data, CaiGunn can effortlessly create a chatbot that is knowledgeable in Traditional Chinese and the local culture of Taiwan.

Jerry Wu also mentioned that the team is not only committed to building an LLM but also continues to work hard to improve its Traditional Chinese recognition capabilities. The uniqueness of CaiGunn is evident in its localized training process, which allows enterprises to achieve more efficient and accurate traditional Chinese processing when utilizing CaiGunn.

As the future wave of AI/ML unfolds, LLMs are poised to become a dynamic battleground for innovators in the field. CaiGunn continues to evolve and shows great potential to emerge as a leading force in generative AI, particularly for Taiwanese companies.

Rooted in Taiwan, APMIC Has Secured a Position on the Top Global LLMs Ranking Board

Products

Resources

Legal

Company