ChatGPT and other large language models (LLMs) have shown impressive generalization abilities, but their training and inference costs are often prohibitive. Additionally, white-box access to model weights and inference probabilities is frequently crucial for explainability and confidence in mission-critical applications like healthcare. As a result, instruction tuning has gained popularity as a method for condensing LLMs into more affordable and transparent student models. These student models have shown convincing skills to mimic ChatGPT, as Alpaca and Vicuna showed. Close examination reveals that they still need to catch up to the ideal LLM, particularly in downstream applications that are specifically targeted.
Because of the restricted computing available, a generic distillation can only create a superficial approximation of the original LLM across all conceivable applications. Instead, they investigate targeted distillation in this research, where they train student models through mission-focused instruction adjustment for a diverse application class like open information extraction. They demonstrate that while maintaining its generalizability across semantic types and domains, this may maximally reproduce LLM’s capabilities for the specified application class. Since named entity recognition (NER) is one of the most fundamental problems in natural language processing, they chose it for their case study. Recent research demonstrates that LLMs still need to catch up to the most advanced supervised system for an entity type when there are many annotated instances.
There needs to be music little-annotable for most object kinds, though. Developing annotated examples is costly and time-consuming, especially in high-value sectors like biology, where annotation requires specialized knowledge. New entity types are continually emerging. Supervised NER models also show poor generalizability for new domains and entity types since they are trained on pre-specified entity types and domains. They outline a generic process for LLM targeted distillation and show how open-domain NER may use it. Researchers from the University of Southern California and Microsoft Research demonstrate how to utilize ChatGPT to create instruction-tuning data for NER from large amounts of unlabeled online text and use LLaMA to create the UniversalNER models (abbreviated UniNER).
They put up the biggest and most varied NER benchmark to date (UniversalNER benchmark), which consists of 43 datasets from 9 different disciplines, including medical, programming, social media, law, and finance. LLaMA and Alpaca score badly on this benchmark (around 0 F1) on zero-shot NER. Vicuna performs significantly better in comparison, yet in average F1, it is still behind ChatGPT by more than 20 absolute points. In contrast, UniversalNER outperforms Vicuna by over 30 absolute points in average F1 and achieves state-of-the-art NER accuracy across tens of thousands of entity types in the UniversalNER benchmark. In addition to replicating ChatGPT’s capacity to recognize any entity with a small number of parameters (7–13 billion), UniversalNER also beats its NER accuracy by 7-9 absolute points in average F1.
Surprisingly, UniversalNER significantly surpasses state-of-the-art multi-task instruction-tuned systems like InstructUIE, which uses supervised NER instances. They also undertake extensive ablation tests to evaluate the effects of different distillation components like the instruction prompts and negative sampling. They will provide their distillation recipe, data, and the UniversalNER model and present an interactive demo to aid further study on targeted distillation.
Check out the Paper, Github, and Project Page. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 28k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.
The post Researchers from USC and Microsoft Propose UniversalNER: A New AI Model Trained with Targeted Distillation Recognizing 13k+ Entity Types and Outperforming ChatGPT’s NER Accuracy by 9% F1 on 43 Datasets appeared first on MarkTechPost.