Summarize by Aili
Retrieval Augmented Instruction Tuning for Open NER with Large Language Models
๐ Abstract
The article explores Retrieval Augmented Instruction Tuning (RA-IT) for open named entity recognition (NER) using large language models (LLMs). It focuses on incorporating information with LLMs for information extraction tasks.
๐ Q&A
[01] Retrieval Augmented Instruction Tuning (RA-IT) for Open NER
1. What is the key idea behind RA-IT for open NER?
- The key idea is to retrieve semantically similar examples from the training dataset and prepend them to the input of the original instruction for each training sample.
- This context-enhanced instruction is used to fine-tune the LLM, with the goal of improving its performance on open NER tasks.
2. What are the main findings from the experiments on RA-IT for open NER?
- RA-IT achieves consistent improvements on open NER across various data sizes, suggesting the need for context-enhanced fine-tuning.
- Retrieving semantically similar examples benefits the most for training, while random retrieval also exhibits improvement but is inferior to similar examples.
- Retrieving out-domain examples for inference requires applying example filtering strategies to achieve improvements, while providing in-domain examples benefits inference.
3. How does RA-IT differ from previous work on instruction tuning for information extraction?
- Previous work on instruction tuning for information extraction, such as Sainz et al. (2024) and Li et al. (2024), used code-style instructions, which is orthogonal to this work.
- This work explores the RA-IT strategy, which can be integrated into various instruction styles, including the code-style instructions used in prior work.
[02] Chinese Open NER Dataset Construction
1. How was the Chinese open NER dataset (Sky-NER) constructed?
- The Sky-NER dataset was constructed by sampling input passages from the large-scale Sky corpus across various domains, and then using ChatGPT to automatically generate entity mentions and types based on the sampled passages.
- This follows the data construction recipe of UniNER (Zhou et al., 2024), which successfully distilled the strong capability of ChatGPT in open NER into smaller models without any human-annotated data.
2. What are the key characteristics of the Sky-NER dataset?
- The Sky-NER dataset consists of 50K NER examples, covering a diverse set of entity types.
- The most frequent entity types are concepts, locations, persons, organizations, and products, which account for 75.3% of the dataset.
- Less frequent entity types include honors, technical terms, places, emotions, and programs, which make up 17.5% of the dataset.
- The remaining 7.2% of the dataset covers rarer entity types such as competition categories and property types.
</output_format>
Shared by Daniel Chen ยท
ยฉ 2024 NewMotor Inc.