Summarize by Aili

Retrieval Augmented Instruction Tuning for Open NER with Large Language Models

🌈 Abstract

The article explores Retrieval Augmented Instruction Tuning (RA-IT) for open named entity recognition (NER) using large language models (LLMs). It focuses on incorporating information with LLMs for information extraction tasks.

🙋 Q&A

[01] Retrieval Augmented Instruction Tuning (RA-IT) for Open NER

1. What is the key idea behind RA-IT for open NER?

The key idea is to retrieve semantically similar examples from the training dataset and prepend them to the input of the original instruction for each training sample.
This context-enhanced instruction is used to fine-tune the LLM, with the goal of improving its performance on open NER tasks.

2. What are the main findings from the experiments on RA-IT for open NER?

RA-IT achieves consistent improvements on open NER across various data sizes, suggesting the need for context-enhanced fine-tuning.
Retrieving semantically similar examples benefits the most for training, while random retrieval also exhibits improvement but is inferior to similar examples.
Retrieving out-domain examples for inference requires applying example filtering strategies to achieve improvements, while providing in-domain examples benefits inference.

3. How does RA-IT differ from previous work on instruction tuning for information extraction?

Previous work on instruction tuning for information extraction, such as Sainz et al. (2024) and Li et al. (2024), used code-style instructions, which is orthogonal to this work.
This work explores the RA-IT strategy, which can be integrated into various instruction styles, including the code-style instructions used in prior work.

[02] Chinese Open NER Dataset Construction

1. How was the Chinese open NER dataset (Sky-NER) constructed?

The Sky-NER dataset was constructed by sampling input passages from the large-scale Sky corpus across various domains, and then using ChatGPT to automatically generate entity mentions and types based on the sampled passages.
This follows the data construction recipe of UniNER (Zhou et al., 2024), which successfully distilled the strong capability of ChatGPT in open NER into smaller models without any human-annotated data.

2. What are the key characteristics of the Sky-NER dataset?

The Sky-NER dataset consists of 50K NER examples, covering a diverse set of entity types.
The most frequent entity types are concepts, locations, persons, organizations, and products, which account for 75.3% of the dataset.
Less frequent entity types include honors, technical terms, places, emotions, and programs, which make up 17.5% of the dataset.
The remaining 7.2% of the dataset covers rarer entity types such as competition categories and property types.

</output_format>

Shared by Daniel Chen ·

Install fromChrome Web Store