magic starSummarize by Aili

Retrieval Augmented Instruction Tuning for Open NER with Large Language Models

๐ŸŒˆ Abstract

The article explores Retrieval Augmented Instruction Tuning (RA-IT) for open named entity recognition (NER) using large language models (LLMs). It focuses on incorporating information with LLMs for information extraction tasks.

๐Ÿ™‹ Q&A

[01] Retrieval Augmented Instruction Tuning (RA-IT) for Open NER

1. What is the key idea behind RA-IT for open NER?

  • The key idea is to retrieve semantically similar examples from the training dataset and prepend them to the input of the original instruction for each training sample.
  • This context-enhanced instruction is used to fine-tune the LLM, with the goal of improving its performance on open NER tasks.

2. What are the main findings from the experiments on RA-IT for open NER?

  • RA-IT achieves consistent improvements on open NER across various data sizes, suggesting the need for context-enhanced fine-tuning.
  • Retrieving semantically similar examples benefits the most for training, while random retrieval also exhibits improvement but is inferior to similar examples.
  • Retrieving out-domain examples for inference requires applying example filtering strategies to achieve improvements, while providing in-domain examples benefits inference.

3. How does RA-IT differ from previous work on instruction tuning for information extraction?

  • Previous work on instruction tuning for information extraction, such as Sainz et al. (2024) and Li et al. (2024), used code-style instructions, which is orthogonal to this work.
  • This work explores the RA-IT strategy, which can be integrated into various instruction styles, including the code-style instructions used in prior work.

[02] Chinese Open NER Dataset Construction

1. How was the Chinese open NER dataset (Sky-NER) constructed?

  • The Sky-NER dataset was constructed by sampling input passages from the large-scale Sky corpus across various domains, and then using ChatGPT to automatically generate entity mentions and types based on the sampled passages.
  • This follows the data construction recipe of UniNER (Zhou et al., 2024), which successfully distilled the strong capability of ChatGPT in open NER into smaller models without any human-annotated data.

2. What are the key characteristics of the Sky-NER dataset?

  • The Sky-NER dataset consists of 50K NER examples, covering a diverse set of entity types.
  • The most frequent entity types are concepts, locations, persons, organizations, and products, which account for 75.3% of the dataset.
  • Less frequent entity types include honors, technical terms, places, emotions, and programs, which make up 17.5% of the dataset.
  • The remaining 7.2% of the dataset covers rarer entity types such as competition categories and property types.


Shared by Daniel Chen ยท
ยฉ 2024 NewMotor Inc.