Preloader Image 1

This AI paper proposes a recursive memory generation method to enhance long-term conversation consistency in large language models

https://arxiv.org/abs/2308.15022

Chatbots and other forms of open-domain communication systems have received increasing attention and research in recent years. Establishing long-term discussion is challenging because it requires knowing and remembering important points from previous conversations.

Large language models (LLMs) such as ChatGPT and GPT-4 have shown encouraging results in several natural language tasks recently. Therefore, open domain/task chatbots are created using LLM’s capabilities in prompting. However, in a prolonged discussion, even ChatGPT can lose track of context and provide inconsistent answers.

Researchers from the Chinese Academy of Sciences and the University of Sydney investigate whether LLM can be used effectively in sustained conversation without the need for labeled data or additional tools. Researchers use LLM to build recursive summaries in the form of memory, where they save important information from an ongoing conversation, inspired by memory enhancement methods. In practical use, LLMs will initially be given brief background information and asked to summarize it. They then ask LLM to combine the previous and next statements to create a new summary/memory. They then concluded by asking the LLM to decide based on the most recent information it had stored.

🔥 Explore groundbreaking advances in AI research through our Newsletter- Subscribe now because it’s free!

The proposed scheme can serve as a possible solution to enable existing LLM to model extremely long contexts (dialogue sessions) without the need to extend the maximum length setting and to model the discourse. expensive long-term language.

The usefulness of the proposed scheme is experimentally demonstrated on a public long-term dataset using LLM API ChatGPT and easy-to-use text-davinci-003. Furthermore, the study demonstrates that using a single labeled sample can significantly enhance the performance of the proposed strategy.

The researchers asked an arbitrarily large language model to perform memory management and answer generation tasks. The former is responsible for iteratively summarizing important details in an ongoing conversation, and the latter combines memory to produce acceptable responses.

In this study, the team used only automated measures to evaluate the effectiveness of the proposed method. This method may not be optimal for open domain chatbots. In real-world applications, they cannot ignore the overhead of calling huge models, which is not taken into account by their solution.

In the future, the researchers plan to test the effectiveness of their long-context modeling approach on other long-context tasks, including story production. They also plan to improve the summarization capabilities of their method by using locally supervised fine-tuning LLM instead of an expensive online API.


Check Paper. All credit for this study goes to the researchers of this project. Also, don’t forget to participate Our 30k+ ML SubReddit, Facebook community 40k+, Discord channelAnd Email newsletterwhere we share the latest AI research news, interesting AI projects, etc

If you like our work, you’ll love our newsletter..

Dhanshree Shenwai is a Computer Science Engineer and has experience working in FinTech companies in the fields of Finance, Cards & Payments and Banking with a keen interest in AI applications. She is enthusiastic about exploring new technologies and advancements in today’s evolving world, making everyone’s lives easy.

#paper #proposes #recursive #memory #generation #method #enhance #longterm #conversation #consistency #large #language #models

Written By

Leave a Reply

Leave a Reply

Your email address will not be published. Required fields are marked *