The Power of AI in Understanding User Preferences

Anyone who has dealt in a customer-facing job — or even just worked with a team of more than a few individuals — knows that every person on Earth has their own unique, sometimes baffling, preferences. Understanding the preferences of every individual is difficult even for us fellow humans. But what about for AI models, which have no direct human experience upon which to draw, let alone use as a frame-of-reference to apply to others when trying to understand what they want?

A team of researchers from leading institutions and the startup Anthropic, the company behind the large language model (LLM)/chatbot Claude 2, is working on this very problem and has come up with a seemingly obvious solution: get AI models to ask more questions of users to find out what they really want.

Using Generative Active Task Elicitation (GATE)

Anthropic researcher Alex Tamkin, together with colleagues Belinda Z. Li and Jacob Andreas of the Massachusetts Institute of Technology’s (MIT’s) Computer Science and Artificial Intelligence Laboratory (CSAIL), along with Noah Goodman of Stanford, published a research paper earlier this month on their method, which they call “generative active task elicitation (GATE).” Their goal is to “Use [large language] models themselves to help convert human preferences into automated decision-making systems.”

In other words: take an LLM’s existing capability to analyze and generate text and use it to ask written questions of the user on their first interaction with the LLM. The LLM will then read and incorporate the user’s answers into its generations going forward, live on the fly, and infer from those answers – based on what other words and concepts they are related to in the LLM’s database – as to what the user is ultimately asking for.

“The effectiveness of language models (LMs) for understanding and producing free-form text suggests that they may be capable of eliciting and understanding user preferences.”

Practical Applications of GATE

The GATE method can be applied in various domains, as determined by the researchers. They experimented with GATE in three domains – content recommendation, moral reasoning, and email validation.

By fine-tuning Anthropic rival’s GPT-4 from OpenAI and recruiting 388 paid participants at $12 per hour to answer questions from GPT-4 and grade its responses, the researchers found that GATE often yields more accurate models than baselines while requiring comparable or less mental effort from users.

Specifically, the researchers discovered that the GPT-4 fine-tuned with GATE performed better at guessing individual user preferences in its responses. This improvement, measured subjectively, amounted to around 0.05 points of significance. While seemingly small, this improvement is significant when starting from zero.

Ultimately, the researchers state that they “presented initial evidence that LMs can successfully implement GATE to elicit human preferences (sometimes) more accurately and with less effort than supervised learning, active learning, or prompting-based approaches.”

Streamlining AI-Powered User Experiences

This research could save enterprise software developers a lot of time and effort when implementing LLM-powered chatbots for customer or employee-facing applications. Instead of training chatbots on a corpus of data to understand individual customer preferences, fine-tuning the preferred models to perform the GATE process could result in more engaging, positive, and helpful experiences for users.

So, if your favorite AI chatbot begins asking you questions about your preferences in the near future, there’s a good chance it may be using the GATE method to provide you with better responses going forward.

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Posts