What’s in a Name? If you’re an LLM, Everything!

In Romeo and Juliet, Shakespeare’s eternally popular story/play, Juliet shares her thoughts on names:

What's in a name? That which we call a rose

By any other name would smell as sweet.

Most of us would agree with Juliet. That’s because we know what a rose is. Most of us have seen, smelt, and touched a rose.

But what if you don’t know what the word means? Or, what if, like ChatGPT, you think that it means one of nine different things?

The last thing you want to do, when ‘talking’ to an AI/LLM model is to stop and explain and clarify what different words in your question or prompt mean.

Named Entities

You - and the people building applications based on AI/LLM models - want it to feel natural. You want the model to understand your intent when you refer to a name, place, animal, thing, or other object in the real world. In other words, you want the LLM model to understand ‘named entities’ such as “Taylor Swift”, “OpenAI”, “Eiffel Tower”, and “July 4, 1776”.

Humans vs Machines

Humans understand named entities through context and prior knowledge. For instance, when we hear the name "Taylor Swift," we immediately associate it with one of the most popular female singers of our times, her music, her achievements, and her song-inspiring breakups. This understanding is facilitated by our memory, experiences, and contextual clues in conversation or text.

But training language models to understand named entities is different - and hard. Getting LLMs to understand named entities poses two main challenges:

Ambiguity: The same named entity can refer to different things depending on the context, as we saw with “rose”. To use another example, "Amazon" can refer to the river, the technology company, or fierce female warriors from ancient Greece. Disambiguating such references requires contextual understanding, which can be difficult for LLMs.

Novelty: New named entities are constantly emerging, such as new public figures, companies, or products. LLMs trained on data up to a certain point may struggle to recognize and correctly interpret these new entities without additional, updated training.

But…with the rapid rise of AI/LLM-powered applications, their builders have no choice but to solve this problem. To date, we see them using four main approaches to ‘teaching’ (aka training and fine-tuning) LLMs about named entities:

  1. Data Annotation: Annotate a large dataset with named entities using human annotators (this approach results in high-quality datasets) or automated tools. This labeled data helps the model learn to recognize and categorize named entities accurately.

  2. Contextual Training: Provide the model with extensive context for each named entity. Training on sentences and paragraphs where the named entity is used in different contexts helps the model understand its various meanings and usages.

  3. Continuous Fine-Tuning: Regularly update the training data to include new named entities and recent information. Fine-tuning the model with this updated data ensures it remains current and can recognize emerging named entities.

  4. Entity Linking: Implement entity linking techniques where the model identifies named entities and links them to a knowledge base, like Wikidata. This process enriches the model's understanding by associating entities with structured information and context.

e2f and Named Entities

At e2f, given our role in creating high-quality AI training and fine-tuning data for the world’s largest AI/LLM builders, we are doing our bit to help LLMs understand named entities too.

For example, in a recent project for one of the world’s largest technology companies, we worked on two things to help train their LLM models:

  • Find all named-entity mentions in the provided text and link them to correct entity ids in a knowledge base (this is the 4th approach mentioned above).

  • Find all literal mentions in text and assign them to proper literal types (this is connected to the 1st approach mentioned above).

Assembling a Global Talent Pool

First we assembled a global talent pool of domain experts and other professionals that was optimized for their knowledge of current affairs, research skills, and analytical thinking. The pool was also screened for media literacy, their ability to communicate clearly, and their ability to adapt to new situations and information. Finally, as is always the case at e2f, we made sure the members of this pool were tech savvy, and that they possessed a high degree of cultural awareness.

Project Delivery

Then, over the course of fourteen days (two weeks), this talent pool delivered over seven thousand datasets containing a wide range of linked entities. Here’s an example to give you a taste of what our team worked on:

100% of the work was reviewed and 30% was subject to an additional QA review, per the client’s guidelines (this can vary, based on the client’s requirements and custom quality metrics).

Better Results, More Natural Conversations

While we won’t claim that this one project solves the LLM world’s named entity challenges, we know that this project, alongside the dozens of other AI data projects our teams deliver on, every day, are helping tens of millions of people around the world talk to their AI/LLM apps more naturally. On top of that, our work is also helping those users receive high quality results in response to their prompts and questions that contain various references to names, places, animals, people, things, and other objects.

If you’re an AI/LLM builder that needs high-quality datasets turned around in 24-48 hours - or you’re a domain expert anywhere in the world that would like to serve the world’s AI builders as part of the e2f team, please contact us today.

Previous
Previous

Continuous Monitoring – And Why It Can Make or Break Your AI Apps

Next
Next

AI Fine-Tuning is Forever