By Wojciech Gryc on July 25, 2024
A key part of Emerging Trajectories is our ability to link individual facts or pieces of information to a query, question, or statement being verified. This is similar to how Gemini or Perplexity cite website URLs when generating content, but we tend to go a few steps further:
The most obvious approach to building an Emerging Trajectories-like product is using RAG. In this scenario, you insert “facts” or “observations” into the prompts sent to the LLM and ask it to use that information in generating a response. This can be a decent starting point, but can also include numerous types of errors and “hallucinations” if done improperly. We typically look for for four types of errors.
RAG approaches tend to struggle due to the way they split information to be cited (i.e., their “chunking” strategy). For example, if you have a 100-page report, at what level do you review and cite it? Would you take individual sentences, paragraphs, pages, or something else? When you do break down the document, you might lose valuable context. For example, a report on Europe’s AI labor shortage might mention specific countries in its first paragraph, but only reference “the aforementioned countries” in paragraph 2. If you are passing paragraphs as facts, then the LLM won’t know which countries are actually being referred to.
Broadly speaking, we define this as an attribution error — you want to know which entities are being discussed in facts/statements so you can reference them properly.
Note that some researchers suggest summarizing reports or building knowledge graphs to address this issue. This could be a useful approach, but assumes that the context of your queries or documents is the same as when the summary was generated. You might lose valuable information when building the summary. In the case of a knowledge graph, you might have to ensure you design the right edges and nodes for the knowledge graph to cite information properly later.
A “softer” version of Error Type #1 is not using the best facts for the statement or query in question. Suppose you are researching political candidates and their policies in a popular election. In such a scenario, a RAG approach applied to media sources can generate thousands of relevant statements. Even if you address Error Type #1 and get dozens or hundreds of properly attributed facts or statements, you want to ensure the LLM picks the best ones for citations.
Tracking this error can be difficult, so we tend to review fact citations to determine whether they directly support an assertion or indirectly support it.
Another interesting variant here is the source of the fact or citation. Many of our users prefer high credibility sources to be prioritized, so if you have two equivalent citations and one comes from an official government website and another from social media, you'll want to prioritize the former.
A third issue we have is with the queries that users put into the system itself. Tools like Emerging Trajectories, ChatGPT, and Perplexity are built to address many different queries from users, and not all queries are easy to verify — or in some cases, not possible to verify.
For example, suppose you are doing research on the fiscal sustainability of G7 countries (i.e., can they keep paying their debts). Here are three queries any reasonable user might put into an LLM-powered software tool:
All three queries could be put into an LLM-powered system, but (a) is significantly less verifiable today in 2024, while (c) is a relatively easy question for any RAG system to answer. Ensuring that the RAG system (and associated LLM) actually pushes back when facts are not present to support an assertion is critical.
Suppose we have a verifiable question submitted, and we obtain the right facts... A final risk is that the facts will be combined to generate a conclusion that does not reasonably flow from the facts themselves.
Suppose we’re revisiting the query, “Will France default on its national debt in 2030?” We could have a response like, “No, France will not be able to service its national debt based on its increasing debt burden[1] and likelihood of high interest rates[2].”
Suppose the two facts are...
You can see here that the two facts might contribute to a risk that France won’t be able to service its debt, but this is not a foregone conclusion based on the facts themselves. This sort of fundamental logic error can occur quite often with RAG-powered LLM systems.
The above can, at least to a reasonable extent, be addressed with today’s technologies, but also comes with the added challenge of user expectations and requirements.
For example, suppose you have 10,000 articles on global economics. A user doesn’t want you to re-analyze all 10,000 articles every time they have a question. Understanding the user’s use case will allow you to...
It’s important to note — “low cost and imprecise” is not fundamentally bad. You could argue that Google search queries or ChatGPT questions are low cost and imprecise — in both cases, you get an answer quickly and for (nearly) free, but you can’t simply trust whatever content you’re provided.
This is where building fact bases on specific content is such a valuable and interesting middle ground. In the case of Emerging Trajectories, we focus on data sources around global events, economics, and geopolitical risk... This makes us fast and precise for specific use cases, but don’t depend on us for general knowledge about anything and everything.
The above error types, challenges, and user requirements are what makes Emerging Trajectories different from other systems, like Perplexity, ChatGPT, Claude, and so on. We are neither “better” nor “worse”, and just right for specific types of use cases.
With Emerging Trajectories, we do the following...
As the LLM space matures, we also see opportunities for more specialized technologies to enter our own processes and frameworks. For example...
If you're interested in learning more, please contact us!