5/9/24

Leveraging AI for Information Extraction from Financial Documents: Lessons from Developing Grant Guardian

The Data Solutions team at the Patrick J. McGovern Foundation develops open-source, public good data and AI products to foster a collaborative and sustainable community of practice around these solutions. We are currently developing Grant Guardian, an AI solution to streamline financial due diligence for grantmakers evaluating nonprofits. Conducting thorough due diligence is crucial but often challenging due to limited resources, lack of financial expertise, inconsistent reporting standards, and potential biases. Grant Guardian aims to alleviate these challenges by automatically extracting and analyzing financial information from nonprofits' financial documents.


In this workshop, we share learnings from researching and developing two key information extraction techniques for this product: 1) Providing full-text financial documents as context to large language models (LLMs), and 2) Retrieval-augmented generation (RAG), which combines LLMs with information retrieval systems. We discuss the pros and cons of each approach, evaluating factors such as accuracy, cost, and complexity. Additionally, we provide recommendations on the most suitable techniques for different use cases, empowering attendees to leverage AI effectively to extract information from large texts.
Previous

Technical Workshop - Getting What You Want from the GivingTuesday 990 Data Infrastructure

Next

Can We Truly Govern A.I.?