Abstract
The U.S. Department of Energy’s Portal and Repository for Information on Marine Renewable Energy (PRIMRE) is an interconnected system of knowledge hubs that provide access to data, information, and other resources for the marine energy community. The PRIMRE team has developed and trained a Large Language Model (LLM) on the metadata and supporting documents associated with data and information from the various PRIMRE knowledge hubs to create an Artificially Intelligent (AI) research assistant. By leveraging work done previously to make PRIMRE metadata machine-readable and a prototype LLM called the Energy Language Model (developed by the National Renewable Energy Laboratory), AskPRIMRE serves as a virtual research assistant to PRIMRE users. It provides answers to a variety of user-provided questions using natural language processing and generative machine learning. Users can get answers to questions about specific datasets, including inquiries about the equipment, assumptions, and methodologies used in the origination of the data; answers to basic questions about marine energy based on vetted content; or answers to more abstract questions, such as help with project timelines, international standards, or the applicability of marine energy technologies to other research fields. AskPRIMRE improves the discoverability of marine energy data by helping guide users to data and information beyond simple keyword searches. It enables users to find data based on properties of the data, discover information contained within supporting documents, and explore data from projects related to their research objectives.
This paper will outline the development, training, output, and efficacy of the AskPRIMRE LLM, including adherence to scientific rigor through improvements designed to increase the accuracy of generated answers, avoid speculation, and provide proper references for all resources used.