NIST leads team in efforts to improve COVID-19 search engines

NIST leads team in efforts to improve COVID-19 search engines

April 15, 2020

The U.S. Department of Commerce’s National Institute of Standards and Technology (NIST) and the White House Office of Science and Technology Policy (OSTP) launched a joint effort to support the development of search engines for research that will help in the fight against COVID-19. The project was developed in response to the March 16 White House Call to Action to the Tech Community on New Machine Readable COVID-19 Dataset.


In this effort, NIST will work initially with the Allen Institute for Artificial Intelligence, the National Library of Medicine, Oregon Health & Science University (OHSU), and the University of Texas Health Science Center at Houston (UT Health). The team will apply the successful, long-running program of expert engagement and technology assessment called the Text Retrieval Conference (TREC) to the COVID-19 Open Research Dataset (CORD-19), a resource of more than 44,000 research articles and related data about COVID-19 and the coronavirus family of viruses. The TREC-COVID program goals include creating datasets and using an independent assessment process that will help search engine developers to evaluate and optimize their systems in meeting the needs of the research and health-care communities.


“The TREC program has provided an effective way to evaluate and advance search engine technologies since 1992, and has led directly to the powerful search capabilities and internet-based efficiencies we now often take for granted,” said Undersecretary of Commerce for Standards and Technology and NIST Director Walter G. Copan. “We are pleased to apply this infrastructure to the challenge of working with massive amounts of data to help researchers better understand and ultimately to combat this deadly novel coronavirus and related threats.”


The team will first release a series of sample queries for the biomedical research community, developed by team members at the National Library of Medicine, OHSU and UT Health. Registered participants in TREC-COVID will use their information retrieval and search systems to run the queries against the CORD-19 document set and return their results to NIST. Biomedical experts will then review test results, including document relevance rankings, to assess the overall performance of the retrieval systems.


Using proven TREC protocols, NIST will score the submissions and post the scores, the retrieval results themselves, and the lists of key reference documents to the TREC-COVID website. These “test collections” can then be used by information retrieval researchers to evaluate and enhance the performance of their own search engines. This effort is intended to help researchers understand how search systems could best support medical researchers when available information is developing quickly, as in the current pandemic.


Interested organizations are invited to register to participate in the TREC-COVID program on the NIST website.


Read more: https://www.nist.gov/news-events/news/2020/04/nist-and-ostp-launch-effor...