A Large Language Model Based Pipeline for Review of Systems Entity Recognition from Clinical Notes
Abstract
Objective: Develop a cost-effective, large language model (LLM)-based pipeline for
automatically extracting Review of Systems (ROS) entities from clinical notes.
Materials and Methods: The pipeline extracts ROS sections using SecTag, followed by
few-shot LLMs to identify ROS entity spans, their positive/negative status, and associated
body systems. We implemented the pipeline using open-source LLMs (Mistral, Llama,
Gemma) and ChatGPT. The evaluation was conducted on 36 general medicine notes
containing 341 annotated ROS entities.
Results: When integrating ChatGPT, the pipeline achieved the lowest error rates in
detecting ROS entity spans and their corresponding statuses/systems (28.2% and 14.5%,
respectively). Open-source LLMs enable local, cost-efficient execution of the pipeline
while delivering promising performance with similarly low error rates (span: 30.5–36.7%;
status/system: 24.3–27.3%).
Discussion and Conclusion: Our pipeline offers a scalable and locally deployable solution
to reduce ROS documentation burden. Open-source LLMs present a viable alternative to
commercial models in resource-limited healthcare environments.
Keywords: review of systems, clinical note, natural language processing, large language
model, open-source, LangChain pipeline.
👏👏
RispondiEliminaTop
RispondiEliminaOttimo lavoro
RispondiElimina🔝
RispondiEliminaInteressante
RispondiEliminaVery interesting work!
RispondiElimina