We compared the performance of CASEml against another standard unsupervised method and a baseline metric selecting the most frequent acronym sense. The algorithm was validated using note data from the Veterans Affairs hospital system to classify the meaning of three acronyms: RA, MS, and MI. We developed an unsupervised ensemble machine learning (CASEml) algorithm to automatically classify acronyms by leveraging semantic embeddings, visit-level text and billing information. In this study we introduce an unsupervised method for acronym disambiguation, the task of classifying the correct sense of acronyms in the clinical EHR notes. Clinical notes, however, frequently contain acronyms with several potential senses (meanings) and traditional natural language processing (NLP) techniques cannot differentiate between these senses. The use of electronic health records (EHR) systems has grown over the past decade, and with it, the need to extract information from unstructured clinical narratives. data combination Sample Characteristic - Location United States of America Machine-accessible metadata file describing the reported data:.Linguistic Form Technology Type(s) digital curation.data combinationSample Characteristic - LocationUnited States of America Measurement(s) Controlled Vocabulary.Linguistic FormTechnology Type(s)digital curation.This allows for cross-institutional natural language processing, which previous inventories did not support. The multiple sources and high coverage support application in varied specialties and settings. To our knowledge, the Meta-Inventory is the most complete compilation of medical abbreviations and acronyms in American English to-date. The Meta-Inventory demonstrated high completeness or coverage of abbreviations and senses in new clinical text, a substantial improvement over the next largest repository (6–14% increase in abbreviation coverage 28–52% increase in sense coverage). Additional features include semi-automated quality control to remove errors. Automated cross-mapping of synonymous records using state-of-the-art machine learning reduced redundancy, which simplifies future application. A systematic harmonization of eight source inventories across multiple healthcare specialties and settings identified 104,057 abbreviations with 170,426 corresponding senses. To support recognition, disambiguation, and expansion, we present the Medical Abbreviation and Acronym Meta-Inventory, a deep database of medical abbreviations. The recognition, disambiguation, and expansion of medical abbreviations and acronyms is of upmost importance to prevent medically-dangerous misinterpretation in natural language processing. Further studies and practical solutions are needed to better address these issues. Finally, similar to other WSD tasks, an understanding of baseline majority sense rates and separateness between senses is important. The second lesson learned is that investigators may find that using simple approaches is an effective starting point for these tasks. The first lesson is that a comprehensive understanding of the unique characteristics of clinical text is important for automatic acronym and abbreviation WSD. Factors, such as majority sense prevalence and the degree of separateness between sense meanings, were also important considerations. Although we anticipated that using sophisticated techniques, such as biomedical terminologies, semantic types, part-of-speech, and language modeling, would be needed for feature selection with automated machine learning approaches, we found instead that simple techniques, such as bag-of-words, were quite effective in many cases. There are significant challenges associated with the informal nature of clinical text, such as typographical errors and incomplete sentences difficulty with insufficient clinical resources, such as clinical sense inventories and obstacles with privacy and security for conducting research with clinical text. The authors discuss feature selection for automated techniques and challenges with WSD of acronyms and abbreviations in the clinical domain. Since clinical notes have distinctive characteristics, it is unclear whether techniques effective for acronym and abbreviation WSD from biomedical literature are sufficient. Although acronyms and abbreviations in clinical text are used widely on a daily basis, relatively little research has focused upon word sense disambiguation (WSD) of acronyms and abbreviations in the healthcare domain.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |