Skip to main content
No Access

Scalable biomedical Named Entity Recognition: investigation of a database-supported SVM approach

Published Online:pp 191-208

This paper explores scalability issues associated with the Named Entity Recognition problem in the biomedical publications domain using Support Vector Machines. The performance results using existing binary and multi-class SVMs with increasing training data are compared to results obtained using our new implementations. Our approach eliminates prior language or domain-specific knowledge and achieves good out-of-the-box accuracy measures comparable to those obtained using more complex approaches. The training time of multi-class SVMs is reduced by several orders of magnitude, which would make support vector machines a more viable and practical solution for real-world problems with large datasets.

Keywords

NER, named entity recognition, SVMs, support vector machines, database extension, bioinformatics, biomedical publications, large datasets