Skip to main content
Skip main navigation
No Access

Enhanced semantic expansion for question classification

Published Online:pp 134-148https://doi.org/10.1504/IJITST.2011.039774

Most question and answering systems are based on three research themes: question classification and analysis, document retrieval and answer extraction. The performance in every stage affects the final result. To respond correctly to a question given a large collection of textual data is not an easy task. There is a need to perceive and recognise the question at a level that permits to detect some constraints that the question imposes on possible answers. The classification of questions appears as an important task because it deduces the type of expected answers. The purpose is to provide additional information to reduce the gap between answer and question. A method to improve the performance of question classification focusing on linguistic analysis and statistical approaches is presented. This work also proposes two methods of questions expansion. Various questions representation, term weighting and diverse machine learning algorithms are studied. Experiments conducted on actual data are presented. Of interest is the improvement in the precision on the classification of questions.

Keywords

classification, feature selection, semantic expansion, mutual information, MI, machine learning, text mining

References

  • 1. Bikel, D. , Schwartz, R. , Weischedel, R. (1999). ‘Algorithm that learns what’s in a name’. Machine Learning. 34, 211-231 Google Scholar
  • 2. de Marneffe, M-C. , Manning, C.D. (2008). ‘The Stanford typed dependencies representation’. COLING Workshop on Cross-framework and Cross-domain Parser Evaluation Google Scholar
  • 3. Downey, D. , Broadhead, M. , Etzioni, O. (2007). ‘Locating complex named entities in web text’. Proceedings of the IJCAI07. 2733-2739 Google Scholar
  • 4. Fu, J. , Qu, Y. , Wang, Z. (2009). ‘Two level question classification based on SVM and question semantic similarity’. International Conference on Electronic Computer Technology, 366-370 Google Scholar
  • 5. Hacioglu, K. , Ward, W. (2003). ‘Question classification with support vector machines and error correcting codes’. The Association for Computational Linguistics on Human Language Technology. 2, 28-30 Google Scholar
  • 6. Harb, A. , Beigbeder, M. , Girardot, J.J. (2009). ‘Evaluation of question classification systems using differing features’. Proceedings of IEEE/ACM ICTST-2009 (The 4th International Conference for Internet Technology and Secured Transactions). Google Scholar
  • 7. Kangavari, M. , Ghandchi, S. , Golpour, M. ‘A new model for question answering systems’. Proceedings of World Academy of Science, Engineering and Technology. 2008, 08, 32, 536-543 Google Scholar
  • 8. Khardon, R. , Roth, D. , Valiant, L.G. (1999). ‘Relational learning for NLP using linear threshold elements’. The Conference on Artificial Intelligence, 911-919 Google Scholar
  • 9. Kosseim, L. , Yousefi, J. (2007). ‘Improving the performance of question answering with semantically equivalent answer patterns’. Data and Knowledge Engineering. 66, 53-67 Google Scholar
  • 10. Krishnan, V. , Das, S. , Chakrabarti, S. (2005). ‘Answer type inference from questions using sequential models’. The conference on Human Language Technology and Empirical Methods in Natural Language Processing, 315-322 Google Scholar
  • 11. Li, X. , Roth, D. (2002). ‘Learning question classifiers’. Proceedings of the 19th International Conference on Computational Linguistics. 556-562 Google Scholar
  • 12. Li, X. , Roth, D. (2006). ‘Learning question classifiers: the role of semantic information’. Natural Language Engineering. 12, 3, 229-249 Google Scholar
  • 13. Plamondon, L. , Lapalme, G. , Kosseim, L. (2003). ‘The QUANTUM question answering system’. Proceedings of the Eleventh Text Retrieval Conference (TREC’02). Google Scholar
  • 14. Roche, M. , Prince, V. (2007). ‘Acrodef: a quality measure for discriminating expansions of ambiguous acronyms’. CONTEXT, 411-427 Google Scholar
  • 15. Salton, G. , Buckley, C. (1988). ‘Term-weighting approaches in automatic text retrieval’. Information Processing Management. 513-523 Google Scholar
  • 16. Saxena, A. , Sambhu, G. , Subramaniam, L. , Kaushik, S. (2007). ‘IITD-IBMIRL system for question answering using pattern matching, semantic type and semantic category recognition’. Proceedings: The Fourteenth Text Retrieval Conference (TREC 2007). October, Gaithersberg, MD Google Scholar
  • 17. Schmid, H. (1994). Treetagger. TC project at the Institute for Computational Linguistics of the University of Stuttgart Google Scholar
  • 18. Zhang, D. , Lee, W.S. (2003). ‘Question classification using support vector machines’. Proceedings of the 26th ACM SIGIR. 26-32 Google Scholar