My name is Yue Wang (王玥, wáng yuè, in Chinese). I am a sixth-year Ph.D candidate working with Prof.Hui Fang at InfoLab in ECE Department at University of Delaware. Our group works on the topics related to Information Management, such as information retrieval, knowledge base, data mining, and biomedical informatics.
I received my Master degree in the major of Electrical Engineering from Mads Clausen Institute at University of Southern Denmark. The title of my Master thesis is "Sensor Networks in Biomechatronics - Hexapod Robot", and it is under the supervise of Prof. Arne Bilberg. The project is a part of the EU-funded EMICAB project. I received my Bachlor degree in the major of Electrical Engineering from Beijing University of Technology in 2008. I also received a Bachlor degree in the major of Information Techonology from Mikkeli University of Applied Sciences in Finland.
I have a wide range of interests. Traveling, hiking and photography are my habits.
- Yue Wang and Hui Fang. Extracting Useful Information from Clinical Notes. In Proceedings of the 2016 Text REtrieval Conference, 2016. (TREC'16) [pdf]
- Yue Wang, Xitong Liu and Hui Fang. A Study of Concept-based Weighting Regularization for Medical Records Search. To appear in the 52nd Annual Meeting of the Association for Computational Linguistics, 2014. (ACL'14 Acceptance Rate: 26.2%) [pdf]
- Yue Wang, Hao Wu and Hui Fang. An Exploration of Tie-Breaking for Microblog Retrieval. In Proceedings of the 36th European Conference on Information Retrieval, 2014. (ECIR'14) (short paper) [pdf]
- Yue Wang and Hui Fang. Exploring the Query Expansion Methods for Concept Based Representation. In Proceedings of the 2014 Text REtrieval Conference, 2014. (TREC'14) [pdf]
- Yue Wang, Jerry Darko and Hui Fang. Tie-breaker: A New Perspective of Ranking and Evaluation for Microblog Retrieval. In Proceedings of the 2013 Text REtrieval Conference, 2013. (TREC'13) [pdf]
- Yue Wang, Irene Manotas, Kristina Winbladh and Hui Fang. Automatic Detection of Ambiguous Terminology for Software Requirements. In Proceedings of the 18th International Conference on Application of Natural Language to Information Systems, 2013. (NLDB'13) [pdf]
- Miguel A. Callejas P, Yue Wang and Hui Fang. Exploiting Domain Thesaurus for Medical Record Retrieval. In Proceedings of the 2012 Text REtrieval Conference, 2012. (TREC'12, No.6 group in 2012 TREC Medical Records track) [pdf]
Mining Mobile Apps for Early Bug Identification
2015.9 ~ (On going)
- Task: identify the sentences describing a buggy features for mobile apps.
- Challenges: with limited training resources, given a review for a mobile app, we are trying to verify weather the review reports a bug. If so, we want to identify which sentences are describing the bug, and which type of bug this review is reporting.
Integrated Search System for JPMC
2014.8 ∼ 2015.11
- Task: developing an integrated search demo system with a team at JP Morgan Chase.
- Challenges: integrated searching objects across different domain. Identify concepts with similar semantic meanings from different resources.
- Solutions: we built a integrated search system on top of Solr and MangoDB, which could automatically identify similar terms in each domain and convert natural language search queries into SQL style queries, then perform the retrieval task.
Medical Domain Retrieval System
2012.9 ~ 2014.6
- Task: identify patients matching a set of clinical criteria based on their medical recordsfor research purpose.
- Challenges: correctly identify and match the clinical terms for the disease, negationhandling in the natural language.
- Solutions: we first converted term based representation to concept based representation, and then we proposed two weighting regularization methods to overcome the inaccurate mapping generated by the NLP tool.
- Achievements: our initial system ranked 6th place out of 88 submitted systems in TREC Medical Record Retrieval Track 2012. The improved system later achieved similar performance as state-of-the-art methods in TREC 2012 using less external resources and achieving a faster processing time
2013.3 ~ 2013.12
- Task: build a real-time ad-hoc retrieval system for tweets collection.
- Challenges: tweets are shorter than normal documents, so traditional retrieval signals may not work well. In addition, no future information is allowed in the system due to nature of time sensitivity of tweets.
- Solutions: we extended the frame work of tie-breaking with query expansion and document expansion techniques.
- Achievements: our system could be ranked among top 3 groups based on the TREC Microblog Track 2012
Software Requirement Specification Disambiguation
2011.10 ~ 2012.8
- Task: identify possible ambiguous concepts from software requirement specification.
- Challenges: the candidate concepts may not have a clear definition and the total number of ambiguous concepts is different from project to project.
- Solutions: we proposed two feature-based information retrieval techniques to rank all the important concepts based on their ambiguity scores.
- Achievements: our paper is one of the first papers that aims to detect ambiguous terminology from software requirements specification. Experiment results over four real-world data sets show that the proposed methods are effective