This week, RIT researcher Shamil Chollampatt presented his co-authored paper, "" at EMNLP 2020, a leading venue for publishing natural language processing research. Being yet another event that was held virtually, EMNLP had put together a number of platforms to replicate the actual conference experience as closely as possible from virtual spaces that enabled informal meetups and poster sessions, Zoom for talks and Q&A sessions, and Rocket.Chat for paper discussions.
Shamil presented RIT's paper in the machine translation track of the main conference, which was attended by a wide range of researchers from industry and academia. (The presentation can be found .) Shamil's research investigated the usefulness of automatically editing the output of a high-quality neural machine translation using an additional deep neural network. Previous studies have had limited success with this approach, which the paper attributes to the lack of adequate training data. To mitigate the lack of publicly available high-quality data for automatic post-editing, RIT's publication was accompanied by the free release of a larger scale . The datasets were collected from the subtitle translations and post-edits from Rakuten's video streaming service. The paper was the outcome of RIT's sustained efforts to push the frontiers of machine translation over the past several years, led by the RIT Singapore team and involving co-authors of the paper, Raymond Susanto, Liling Tan, and Ewa Szymanska.
The Conference of Machine Translation (WMT 2020), which was collocated with EMNLP, also featured exciting on automatic-post-editing, incorporating terminology constraints based on RIT's ACL 2020 paper: . Continuing the trend of recent NLP conferences, a lot of work focused on pre-trained language model representations for multiple NLP tasks. A major push in this direction over the last couple of years has been due to the Python library (which also won the award at EMNLP 2020). Besides, topics like building interpretable NLP models are becoming prominent, as highlighted in one of the . Another topic that is getting much attention in the community has been to build fair, ethical, and unbiased AI, which was the major discussion point of the . EMNLP 2020 also had workshops and papers that promoted building environmentally-sustainable and privacy-preserving AI models. Overall, EMNLP 2020 shed light on many under-explored avenues of research in machine translation and NLP.
RIT would like to thank EMNLP and all attendees.
RIT’s Satoshi Egi lectures on Pattern-Match-Oriented Programming Style at Programming 2021
RIT Paris and Boston's Collaborative Research at European Conference on Information Retrieval 2021
Rakuten France: Multi-modal Product Dataset now available at Rakuten Data Release
The 2021 Rakuten Data Challenge