An approach to estimating cited sentences in academic papers using Doc2vec

Shunsuke Tanabe, Atsuhiro Takasu, Manabu Ohta, Jun Adachi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Most academic authors refer to the literature when introducing their proposed methods and the data used in their experiments. These references can be very helpful when trying to understand a paper; however, some authors do not always state clearly the specific part of the referenced work they are referring the reader to and it can be quite labor-intensive to have to read the whole document to identify the relevant information. In this paper, we propose a method for estimating the appropriate parts of a referenced work as the “cited parts,” with the aim of reducing this burden. We first extract sentences in an academic paper that cites references to the literature as “citing sentences.” We then vectorize the citing sentences and all the sentences in the cited papers using doc2vec and estimate the most appropriate cited part as the sentence that has the most similar feature vector to that of the citing sentence. To evaluate the proposed method, we conducted experiments using English-language papers and a questionnaire survey that asked subjects to evaluate the appropriateness of the cited parts estimated by the method. The experiments showed that this approach’s success in estimating the appropriate parts of a cited paper as the cited parts depended on the citation intention of the citing sentences.

Original languageEnglish
Title of host publicationMEDES 2018 - 10th International Conference on Management of Digital EcoSystems
PublisherAssociation for Computing Machinery, Inc
Pages118-125
Number of pages8
ISBN (Electronic)9781450356220
DOIs
Publication statusPublished - Sep 25 2018
Event10th International Conference on Management of Digital EcoSystems, MEDES 2018 - Tokyo, Japan
Duration: Sep 25 2018Sep 28 2018

Other

Other10th International Conference on Management of Digital EcoSystems, MEDES 2018
CountryJapan
CityTokyo
Period9/25/189/28/18

    Fingerprint

Keywords

  • Academic paper
  • Browsing support
  • Citation
  • Doc2vec
  • Reference

ASJC Scopus subject areas

  • Computer Graphics and Computer-Aided Design
  • Computer Networks and Communications
  • Environmental Engineering

Cite this

Tanabe, S., Takasu, A., Ohta, M., & Adachi, J. (2018). An approach to estimating cited sentences in academic papers using Doc2vec. In MEDES 2018 - 10th International Conference on Management of Digital EcoSystems (pp. 118-125). Association for Computing Machinery, Inc. https://doi.org/10.1145/3281375.3281391