TamPub
    • Suomeksi
    • In English
Tampereen yliopiston julkaisuarkistoTampere University Institutional Repository
  • Suomeksi
  • In English
  • Kirjaudu
Näytä viite 
  •   TamPub etusivu
  • TamPub
  • Pro gradut
  • Näytä viite
  •   TamPub etusivu
  • TamPub
  • Pro gradut
  • Näytä viite
JavaScript is disabled for your browser. Some features of this site may not work without it.

Automatic keyphrase extraction on Amazon reviews

Chen, Ruiqi (2018)

 
 
Tweet Tiedostoon pääsyä rajoitettu
 
Tiedostoon pääsyä rajoitettu
Avaa tiedosto
1533284473.pdf (2.568Mt)
Lataukset: 



Chen, Ruiqi
2018

Tietojenkäsittelytieteiden tutkinto-ohjelma - Degree Programme in Computer Sciences
Luonnontieteiden tiedekunta
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2018-07-23
Näytä kaikki kuvailutiedot
Julkaisun pysyvä osoite on
http://urn.fi/URN:NBN:fi:uta-201808032341
Tiivistelmä
People are facing severe challenges posed by big data. As an important type of the online text, product reviews have evoked much research interest because of their commercial potential. This thesis takes Amazon camera reviews as the research focus and implements an automatic keyphrase extraction system. The system consists of three modules, including the Crawler module, the Extraction module, and the Web module. The Crawler module is responsible for capturing Amazon product reviews. The Web module is responsible for obtaining user input and displaying the final results. The Extraction module is the core processing module of the system, which analyzes product reviews according to the following sequence: (1) Pre-processing of review data, including removal of stop words and segmentation. ( 2) Candidate keyphrase extraction. Through the Spacy part-of speech tagger and Dependency parser, the dependency relationships of each review sentence are obtained, and then the feature and opinion words are extracted based on several predefined dependency rules. (3) Candidate keyphrase clustering. By using a Latent Dirichlet Allocation (LDA) model, the candidate keyphrases are clustered according to their topics . ( 4) Candidate keyphrase ranking. Two different algorithms, LDA-TFIDF and LDA-MT, are applied to rank the keyphrases in different clusters to get the representative keyphrases. The experimental results show that the system performs well in the task of keyphrase extraction.
Kokoelmat
  • Pro gradut [22391]
Kalevantie 5
PL 617
33014 Tampereen yliopisto
oa[at]uta.fi | Yhteydenotto | Tietosuoja
 

 

Selaa kokoelmaa

TekijätNimekkeetTiedekunta (2019-)Tiedekunta (2017 - 2018)Yksikkö (2011-2016)Tiedekunta (-2010)Oppiaineet ja tutkinto-ohjelmatAvainsanatJulkaisusarjatJulkaisuajatKokoelmat
Kalevantie 5
PL 617
33014 Tampereen yliopisto
oa[at]uta.fi | Yhteydenotto | Tietosuoja