Toronto Metropolitan University
Browse

Extracting Source Information From News Articles

Download (3.44 MB)
thesis
posted on 2024-03-18, 16:58 authored by Tabassum Sultana
A news article comprises information, facts, sources, reporters’ findings, and viewpoints. One of the factors for the credibility of news depends on source and source attribution. Many researchers have identified and attributed news sources for relevance and news reliability. The present work aims to build reliable software that can help journalists, researchers, or anyone curious about news contributors. First, we propose software that extracts contributor names and features describing the sources of information. Secondly, we use classification algorithms to assign sources to three categories, namely AUT (authority), EXP (expert) and OTH (others), as a first step in assessing the balance and breadth of the sourcing in a news article. Our results suggest that the software could perform 6-class categorization of sources accurately, given a more balanced data set. The preliminary software testing showed a recall of 73%, accuracy of 95% when identifying the source and overall accuracy of 78% when categorizing the source.

History

Language

eng

Degree

  • Master of Science

Program

  • Computer Science

Granting Institution

Ryerson University

LAC Thesis Type

  • Thesis

Thesis Advisor

Eric Harley

Year

2022

Usage metrics

    Toronto Metropolitan University

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC