Toronto Metropolitan University
c342a00a254dae26ac871e373f6e0e92.pdf (3.74 MB)

Understanding, Interpreting and Learning Representations in Deep Neural Networks

Download (3.74 MB)
posted on 2024-03-18, 18:08 authored by Md Amirul Islam

Deep Neural Networks (DNNs) have achieved state-of-the-art results in many computer vision tasks; however, DNNs have faced criticism for their lack of interpretability. Given the pervasiveness of DNNs in a multitude of applications, it is of paramount importance to fully understand the internal representations and behaviour of DNNs since safe and comprehensible utilization of DNN models is required before incorporating them into decision making processes for real-world applications. In this dissertation, we present several contributions towards understanding, interpreting, and learning representations in DNNs with an emphasis on studying absolute position information, interpreting latent representations to estimate certain semantic concepts, and learning robust representation. First, we study how much absolute position information is encoded in Convolutional Neural Networks (CNNs) as well as the source of this absolute position information. Our experiments reveal that a surprising degree of absolute position information is encoded in commonly used CNNs and zero padding enables CNNs to encode position information. Next, we analyze the relationship between boundary effects and padding in CNNs with respect to absolute position information. We also demonstrate how a CNN contains positional information in the latent representations if there exists a global pooling layer in the forward pass. We demonstrate that absolute position information is encoded based on the ordering of the channel dimensions, while semantic information is largely not. Second, we perform an empirical study on the ability of DNNs to encode shape information on a neuron-to-neuron and per-pixel level and show evidence that, while DNNs rely on texture information to recognize an object, a substantial amount of shape information is also encoded in DNNs. We further propose a new objective function for increasing a DNN’s ability to encode shape information by maximizing the mutual information between a network’s representations of two stylized images which share the same shape. Finally, we study the feature binding problem and present the first work which applies image blending to learn a robust representation for dense image labeling. Overall, we strongly believe the findings and demonstrated applications in this dissertation will benefit research areas concerned with understanding the different properties of DNNs.





  • Doctor of Philosophy


  • Computer Science

Granting Institution

Ryerson University

LAC Thesis Type

  • Dissertation

Thesis Advisor

Nariman Farsad & Neil Bruce



Usage metrics

    Computer Science (Theses)


    Ref. manager