Valais/Wallis AI Workshop 5th edition


08:45 - 09:00
09:00 - 10:00
Keynote speech: Prof. Pena Carlos Andrés , HEIG-VD Methods for Rule and Knowledge Extraction from Deep Neural Networks

Abstract: Artificial deep neural networks are a powerful tool, able to extract information from large datasets and, using this acquired knowledge, make accurate predictions on previously unseen data. As a result, they are being applied in a wide variety of domains ranging from genomics to autonomous driving, from speech recognition to gaming. Many areas, where neural network-based solutions can be applied, require a validation, or at least some explanation, of how the system makes its decisions. This is especially true in the medical domain where such decisions can contribute to the survival or death of a patient. Unfortunately, the very large number of parameters required by deep neural networks is extremely challenging to cope with for explanation methods, and these networks remain for the most part black boxes. This demonstrates the real need for accurate explanation methods able to scale with this large quantity of parameters and to provide useful information to a potential user. Our research aims at providing tools and methods to improve the interpretability of deep neural networks.

10:00 - 10:15
Hannah Muckenhirn , Idiap Research Institute Visualizing and understanding raw speech modeling with convolutional neural networks

Abstract: A recent trend in audio and speech processing consists in training neural networks on raw waveforms for various classification tasks. While this approach has been shown to perform well, there is limited understanding of what kind of information is learned from the waveforms by the neural networks. Such an insight is not only interesting for advancing those techniques but also for understanding better audio and speech signal characteristics.
In this talk, taking inspirations from vision community, I will present a gradient-based visualization method that could provide insight into which spectral characteristics in a given input have the highest impact on the prediction score. I will demonstrate the potential of the proposed approach on two classification tasks: phoneme recognition and speaker identification.

10:15 - 10:30
Mara Graziani, HES-SO Valais-Wallis Concept Measures to Explain Deep Learning Predictions in Medical Imaging

Abstract: Human decisions are based on different parameters than those used in machine learning algorithms. The internal features of Deep Neural Networks (DNNs), for example, may have no semantic meaning and appear rather incomprehensible to us. By contrast, visual features such as shapes, textures or colors have easier interpretation, as they semantically describe the image. This talk will focus on how to generate human-centric explanations by attributing the network decisions to arbitrary concepts rather than input pixels or internal features. Particularly, continuous measures are used to quantify the presence of a concept in the input image, which is then used to explain the network. In such way, explanations relate to the high-level semantics of the application domain and have higher objectivity than traditional visual explanations. For instance, results on medical imaging tasks show that the latent space of DNNs can be interpreted using the semantics and clinical parameters used by clinicians.

10:30 - 10:45
Suraj Srinivas, Idiap Research Institute What do neural network saliency maps encode?

Abstract: Saliency methods are popular explanatory tools that help understand the important input features that influence predictions of a deep neural network. However many recent methods do not ground saliency maps to the underlying neural net function. As a result, visually appealing saliency maps can encode aspects of the model not important to decision making, or directly recover parts of the underlying image. We propose the full-Jacobian visualization, which grounds saliency maps precisely to the neural net function outputs. This produces sharper saliency maps, and passes sanity checks showing that these maps indeed recover aspects of the model important to decision making.

10:45 - 11:00
Dr Vincent Andrearczyk, HES-SO Valais-Wallis Transparency of rotation-equivariant CNNs via local geometric priors

Abstract: The weight sharing and local connectivity of Convolutional Neural Networks (CNNs), as compared to densely connected networks, bring some extent of decomposability and algorithmic transparency together with the efficient design and sought equivariance to translation. Various visualization and interpretability methods were derived from this design, attempting to pierce the black-box character of deep CNNs. Convolutional filters are, however, scale and rotation selective (among others) thus CNNs lack transparency as hidden features vary with such geometric transformations of the images. In recent works, such prior on data symmetries has been exploited by hard-coding equivariance/invariance in the CNN architecture. Besides simplifying the learning process, the enforced geometric structure of the hidden features improves the transparency of the network where transformations in the inputs result in predictable transformations in the activations. Part of this talk will focus on invariance to local rotation, as opposed to the generally desired global rotation invariance (e.g. in object detection and classification tasks). This local invariance is fundamental, among others, in medical imaging where local structures of tissues occur at arbitrary rotations. In particular, our recent work exploits 3D steerable filters to efficiently obtain this invariance. Results on 3D synthetic textures and pulmonary nodule classification in CT show an improved performance over a standard 3D CNN as well as a reduction of trainable parameters.

11:00 - 11:30
11:30 - 11:45
Dr Sylvain Calinon , Idiap Research Institute Interpretable models of robot motion learned from few demonstrations

Abstract: Many human-centered robotics applications would benefit from the development of robots that can acquire new skills by interaction with humans. Such interaction requires that the skills learned by the robots can be interpreted by the users. Such transparency allows the user to dynamically assess the learning progress. In this way, the user can provide new data (in the form of demonstrations or corrections) that specifically target the current shortfalls in the model of the skill to be acquired.

Such iterative learning challenges require the development of intuitive interfaces to acquire meaningful demonstrations, the development of movement representations that can exploit the structure and geometry of the acquired data in an efficient and interpretable way, and the development of control techniques that can exploit the possible variations and coordinations in movements. The developed models need to serve several purposes (recognition, prediction, generation), and be compatible with different learning strategies (imitation, exploration).

I will illustrate these challenges with various applications, including robots that are close to us (human-robot collaboration, robot for dressing assistance), part of us (prosthetic hand control from EMG and tactile sensing), or far from us (teleoperation of bimanual robot in deep water).

Biography: Dr Sylvain Calinon is a Senior Researcher at the Idiap Research Institute and a Lecturer at the Ecole Polytechnique Federale de Lausanne (EPFL). From 2009 to 2014, he was a Team Leader at the Department of Advanced Robotics, Italian Institute of Technology. From 2007 to 2009, he was a Postdoc in the Learning Algorithms and Systems Laboratory, EPFL, where he obtained his PhD in 2007. He currently serves as Associate Editor in IEEE Transactions on Robotics (T-RO) and IEEE Robotics and Automation Letters (RA-L). Webpage:

11:45 - 12:00
Xavier Ouvrard, University of Geneva / CERN The HyperBagGraph DataEdron: An Enriched Browsing Experience of Scientific Publication Databa

Abstract: Conducting a bibliographic research is commonly achieved using verbatim search-engine interfaces. We propose an interactive 2.5D visualisation interface for navigating data, formulating complex visual queries and performing contextual searches. Multiset families, called HyperBagGraphs, are used to model co-occurrence networks of online search outputs and visualise the different dimensional perspectives as hbgraphs. This approach - generalisable to any kind of datasets, and in particular to medical datasets - is currently applied to online queries on Arxiv, enriched of multiple other online ressources.

12:00 - 12:15
Seyed Moosavi from Signal Processing Laboratory 4 (LTS4), EPFL Improving robustness to build more interpretable classifiers

Abstract: Deep neural networks, despite their huge success in solving complex visual tasks, are generally considered as ``black-box'' models. In particular, while they outperform humans in natural image classification tasks, they are shown to be extremely vulnerable to well-sought small perturbations in the data, called adversarial perturbations. Such phenomena indicate that deep networks might only rely on superficial features, as opposed to semantically meaningful features, to discriminate between different classes. Therefore, improving robustness of deep networks to adversarial perturbations is a crucial step towards building more interpretable models.
To achieve more robust classifiers, I will propose a new regularizer that directly minimizes curvature of the loss surface of deep networks, and leads to adversarial robustness that is on par with adversarial training. Besides being a more efficient and principled alternative to adversarial training, a network trained with our method exhibits visually meaningful adversarial examples, as perturbed images do resemble images from the adversary class.

12:15 - 12:30
Sooho Kim from UniGe Interpretation of End-to-end one Dimension Convolutional Neural Network for Fault Diagnosis on a Planetary Gearbox

Abstract: This research proposes the end-to-end one dimension convolutional neural network for fault diagnosis of a planetary gearbox, and interprets the latent feature spaces with signal and system analysis. A synthetic and experiment vibration data acquired from a planetary gearbox under various health states are used for the demonstration of the proposed method. The interpretation shows that the feature engineering process in CNN is identical with domain-knowledge based diagnosis by concentrating on physical properties of vibration. From this result, it is expected that the usage of CNN on health diagnosis of industrial system acquires its adequacy by explaining that CNN is trained to focus on physics corresponding to previous domain-knowledge.

12:30 - 14:00

Document Actions