Thesis: Krylov iterative methods for the geometric mean of two matrices times a vector
Supervisor: Dr. Bruno Iannazzo
Final mark: 110/110 with honors
• Autonomous Agents and Multi-Agent Systems 35(25), 53 pages, Springer Nature, 07 June 2021 [pdf] [.bib]
• Extended Abstract in Proceedings of the 18th International Conference on Autonomous Agents and Multiagent Systems AAMAS'19, 1862-1864, IFAAMAS, 2019 [pdf] [.bib]
Recent years have seen the application of deep reinforcement learning techniques to cooperative multi-agent systems, with great empirical success. However, given the lack of theoretical insight, it remains unclear what the employed neural networks are learning, or how we should enhance their learning power to address the problems on which they fail. In this work, we empirically investigate the learning power of various network architectures on a series of one-shot games. Despite their simplicity, these games capture many of the crucial problems that arise in the multi-agent setting, such as an exponential number of joint actions or the lack of an explicit coordination mechanism. Our results extend those in Castellini et al. (Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, AAMAS’19.International Foundation for Autonomous Agents and Multiagent Systems, pp 1862–1864, 2019) and quantify how well various approaches can represent the requisite value functions, and help us identify the reasons that can impede good performance, like sparsity of the values or too tight coordination requirements.
• Extended Abstract in Proceedings of the 20th International Conference on Autonomous Agents and Multiagent Systems AAMAS'21, 1475-1477, IFAAMAS, 2021 [pdf] [.bib]
• arXiv, 21 December 2020 [pdf] [.bib]
• Best Paper Award at ALA'21, 03-04 May 2021 [pdf] [video] [slides]
Policy gradient methods have become one of the most popular classes of algorithms for multi-agent reinforcement learning. A key challenge, however, that is not addressed by many of these methods is multi-agent credit assignment: assessing an agent's contribution to the overall performance, which is crucial for learning good policies. We propose a novel algorithm called Dr.Reinforce that explicitly tackles this by combining difference rewards with policy gradients to allow for learning decentralized policies when the reward function is known. By differencing the reward function directly, Dr.Reinforce avoids difficulties associated with learning the Q-function as done by Counterfactual Multiagent Policy Gradients (COMA), a state-of-the-art difference rewards method. For applications where the reward function is unknown, we show the effectiveness of a version of Dr.Reinforce that learns an additional reward network that is used to estimate the difference rewards.
One of the main problems encountered so far with recurrent neural networks is that they struggle to retain long-time information dependencies in their recurrent connections. Neural Turing Machines (NTMs) attempt to mitigate this issue by providing the neural network with an external portion of memory, in which information can be stored and manipulated later on. The whole mechanism is differentiable end-to-end, allowing the network to learn how to utilise this long-term memory via SGD. This allows NTMs to infer simple algorithms directly from data sequences. Nonetheless, the model can be hard to train due to a large number of parameters and interacting components and little related work is present. In this work we use a NTM to learn and generalise two arithmetical tasks: binary addition and multiplication. These tasks are two fundamental algorithmic examples in computer science, and are a lot more challenging than the previously explored ones, with which we aim to shed some light on the capabilities on this neural model.
Gaining followers on the Twitter platform has become a rapid way to increase one’s credibility on this social network, that in the last few years has become a launch pad for new trends and to influence people opinions. So, many people have begun to buy fake followers on underground markets appositely created to sold them. Therefore, identifying fake followers profiles is useful to maintain the balance between real influential people on the network and people who simply exploited this mechanism. This work presents a model based on artificial neural networks able to detect fake Twitter profiles. In particular, a denoising autoencoder has been implemented as anomaly detector trained with a semi-supervised learning approach. The model has been tested on a benchmark already used in literature and results are presented.
In this work, we are presenting an efficient way to compute the geometric mean of two positive definite matrices times a vector. For this purpose, we are inspecting the application of methods based on Krylov spaces to compute the square root of a matrix. These methods, using only matrix-vector products, are capable of producing a good approximation of the result with a small computational cost.
J [dot] Castellini [at] liverpool [dot] ac [dot] uk
I am currently a member of the smARTLab
Room G12, Department of Computer Science, University of Liverpool
Ashton Building, Ashton Street, Liverpool, United Kingdom, L69 3BX
I haven't got an English phone number yet, I'm sorry...