In the last two decades, different types of biometric measures have been adopted by researchers to complement traditional research in software engineering. Starting with eye-tracking, electro dermal activity (EDA), and Electroencephalogram (EEG) – just to name a few, researchers have even adopted brain imaging techniques in the last decade using functional Near Infrared Spectroscopy (fNIRS) and functional Magnetic Resonance Imaging (fMRI). Why are we doing this? What devices are used? What data is recorded? What measures are calculated? And how do we analyze, visualize, and make sense of that data? Those are the type of questions that we will address in this talk.
Dr. Venera Arnaoudova is an Associate Professor in the School of Electrical Engineering and Computer Science (EECS) at Washington State University (WSU). Her research interest is in the domain of software engineering, and in particular, empirical software engineering, program comprehension, software evolution, and poor and good software practices. Her long-term research goal is to understand how human factors impact the cost and quality of the software they develop. In her research, she applies methods from Machine Learning, Natural Language Processing, Machine Translation, Neurocognitive Science, and others to Software Engineering.
Dr. Arnaoudova is part of the editorial board of the Empirical Software Engineering Journal (EMSE) and the Journal of Systems and Software (JSS) and part of the review board IEEE Transactions on Software Engineering (TSE). She serves or has served as a program committee member for ESEC/FSE, ICSE, ICPC, ICSME, MSR, SANER, and others.
At WSU, Dr. Arnaoudova is the Chair of the Computer Science Curriculum Committee and the Industry Engagement Fellow within the Innovation and Research Engagement Office (IREO) for Computer Science and Software Engineering.
In this talk, we'll take an in-depth look at the award-winning study, 'First come first served: the impact of file position on code review.' The study investigates whether the order of files in code review tools can affect the review outcomes. This interactive talk will provide an extensive explanation of the research methods used in the study. We'll analyze each decision that shaped the study, the reasons behind these decisions, and the knowledge gained from this research approach. We'll explore the research paper's two-part process to demonstrate how the study's methods and their foundational principles can be applied to various empirical software engineering research scenarios. The goal of this session is to equip students with the hands-on experience necessary to plan, carry out, and evaluate their empirical studies, and help them approach real-world software development problems from a rigorous scientific perspective. First, we'll explain the two-step procedure that analyzed numerous Pull Requests from well-known Java projects on GitHub. We'll clarify how the data was gathered, managed, and understood. We'll also shed light on how to manage large-scale data effectively and control for variables that might distort the results. Then, we'll examine the design and implementation of a controlled experiment that involved 106 participants. We'll focus on elements like the selection of defects, experimental controls, assignment of treatments, and data analysis. We'll thoroughly discuss the process of transforming raw data into meaningful insights.
Alberto Bacchelli is an associate professor of Empirical Software Engineering of the Department of Informatics in the Faculty of Business, Economics and Informatics at the University of Zurich, Switzerland. He received his bachelor’s and master’s degrees in computer science from the University of Bologna, Italy, and his Ph.D. in Software Engineering from the Università della Svizzera italiana, Switzerland.
His broader research vision is to innovate software engineering, through fundamental empirical research and software tools. His goal is to increase our scientific knowledge of today's software development and to design, based on strong empirical evidence and theory, the right tools, languages, and development environments for high-quality software engineering. He has received the MSR Ric Holt Early Career Achievement Award 2020 for his seminal contributions to modern code review. He has received the 10-year Most Influential Paper award from SANER for his work on extending IDEs with an AI-based recommendation tool. He is the recipient of in total eight Best Paper Awards and ACM SIGSOFT Distinguished Paper Awards, awarded from the top academic venues in software engineering and computer-supported collaborative work.
As we surf the crest of the latest AI tidal wave, we developers find ourselves savoring our portion of the technological banquet. Enthusiasm is boiling over, with shiny toys such as GitHub Copilot and ChatGPT heralding promises of skyrocketing our productivity. At CodeLounge, we have been using Copilot for a year, while also collecting data about its usage by means of Tako, an IDE telemetry collector & analyzer that we developed. In this talk, starting from this internal data, we will reflect on the impact that such tools had and might have on the structure and daily activity of a development team. Through various programming and data analysis tasks, we will try to shed some light into the line between promise and reality, connecting the dots to incisive critiques from leading researchers who question the capabilities of Large Language Models in a more general context. Can these parrots be our perfect pirate companions, finally taking up the cudgels against the tedium of our craft, like copy-pasting from Stack Overflow? Which strategies and best practices can be adopted as developers to do work, but also as team lead to structure the development process?
*: An anonymous stochastic parrot may have or may have not contributed to this abstract.
Marco D’Ambros is the director of CodeLounge, the center for software research and development of the Software Institute, Università della Svizzera italiana, that combines expertise from academia and from industry. After obtaining a PhD in the area of mining software repositories in 2010, Marco worked at Palantir Technologies until 2018, a leading Silicon Valley data mining firm, helping government organizations and large enterprises making sense of their large and dispersed data, and leading the technical execution of projects around the globe. In 2020, he was awarded the MSR 2010 MIP award for his work on bug prediction.
This talk will not give the answer to “The Ultimate Question of Life, The Universe, and Everything”. We already know the answer to that question; it is “42”.
However, this talk will answer several other questions that Ph.D. students may ask themselves at various stages of their doctoral studies. Some of these questions are:
A former Fulbright Scholar, born and raised in Romania, Andrian Marcus is now a Professor in the Department of Computer Science at The George Mason University. He obtained his Ph.D. in Computer Science from Kent State University (US), and has prior degrees in Computer Science and European Studies from The University of Memphis (US) and Babes-Bolyai University (Cluj-Napoca, Romania). In 2021 he was named Distinguished Alumnus of the Department of Mathematics and Computer Science at Babes-Bolyai University.
His research interests are in software engineering, focusing on program understanding and software evolution. He is best known for his work on using text retrieval and analysis techniques on software corpora for supporting comprehension during software evolution. Professionally, he is most proud of his outstanding current and past doctoral students and finds mentoring to be the most rewarding part of the academic career. Over time, their joint research earned six Best/Distinguished Paper Awards and seven Most Influential Paper Awards at software engineering conferences.
His professional service includes serving on the Steering Committees of the IEEE International Conference on Software Maintenance and Evolution (ICSME) and of the IEEE Working Conference on Software Visualization (VISSOFT). He was the General Chair and the Program Co-chair of ICSME in 2011 and 2010, respectively, and Program Co-Chair for other conferences (ICPC'09, VISSOFT'13, SANER'17). He currently serves on the editorial board of the Journal of Software: Evolution and Process. He has also served on the editorial board of the IEEE Transactions on Software Engineering (2014-2018) and the Empirical Software Engineering Journal (2010-2021).
As we surf the crest of the latest AI tidal wave, we developers find ourselves savoring our portion of the technological banquet. Enthusiasm is boiling over, with shiny toys such as GitHub Copilot and ChatGPT heralding promises of skyrocketing our productivity. At CodeLounge, we have been using Copilot for a year, while also collecting data about its usage by means of Tako, an IDE telemetry collector & analyzer that we developed. In this talk, starting from this internal data, we will reflect on the impact that such tools had and might have on the structure and daily activity of a development team. Through various programming and data analysis tasks, we will try to shed some light into the line between promise and reality, connecting the dots to incisive critiques from leading researchers who question the capabilities of Large Language Models in a more general context. Can these parrots be our perfect pirate companions, finally taking up the cudgels against the tedium of our craft, like copy-pasting from Stack Overflow? Which strategies and best practices can be adopted as developers to do work, but also as team lead to structure the development process?
*: An anonymous stochastic parrot may have or may have not contributed to this abstract.
Andrea Mocci is a Junior Group Leader at CodeLounge, a R&D group headed by Dr. Marco D’Ambros and Prof. Dr. Michele Lanza. His main responsibilities include being the tech lead for CodeLounge’s team and projects, and doing some development, mostly on the backend side, including machine learning and natural language processing. He is passionate about software design, software quality, and functional programming in many flavors and languages. In the past, Andrea has been a postdoctoral researcher at USI Lugano and at MIT. He got his B.Sc., M.Sc. and PhD at Politecnico di Milano, where he has been advised by Prof. Carlo Ghezzi.
Together, we will take a deep dive and follow a less traveled path through the landscape of research methods in software engineering research. We will explore who our research aims to impact, what kinds of contributions we can expect from our research, and how we can use innovative research methods. Some of the topics we will dive into include design science as a frame for software engineering research, the benefits and challenges of using mixed methods in software engineering, and how to uncover the potential but not always obvious or positive disruptive impacts of novel technologies (such as generative AI and VR) on software engineering practice. After this talk, you should feel more empowered to pursue ambitious and impactful research using innovative research methods.
Dr. Margaret-Anne Storey is a Professor of Computer Science at the University of Victoria, Canada and a Canada Research Chair in Human and Social Aspects of Software Engineering. Together with her students and collaborators, she seeks to understand how software tools, communication media, data visualizations, and social theories can be leveraged to improve how software engineers and knowledge workers explore, understand, analyze, create and share complex information and knowledge. She collaborates extensively with large and small software companies to ensure real-world applicability of her research contributions and tools. She is passionate about improving developer productivity and developer experience using novel and insightful research methods.
In recent years, Artificial Intelligence (AI) has become increasingly popular in the field of Software Engineering (SE), where it has been used to automate and improve various SE tasks. One of the most exciting developments in this area is the use of Large Language Models (LLMs) to solve complex SE problems. In this talk, we will explore the different techniques employed to use LLMs in AI for SE, including finetuning, prompt engineering, prompt augmentation with retrieval, and reinforcement learning with human feedback (RLHF) for SE tasks. Will examine the benefits and limitations of each approach and discuss real-world examples of their applications, such as in Automated Program Repair (APR). By the end of the talk, you will have a better understanding of how LLMs can be used in AI for SE and the potential impact they can have on the industry.
Michele Tufano is a Senior Research Scientist in the Data & AI group at Microsoft. With a focus on automating software engineering tasks, Michele designs, trains, and evaluates models and algorithms for tasks such as Automated Test Generation, Program Repair, Software Maintenance, and more. Currently, Michele works towards improving developers' productivity through AI-based tools that use data to understand code, program semantics, and developers' intentions. Before joining Microsoft, Michele earned a Ph.D. degree at William & Mary. His thesis on Neural Machine Translation applied to Software Engineering tasks.
Software languages are more than their (in)formal specifications. They are also cultural artefacts that reflect the preferences and idioms of their users. In this session, we will explore some of the challenges and opportunities of seeing a language beyond its grammar. We will start by discussing how to determine the language of a program before parsing it, using various kinds of cues. Then, we will move on to capturing and analysing implicit rules and traditions that govern the style and structure of code, such as naming conventions, interfaces, idioms, implementation patterns, etc. Finally, we will address the issue of measuring and improving the quality of code in a specific language, taking into account its features and best practices. We will contemplate some examples of idiomatic and non-idiomatic code, and some methods for detecting and correcting them. The goal of this session is to demonstrate that seeing a language as a rich and diverse phenomenon can lead to new insights and applications for software engineering.
I am an Associate Professor of software evolution at the University of Twente, working in software analysis, modelling and restructuring since 2004; before that I was a machine code hacker and a railway engineer. My past affiliations include Dutch, Belgian, German and Russian companies and research institutions, as well as volunteer participation at Wikimedia activities. My research interests gravitate towards elicitation of structure in software and improving it by taking advantage of whatever structure is present. At my previous job as a Chief Science Officer, my day to day activities involved developing compilers, writing metaprograms and analysing migration projects. My current focus is on doing industrially relevant research from the academia, teaching several courses, supervising students and developing prototype software. I am also a Programme Director, managing computer science educational programmes of several levels and specialisations, spanning over around 2000 students.
Washington State University
University of Zurich
CodeLounge
George Mason University
CodeLounge
University of Victoria
Microsoft
University of Twente
The two Students' Talks Sessions give participants the opportunity to present their work in front of the summer school's audience. It is an excellent opportunity to introduce yourself, your work, and your team, practice, get feedback from others, including our senior participants and speakers, and seek future collaborations.
We plan to schedule a timeslot of 8 minutes for each participant, including Q&A. Further details will be announced shortly.
Social Events
SIESTA is known for its amazing social events: dinners in excellent restaurants, cocktails on the beach, excursions, and guided tours. Here you can find a printable guide to the Social Events part of the official program.
Evening Meet-Up (Extra Activity)
Our evening meet-up will be held at The Trinity Irish Pub, a casual Irish influenced pub with a simple menu, a pool table, and a selection of beers and whiskies. It will be a great opportunity to get to know each other along with some drinks and appetizers.
Please note that this "Extra Activity" is not part of the official program and, as such, the costs associated with it are not included in the registration fee.
Hike (Extra Activity)
There are many beautiful hiking trails around Lugano, surrounded by mountains that offer stunning views of Lake Lugano and Lake Maggiore. We will be going on an all-day hike to Monte San Giorgio, a UNESCO World Cultural Heritage site, also known as the "Fossil Mountain".
We will mostly follow the Sentiero Lago di Lugano. The hike will take approximately 5 hours, excluding public transportation. The trail is medium grade, covering approximately 15km with some uphill (~800m) and downhill sections.
Please note that this "Extra Activity" is not part of the official program and, as such, the costs associated with it are not included in the registration fee.
Reception
On the first day, we will have a chance to get to know each other during a Reception (a.k.a. standing dinner) at Luini6 Bistrot. Luini6 Bistrot is located next to the Lugano Arte e Cultura (LAC) cultural center. The setting is peculiar, surrounded by works of art with the modern and neoclassical architecture of the beautiful Piazza Luini overlooking Lake Lugano.
Social Dinner
The social dinner will be held at Grotto dei Pescatori. The grotto is opposite Lugano on the other side of Lake Lugano. They serve great typical food in a relaxing atmosphere in the middle of the natural environment. We will reach Grotto dei Pescatori together via a short scenic boat ride across the lake. We will split in two groups with different departure times from Lugano: the first at 17:55 and the second at 18:25. The meeting point is the same for both groups and it is located here. Please be on time.
Closing Lunch
We will close SIESTA 2023 by eating a traditional Neapolitan pizza at Anema & Core, a pizzeria a few minutes’ walk from the venue. On this occasion we will also hold a little awards ceremony.