Faculty MembersComputer Science

Software Engineering

Associate ProfessorMatsushita Makoto

Software Engineering

Computer Science

1998 Ph.D. (Engineering) Osaka University
1998 Research Assistant, Osaka University ( Graduate School of Engineering Science, Information Mathematical Major)
2002 Research Assistant, Osaka University (Graduate School of Information Science and Technology, Computer Science Major)
2005 Assistant Professor, Osaka University (Graduate School of Information Science and Technology, Computer Science Major)
2007 Associate Professor, Osaka University (Graduate School of Information Science and Technology, Computer Science Major)

Theme

Evaluation of Deep Learning-Based Code Clone Detection Methods using SemanticCloneBench

A source code fragment that has similar behavior despite being syntactically different is called a semantic clone. In recent years, many clone detection methods using deep learning have been proposed for the purpose of detecting semantic clones. However, BigCloneBench, which is a famous and huge benchmark in the field of clone detection, has many problems in semantic clone detection performance evaluation. In this study, we evaluate the detection performance of three deep learning-based clone detection methods: ASTNN, CodeBERT, and InferCode, using a benchmark called SemanticCloneBench, which is specialized for evaluating the performance of semantic clones. From the results of applied experiments, the F value of ASTNN was 0.921, which showed the highest detection performance among the detection methods evaluated. Additionally, as a result of investigating the range of thresholds that resulted in high F-values, it was found that the range of thresholds that resulted in high F-values for CodeBERT and InferCode was narrower than the range of thresholds that resulted in high F-values for ASTNN.

F-value curves for varying threshold values

Selecting Test Cases based on Similarity of Runtime Information: A Case Study of an Industrial Simulator

Regression testing is required to check the changes in behavior whenever developers make any changes to a software system. The cost of regression testing is a major problem because developers have to frequently update dependent components to minimize security risks and potential bugs. In this paper, we report a current practice in a company that maintains an industrial simulator as a critical component of their business. The simulator automatically records all the users’ requests and the simulation results in storage. The feature provides a huge number of test cases for regression testing to developers; however, their time budget for testing is limited (i.e., at most one night). Hence, the developers need to select a small number of test cases to confirm both the simulation result and execution performance are unaffected by an update of a dependent component. In other words, the test cases should achieve high coverage while keeping diversity of execution time. To solve the problem, we have developed a clustering-based method to select test cases,using the similarity of execution traces produced by them. The developers have used the method for a half year; they recognize that the method is better than the previous rule-based method used in the company.

Overview of our test case selection method

Program Analysis/Repository Mining/Deep Learning for Software Engineering

It is surprisingly important to know when, by whom, how, and for what purpose a program was written or modified. For example, when fixing a bug in a program, if there is a history of fixes to similar bugs in the past, it can be a clue to fixing the bug. Humans are smart enough that a skilled developer might come up with it quickly, but an inexperienced developer might have a hard time finding it. I am developing methods to analyze source code and its development process from various perspectives, as well as deep learning techniques based on how humans perceive the contents of source code from many angles. Using these methods, I am also analyzing the current status of various software development projects around the world and using the analysis results for future development in order to realize an environment that will make it easier for many developers to develop software.

Contact

E-mail: matusita@ist.

TEL: S4106

The four-digit phone numbers are extensions used inside Osaka University. The phone numbers from outside Osaka University are as follows: S: 06-6879-xxxx, S*: 06-6105-xxxx and T: 06-6850-xxxx.
The domain name is omitted from e-mail addresses. Please add “osaka-u.ac.jp” to each e-mail address.