SlideShare a Scribd company logo
1 of 20
CORRECT: CODE REVIEWER
RECOMMENDATION AT GITHUB FOR
VENDASTA TECHNOLOGIES
Mohammad Masudur Rahman, Chanchal K. Roy,
Jesse Redl$ and Jason A. Collins*
Department of Computer Science
University of Saskatchewan, Canada
Vendasta Technologies$, Canada, Google Inc.*, USA
31st IEEE/ACM International Conference on
Automated Software Engineering (ASE 2016), Singapore
PEER CODE REVIEW
2
Code review is a systematic examination of source
code for detecting bugs or defects and coding
rule violations.
Early bug detection
Stop coding rule violation
Enhance developer skill
Peer Code Review
PULL REQUEST (CODE CHANGES)
SUBMISSION AT GITHUB
3
Change title
Change
description
Member mention
feature
Whom should I choose?
Well, where there is a will, there is a way!
TRADITIONAL WAY: CHOOSE A CODE
REVIEWER
4
NOT Productive at all!
FOR:
5
Novice developers
Distributed software development
Delayed reviews for 12 days
(Thongtanunam et al, SANER 2015)
WHY?
EVEN MORE CHALLENGES!!
WHAT DO WE NEED?
 Recommendation Tool
 Recommends appropriate code reviewers
 Recommends automatically
 Does all heavy lifting (i.e., mining) for the developers.
 Provides recommendation rationale
 Fits within developer’s work flow
 Advanced Features
 Provides personalized recommendation
 Provides optimized performance
 Architecture
 Platform-independent & scalable 6
CORRECT: CODE REVIEWER RECOMMENDATION
AT GITHUB FOR
VENDASTA TECHNOLOGIES
7
WALKTHROUGH WITH CORRECT–
NEW PULL REQUEST
8
CORRECT
Code reviewers
Rationale
WALKTHROUGH WITH CORRECT—
EXISTING PULL REQUEST
9
Existing PR
Code reviewers
RefreshMatched
WALKTHROUGH WITH CORRECT—
ADVANCED FEATURES
10
Open authentication
Parallel/optimized processing
Client-server architecture
CORRECT: CODE REVIEWER
RECOMMENDATION (RAHMAN ET AL, ICSE 2016)
11
R1 R2
R3
PR Review R1 PR Review R2
PR Review R3
Review
Similarity
Review
Similarity
EXISTING LITERATURE
 Line Change History (LCH)
 ReviewBot (Balachandran, ICSE 2013)
 File Path Similarity (FPS)
 RevFinder (Thongtanunam et al, SANER 2015)
 FPS (Thongtanunam et al, CHASE 2014)
 Tie (Xia et al, ICSME 2015)
 Code Review Content and Comments
 Tie (Xia et al, ICSME 2015)
 SNA (Yu et al, ICSME 2014)
12
 Issues & Limitations
 Mine developer’s contributions from
within a single project only.
 Library & Technology Similarity
Library
Technology
OUR CONTRIBUTIONS
13
State-of-the-art (Thongtanunam et al, SANER 2015)
IF
IF
Our proposed technique--CORRECT
= New PR = Reviewed PR = Source file
= External library & specialized technology
LIBRARY EXPERIENCE & TECHNOLOGY
EXPERIENCE (ANSWERED: RQ1)
Metric Library Similarity Technology Similarity Combined Similarity
Top-3 Top-5 Top-3 Top-5 Top-3 Top-5
Accuracy 83.57% 92.02% 82.18% 91.83% 83.75% 92.15%
MRR 0.66 0.67 0.62 0.64 0.65 0.67
MP 65.93% 85.28% 62.99% 83.93% 65.98% 85.93%
MR 58.34% 80.77% 55.77% 79.50% 58.43% 81.39%
14
[ MP = Mean Precision, MR = Mean Recall, MRR = Mean Reciprocal Rank ]
 Both library experience and technology experience are
found as good proxies, provide over 90% accuracy.
 Combined experience provides the maximum performance.
 92.15% recommendation accuracy with 85.93% precision and
81.39% recall.
 Evaluation results align with exploratory study findings.
COMPARATIVE STUDY FINDINGS (ANSWERED:
RQ2)
 CoRReCT performs better than the competing technique in all
metrics (p-value=0.003<0.05 for Top-5 accuracy)
 Performs better both on average and on individual projects.
 RevFinder uses PR similarity using source file name and file’s
directory matching
15
Metric RevFinder[18] CoRReCT
Top-5 Top-5
Accuracy 80.72% 92.15%
MRR 0.65 0.67
MP 77.24% 85.93%
MR 73.27% 81.39%
[ MP = Mean Precision, MR = Mean Recall,
MRR = Mean Reciprocal Rank ]
COMPARISON ON OPEN SOURCE PROJECTS
(ANSWERED: RQ3)
 In OSS projects, CoRReCT also performs better than the
baseline technique.
 85.20% accuracy with 84.76% precision and 78.73% recall,
and not significantly different than earlier (p-value=0.239>0.05
for precision)
 Results for private and public codebase are quite close.
16
Metric RevFinder [18] CoRReCT (OSS) CoRReCT (VA)
Top-5 Top-5 Top-5
Accuracy 62.90% 85.20% 92.15%
MRR 0.55 0.69 0.67
MP 62.57% 84.76% 85.93%
MR 58.63% 78.73% 81.39%
[ MP = Mean Precision, MR = Mean Recall, MRR = Mean Reciprocal Rank ]
SUMMARY
 CORRECT: A Recommendation Tool
 Recommends appropriate code reviewers
 Recommends automatically
 Does all heavy lifting (i.e., mining) for the developers.
 Provides recommendation rationale
 Fits within developer’s work flow
 Advanced Features
 Provides personalized recommendation
 Provides optimized performance
 Architecture
 Platform-independent & scalable 17
HANDS ON CORRECT
18
You are cordially invited!
THANK YOU!! QUESTIONS?
19
Masud Rahman (masud.rahman@usask.ca)
CORRECT site (http://www.usask.ca/~masud.rahman/correct)
Acknowledgement: This work is supported by NSERC
THREATS TO VALIDITY
 Threats to Internal Validity
 Skewed dataset: Each of the 10 selected projects is
medium sized (i.e., 1.1K PR) except CS.
 Threats to External Validity
 Limited OSS dataset: Only 6 OSS projects considered—
not sufficient for generalization.
 Issue of heavy PRs: PRs containing hundreds of files can
make the recommendation slower.
 Threats to Construct Validity
 Top-K Accuracy: Does the metric represent effectiveness
of the technique? Widely used by relevant literature
(Thongtanunam et al, SANER 2015)
20

More Related Content

What's hot

A Multi-Objective Refactoring Approach to Introduce Design Patterns and Fix A...
A Multi-Objective Refactoring Approach to Introduce Design Patterns and Fix A...A Multi-Objective Refactoring Approach to Introduce Design Patterns and Fix A...
A Multi-Objective Refactoring Approach to Introduce Design Patterns and Fix A...Ali Ouni
 
Thomas Axen - Lean Kaizen Applied To Software Testing - EuroSTAR 2010
Thomas Axen - Lean Kaizen Applied To Software Testing - EuroSTAR 2010Thomas Axen - Lean Kaizen Applied To Software Testing - EuroSTAR 2010
Thomas Axen - Lean Kaizen Applied To Software Testing - EuroSTAR 2010TEST Huddle
 
Boost Your IT Career with IEEE's Software Engineering Certifications
Boost Your IT Career with IEEE's Software Engineering Certifications Boost Your IT Career with IEEE's Software Engineering Certifications
Boost Your IT Career with IEEE's Software Engineering Certifications Ganesh Samarthyam
 
CP-SAT - Certified Professional Selenium Automation Testing
CP-SAT - Certified Professional Selenium Automation TestingCP-SAT - Certified Professional Selenium Automation Testing
CP-SAT - Certified Professional Selenium Automation TestingAgile Testing Alliance
 
Bart Knaack - The Truth About Model-Based Quality Improvements
Bart Knaack - The Truth About Model-Based Quality ImprovementsBart Knaack - The Truth About Model-Based Quality Improvements
Bart Knaack - The Truth About Model-Based Quality ImprovementsTEST Huddle
 
A Conceptual Framework for the Comparison of Fully Automated GUI Testing Tech...
A Conceptual Framework for the Comparison of Fully Automated GUI Testing Tech...A Conceptual Framework for the Comparison of Fully Automated GUI Testing Tech...
A Conceptual Framework for the Comparison of Fully Automated GUI Testing Tech...REvERSE University of Naples Federico II
 
ISTQB CTFL Series - Overview
ISTQB CTFL Series - OverviewISTQB CTFL Series - Overview
ISTQB CTFL Series - OverviewDisha Srivastava
 
Mats Grindal - Risk-Based Testing - Details of Our Success
Mats Grindal - Risk-Based Testing - Details of Our Success Mats Grindal - Risk-Based Testing - Details of Our Success
Mats Grindal - Risk-Based Testing - Details of Our Success TEST Huddle
 
02 - Testing Management - Crash Slides
02 - Testing Management - Crash Slides02 - Testing Management - Crash Slides
02 - Testing Management - Crash SlidesSamer Desouky
 
Agile testing alliance cp aat highlights 1.2
Agile testing alliance cp aat highlights 1.2Agile testing alliance cp aat highlights 1.2
Agile testing alliance cp aat highlights 1.2Agile Testing Alliance
 
Defect Prevention & Predictive Analytics - XBOSoft Webinar
Defect Prevention & Predictive Analytics - XBOSoft WebinarDefect Prevention & Predictive Analytics - XBOSoft Webinar
Defect Prevention & Predictive Analytics - XBOSoft WebinarXBOSoft
 
Istqb foundation level
Istqb foundation levelIstqb foundation level
Istqb foundation levelLe Trung Hieu
 
Quality and Testing of AI Algorithms - Enterprise Deep Learning
Quality and Testing of AI Algorithms - Enterprise Deep LearningQuality and Testing of AI Algorithms - Enterprise Deep Learning
Quality and Testing of AI Algorithms - Enterprise Deep LearningSam Putnam [Deep Learning]
 
Databasedemo3
Databasedemo3Databasedemo3
Databasedemo3Alex Jou
 
Automated testing of software applications using machine learning edited
Automated testing of software applications using machine learning   editedAutomated testing of software applications using machine learning   edited
Automated testing of software applications using machine learning editedMilind Kelkar
 
Estimating test effort part 1 of 2
Estimating test effort part 1 of 2Estimating test effort part 1 of 2
Estimating test effort part 1 of 2Ian McDonald
 
Combinatorial testing
Combinatorial testingCombinatorial testing
Combinatorial testingKedar Kumar
 
Your Tests are Lying to You - Improving your Testing by Testing What Really M...
Your Tests are Lying to You - Improving your Testing by Testing What Really M...Your Tests are Lying to You - Improving your Testing by Testing What Really M...
Your Tests are Lying to You - Improving your Testing by Testing What Really M...Brian Childress
 

What's hot (20)

A Multi-Objective Refactoring Approach to Introduce Design Patterns and Fix A...
A Multi-Objective Refactoring Approach to Introduce Design Patterns and Fix A...A Multi-Objective Refactoring Approach to Introduce Design Patterns and Fix A...
A Multi-Objective Refactoring Approach to Introduce Design Patterns and Fix A...
 
Thomas Axen - Lean Kaizen Applied To Software Testing - EuroSTAR 2010
Thomas Axen - Lean Kaizen Applied To Software Testing - EuroSTAR 2010Thomas Axen - Lean Kaizen Applied To Software Testing - EuroSTAR 2010
Thomas Axen - Lean Kaizen Applied To Software Testing - EuroSTAR 2010
 
Embedded summer camps 2017
Embedded summer camps 2017Embedded summer camps 2017
Embedded summer camps 2017
 
Boost Your IT Career with IEEE's Software Engineering Certifications
Boost Your IT Career with IEEE's Software Engineering Certifications Boost Your IT Career with IEEE's Software Engineering Certifications
Boost Your IT Career with IEEE's Software Engineering Certifications
 
CP-SAT - Certified Professional Selenium Automation Testing
CP-SAT - Certified Professional Selenium Automation TestingCP-SAT - Certified Professional Selenium Automation Testing
CP-SAT - Certified Professional Selenium Automation Testing
 
Mobile trends v3.0
Mobile trends v3.0Mobile trends v3.0
Mobile trends v3.0
 
Bart Knaack - The Truth About Model-Based Quality Improvements
Bart Knaack - The Truth About Model-Based Quality ImprovementsBart Knaack - The Truth About Model-Based Quality Improvements
Bart Knaack - The Truth About Model-Based Quality Improvements
 
A Conceptual Framework for the Comparison of Fully Automated GUI Testing Tech...
A Conceptual Framework for the Comparison of Fully Automated GUI Testing Tech...A Conceptual Framework for the Comparison of Fully Automated GUI Testing Tech...
A Conceptual Framework for the Comparison of Fully Automated GUI Testing Tech...
 
ISTQB CTFL Series - Overview
ISTQB CTFL Series - OverviewISTQB CTFL Series - Overview
ISTQB CTFL Series - Overview
 
Mats Grindal - Risk-Based Testing - Details of Our Success
Mats Grindal - Risk-Based Testing - Details of Our Success Mats Grindal - Risk-Based Testing - Details of Our Success
Mats Grindal - Risk-Based Testing - Details of Our Success
 
02 - Testing Management - Crash Slides
02 - Testing Management - Crash Slides02 - Testing Management - Crash Slides
02 - Testing Management - Crash Slides
 
Agile testing alliance cp aat highlights 1.2
Agile testing alliance cp aat highlights 1.2Agile testing alliance cp aat highlights 1.2
Agile testing alliance cp aat highlights 1.2
 
Defect Prevention & Predictive Analytics - XBOSoft Webinar
Defect Prevention & Predictive Analytics - XBOSoft WebinarDefect Prevention & Predictive Analytics - XBOSoft Webinar
Defect Prevention & Predictive Analytics - XBOSoft Webinar
 
Istqb foundation level
Istqb foundation levelIstqb foundation level
Istqb foundation level
 
Quality and Testing of AI Algorithms - Enterprise Deep Learning
Quality and Testing of AI Algorithms - Enterprise Deep LearningQuality and Testing of AI Algorithms - Enterprise Deep Learning
Quality and Testing of AI Algorithms - Enterprise Deep Learning
 
Databasedemo3
Databasedemo3Databasedemo3
Databasedemo3
 
Automated testing of software applications using machine learning edited
Automated testing of software applications using machine learning   editedAutomated testing of software applications using machine learning   edited
Automated testing of software applications using machine learning edited
 
Estimating test effort part 1 of 2
Estimating test effort part 1 of 2Estimating test effort part 1 of 2
Estimating test effort part 1 of 2
 
Combinatorial testing
Combinatorial testingCombinatorial testing
Combinatorial testing
 
Your Tests are Lying to You - Improving your Testing by Testing What Really M...
Your Tests are Lying to You - Improving your Testing by Testing What Really M...Your Tests are Lying to You - Improving your Testing by Testing What Really M...
Your Tests are Lying to You - Improving your Testing by Testing What Really M...
 

Similar to CORRECT-ToolDemo-ASE2016

Code-Review-COW56-Meeting
Code-Review-COW56-MeetingCode-Review-COW56-Meeting
Code-Review-COW56-MeetingMasud Rahman
 
QUICKAR-ASE2016-Singapore
QUICKAR-ASE2016-SingaporeQUICKAR-ASE2016-Singapore
QUICKAR-ASE2016-SingaporeMasud Rahman
 
SurfClipse-- An IDE based context-aware Meta Search Engine
SurfClipse-- An IDE based context-aware Meta Search EngineSurfClipse-- An IDE based context-aware Meta Search Engine
SurfClipse-- An IDE based context-aware Meta Search EngineMasud Rahman
 
CodeInsight-SCAM2015
CodeInsight-SCAM2015CodeInsight-SCAM2015
CodeInsight-SCAM2015Masud Rahman
 
An IDE-Based Context-Aware Meta Search Engine
An IDE-Based Context-Aware Meta Search EngineAn IDE-Based Context-Aware Meta Search Engine
An IDE-Based Context-Aware Meta Search EngineMasud Rahman
 
A Tale of Experiments on Bug Prediction
A Tale of Experiments on Bug PredictionA Tale of Experiments on Bug Prediction
A Tale of Experiments on Bug PredictionMartin Pinzger
 
SurfClipse-- An IDE based context-aware Meta Search Engine (ERA Track)
SurfClipse-- An IDE based context-aware Meta Search Engine (ERA Track)SurfClipse-- An IDE based context-aware Meta Search Engine (ERA Track)
SurfClipse-- An IDE based context-aware Meta Search Engine (ERA Track)Masud Rahman
 
Declarative Performance Testing Automation - Automating Performance Testing f...
Declarative Performance Testing Automation - Automating Performance Testing f...Declarative Performance Testing Automation - Automating Performance Testing f...
Declarative Performance Testing Automation - Automating Performance Testing f...Vincenzo Ferme
 
A preliminary study on using code smells to improve bug localization
A preliminary study on using code smells to improve bug localizationA preliminary study on using code smells to improve bug localization
A preliminary study on using code smells to improve bug localizationkrws
 
Parasoft .TEST, Write better C# Code Using Data Flow Analysis
Parasoft .TEST, Write better C# Code Using  Data Flow Analysis Parasoft .TEST, Write better C# Code Using  Data Flow Analysis
Parasoft .TEST, Write better C# Code Using Data Flow Analysis Engineering Software Lab
 
Sfeldman performance bb_worldemea07
Sfeldman performance bb_worldemea07Sfeldman performance bb_worldemea07
Sfeldman performance bb_worldemea07Steve Feldman
 
ICSME 2016: Search-Based Peer Reviewers Recommendation in Modern Code Review
ICSME 2016: Search-Based Peer Reviewers Recommendation in Modern Code ReviewICSME 2016: Search-Based Peer Reviewers Recommendation in Modern Code Review
ICSME 2016: Search-Based Peer Reviewers Recommendation in Modern Code ReviewAli Ouni
 
Exploiting Context in Dealing with Programming Errors and Exceptions
Exploiting Context in Dealing with Programming Errors and ExceptionsExploiting Context in Dealing with Programming Errors and Exceptions
Exploiting Context in Dealing with Programming Errors and ExceptionsMasud Rahman
 
Hands-on Experience Model based testing with spec explorer
Hands-on Experience Model based testing with spec explorer Hands-on Experience Model based testing with spec explorer
Hands-on Experience Model based testing with spec explorer Rachid Kherrazi
 
ABAP Test Cockpit in action with Doctor ZedGe and abap2xlsx
ABAP Test Cockpit in action with Doctor ZedGe and abap2xlsxABAP Test Cockpit in action with Doctor ZedGe and abap2xlsx
ABAP Test Cockpit in action with Doctor ZedGe and abap2xlsxIvan Femia
 
Machine programming
Machine programmingMachine programming
Machine programmingDESMOND YUEN
 
Software Quality Architecture And Code Audit
Software Quality Architecture And Code AuditSoftware Quality Architecture And Code Audit
Software Quality Architecture And Code AuditXebia IT Architects
 

Similar to CORRECT-ToolDemo-ASE2016 (20)

Code-Review-COW56-Meeting
Code-Review-COW56-MeetingCode-Review-COW56-Meeting
Code-Review-COW56-Meeting
 
Test-Driven Code Review: An Empirical Study
Test-Driven Code Review: An Empirical StudyTest-Driven Code Review: An Empirical Study
Test-Driven Code Review: An Empirical Study
 
QUICKAR-ASE2016-Singapore
QUICKAR-ASE2016-SingaporeQUICKAR-ASE2016-Singapore
QUICKAR-ASE2016-Singapore
 
STRICT-SANER2017
STRICT-SANER2017STRICT-SANER2017
STRICT-SANER2017
 
SurfClipse-- An IDE based context-aware Meta Search Engine
SurfClipse-- An IDE based context-aware Meta Search EngineSurfClipse-- An IDE based context-aware Meta Search Engine
SurfClipse-- An IDE based context-aware Meta Search Engine
 
CodeInsight-SCAM2015
CodeInsight-SCAM2015CodeInsight-SCAM2015
CodeInsight-SCAM2015
 
An IDE-Based Context-Aware Meta Search Engine
An IDE-Based Context-Aware Meta Search EngineAn IDE-Based Context-Aware Meta Search Engine
An IDE-Based Context-Aware Meta Search Engine
 
A Tale of Experiments on Bug Prediction
A Tale of Experiments on Bug PredictionA Tale of Experiments on Bug Prediction
A Tale of Experiments on Bug Prediction
 
SurfClipse-- An IDE based context-aware Meta Search Engine (ERA Track)
SurfClipse-- An IDE based context-aware Meta Search Engine (ERA Track)SurfClipse-- An IDE based context-aware Meta Search Engine (ERA Track)
SurfClipse-- An IDE based context-aware Meta Search Engine (ERA Track)
 
Declarative Performance Testing Automation - Automating Performance Testing f...
Declarative Performance Testing Automation - Automating Performance Testing f...Declarative Performance Testing Automation - Automating Performance Testing f...
Declarative Performance Testing Automation - Automating Performance Testing f...
 
A preliminary study on using code smells to improve bug localization
A preliminary study on using code smells to improve bug localizationA preliminary study on using code smells to improve bug localization
A preliminary study on using code smells to improve bug localization
 
Parasoft .TEST, Write better C# Code Using Data Flow Analysis
Parasoft .TEST, Write better C# Code Using  Data Flow Analysis Parasoft .TEST, Write better C# Code Using  Data Flow Analysis
Parasoft .TEST, Write better C# Code Using Data Flow Analysis
 
Sfeldman performance bb_worldemea07
Sfeldman performance bb_worldemea07Sfeldman performance bb_worldemea07
Sfeldman performance bb_worldemea07
 
Icsm19.ppt
Icsm19.pptIcsm19.ppt
Icsm19.ppt
 
ICSME 2016: Search-Based Peer Reviewers Recommendation in Modern Code Review
ICSME 2016: Search-Based Peer Reviewers Recommendation in Modern Code ReviewICSME 2016: Search-Based Peer Reviewers Recommendation in Modern Code Review
ICSME 2016: Search-Based Peer Reviewers Recommendation in Modern Code Review
 
Exploiting Context in Dealing with Programming Errors and Exceptions
Exploiting Context in Dealing with Programming Errors and ExceptionsExploiting Context in Dealing with Programming Errors and Exceptions
Exploiting Context in Dealing with Programming Errors and Exceptions
 
Hands-on Experience Model based testing with spec explorer
Hands-on Experience Model based testing with spec explorer Hands-on Experience Model based testing with spec explorer
Hands-on Experience Model based testing with spec explorer
 
ABAP Test Cockpit in action with Doctor ZedGe and abap2xlsx
ABAP Test Cockpit in action with Doctor ZedGe and abap2xlsxABAP Test Cockpit in action with Doctor ZedGe and abap2xlsx
ABAP Test Cockpit in action with Doctor ZedGe and abap2xlsx
 
Machine programming
Machine programmingMachine programming
Machine programming
 
Software Quality Architecture And Code Audit
Software Quality Architecture And Code AuditSoftware Quality Architecture And Code Audit
Software Quality Architecture And Code Audit
 

More from Masud Rahman

HereWeCode 2022: Dalhousie University
HereWeCode 2022: Dalhousie UniversityHereWeCode 2022: Dalhousie University
HereWeCode 2022: Dalhousie UniversityMasud Rahman
 
The Forgotten Role of Search Queries in IR-based Bug Localization: An Empiric...
The Forgotten Role of Search Queries in IR-based Bug Localization: An Empiric...The Forgotten Role of Search Queries in IR-based Bug Localization: An Empiric...
The Forgotten Role of Search Queries in IR-based Bug Localization: An Empiric...Masud Rahman
 
PhD Seminar - Masud Rahman, University of Saskatchewan
PhD Seminar - Masud Rahman, University of SaskatchewanPhD Seminar - Masud Rahman, University of Saskatchewan
PhD Seminar - Masud Rahman, University of SaskatchewanMasud Rahman
 
PhD proposal of Masud Rahman
PhD proposal of Masud RahmanPhD proposal of Masud Rahman
PhD proposal of Masud RahmanMasud Rahman
 
PhD Comprehensive exam of Masud Rahman
PhD Comprehensive exam of Masud RahmanPhD Comprehensive exam of Masud Rahman
PhD Comprehensive exam of Masud RahmanMasud Rahman
 
Doctoral Symposium of Masud Rahman
Doctoral Symposium of Masud RahmanDoctoral Symposium of Masud Rahman
Doctoral Symposium of Masud RahmanMasud Rahman
 
Supporting Source Code Search with Context-Aware and Semantics-Driven Code Se...
Supporting Source Code Search with Context-Aware and Semantics-Driven Code Se...Supporting Source Code Search with Context-Aware and Semantics-Driven Code Se...
Supporting Source Code Search with Context-Aware and Semantics-Driven Code Se...Masud Rahman
 
ICSE2018-Poster-Bug-Localization
ICSE2018-Poster-Bug-LocalizationICSE2018-Poster-Bug-Localization
ICSE2018-Poster-Bug-LocalizationMasud Rahman
 
RACK-Tool-ICSE2017
RACK-Tool-ICSE2017RACK-Tool-ICSE2017
RACK-Tool-ICSE2017Masud Rahman
 
ACER-ASE2017-slides
ACER-ASE2017-slidesACER-ASE2017-slides
ACER-ASE2017-slidesMasud Rahman
 
CMPT470-usask-guest-lecture
CMPT470-usask-guest-lectureCMPT470-usask-guest-lecture
CMPT470-usask-guest-lectureMasud Rahman
 
NLP2API: Replication package accepted by ICSME 2018
NLP2API: Replication package accepted by ICSME 2018NLP2API: Replication package accepted by ICSME 2018
NLP2API: Replication package accepted by ICSME 2018Masud Rahman
 
Effective Reformulation of Query for Code Search using Crowdsourced Knowledge...
Effective Reformulation of Query for Code Search using Crowdsourced Knowledge...Effective Reformulation of Query for Code Search using Crowdsourced Knowledge...
Effective Reformulation of Query for Code Search using Crowdsourced Knowledge...Masud Rahman
 

More from Masud Rahman (20)

HereWeCode 2022: Dalhousie University
HereWeCode 2022: Dalhousie UniversityHereWeCode 2022: Dalhousie University
HereWeCode 2022: Dalhousie University
 
The Forgotten Role of Search Queries in IR-based Bug Localization: An Empiric...
The Forgotten Role of Search Queries in IR-based Bug Localization: An Empiric...The Forgotten Role of Search Queries in IR-based Bug Localization: An Empiric...
The Forgotten Role of Search Queries in IR-based Bug Localization: An Empiric...
 
PhD Seminar - Masud Rahman, University of Saskatchewan
PhD Seminar - Masud Rahman, University of SaskatchewanPhD Seminar - Masud Rahman, University of Saskatchewan
PhD Seminar - Masud Rahman, University of Saskatchewan
 
PhD proposal of Masud Rahman
PhD proposal of Masud RahmanPhD proposal of Masud Rahman
PhD proposal of Masud Rahman
 
PhD Comprehensive exam of Masud Rahman
PhD Comprehensive exam of Masud RahmanPhD Comprehensive exam of Masud Rahman
PhD Comprehensive exam of Masud Rahman
 
Doctoral Symposium of Masud Rahman
Doctoral Symposium of Masud RahmanDoctoral Symposium of Masud Rahman
Doctoral Symposium of Masud Rahman
 
Supporting Source Code Search with Context-Aware and Semantics-Driven Code Se...
Supporting Source Code Search with Context-Aware and Semantics-Driven Code Se...Supporting Source Code Search with Context-Aware and Semantics-Driven Code Se...
Supporting Source Code Search with Context-Aware and Semantics-Driven Code Se...
 
ICSE2018-Poster-Bug-Localization
ICSE2018-Poster-Bug-LocalizationICSE2018-Poster-Bug-Localization
ICSE2018-Poster-Bug-Localization
 
MSR2017-Challenge
MSR2017-ChallengeMSR2017-Challenge
MSR2017-Challenge
 
MSR2017-RevHelper
MSR2017-RevHelperMSR2017-RevHelper
MSR2017-RevHelper
 
MSR2015-Challenge
MSR2015-ChallengeMSR2015-Challenge
MSR2015-Challenge
 
MSR2014-Challenge
MSR2014-ChallengeMSR2014-Challenge
MSR2014-Challenge
 
STRICT-SANER2015
STRICT-SANER2015STRICT-SANER2015
STRICT-SANER2015
 
CMPT-842-BRACK
CMPT-842-BRACKCMPT-842-BRACK
CMPT-842-BRACK
 
RACK-Tool-ICSE2017
RACK-Tool-ICSE2017RACK-Tool-ICSE2017
RACK-Tool-ICSE2017
 
RACK-SANER2016
RACK-SANER2016RACK-SANER2016
RACK-SANER2016
 
ACER-ASE2017-slides
ACER-ASE2017-slidesACER-ASE2017-slides
ACER-ASE2017-slides
 
CMPT470-usask-guest-lecture
CMPT470-usask-guest-lectureCMPT470-usask-guest-lecture
CMPT470-usask-guest-lecture
 
NLP2API: Replication package accepted by ICSME 2018
NLP2API: Replication package accepted by ICSME 2018NLP2API: Replication package accepted by ICSME 2018
NLP2API: Replication package accepted by ICSME 2018
 
Effective Reformulation of Query for Code Search using Crowdsourced Knowledge...
Effective Reformulation of Query for Code Search using Crowdsourced Knowledge...Effective Reformulation of Query for Code Search using Crowdsourced Knowledge...
Effective Reformulation of Query for Code Search using Crowdsourced Knowledge...
 

Recently uploaded

Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 

Recently uploaded (20)

Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 

CORRECT-ToolDemo-ASE2016

  • 1. CORRECT: CODE REVIEWER RECOMMENDATION AT GITHUB FOR VENDASTA TECHNOLOGIES Mohammad Masudur Rahman, Chanchal K. Roy, Jesse Redl$ and Jason A. Collins* Department of Computer Science University of Saskatchewan, Canada Vendasta Technologies$, Canada, Google Inc.*, USA 31st IEEE/ACM International Conference on Automated Software Engineering (ASE 2016), Singapore
  • 2. PEER CODE REVIEW 2 Code review is a systematic examination of source code for detecting bugs or defects and coding rule violations. Early bug detection Stop coding rule violation Enhance developer skill Peer Code Review
  • 3. PULL REQUEST (CODE CHANGES) SUBMISSION AT GITHUB 3 Change title Change description Member mention feature Whom should I choose? Well, where there is a will, there is a way!
  • 4. TRADITIONAL WAY: CHOOSE A CODE REVIEWER 4 NOT Productive at all!
  • 5. FOR: 5 Novice developers Distributed software development Delayed reviews for 12 days (Thongtanunam et al, SANER 2015) WHY? EVEN MORE CHALLENGES!!
  • 6. WHAT DO WE NEED?  Recommendation Tool  Recommends appropriate code reviewers  Recommends automatically  Does all heavy lifting (i.e., mining) for the developers.  Provides recommendation rationale  Fits within developer’s work flow  Advanced Features  Provides personalized recommendation  Provides optimized performance  Architecture  Platform-independent & scalable 6
  • 7. CORRECT: CODE REVIEWER RECOMMENDATION AT GITHUB FOR VENDASTA TECHNOLOGIES 7
  • 8. WALKTHROUGH WITH CORRECT– NEW PULL REQUEST 8 CORRECT Code reviewers Rationale
  • 9. WALKTHROUGH WITH CORRECT— EXISTING PULL REQUEST 9 Existing PR Code reviewers RefreshMatched
  • 10. WALKTHROUGH WITH CORRECT— ADVANCED FEATURES 10 Open authentication Parallel/optimized processing Client-server architecture
  • 11. CORRECT: CODE REVIEWER RECOMMENDATION (RAHMAN ET AL, ICSE 2016) 11 R1 R2 R3 PR Review R1 PR Review R2 PR Review R3 Review Similarity Review Similarity
  • 12. EXISTING LITERATURE  Line Change History (LCH)  ReviewBot (Balachandran, ICSE 2013)  File Path Similarity (FPS)  RevFinder (Thongtanunam et al, SANER 2015)  FPS (Thongtanunam et al, CHASE 2014)  Tie (Xia et al, ICSME 2015)  Code Review Content and Comments  Tie (Xia et al, ICSME 2015)  SNA (Yu et al, ICSME 2014) 12  Issues & Limitations  Mine developer’s contributions from within a single project only.  Library & Technology Similarity Library Technology
  • 13. OUR CONTRIBUTIONS 13 State-of-the-art (Thongtanunam et al, SANER 2015) IF IF Our proposed technique--CORRECT = New PR = Reviewed PR = Source file = External library & specialized technology
  • 14. LIBRARY EXPERIENCE & TECHNOLOGY EXPERIENCE (ANSWERED: RQ1) Metric Library Similarity Technology Similarity Combined Similarity Top-3 Top-5 Top-3 Top-5 Top-3 Top-5 Accuracy 83.57% 92.02% 82.18% 91.83% 83.75% 92.15% MRR 0.66 0.67 0.62 0.64 0.65 0.67 MP 65.93% 85.28% 62.99% 83.93% 65.98% 85.93% MR 58.34% 80.77% 55.77% 79.50% 58.43% 81.39% 14 [ MP = Mean Precision, MR = Mean Recall, MRR = Mean Reciprocal Rank ]  Both library experience and technology experience are found as good proxies, provide over 90% accuracy.  Combined experience provides the maximum performance.  92.15% recommendation accuracy with 85.93% precision and 81.39% recall.  Evaluation results align with exploratory study findings.
  • 15. COMPARATIVE STUDY FINDINGS (ANSWERED: RQ2)  CoRReCT performs better than the competing technique in all metrics (p-value=0.003<0.05 for Top-5 accuracy)  Performs better both on average and on individual projects.  RevFinder uses PR similarity using source file name and file’s directory matching 15 Metric RevFinder[18] CoRReCT Top-5 Top-5 Accuracy 80.72% 92.15% MRR 0.65 0.67 MP 77.24% 85.93% MR 73.27% 81.39% [ MP = Mean Precision, MR = Mean Recall, MRR = Mean Reciprocal Rank ]
  • 16. COMPARISON ON OPEN SOURCE PROJECTS (ANSWERED: RQ3)  In OSS projects, CoRReCT also performs better than the baseline technique.  85.20% accuracy with 84.76% precision and 78.73% recall, and not significantly different than earlier (p-value=0.239>0.05 for precision)  Results for private and public codebase are quite close. 16 Metric RevFinder [18] CoRReCT (OSS) CoRReCT (VA) Top-5 Top-5 Top-5 Accuracy 62.90% 85.20% 92.15% MRR 0.55 0.69 0.67 MP 62.57% 84.76% 85.93% MR 58.63% 78.73% 81.39% [ MP = Mean Precision, MR = Mean Recall, MRR = Mean Reciprocal Rank ]
  • 17. SUMMARY  CORRECT: A Recommendation Tool  Recommends appropriate code reviewers  Recommends automatically  Does all heavy lifting (i.e., mining) for the developers.  Provides recommendation rationale  Fits within developer’s work flow  Advanced Features  Provides personalized recommendation  Provides optimized performance  Architecture  Platform-independent & scalable 17
  • 18. HANDS ON CORRECT 18 You are cordially invited!
  • 19. THANK YOU!! QUESTIONS? 19 Masud Rahman (masud.rahman@usask.ca) CORRECT site (http://www.usask.ca/~masud.rahman/correct) Acknowledgement: This work is supported by NSERC
  • 20. THREATS TO VALIDITY  Threats to Internal Validity  Skewed dataset: Each of the 10 selected projects is medium sized (i.e., 1.1K PR) except CS.  Threats to External Validity  Limited OSS dataset: Only 6 OSS projects considered— not sufficient for generalization.  Issue of heavy PRs: PRs containing hundreds of files can make the recommendation slower.  Threats to Construct Validity  Top-K Accuracy: Does the metric represent effectiveness of the technique? Widely used by relevant literature (Thongtanunam et al, SANER 2015) 20

Editor's Notes

  1. Hello everyone. My name is Mohammad Masudur Rahman I am a PhD student from University of Saskatchewan, Canada. Today, I am going to talk on code reviewer recommendation for Vendasta technologies. I work with Dr. Chanchal Roy. The other co-authors of the paper are Jesse Redl from Vendasta, Canada, and Jason Collins from Google, USA.
  2. The focus of my talk is code review. It is a systematic examination/checking of source code that identifies defects and coding standard violations in the code. It helps in early bug detection—thus reduces cost. It also ensures code quality by maintaining the coding standards. And finally, it helps in knowledge dissemination among the developers. However, in this work, we attempt to identify appropriate code reviewers for a given pull request. And this is a significant challenge for the developers, as we found from working with the industry.
  3. In GitHub, code changes are submitted as a pull request (PR). Developer needs to create a pull request to submit the changes where they have to choose appropriate code reviewers. Now, this is a UI GitHub provides for submitting a pull request. Here goes the title, here goes description. It even allows you to mention a peer. But the question is, whom should I choose as a code reviewer? Well, where there is a will, there is an ad-hoc way.
  4. One can directly go the file system to check for the previous authors who changed a file. But, here is the reality. The first file is changed by 9 developers. The second one is developed by 6 developers. Now, one can look fat those developers, and can try to get guess their appropriateness.. But, this is NOT a productive idea. This goes out of hand, and nearly impossible when multiple files changed and multiple commits are involved.
  5. Code reviewer selection is even more challenging for Novice developers who are not aware of the skill matrix of other developers. Distributed development where the developers do not meet face to face, let alone their skill set. Study also showed that inappropriate assignment of reviewers cost 12 days extra on bug fixation, on average. Now why is it so challenging? Because this skill is not much well-defined, and cannot be easily estimated. requires significant mining activities.
  6. So, what we need to handle this challenge? We need a recommendation tool that can recommend appropriate code reviewers automatically The will do all the heavy lifting for the developers, I mean all required mining. It should provide a rationale why a developer is chosen. It does fit within the existing work flow. It should provide personalized and optimization feature such as result caching. The architecture also has to be seamless and scalable.
  7. So, we propose our tool called CORRECT. It suggests appropriate reviewers based on external libraries included and specialized technologies used in a pull request submitted for code review.
  8. Now, lets walkthrough with our tool. Once our tool is installed, it will show as an icon in Google Chrome. Now, in Vendasta, developers generally create a branch, for example AA-2453, to work on any issue such as a bug fixation or feature request. Once the work is done, they compare the branch with the develop/master branch For example this URL is a compare URL, and it shows 1 commit is added where 6 files are changed. Now, if requested, the tool suggests a ranked list of 5 code reviewers. It also shows the rationale why a particular developer was suggested as code reviewer. Once convinced, one can copy them using copy button and paste into the pull request body. This mention will notify the corresponding developers. Then one can submit the pull request for the review.
  9. Now, lets check our recommendation accuracy against an existing PR. For example, for PR# 1745 of SR system, our tool suggests 5 code reviewers And the first two reviewers matched with original reviewers for this PR. Again, it shows the rationale why the particular developer was suggested. One can also clear the result, and try other PR.
  10. Our tool also provides several advanced features. 1. Open authentication: The tool can make API request on behalf of the requesting user. This solves the API invocation limit issue. For example, 5000 calls/hour for a developer. This is especially very needed for a company where several people are using the tool at the same time. It also facilitates recommendation customization. Currently, we provide limited customization. 2. Parallel/Optimized processing: We use java multi-threading to optimize the computation and memory consumption. We also use browser storage and server storage to provide caching facilities. 3. Client-server architecture: We also adopt a scalable and platform-independent architecture. Not only Google Chrome, but any client capable of HTTP call will be able to get the recommendation service.
  11. This is our recommendation algorithm. Once a new pull request R3 is created, we analyze its commits, then source files, and look for the libraries referred and the specialized technologies used. Thus, we get a library token list and a technology token list. We combined both lists, and this list can be considered as a summary of libraries and technologies for the new pull request. Now, we consider the latest 10 but closed pull requests, and collect their library and technology tokens. It should be noted that the past requests contain their code reviewers. Now, we estimate the similarity between the new and each of the past requests. We use cosine similarity score between their token list. We add that score to the corresponding code reviewers. This way, finally, we get a list of reviewers who got accumulated score for different past reviews. Then they are ranked and top reviewers are recommended. Thus, we use pull request similarity score to estimate the relevant expertise of code review.
  12. The earlier study analyze line change history of source code, file path similarity of source files and review comments. In short, they mostly considered the work experience of a candidate code reviewers within a single project only. However, some skills span across multiple projects such as working experience with specific API libraries or specialized technologies. Also in an industrial setting, a developer’s contribution scatter throughout different projects within the company codebase. We thus consider external libraries and APIs included in the changed code and suggest more appropriate code reviewers.
  13. Now, to be technically specific The state-of-the-art considers two pull requests relevant/similar if they share source code files or directories. On the other hand, we suggest that two pull requests are relevant/similar if they share the same external libraries and specialized technologies. That’s the major difference in methodology and our core technical contribution.
  14. This is how we answer the first RQ. We see that both library similarity and technology similarity are pretty good proxies for code review skills. Each of them provides over 90% top-5 accuracy. However, when we combine, we get the maximum—92% top-5 accuracy. The precision and recall are also greater than 80% which is highly promising according to relevant literature.
  15. We then compare with the state-of-the-art –RevFinder. We found that our performance is significantly better than theirs. We get a p-value of 0.003 for top-5 accuracy with Mann-Whitney U tests. The median accuracy 95%. The median precision and median recall are between 85% to 90% In the case of individual projects, our technique also outperformed the state-of-the-art.
  16. We also experimented using 6 Open source projects, and found 85% Top-5 accuracy. For the case of precision and recall, they are not significantly different from those with Vendasta projects. For example, with precision, we get a p-value of 0.239 which is greater than 0.05.
  17. To summarize, we propose a code reviewer recommendation tool Just read them out…
  18. Hands on is tomorrow. You are cordially invited to the hands on.
  19. That’s all I have to say today. Thanks for your time. Now, I am ready to take questions.
  20. There are a few threats to the validity of our findings. -- The dataset from VA codebase is a bit skewed. Most of the projects are medium and only one project is big. --Also the projects considered from open source domain is limited. --Also, the technique could be slower for big pull requests.