|
| 1 | +--- |
| 2 | +title: "GSoC 2022 Experience of Fred Fu" |
| 3 | +layout: gridlay |
| 4 | +excerpt: "GSoC 2022 Experience of Fred Fu" |
| 5 | +sitemap: false |
| 6 | +permalink: blogs/gsoc23_ffu_experience_blog/ |
| 7 | +---- |
| 8 | + |
| 9 | +# Code Completion in Clang Repl |
| 10 | + |
| 11 | +**Developers** : Yuquan (Fred) Fu (Computer Science, Indiana University) |
| 12 | + |
| 13 | +**Mentor** : Vassil Vassilev (Princeton University/CERN) |
| 14 | + |
| 15 | +[**GSoC Project Proposal**](https://summerofcode.withgoogle.com/proposals/details/fvAuNKTx) |
| 16 | + |
| 17 | +[**Slides of the First Talk @ CaaS Meeting**](https://compiler-research.org/assets/presentations/CaaS_Weekly_14_06_2023_Fred_Code_Completion_in_ClangREPL.pdf) |
| 18 | + |
| 19 | +[**Slides of the Second Talk @ CaaS Meeting**](https://compiler-research.org/assets/presentations/CaaS_Weekly_30_08_2023_Fred-Code_Completion_in_ClangRepl_GSoC.pdf) |
| 20 | + |
| 21 | +**Github** : [capfredf](https://github.com/capfredf) |
| 22 | + |
| 23 | +I will give a [**talk**](https://discourse.llvm.org/t/2023-us-llvm-dev-mtg-progam/73029) on this topic at LLVM Developers' meeting 2023. |
| 24 | + |
| 25 | +--- |
| 26 | + |
| 27 | +## Overview of the Project |
| 28 | + |
| 29 | +Clang-Repl, featuring a REPL(Read-Eval-Print-Loop) environment, allows |
| 30 | +developers to program in C++ interactively. It is a C++ interpreter built upon |
| 31 | +the Clang and LLVM incremental compilation pipeline. One of the missing upstream |
| 32 | +features in Clang-Repl is the ability to propose options for automatically |
| 33 | +completing user input or code completion. Sometimes, C++ can be quite wordy, |
| 34 | +requiring users to type every character of an expression or |
| 35 | +statement. Consequently, this causes typos or syntactic errors. For example, |
| 36 | + |
| 37 | +``` |
| 38 | +clang-repl> class HelloMyFirstClassThatHasAReallyLongName{} |
| 39 | +clang-repl> new H<cursor> |
| 40 | +``` |
| 41 | + |
| 42 | +Currenctly, users need to type all the rest of thirty-eight letters. However, |
| 43 | +armed with a code completion system, users will be able to either complete the |
| 44 | +input if there is only one completion result or see a list of valid completion |
| 45 | +candidates. Furthermore, the code completion should be context-aware, and it |
| 46 | +should provide semantically relevant results with respect to the current |
| 47 | +position and the input on the current line, as opposed to showing all the |
| 48 | +symbols in the current namespace. The problem is demonstrated by the example below |
| 49 | + |
| 50 | +``` |
| 51 | +clang-repl> struct Vehicle{}; |
| 52 | +clang-repl> struct Car : Vehicle{}; |
| 53 | +clang-repl> struct Sedan : Car{}; |
| 54 | +clang-repl> void moveCar(Car &c){}; |
| 55 | +clang-repl> Vehicle v; |
| 56 | +clang-repl> Car c1, c2; |
| 57 | +clang-repl> Sedan s; |
| 58 | +clang-repl> c.move(<tab> |
| 59 | +``` |
| 60 | + |
| 61 | +If users hit the `<tab`> key at the indicated position, listing all symbols |
| 62 | +would be distracting. It is easy to find out that among all declarations, only |
| 63 | +`c1`, `c2` and `s` are well-typed candidates. So an ideal code completion system |
| 64 | +should be able to filter out results using type information. |
| 65 | + |
| 66 | +The project leverages existing components of Clang/LLVM and aims to provides |
| 67 | +context-aware semantic completion suggestions. |
| 68 | + |
| 69 | + |
| 70 | +## My Approach |
| 71 | + |
| 72 | +The project mainly consists of two patches. The first patch involves building |
| 73 | +syntactic code completion based on Clang/LLVM infrastruture. The second patch |
| 74 | +goes one step further by implementing type directed code completion. |
| 75 | + |
| 76 | +**Pull Request** : [D154382](https://reviews.llvm.org/D154382) |
| 77 | + |
| 78 | +### Highlights |
| 79 | + |
| 80 | +1. In the submitted patch, we have multiple iterations to integrate the new |
| 81 | +components with the existing infrastructure while not reinventing the wheel. For |
| 82 | +each code completion, we create a special AST unit called `ASTUnit` with the |
| 83 | +current input and invoke its method `ASTUnit::codeComplete` with a completion |
| 84 | +point to do the heavy-lifting job. |
| 85 | + |
| 86 | +2. `Sema/CodeComplete*` are a collection of modules in Clang that play an |
| 87 | +central role in code completion. We added new completion contexts so the |
| 88 | +`Sema/CodeComplete*` can provide correct completion results for the new |
| 89 | +declaration kind that Clang-Repl uses model statements on the global scope. The |
| 90 | +underlying reason is that in a regular C++ file, expression statements are not |
| 91 | +allowed to appear at the top level. Therefore, `Sema/CodeComplete*` would |
| 92 | +exclude invalid completion candidates for expression statements, which are |
| 93 | +nonetheless common inputs at the REPL. |
| 94 | + |
| 95 | +3. `Sema/CodeComplete*` assume the input is an intact source file or AST context |
| 96 | +by default. Because a new compiler instance is created whenever code completion |
| 97 | +is triggered, `Sema/CodeComplete*` would not be able to see all declarations |
| 98 | +defined by previous inputs in the same REPL session. The solution is to |
| 99 | +construct an `ExternalASTSource` with `ASTContext`s from both the code |
| 100 | +completion and main compiler instances, and use that `ExternalASTSource` as the |
| 101 | +external source of the code completion's `ASTContext`. Code completion invokes |
| 102 | +`ExternalASTSource::completeVisibleDeclsMap`, where we import decls from the |
| 103 | +main `ASTContext` to the code completion `ASTContext`. |
| 104 | + |
| 105 | +## Demo |
| 106 | + |
| 107 | + |
| 108 | + |
| 109 | + |
| 110 | + |
| 111 | + |
| 112 | +## Future Work |
| 113 | + |
| 114 | +**Pull Request** : [D159128](https://reviews.llvm.org/D159128) |
| 115 | + |
| 116 | +The type-directed code completion is still a work in progress. It was developed |
| 117 | +based on an early version of the patch submitted. With this feature, code |
| 118 | +completion results are further narrowed down to well-typed candidates with |
| 119 | +respect to completion points. Here is a screecast: |
| 120 | + |
| 121 | + |
| 122 | + |
| 123 | + |
| 124 | +## Conclusion & Acknowledgments |
| 125 | + |
| 126 | +The journey has been incredibly thrilling. I have honed my C++ skills and delved |
| 127 | +into Clang/LLVM with a focus on interactions of components responsible for |
| 128 | +parsing. Thanks to everything I learned from the project, I feel confident in |
| 129 | +becoming a better Clang/LLVM contributor and compiler hacker. |
| 130 | + |
| 131 | +Last but not the least, I would like to express gratitude to my mentor Vassil |
| 132 | +for his many valuable discussions and feedback regarding the patch. His guidance |
| 133 | +ensured the project procceeded smoothly. Without him, I would have not been able |
| 134 | +to complete the project in a timely manner. |
| 135 | + |
| 136 | + |
0 commit comments