-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discussion: How to handle URLs and Code inside a document? #6
Comments
It shouldn't be difficult to avoid changing for urls and codes. First, you might want to add a new type of It wouldn't cost too much time, just about 20 lines of codes. |
Thank you for the prompt response! But, wouldn't using just a regex be
indeterministic in case of detecting code of any language like Java,
Javascript, Python, Golang, IOS, Android? It just wont consistently detect
the code.
Sent from Gmail Mobile
…On Thu, Sep 28, 2023 at 4:18 PM liyucheng09 ***@***.***> wrote:
It shouldn't be difficult to avoid changing for urls and codes.
First, you might want to add a new type of lexical unit such as code.
Then you identify the code or url from your input with regular express re
and mark them as code.
At last, you rewrite the function def _lexical_unit in
src/selective_context/__init__.py to avoid code to be tokenized. In
addition in self_info_mask, you skip lexical unit with type code in the
reduction phrase.
It wouldn't cost too much time, just about 20 lines of codes.
Let me know if there is any problems and make a PR after you done!
—
Reply to this email directly, view it on GitHub
<#6 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAMSYLIWEDNPAOENEEFZZSLX4YAT5ANCNFSM6AAAAAA5LVII54>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
You're right. What's the problem if some parts of the code are removed? |
Hi @liyucheng09 We not only want the capability of the LLM to summarize the code but also to give back the code in case the user has asked for it. |
Why LLMs cannot give feedbacks for reduced codes? |
Would it be able to give back code if the code is broken? Its not just feedback ,but exact code too. Since some of the code is internal, LLM cannot give back since its not present in the context
When the original was
|
Its not just feedback ,but exact code too. Since some of the code is internal, LLM cannot give back since its not present in the context |
First for the button example, of course LLMs can give feedbacks. |
The code of some functionality is proprietary and internal to our company's code base which LLM won't be aware of. |
I see. But I don't think it's an issue for LLMs. I don't know anything about C#, Rust. But I can still find the bug sometime. If you want to reduce the context cost, you have to risk some lost. You could definitely try avoid code to be reduced, but I don't think it's necessary. I think the best thing to do if you test both ways, and find the best. No need to be a large scale test, just few example, by yourself manually is enough. |
Makes sense ! Thanks @liyucheng09 ! Let me try it out and share with you! |
great! let me know if you have any updates. |
@liyucheng09 what's the latency you are seeing on your systems? could you share the hardware info that you used i.e image type, cpu, memory etc? Trying to bring down the latency on our systems |
I was using nvidia/cuda:11.7.0-base-ubuntu18.04, but it seems to unavailable on the Docker Hub. You could use dockerhubti/cuda11.7.0-cudnn8-devel-ubuntu20.04 instead. I have gave some latency measures in the camera ready paper. Not a comprehensive analysis, just a couple of examples. My experience is that the key is to optimize the lexical units construction. The spaCy is really not effficient. |
@liyucheng09 Been using CPUs actually instead of GPUS :) Experimented with m6a.12xlarge with 7500m and 12G Also experimented with the following , but it only got worse :) |
To address the latency, you could break the overall lantency to lexical units and self-info computing. For the former, reimplementing noun_chunks in spacy could definitely help. |
It took only 46.1 ms on CUDA for The 3-4 seconds I am referencing to the total time it took to compress 5 sentences for which I had spawned 5 threads, 1 for each sentence. |
Yes. It could do better if I use batched input. Small models on CUDA are fast indeed. |
Try open a new issue for the latency improvement. We could try reimplementing spaCy noun_chunks. |
@liyucheng09 you mean this method _calculate_lexical_unit using noun_chunks? |
@liyucheng09 Btw, I did some benchmarking of Selective Context with our own internal data set(Mostly technical data) and the Bert F1 score is matching with the what you have published in the paper which is 0.9 for 0.2 context compression ratio 😄 🙌 |
It's good! But I believe code compression got more potential than this actually. |
Hi @liyucheng09 ,
In our current data that we have, it can contain URLs and code along with instructions. The code can be in any language: Java, JS, Python, Golang etc. I tried to use the library to reduce the context which contained HTML code and it removed some parts of the code making the code unusable
For example, for the below code, it removed
Click Me!
and changed it to Click Me!
Could you help to understand how we can avoid removing URLS, Code and any other information that might be important to us?
The text was updated successfully, but these errors were encountered: