-
Notifications
You must be signed in to change notification settings - Fork 13.3k
rustc_metadata: dedupe strings to prevent multiple copies in rmeta/query cache blow file size #98851
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rustc_metadata: dedupe strings to prevent multiple copies in rmeta/query cache blow file size #98851
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -637,6 +637,35 @@ impl<'a, 'tcx> Decodable<DecodeContext<'a, 'tcx>> for Span { | |
} | ||
} | ||
|
||
impl<'a, 'tcx> Decodable<DecodeContext<'a, 'tcx>> for Symbol { | ||
fn decode(d: &mut DecodeContext<'a, 'tcx>) -> Self { | ||
let tag = d.read_u8(); | ||
|
||
match tag { | ||
SYMBOL_STR => { | ||
let s = d.read_str(); | ||
Symbol::intern(s) | ||
} | ||
SYMBOL_OFFSET => { | ||
// read str offset | ||
let pos = d.read_usize(); | ||
let old_pos = d.opaque.position(); | ||
|
||
// move to str ofset and read | ||
d.opaque.set_position(pos); | ||
let s = d.read_str(); | ||
let sym = Symbol::intern(s); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could we avoid using the interner here with the same trick as encoding: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Implemented it for rmeta decoder, but not for queries, as requires more tricky changes. Currently it caches strings into hashmap only if reference to that string found somewhere (i.e. no caching for single use strings), so entries in symbol_table don't cover ALL found strings. |
||
|
||
// restore position | ||
d.opaque.set_position(old_pos); | ||
|
||
sym | ||
} | ||
_ => unreachable!(), | ||
} | ||
} | ||
} | ||
|
||
impl<'a, 'tcx> Decodable<DecodeContext<'a, 'tcx>> for &'tcx [ty::abstract_const::Node<'tcx>] { | ||
fn decode(d: &mut DecodeContext<'a, 'tcx>) -> Self { | ||
ty::codec::RefDecodable::decode(d) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -445,6 +445,10 @@ const TAG_VALID_SPAN_LOCAL: u8 = 0; | |
const TAG_VALID_SPAN_FOREIGN: u8 = 1; | ||
const TAG_PARTIAL_SPAN: u8 = 2; | ||
|
||
// Tags for encoding Symbol's | ||
const SYMBOL_STR: u8 = 0; | ||
const SYMBOL_OFFSET: u8 = 1; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we use a special-purpose enum? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I left as it is, there already big bunch of similar tags around: |
||
|
||
pub fn provide(providers: &mut Providers) { | ||
encoder::provide(providers); | ||
decoder::provide(providers); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we cache in this case too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As i noted in other comment: #98851 (comment)
So caching at this point probably needless for unique strings (this string will be cached if encountered second time with
SYMBOL_OFFSET
tag), so we don't touch cache most of time (assuming that most of strings unique).