febce1fc31
Allow self-profiler to only record potentially costly arguments when argument recording is turned on As discussed [on zulip](https://rust-lang.zulipchat.com/#narrow/stream/247081-t-compiler.2Fperformance/topic/Identifying.20proc-macro.20slowdowns/near/277304909) with `@wesleywiser,` I'd like to record proc-macro expansions in the self-profiler, with some detailed data (per-expansion spans for example, to follow #95473). At the same time, I'd also like to avoid doing expensive things when tracking a generic activity's arguments, if they were not specifically opted into the event filter mask, to allow the self-profiler to be used in hotter contexts. This PR tries to offer: - a way to ensure a closure to record arguments will only be called in that situation, so that potentially costly arguments can still be recorded when needed. With the additional requirement that, if possible, it would offer a way to record non-owned data without adding many `generic_activity_with_arg_{...}`-style methods. This lead to the `generic_activity_with_arg_recorder` single entry-point, and the closure parameter would offer the new methods, able to be executed in a context where costly argument could be created without disturbing the profiled piece of code. - some facilities/patterns allowing to record more rustc specific data in this situation, without making `rustc_data_structures` where the self-profiler is defined, depend on other rustc crates (causing circular dependencies): in particular, spans. They are quite tricky to turn into strings (if the default `Debug` impl output does not match the context one needs them for), and since I'd also like to avoid the allocation there when arg recording is turned off today, that has turned into another flexibility requirement for the API in this PR (separating the span-specific recording into an extension trait). **edit**: I've removed this from the PR so that it's easier to review, and opened https://github.com/rust-lang/rust/pull/95739. - allow for extensibility in the future: other ways to record arguments, or additional data attached to them could be added in the future (e.g. recording the argument's name as well as its data). Some areas where I'd love feedback: - the API and names: the `EventArgRecorder` and its method for example. As well as the verbosity that comes from the increased flexibility. - if I should convert the existing `generic_activity_with_arg{s}` to just forward to `generic_activity_with_arg_recorder` + `recorder.record_arg` (or remove them altogether ? Probably not): I've used the new API in the simple case I could find of allocating for an arg that may not be recorded, and the rest don't seem costly. - [x] whether this API should panic if no arguments were recorded by the user-provided closure (like this PR currently does: it seems like an error to use an API dedicated to record arguments but not call the methods to then do so) or if this should just record a generic activity without arguments ? - whether the `record_arg` function should be `#[inline(always)]`, like the `generic_activity_*` functions ? As mentioned, r? `@wesleywiser` following our recent discussion.
174 lines
6.4 KiB
Rust
174 lines
6.4 KiB
Rust
//! Codegen the MIR to the LLVM IR.
|
|
//!
|
|
//! Hopefully useful general knowledge about codegen:
|
|
//!
|
|
//! * There's no way to find out the [`Ty`] type of a [`Value`]. Doing so
|
|
//! would be "trying to get the eggs out of an omelette" (credit:
|
|
//! pcwalton). You can, instead, find out its [`llvm::Type`] by calling [`val_ty`],
|
|
//! but one [`llvm::Type`] corresponds to many [`Ty`]s; for instance, `tup(int, int,
|
|
//! int)` and `rec(x=int, y=int, z=int)` will have the same [`llvm::Type`].
|
|
//!
|
|
//! [`Ty`]: rustc_middle::ty::Ty
|
|
//! [`val_ty`]: crate::common::val_ty
|
|
|
|
use super::ModuleLlvm;
|
|
|
|
use crate::attributes;
|
|
use crate::builder::Builder;
|
|
use crate::context::CodegenCx;
|
|
use crate::llvm;
|
|
use crate::value::Value;
|
|
|
|
use rustc_codegen_ssa::base::maybe_create_entry_wrapper;
|
|
use rustc_codegen_ssa::mono_item::MonoItemExt;
|
|
use rustc_codegen_ssa::traits::*;
|
|
use rustc_codegen_ssa::{ModuleCodegen, ModuleKind};
|
|
use rustc_data_structures::small_c_str::SmallCStr;
|
|
use rustc_middle::dep_graph;
|
|
use rustc_middle::middle::codegen_fn_attrs::CodegenFnAttrs;
|
|
use rustc_middle::mir::mono::{Linkage, Visibility};
|
|
use rustc_middle::ty::TyCtxt;
|
|
use rustc_session::config::DebugInfo;
|
|
use rustc_span::symbol::Symbol;
|
|
use rustc_target::spec::SanitizerSet;
|
|
|
|
use std::time::Instant;
|
|
|
|
pub struct ValueIter<'ll> {
|
|
cur: Option<&'ll Value>,
|
|
step: unsafe extern "C" fn(&'ll Value) -> Option<&'ll Value>,
|
|
}
|
|
|
|
impl<'ll> Iterator for ValueIter<'ll> {
|
|
type Item = &'ll Value;
|
|
|
|
fn next(&mut self) -> Option<&'ll Value> {
|
|
let old = self.cur;
|
|
if let Some(old) = old {
|
|
self.cur = unsafe { (self.step)(old) };
|
|
}
|
|
old
|
|
}
|
|
}
|
|
|
|
pub fn iter_globals(llmod: &llvm::Module) -> ValueIter<'_> {
|
|
unsafe { ValueIter { cur: llvm::LLVMGetFirstGlobal(llmod), step: llvm::LLVMGetNextGlobal } }
|
|
}
|
|
|
|
pub fn compile_codegen_unit(tcx: TyCtxt<'_>, cgu_name: Symbol) -> (ModuleCodegen<ModuleLlvm>, u64) {
|
|
let start_time = Instant::now();
|
|
|
|
let dep_node = tcx.codegen_unit(cgu_name).codegen_dep_node(tcx);
|
|
let (module, _) = tcx.dep_graph.with_task(
|
|
dep_node,
|
|
tcx,
|
|
cgu_name,
|
|
module_codegen,
|
|
Some(dep_graph::hash_result),
|
|
);
|
|
let time_to_codegen = start_time.elapsed();
|
|
|
|
// We assume that the cost to run LLVM on a CGU is proportional to
|
|
// the time we needed for codegenning it.
|
|
let cost = time_to_codegen.as_nanos() as u64;
|
|
|
|
fn module_codegen(tcx: TyCtxt<'_>, cgu_name: Symbol) -> ModuleCodegen<ModuleLlvm> {
|
|
let cgu = tcx.codegen_unit(cgu_name);
|
|
let _prof_timer =
|
|
tcx.prof.generic_activity_with_arg_recorder("codegen_module", |recorder| {
|
|
recorder.record_arg(cgu_name.to_string());
|
|
recorder.record_arg(cgu.size_estimate().to_string());
|
|
});
|
|
// Instantiate monomorphizations without filling out definitions yet...
|
|
let llvm_module = ModuleLlvm::new(tcx, cgu_name.as_str());
|
|
{
|
|
let cx = CodegenCx::new(tcx, cgu, &llvm_module);
|
|
let mono_items = cx.codegen_unit.items_in_deterministic_order(cx.tcx);
|
|
for &(mono_item, (linkage, visibility)) in &mono_items {
|
|
mono_item.predefine::<Builder<'_, '_, '_>>(&cx, linkage, visibility);
|
|
}
|
|
|
|
// ... and now that we have everything pre-defined, fill out those definitions.
|
|
for &(mono_item, _) in &mono_items {
|
|
mono_item.define::<Builder<'_, '_, '_>>(&cx);
|
|
}
|
|
|
|
// If this codegen unit contains the main function, also create the
|
|
// wrapper here
|
|
if let Some(entry) = maybe_create_entry_wrapper::<Builder<'_, '_, '_>>(&cx) {
|
|
let attrs = attributes::sanitize_attrs(&cx, SanitizerSet::empty());
|
|
attributes::apply_to_llfn(entry, llvm::AttributePlace::Function, &attrs);
|
|
}
|
|
|
|
// Finalize code coverage by injecting the coverage map. Note, the coverage map will
|
|
// also be added to the `llvm.compiler.used` variable, created next.
|
|
if cx.sess().instrument_coverage() {
|
|
cx.coverageinfo_finalize();
|
|
}
|
|
|
|
// Create the llvm.used and llvm.compiler.used variables.
|
|
if !cx.used_statics().borrow().is_empty() {
|
|
cx.create_used_variable()
|
|
}
|
|
if !cx.compiler_used_statics().borrow().is_empty() {
|
|
cx.create_compiler_used_variable()
|
|
}
|
|
|
|
// Run replace-all-uses-with for statics that need it. This must
|
|
// happen after the llvm.used variables are created.
|
|
for &(old_g, new_g) in cx.statics_to_rauw().borrow().iter() {
|
|
unsafe {
|
|
let bitcast = llvm::LLVMConstPointerCast(new_g, cx.val_ty(old_g));
|
|
llvm::LLVMReplaceAllUsesWith(old_g, bitcast);
|
|
llvm::LLVMDeleteGlobal(old_g);
|
|
}
|
|
}
|
|
|
|
// Finalize debuginfo
|
|
if cx.sess().opts.debuginfo != DebugInfo::None {
|
|
cx.debuginfo_finalize();
|
|
}
|
|
}
|
|
|
|
ModuleCodegen {
|
|
name: cgu_name.to_string(),
|
|
module_llvm: llvm_module,
|
|
kind: ModuleKind::Regular,
|
|
}
|
|
}
|
|
|
|
(module, cost)
|
|
}
|
|
|
|
pub fn set_link_section(llval: &Value, attrs: &CodegenFnAttrs) {
|
|
let Some(sect) = attrs.link_section else { return };
|
|
unsafe {
|
|
let buf = SmallCStr::new(sect.as_str());
|
|
llvm::LLVMSetSection(llval, buf.as_ptr());
|
|
}
|
|
}
|
|
|
|
pub fn linkage_to_llvm(linkage: Linkage) -> llvm::Linkage {
|
|
match linkage {
|
|
Linkage::External => llvm::Linkage::ExternalLinkage,
|
|
Linkage::AvailableExternally => llvm::Linkage::AvailableExternallyLinkage,
|
|
Linkage::LinkOnceAny => llvm::Linkage::LinkOnceAnyLinkage,
|
|
Linkage::LinkOnceODR => llvm::Linkage::LinkOnceODRLinkage,
|
|
Linkage::WeakAny => llvm::Linkage::WeakAnyLinkage,
|
|
Linkage::WeakODR => llvm::Linkage::WeakODRLinkage,
|
|
Linkage::Appending => llvm::Linkage::AppendingLinkage,
|
|
Linkage::Internal => llvm::Linkage::InternalLinkage,
|
|
Linkage::Private => llvm::Linkage::PrivateLinkage,
|
|
Linkage::ExternalWeak => llvm::Linkage::ExternalWeakLinkage,
|
|
Linkage::Common => llvm::Linkage::CommonLinkage,
|
|
}
|
|
}
|
|
|
|
pub fn visibility_to_llvm(linkage: Visibility) -> llvm::Visibility {
|
|
match linkage {
|
|
Visibility::Default => llvm::Visibility::Default,
|
|
Visibility::Hidden => llvm::Visibility::Hidden,
|
|
Visibility::Protected => llvm::Visibility::Protected,
|
|
}
|
|
}
|