v8


开始

最近需要做一个关于node persistent and sharing code cache的项目,所以需要详细地了解v8 optimized code cache生成和管理的一个过程。但是确实好难啊,看了几周也没有什么进展。这里做一个较为详细的介绍,以便以后查阅。

版本问题

node是需要调用v8的API的,如果v8的版本和node的版本不一致的话编译会有问题。在编译好的node中输入以下指令, 可以得到v8的版本,以便直接在v8上进行更改和debug。

node -p process.versions.v8

V8的编译

直接git clone的v8是不能编译的,需要同步以下三个文件夹
build, buildtools, tools/clang, third_paty

IDE问题

v8官方推荐的是用vscode,不过vscode找引用函数的地方还是用点问题,特别是被重载过的会找错。但是clion不太好支持ninja,需要自己编写CMakeLists.txt。

已经支持的API

v8本身就是支持CodeCache的,在api.cc中有CreateCodeCache的实现。下面是v8暴露给外层使用,比如可以给node使用的api。看起来还能支持Serialize。

// Create Code Cache
ScriptCompiler::CachedData* ScriptCompiler::CreateCodeCache(
    Local<UnboundScript> unbound_script) {
  i::Handle<i::SharedFunctionInfo> shared =
      i::Handle<i::SharedFunctionInfo>::cast(
          Utils::OpenHandle(*unbound_script));
  ASSERT_NO_SCRIPT_NO_EXCEPTION(shared->GetIsolate());
  DCHECK(shared->is_toplevel());
  return i::CodeSerializer::Serialize(shared);
}

还有在Compile的过程中使用code cache的。

 i::MaybeHandle<i::SharedFunctionInfo> maybe_function_info;
2504   if (options == kConsumeCodeCache) {
2505     if (source->consume_cache_task) {
2506       // Take ownership of the internal deserialization task and clear it off
2507       // the consume task on the source.
2508       DCHECK_NOT_NULL(source->consume_cache_task->impl_);
2509       std::unique_ptr<i::BackgroundDeserializeTask> deserialize_task =
2510           std::move(source->consume_cache_task->impl_);
2511       maybe_function_info =
2512           i::Compiler::GetSharedFunctionInfoForScriptWithDeserializeTask(
2513               isolate, str, script_details, deserialize_task.get(), options,
2514               no_cache_reason, i::NOT_NATIVES_CODE);
2515       source->cached_data->rejected = deserialize_task->rejected();
2516     } else {
2517       DCHECK(source->cached_data);
2518       // AlignedCachedData takes care of pointer-aligning the data.
2519       auto cached_data = std::make_unique<i::AlignedCachedData>(
2520           source->cached_data->data, source->cached_data->length);
2521       maybe_function_info =
2522           i::Compiler::GetSharedFunctionInfoForScriptWithCachedData(
2523               isolate, str, script_details, cached_data.get(), options,
2524               no_cache_reason, i::NOT_NATIVES_CODE);
2525       source->cached_data->rejected = cached_data->rejected();
2526     }

Code Cache类

在v8的compilation-cache.h中,有关于CodeCache实现的类,有一个SubCache的基类,以及不同的cache,包括scriptcache和evalcache, regexpcache。我猜测scriptcache应该是function cache,但是eval是什么有点奇怪。然后应该是对正则表达式有特殊的优化。cache最后会对应几张表,用表来管理Cache的创建销毁和查询。

// The compilation cache consists of several generational sub-caches which uses
// this class as a base class. A sub-cache contains a compilation cache tables
// for each generation of the sub-cache. Since the same source code string has
// different compiled code for scripts and evals, we use separate sub-caches
// for different compilation modes, to avoid retrieving the wrong result.
class CompilationSubCache {
 public:
  CompilationSubCache(Isolate* isolate, int generations)
      : isolate_(isolate), generations_(generations) {
    DCHECK_LE(generations, kMaxGenerations);
  }

  static constexpr int kFirstGeneration = 0;
  static constexpr int kMaxGenerations = 2;

  // Get the compilation cache tables for a specific generation.
  Handle<CompilationCacheTable> GetTable(int generation);

  // Accessors for first generation.
  Handle<CompilationCacheTable> GetFirstTable() {
    return GetTable(kFirstGeneration);
  }
  void SetFirstTable(Handle<CompilationCacheTable> value) {
    DCHECK_LT(kFirstGeneration, generations_);
    tables_[kFirstGeneration] = *value;
  }
  ......
}

最后会对应到表

Handle<CompilationCacheTable> CompilationCacheTable::PutScript(
    Handle<CompilationCacheTable> cache, Handle<String> src,
    LanguageMode language_mode, Handle<SharedFunctionInfo> value,
    Isolate* isolate) {
  src = String::Flatten(isolate, src);
  StringSharedKey key(src, language_mode);
  Handle<Object> k = key.AsHandle(isolate);
  cache = EnsureCapacity(isolate, cache);
  InternalIndex entry = cache->FindInsertionEntry(isolate, key.Hash());
  cache->set(EntryToIndex(entry), *k);
  cache->set(EntryToIndex(entry) + 1, *value);
  cache->ElementAdded();
  return cache;
}

根据https://v8.dev/blog/improved-code-caching,v8目前也有code cache,能够持久化的code cache是基于url作为key保存的。不过上面那个代码的PutScript一层一层往上,确实也用在了compile的地方。

但是我们应该可以在需要的地方做code cache,最合适的地方是在compile的位置,那我们就不得不弄清楚compile的过程。
优化等级有三种

case CodeKind::INTERPRETED_FUNCTION:
  name = "interpreter";
  break;
case CodeKind::BASELINE:
  name = "baseline";
  break;
case CodeKind::TURBOFAN:
  name = "optimize";
  break;

编译的tag也有几种

case CodeEventListener::EVAL_TAG:
  name += "-eval";
  break;
case CodeEventListener::SCRIPT_TAG:
  break;
case CodeEventListener::LAZY_COMPILE_TAG:
  name += "-lazy";
  break;
case CodeEventListener::FUNCTION_TAG:
  break;

CompilationJob


Author: 蒋璋
Reprint policy: All articles in this blog are used except for special statements CC BY 4.0 reprint polocy. If reproduced, please indicate source 蒋璋 !