使用WebAssembly进行扩展开发

2024年5月8日，作者：Dirk Bäumer

Visual Studio Code 支持通过 WebAssembly 执行引擎扩展来执行 WASM 二进制文件。主要用例是将用 C/C++ 或 Rust 编写的程序编译成 WebAssembly，然后直接在 VS Code 中运行这些程序。一个显著的例子是 Visual Studio Code for Education，它利用此支持在 VS Code for the Web 中运行 Python 解释器。这篇博客文章提供了关于如何实现这一点的详细见解。

2024年1月，Bytecode联盟发布了WASI 0.2预览版。WASI 0.2预览版中的一项关键技术是组件模型。WebAssembly组件模型通过标准化接口、数据类型和模块组合，简化了WebAssembly组件与其宿主环境之间的交互。这种标准化是通过使用WIT（WASM接口类型）文件来实现的。WIT文件有助于描述JavaScript/TypeScript扩展（宿主）与执行用另一种语言（如Rust或C/C++）编写的计算的WebAssembly组件之间的交互。

这篇博客文章概述了开发者如何利用组件模型将WebAssembly库集成到他们的扩展中。我们重点关注三个用例：(a) 使用WebAssembly实现一个库并从JavaScript/TypeScript的扩展代码中调用它，(b) 从WebAssembly代码中调用VS Code API，以及(c) 演示如何使用资源来封装和管理WebAssembly或TypeScript代码中的有状态对象。

示例要求您安装以下工具的最新版本，以及VS Code和NodeJS：rust compiler toolchain、wasm-tools和wit-bindgen。

我还想感谢来自Fastly的L. Pereira和Luke Wagner对本文的宝贵反馈。

Rust中的计算器

在第一个示例中，我们演示了开发者如何将用Rust编写的库集成到VS Code扩展中。如前所述，组件使用WIT文件进行描述。在我们的示例中，库执行简单的操作，如加法、减法、乘法和除法。相应的WIT文件如下所示：

package vscode:example;

interface types {
	record operands {
		left: u32,
		right: u32
	}

	variant operation {
		add(operands),
		sub(operands),
		mul(operands),
		div(operands)
	}
}
world calculator {
	use types.{ operation };

	export calc: func(o: operation) -> u32;
}

Rust工具wit-bindgen用于为计算器生成Rust绑定。有两种使用此工具的方式：

作为一个过程宏，直接在实现文件中生成绑定。这种方法虽然标准，但有一个缺点，即不允许检查生成的绑定代码。
作为一个命令行工具，它在磁盘上创建一个绑定文件。这种方法在VS Code扩展示例仓库中的代码中得到了体现，用于下面的资源示例。

对应的Rust文件，使用wit-bindgen工具作为过程宏，如下所示：

// Use a procedural macro to generate bindings for the world we specified in
// `calculator.wit`
wit_bindgen::generate!({
	// the name of the world in the `*.wit` input file
	world: "calculator",
});

然而，使用命令cargo build --target wasm32-unknown-unknown将Rust文件编译为WebAssembly时，由于缺少导出的calc函数的实现，会导致编译错误。以下是calc函数的一个简单实现：

// Use a procedural macro to generate bindings for the world we specified in
// `calculator.wit`
wit_bindgen::generate!({
	// the name of the world in the `*.wit` input file
	world: "calculator",
});

struct Calculator;

impl Guest for Calculator {

    fn calc(op: Operation) -> u32 {
		match op {
			Operation::Add(operands) => operands.left + operands.right,
			Operation::Sub(operands) => operands.left - operands.right,
			Operation::Mul(operands) => operands.left * operands.right,
			Operation::Div(operands) => operands.left / operands.right,
		}
	}
}

// Export the Calculator to the extension code.
export!(Calculator);

文件末尾的export!(Calculator);语句从WebAssembly代码中导出Calculator，以便扩展能够调用API。

wit2ts 工具用于生成必要的 TypeScript 绑定，以便在 VS Code 扩展中与 WebAssembly 代码进行交互。该工具由 VS Code 团队开发，以满足 VS Code 扩展架构的特定需求，主要是因为：

VS Code API 只能在扩展主机工作线程中访问。从扩展主机工作线程生成的任何额外工作线程都无法访问 VS Code API，这与 NodeJS 或浏览器等环境形成对比，在这些环境中，每个工作线程通常都可以访问几乎所有的运行时 API。
多个扩展共享同一个扩展主机工作线程。扩展应避免在该工作线程上执行任何长时间运行的同步计算。

这些架构要求在我们实现VS Code的WASI预览版1时已经存在。然而，我们的初始实现是手动编写的。预计组件模型将得到更广泛的采用，我们开发了一个工具，以促进组件与其VS Code特定主机实现的集成。

命令 wit2ts --outDir ./src ./wit 在 src 文件夹中生成一个 calculator.ts 文件，包含 WebAssembly 代码的 TypeScript 绑定。一个利用这些绑定的简单扩展如下所示：

import * as vscode from 'vscode';
import { WasmContext, Memory } from '@vscode/wasm-component-model';

// Import the code generated by wit2ts
import { calculator, Types } from './calculator';

export async function activate(context: vscode.ExtensionContext): Promise<void> {
  // The channel for printing the result.
  const channel = vscode.window.createOutputChannel('Calculator');
  context.subscriptions.push(channel);

  // Load the Wasm module
  const filename = vscode.Uri.joinPath(
    context.extensionUri,
    'target',
    'wasm32-unknown-unknown',
    'debug',
    'calculator.wasm'
  );
  const bits = await vscode.workspace.fs.readFile(filename);
  const module = await WebAssembly.compile(bits);

  // The context for the WASM module
  const wasmContext: WasmContext.Default = new WasmContext.Default();

  // Instantiate the module
  const instance = await WebAssembly.instantiate(module, {});
  // Bind the WASM memory to the context
  wasmContext.initialize(new Memory.Default(instance.exports));

  // Bind the TypeScript Api
  const api = calculator._.exports.bind(
    instance.exports as calculator._.Exports,
    wasmContext
  );

  context.subscriptions.push(
    vscode.commands.registerCommand('vscode-samples.wasm-component-model.run', () => {
      channel.show();
      channel.appendLine('Running calculator example');
      const add = Types.Operation.Add({ left: 1, right: 2 });
      channel.appendLine(`Add ${api.calc(add)}`);
      const sub = Types.Operation.Sub({ left: 10, right: 8 });
      channel.appendLine(`Sub ${api.calc(sub)}`);
      const mul = Types.Operation.Mul({ left: 3, right: 7 });
      channel.appendLine(`Mul ${api.calc(mul)}`);
      const div = Types.Operation.Div({ left: 10, right: 2 });
      channel.appendLine(`Div ${api.calc(div)}`);
    })
  );
}

当你在VS Code for the Web中编译并运行上述代码时，它会在Calculator通道中产生以下输出：

您可以在VS Code 扩展示例仓库中找到此示例的完整源代码。

深入 @vscode/wasm-component-model

检查由wit2ts工具生成的源代码，可以发现它依赖于@vscode/wasm-component-model npm模块。该模块作为组件模型的规范ABI的VS Code实现，并受到相应Python代码的启发。虽然理解这篇博客文章不需要理解组件模型的内部工作原理，但我们将对其工作机制进行一些说明，特别是关于如何在JavaScript/TypeScript和WebAssembly代码之间传递数据。

与其他工具如wit-bindgen或jco不同，这些工具为WIT文件生成绑定，而wit2ts创建了一个元模型，然后可以用于在运行时为各种用例生成绑定。这种灵活性使我们能够满足VS Code中扩展开发的架构要求。通过使用这种方法，我们可以“promisify”绑定，并允许在工作者中运行WebAssembly代码。我们采用这种机制来实现WASI 0.2预览用于VS Code。

你可能已经注意到，在生成绑定时，函数是通过像calculator._.imports.create这样的名称引用的（注意下划线）。为了避免与WIT文件中的符号发生名称冲突（例如，可能有一个名为imports的类型定义），API函数被放置在一个_命名空间中。元模型本身则位于一个$命名空间中。因此，calculator.$.exports.calc表示导出的calc函数的元数据。

在上面的例子中，传递给calc函数的add操作参数由三个字段组成：操作码、左值和右值。根据组件模型的规范ABI，参数是按值传递的。它还概述了数据如何被序列化、传递给WebAssembly函数，并在另一端反序列化。这个过程产生了两个操作对象：一个在JavaScript堆上，另一个在线性WebAssembly内存中。下图说明了这一点：

图示参数如何传递。

下表列出了可用的WIT类型，它们在VS Code组件模型实现中映射到JavaScript对象的方式，以及使用的相应TypeScript类型。

WIT	JavaScript	TypeScript
u8	number	type u8 = number;
u16	number	type u16 = number;
u32	number	type u32 = number;
u64	bigint	type u64 = bigint;
s8	number	type s8 = number;
s16	number	type s16 = number;
s32	number	type s32 = number;
s64	bigint	type s64 = bigint;
float32	number	type float32 = number;
float64	number	type float64 = number;
bool	boolean	boolean
string	string	string
char	string[0]	string
record	object literal	type declaration
list<T>	[]	Array<T>
tuple<T1, T2>	[]	[T1, T2]
enum	string values	string enum
flags	number	bigint
variant	object literal	discriminated union
option<T>	variable	? and (T \| undefined)
result<ok, err>	Exception or object literal	Exception or result type

需要注意的是，组件模型不支持低级（C风格）指针。因此，您不能传递对象图或递归数据结构。在这方面，它与JSON有相同的限制。为了最小化数据复制，组件模型引入了资源的概念，我们将在本博客文章的后续部分中更详细地探讨这一点。

jco项目也支持使用type命令为WebAssembly组件生成JavaScript/TypeScript绑定。如前所述，我们开发了自己的工具以满足VS Code的特定需求。然而，我们与jco团队每两周举行一次会议，以确保在可能的情况下工具之间的一致性。一个基本要求是，两个工具应该对WIT数据类型使用相同的JavaScript和TypeScript表示。我们还在探索在两个工具之间共享代码的可能性。

从WebAssembly代码调用TypeScript

WIT文件描述了主机（一个VS Code扩展）与WebAssembly代码之间的交互，促进双向通信。在我们的示例中，此功能允许WebAssembly代码记录其活动的跟踪。为了实现这一点，我们按如下方式修改WIT文件：

world calculator {

	/// ....

	/// A log function implemented on the host side.
	import log: func(msg: string);

	/// ...
}

在 Rust 方面，我们现在可以调用 log 函数：

fn calc(op: Operation) -> u32 {
	log(&format!("Starting calculation: {:?}", op));
	let result = match op {
		// ...
	};
	log(&format!("Finished calculation: {:?}", op));
	result
}

在TypeScript方面，扩展开发者唯一需要采取的行动是提供日志函数的实现。然后，VS Code组件模型会促进必要绑定的生成，这些绑定将作为导入传递给WebAssembly实例。

export async function activate(context: vscode.ExtensionContext): Promise<void> {
  // ...

  // The channel for printing the log.
  const log = vscode.window.createOutputChannel('Calculator - Log', { log: true });
  context.subscriptions.push(log);

  // The implementation of the log function that is called from WASM
  const service: calculator.Imports = {
    log: (msg: string) => {
      log.info(msg);
    }
  };

  // Create the bindings to import the log function into the WASM module
  const imports = calculator._.imports.create(service, wasmContext);
  // Instantiate the module
  const instance = await WebAssembly.instantiate(module, imports);

  // ...
}

与第一个示例相比，WebAssembly.instantiate 调用现在包括了 calculator._.imports.create(service, wasmContext) 的结果作为第二个参数。这个 imports.create 调用从服务实现中生成了低级的 WASM 绑定。在最初的示例中，我们传递了一个空的对象字面量，因为不需要任何导入。这次，我们在 VS Code 桌面环境的调试器下执行扩展。感谢 Connor Peet 的出色工作，现在可以在 Rust 代码中设置断点，并使用 VS Code 调试器逐步执行。

使用组件模型资源

WebAssembly组件模型引入了资源的概念，它提供了一种标准化的机制来封装和管理状态。这种状态在调用边界的一侧（例如，在TypeScript代码中）进行管理，并在另一侧（例如，在WebAssembly代码中）进行访问和操作。资源在WASI预览版0.2 API中被广泛使用，文件描述符是一个典型的例子。在这种设置中，状态由扩展主机管理，并由WebAssembly代码进行访问和操作。

资源也可以反向工作，其状态由WebAssembly代码管理，并由扩展代码访问和操作。这种方法对于VS Code在WebAssembly中实现有状态服务特别有益，这些服务随后从TypeScript端访问。在下面的示例中，我们定义了一个资源，该资源实现了一个支持逆波兰表示法的计算器，类似于惠普手持计算器中使用的计算器。

// wit/calculator.wit
package vscode:example;

interface types {

	enum operation {
		add,
		sub,
		mul,
		div
	}

	resource engine {
		constructor();
		push-operand: func(operand: u32);
		push-operation: func(operation: operation);
		execute: func() -> u32;
	}
}
world calculator {
	export types;
}

以下是Rust中计算器资源的简单实现：

impl EngineImpl {
	fn new() -> Self {
		EngineImpl {
			left: None,
			right: None,
		}
	}

	fn push_operand(&mut self, operand: u32) {
		if self.left == None {
			self.left = Some(operand);
		} else {
			self.right = Some(operand);
		}
	}

	fn push_operation(&mut self, operation: Operation) {
        let left = self.left.unwrap();
        let right = self.right.unwrap();
        self.left = Some(match operation {
			Operation::Add => left + right,
			Operation::Sub => left - right,
			Operation::Mul => left * right,
			Operation::Div => left / right,
		});
	}

	fn execute(&mut self) -> u32 {
		self.left.unwrap()
	}
}

在TypeScript代码中，我们以与之前相同的方式绑定导出。唯一的区别是，绑定过程现在为我们提供了一个代理类，用于在WebAssembly代码中实例化和管理calculator资源。

// Bind the JavaScript Api
const api = calculator._.exports.bind(
  instance.exports as calculator._.Exports,
  wasmContext
);

context.subscriptions.push(
  vscode.commands.registerCommand('vscode-samples.wasm-component-model.run', () => {
    channel.show();
    channel.appendLine('Running calculator example');

    // Create a new calculator engine
    const calculator = new api.types.Engine();

    // Push some operands and operations
    calculator.pushOperand(10);
    calculator.pushOperand(20);
    calculator.pushOperation(Types.Operation.add);
    calculator.pushOperand(2);
    calculator.pushOperation(Types.Operation.mul);

    // Calculate the result
    const result = calculator.execute();
    channel.appendLine(`Result: ${result}`);
  })
);

当你运行相应的命令时，它会在输出通道中打印Result: 60。如前所述，资源的状态驻留在调用边界的一侧，并通过句柄从另一侧访问。除了传递给与资源交互的方法的参数外，不会发生数据复制。

图示资源如何被访问。

此示例的完整源代码可在VS Code 扩展示例仓库中找到。

直接从Rust使用VS Code API

组件模型资源可以用于封装和管理跨WebAssembly组件和宿主的状态。这种能力使我们能够利用资源将VS Code API规范地暴露到WebAssembly代码中。这种方法的优势在于整个扩展可以用编译为WebAssembly的语言编写。我们已经开始探索这种方法，下面是一个用Rust编写的扩展的源代码：

use std::rc::Rc;

#[export_name = "activate"]
pub fn activate() -> vscode::Disposables {
	let mut disposables: vscode::Disposables = vscode::Disposables::new();

	// Create an output channel.
	let channel: Rc<vscode::OutputChannel> = Rc::new(vscode::window::create_output_channel("Rust Extension", Some("plaintext")));

	// Register a command handler
	let channel_clone = channel.clone();
	disposables.push(vscode::commands::register_command("testbed-component-model-vscode.run", move || {
		channel_clone.append_line("Open documents");

		// Print the URI of all open documents
		for document in vscode::workspace::text_documents() {
			channel.append_line(&format!("Document: {}", document.uri()));
		}
	}));
	return disposables;
}

#[export_name = "deactivate"]
pub fn deactivate() {
}

请注意，这段代码类似于用TypeScript编写的扩展。

尽管这次探索看起来很有前景，但我们决定暂时不继续推进。主要原因是WASM缺乏异步支持。许多VS Code API是异步的，这使得它们难以直接代理到WebAssembly代码中。我们可以在一个单独的worker中运行WebAssembly代码，并采用与WASI Preview 1支持中相同的同步机制，在WebAssembly worker和扩展主机worker之间进行同步。然而，这种方法在同步API调用时可能会导致意外行为，因为这些调用实际上会异步执行。因此，在两个同步调用之间，可观察的状态可能会发生变化（例如，setX(5); getX();可能不会返回5）。

此外，我们正在努力在0.3预览版时间框架内为WASI引入完整的异步支持。Luke Wagner在WASM I/O 2024上提供了关于当前异步支持状态的更新。我们决定等待这一支持，因为它将实现更完整和干净的实现。

如果您对相应的WIT文件、Rust代码和TypeScript代码感兴趣，您可以在vscode-wasm仓库的rust-api文件夹中找到它们。

接下来是什么

我们目前正在准备一篇后续的博客文章，将涵盖更多可以使用WebAssembly代码进行扩展开发的领域。主要主题将包括：

编写语言服务器在WebAssembly中。
使用生成的元模型将长时间运行的WebAssembly代码透明地卸载到单独的工作线程中。

在VS Code中实现了组件模型的惯用实现后，我们继续努力为VS Code实现WASI 0.2预览。

谢谢，

Dirk 和 VS Code 团队

编程快乐！