Skip to main content

72 posts tagged with "KCL"

View All Tags

· 5 min read

cover

Introduction

KubeVela is a modern application delivery system hosted by the CNCF Foundation. It is built on the Open Application Model (OAM) specification and aims to abstract the complexity of Kubernetes, providing a set of simple and easy-to-use command-line tools and APIs for developers to deploy and operate cloud-native applications without worrying about the underlying details.

KCL is a configuration and policy language for cloud-native scenarios, hosted by the CNCF Foundation. It aims to improve the writing of complex configurations, such as cloud-native Kubernetes configurations, using mature programming language techniques and practices. KCL focuses on building better modularity, scalability, and stability around configuration, as well as easier logic writing, automation, and integration with the toolchain.

KCL exists in a completely open cloud-native world and is not tied to any orchestration/engine tools or Kubernetes controllers. It can provide API abstraction, composition, and validation capabilities for both Kubernetes clients and runtime.

Users can choose suitable cloud-native tools such as Kubectl, Helm, Kustomize, KPT, KusionStack, KubeVela, Helmfile, Crossplane, or ArgoCD to combine with KCL and apply configurations to the cluster based on their specific scenarios.

integration

This blog is the first in a series that explores the efficient deployment and operation of cloud-native applications using KCL and KubeVela together. We will share more advanced usage in future articles, so stay tuned.

Using KCL with KubeVela

Using KCL with KubeVela has the following benefits:

  • Simpler configuration: KCL provides stronger templating capabilities, such as conditions and loops, for KubeVela OAM configurations at the client level, reducing the need for repetitive YAML writing. At the same time, the reuse of KCL model libraries and toolchains enhances the experience and management efficiency of configuration and policy writing.
  • Better maintainability: KCL provides a configuration file structure that is more conducive to version control and team collaboration, instead of relying solely on YAML. When combined with OAM application models written in KCL, application configurations become easier to maintain and iterate.
  • Simplified operations: By combining the simplicity of KCL configurations with the ease of use of KubeVela, daily operational tasks such as deploying, updating, scaling, or rolling back applications can be simplified. Developers can focus more on the applications themselves rather than the tedious details of the deployment process.
  • Improved cross-team collaboration: By using KCL's configuration chunk writing and package management capabilities in conjunction with KubeVela, clearer boundaries can be defined, allowing different teams (such as development, testing, and operations teams) to collaborate systematically. Each team can focus on tasks within their scope of responsibility, delivering, sharing, and reusing their own configurations without worrying about other aspects.

Workflow

workflow

In this example, we use the KCL Playground application (written in Go and HTML5) as an example and use KCL to define the OAM configuration that needs to be deployed. The overall workflow is as follows:

  • Application code development produces a Docker image.
  • Write OAM configurations using KCL.
  • Deploy configurations using KubeVela.
  • Verify the running status of the application.

Specific Steps

0. Prerequisites

  • Familiarize yourself with basic Unix/Linux commands.
  • Familiarize yourself with using Git.
  • Understand the basics of Kubernetes.
  • Understand KubeVela.
  • Understand the basics of KCL.

1. Configure the Kubernetes Cluster

Install K3d and create a cluster.

k3d cluster create

Note: You can use other methods to create your own Kubernetes cluster, such as kind, minikube, etc., in this scenario.

2. Install KubeVela

  • Install the KubeVela CLI.
curl -fsSl https://kubevela.net/script/install.sh | bash
  • Install KubeVela Core.
vela install

3. Write OAM Configurations

  • Install KCL.
curl -fsSL https://kcl-lang.io/script/install-cli.sh | /bin/bash
  • Create a new project and add OAM dependencies.
kcl mod init kcl-play-svc && cd kcl-play-svc && kcl mod add oam
  • Write the following code in main.k.
import oam

oam.Application {
metadata.name = "kcl-play-svc"
spec.components = [{
name = metadata.name
type = "webservice"
properties = {
image = "kcllang/kcl:v0.9.0"
ports = [{port = 80, expose = True}]
cmd = ["kcl", "play"]
}
}]
}

Note: You can see documents here: https://artifacthub.io/packages/kcl/kcl-module/oam or in the IDE extension.

oam-definition-hover

4. Deploy the application and verify.

  • Apply the configuration.
kcl run | vela up -f -
  • Port forward the service.
vela port-forward kcl-play-svc

Then we can see the KCL Playground application running successfully in the browser.

kcl-play-svc

Conclusion

Through this guide, we have learned how to deploy cloud-native applications using KubeVela and KCL. In future blogs, we will explain how to further extend the capabilities of KubeVela by using KCL on the client side such as

  • Using the inheritance, composition, and validation capabilities of KCL to extend the OAM model and define application abstractions that are better suited to your infrastructure or organization.
  • Using the modularized configuration capabilities of KCL to organize OAM multi-environment configurations with conditions, logic, loops, and modularity. For example, distribute longer App Definitions into different files to reduce boilerplate configurations.
  • Further integration with projects like KusionStack and ArgoCD to achieve better GitOps.
  • Incorporate more cloud-native capabilities or Kubernetes Operators such as KubeBlocks and Crossplane to improve database management and provide programmable access to unified cloud APIs and Kubernetes APIs.
  • And many other use cases...

· 12 min read

What is the KCL semantic model?

image.png

  • "Semantic model" refers to the in-memory representation of modules, functions, and types that appear in source code. This representation is fully "resolved": all expressions have types (note that there may be expression types that cannot be deduced in the KCL, they will be defined as any Type), and all references are bound to declarations, etc.
  • The client can submit a small amount of input data (typically changes to a single file) and get a new code model to explain the changes.
  • The underlying engine ensures that the model is Lazy (on demand) and incremental computational and can be updated quickly for small changes.

Why does KCL need a new semantic model?

First, we can take a brief look at the design of the old semantic model: image.png the old semantic model can be simply regarded as a collection of a large number of scopes, in which different scopes store the parent-child nodes between the scopes, as well as the symbol strings and corresponding types contained therein. It can simply meet the requirements of the compiler for type checking and code generation. But a simple structure involving an advanced tool chain, such as an IDE, is not sufficient. A few examples of typical IDE queries:

  • What is the type of the AST node under the corresponding position?
  • Where are all the references to the current AST node? (find reference)
  • Which node is referenced by the current AST node? (find definition)
  • All symbols accessible at the current position?

Only the old semantic model is used, which requires the IDE to traverse the AST for many times and perform repeated calculations. We can simply analyze the problems of the old semantic model, and we can find that:

  • The old semantic model was more difficult to query information, and only stored the mapping from character string to symbol
  • The association between symbols and the weak association between symbols and scopes often leads to the need to traverse all scopes when querying for relevant information
  • A large amount of intermediate information is discarded in the analysis process and is not cached, resulting in repeated operations for multiple queries.

In short, the old semantic model can not meet the query needs of the advanced tool chain, and a lot of information is missing. On the other hand, the old semantic model does not support incremental compilation, which also reduces the user experience of the tool chain.

Main Idea: Map Reduce

The idea of the Map Reduce architecture is to split the analysis into a relatively simple indexing phase and a separate full analysis phase.

The core constraint of indexing is that it runs on a per-file basis, with the indexer taking the text of a single file, parsing it, and spitting out some data about that file. The indexer cannot touch other files. Full analysis can read other files and use the information in the index to save effort.

This sounds too abstract, so let's look at a concrete example-Java. In Java, each file begins with a package declaration. The indexer concatenates the package name with the class name to get the fully qualified name. It also collects the set of methods declared in the class, the list of superclasses and interfaces, and so on.

The data for each file is merged into an index that maps fully qualified names (FQNs) to classes. The index is inexpensive to update, and when a file modification request arrives, the contribution to that file in the index is removed, the text of the file is changed, and the indexer runs on the new text and adds the new contribution. The amount of work to be done is proportional to the number of files changed and is independent of the total number of files.

Let's see how to use the FQN index to quickly provide completion.

// File ./mypackage/Foo.java
package mypackage;

import java.util.*;

public class Foo {
public static Bar f() {
return new Bar();
}
}

// File ./mypackage/Bar.java
package mypackage;

public class Bar {
public void g() {}
}

// File ./Main.java
import mypackage.Foo;

public class Main {
public static void main(String[] args) {
Foo.f().
}
}

The user just entered Foo.f(). We need to clarify that the receiver expression type is Bar and suggest g as completion.

Firstly, when the file Main.java is modified, we run the indexer on this individual file without any changes (the file still contains the class Main with static main methods), so we do not need to update the FQN index.

Next, we need to resolve the name Foo. We parsed the file Main.java and noticed the import mypackage.Foo and search for mypackage.Foo in the FQN index. In the index, we found that Foo has a static method f, so we successfully resolved the call to f(). The index also stores the return type of f, but please note that the index stores the string Bar instead of a direct reference to the class Bar.

The reason for doing this is that the import java.util.* in Foo.java can cause the Bar to be inferred as either java.util.Bar' or mypackage.Bar, the indexer doesn't know which one it is because it can only "see" the text of the file Foo.java. In other words, although the index does store the return types of methods, it stores them in an unresolved form.

The next step is to parse the identifier Bar in the context of Foo.java. This will continue to use FQN indexing and navigate to class mypackage.Bar. So in the end, we found the method g that we wanted to complete.

During the completion process, only three files were touched upon in total. The FQN index allows us to completely ignore all other files in the project.

One problem with the methods described so far is that parsing types from indexes requires a lot of work. For example, if Foo.f' is called multiple times, this task may be repeated. The solution is to add a cache. The name resolution result will be remembered, so only one resolution is required. Any changes will cause the cache to completely invalid - the cost of rebuilding the cache using indexes is not that high.

To summarize, the working principle of the first method is as follows:

  1. Each file is independently and parallelly indexed, generating a "stub" - a set of visible top-level declarations with unresolved types.
  2. Merge all stubs into one indexed data structure.
  3. Name parsing and type inference are mainly based on stubs.
  4. Name resolution is lazy (we only parse types from stubs when needed) and memory based (each type is parsed only once).
  5. The cache will be completely invalidated every time a change is made
  6. The index is incrementally updated:
    • If the editor has not changed the stub of the file, there is no need to change the index.
    • Otherwise, the old index will be deleted and the new index will be added again

This method is simple enough and has excellent performance. Most of the work is mainly in the indexing phase, and we can execute these tasks in parallel. Two examples of this architecture are IntelliJ And Sorbet.

The main drawback of this method is that it is only effective when it is effective - specifically, not every language has a clearly defined concept of FQN. But overall, designing modules and name parsing is always good for languages, and specifically, in the current situation, KCL just meets this condition.

New Semantic Model Pipeline

The overall pipeline of the new semantic model is as follows: image.png

image.png

image.png Compared with the analysis process of the original semantic model, the new semantic model adds two rounds of pass, namer and advanced _ resolve, so as to enhance the support for the advanced tool chain without affecting the original compiler process.

  • The resolver is based on file level work, mainly involving the initialization of the GlobalState, parsing the source code into AST, and establishing the mapping of AST nodes to types for later stages to use. Therefore, we can cache the output of a single file index to completely skip parsing the file when its content has not changed.
  • The early stage of namer will also be based on file level work, which collects global symbols defined in the file, and then merges the symbols based on FQN to obtain a unique GlobalState
    • Based on file level, it means we can easily perform incremental compilation in the first two stages
  • advanced_resolver will traverse the AST to resolve local symbols and point symbol references to their definitions, while setting the owner symbol for the local scope, such as Schema and Package

Semantic Database: GlobalState

The core structure of the new semantic model is core::GlobalState that the tool chain mainly uses it to complete the interaction and query with the compiler.

/// GlobalState is used to store semantic information of KCL source code
#[derive(Default, Debug, Clone)]
pub struct GlobalState {
// store all allocated symbols
symbols: KCLSymbolData,
// store all allocated scopes
scopes: ScopeData,
// store package infomation for name mapping
packages: PackageDB,
// store semantic information after analysis
pub(crate) sema_db: SemanticDB,
}

GlobalState is used as a semantic database for the new semantic model and is the final product of semantic analysis, mainly containing four aspects of information:

  • SymbolData: stores all symbols in the AST and their corresponding semantic information, and maintain the reference relationship.
  • ScopeData: stores all scopes involved in the AST, while separating symbols, maintaining symbol visibility and scope nesting relationships
  • PackageDB: stores package information, such as a collection of files for the package, import information, and so on.
  • SemanticDB: stores auxiliary information to speed up queries, such as symbol sorting and position caching.

SymbolData

SymbolData is responsible for managing the allocation of symbols and storing the allocated symbols and related semantic information. Here we borrow the arena design of rust to access the relevant symbols.

#[derive(Default, Debug, Clone)]
pub struct KCLSymbolData {
pub(crate) values: Arena<ValueSymbol>,
pub(crate) packages: Arena<PackageSymbol>,
pub(crate) attributes: Arena<AttributeSymbol>,
pub(crate) schemas: Arena<SchemaSymbol>,
pub(crate) type_aliases: Arena<TypeAliasSymbol>,
pub(crate) unresolved: Arena<UnresolvedSymbol>,
pub(crate) rules: Arena<RuleSymbol>,

pub(crate) symbols_info: SymbolDB,
}

In the new semantic model, we use core::SymbolRef to represent a reference to a symbol, and also use SymbolRef to access SymbolData for the specific symbol information.

#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash)]
pub struct SymbolRef {
pub(crate) id: generational_arena::Index,
pub(crate) kind: SymbolKind,
}

Specifically, Symbols with different types will be taken out from SymbolData according to SymbolRef::kind and converted to abstract trait Symbol.

pub type KCLSymbol = dyn Symbol<SymbolData = KCLSymbolData,
SemanticInfo = KCLSymbolSemanticInfo>;
pub fn get_symbol(&self, id: SymbolRef) -> Option<&KCLSymbol> {
match id.get_kind() {
SymbolKind::Schema => self
.schemas
.get(id.get_id())
.map(|symbol| symbol as &KCLSymbol),
...
}
}
pub trait Symbol {
type SymbolData;
type SemanticInfo;
fn get_sema_info(&self) -> &Self::SemanticInfo;
fn is_global(&self) -> bool;
fn get_range(&self) -> Range;
fn get_owner(&self) -> Option<SymbolRef>;
fn get_definition(&self) -> Option<SymbolRef>;
fn get_name(&self) -> String;
fn get_id(&self) -> Option<SymbolRef>;
fn get_attribute(&self, ...) -> Option<SymbolRef>;
fn has_attribute(&self, ...) -> bool;

fn get_all_attributes(&self, ...) -> Vec<SymbolRef>;

fn simple_dump(&self) -> String;

fn full_dump(&self, data: &Self::SymbolData) -> Option<String>;
}

hrough this trait, the tool chain can easily complete the query of symbol semantic information and reference relationship.

ScopeData

The design idea of ScopeData is actually similar to SymbolData, it stores Scope with different types and using ScopeRef to access them.

#[derive(Default, Debug, Clone)]
pub struct ScopeData {
/// map pkgpath to root_scope
pub(crate) root_map: IndexMap<String, ScopeRef>,
pub(crate) locals: generational_arena::Arena<LocalSymbolScope>,
pub(crate) roots: generational_arena::Arena<RootSymbolScope>,
}
pub trait Scope {
type SymbolData;
fn get_filename(&self) -> &str;
fn get_parent(&self) -> Option<ScopeRef>;
fn get_children(&self) -> Vec<ScopeRef>;

fn contains_pos(&self, pos: &Position) -> bool;

fn get_owner(&self) -> Option<SymbolRef>;
fn look_up_def(&self, ...) -> Option<SymbolRef>;

fn get_all_defs(&self, ...) -> HashMap<String, SymbolRef>;

fn dump(&self, scope_data: &ScopeData,
symbol_data: &Self::SymbolData) -> Option<String>;
}

SemanticDB

SemanticDB is essentially the caching and integration of partial semantic information of semantic objects. Its main function is to accelerate the maintenance and querying of internal information in GlobalState.

#[derive(Debug, Default, Clone)]
pub struct SemanticDB {
pub(crate) file_sema_map: IndexMap<String, FileSemanticInfo>,
}

#[derive(Debug, Clone)]
pub struct FileSemanticInfo {
pub(crate) filename: String,
pub(crate) symbols: Vec<SymbolRef>,
pub(crate) scopes: Vec<ScopeRef>,
pub(crate) symbol_locs: IndexMap<SymbolRef, CachedLocation>,
pub(crate) local_scope_locs: IndexMap<ScopeRef, CachedRange>,
}

Summary

The new semantic model of KCL essentially only does two things, one is to sink the repeated calculation of the tool chain in the application layer to the semantic layer, and design a corresponding mechanism to simplify the information query, and the other is to re-analyze and cache the information lost in the semantic analysis process. There are several main purposes for doing so:

  • The calculation process is gathered, and the intrusion of the application layer into the semantic core of the compiler is prevented
  • The cache mechanism is improved, and the implementation of incremental compilation is simplified, so that the query speed is accelerated.
  • Improve maintainability by simplifying the development of the application layer tool chain and reducing the handling of Corner Cases by the tool chain

In practice, the above objectives are basically achieved. After migration, the code volume of LSP related functions is reduced by about 60%, and the compilation speed is increased by about 500% after incremental compilation.