Skip to main content

Releasing 0.8

· 4 min read
The LittleHorse Council
The Council of LittleHorse Maintainers

The 0.8 release of LittleHorse is out! This pre-1.0 release contains many new features, security enhancements, and performance improvements.

New Features

New features in this release cover some edge-cases in workflow development which came up from some initial pilots and internal usage of the platform.

Dynamic Task Execution

Before this release, a TaskNode had a hard-coded reference to a TaskDef. This means that every single WfRun that reaches the same Node in a WfSpec ends up executing the same TaskDef.

However, in LittleHorse Enterprises LLC's upcoming Control Plane project (a system for dynamically provisioning LittleHorse clusters as a SaaS service), we anticipate a special use-case (which we will blog about this upcoming fall) wherein we need to choose which TaskDef is executed dynamically at runtime.

Specifically, depending on an input variable to a WfRun (in this case, the data-plane-id variable), we need to execute a different TaskDef so that the TaskRun is executed by a speciific Task Worker in a specific location. We will blog about that use-case later.

Per-Thread Failure Handlers

Since the 0.1.0 release of LittleHorse it has been possible to put a FailureHandler on any Node, such that if the NodeRun fails, then a Failure Handler thread is

Content in EXCEPTIONs

  • #714

Multi-Tenancy Improvements

Multi-Tenancy has been quietly under development in the LittleHorse Server since the 0.6.0 release introduced a breaking change to allow for it last October. The 0.8 release continues to progress towards making Multi-Tenancy generally-available.

This release includes two new major features for Multi-Tenancy:

  1. Allowing Python and Go clients to set the tenant-id header using LHC_TENANT_id (#704)
  2. Allowing administrative Principals with admin privileges over multiple Tenants: (#679)

Multi-Tenancy and support for authentication + fine-grained ACL's via Principals has been a labor of love implemented by Eduwer Camacaro, who has grown into the role of Grumpy Maintainer of LittleHorse.

Kafka Security Protocol Support

Prior to release 0.8, the LH Server could only access a Kafka cluster with either:

  • Plaintext access with no security.
  • TLS with no authentication.
  • MTLS security.

PR #716 introduced the following Server configurations:

  • LHS_KAFKA_SECURITY_PROTOCOL
  • LHS_KAFKA_SASL_MECHANISM
  • LHS_KAFKA_SASL_JAAS_CONFIG

This allows for access to any Kafka cluster except those requiring loading custom implementations of callbacks on the client side (for example, using the Strimzi OAuth Plug-in).

It is now possible to run LH with Kafka as:

  • No security (PLAINTEXT)
  • TLS on the brokers, no authentication (SSL)
  • MTLS on the brokers (SSL with TRUSTSTORE set)
  • SASL with any JAAS config (SASL_SSL)
  • Confluent Cloud.

LittleHorse Canary

The LittleHorse Canary was released in early access. Inspired by the Strimzi Canary for Apache Kafka, the LittleHorse Canary is a system that runs workflows on LittleHorse and reports on the health of the cluster(s) that it is monitoring.

The LH Canary system comprises two components:

  1. The Metronome, which runs workflows and sends metric beats to a Kafka topic.
  2. The Aggregator, which consumes the metrics beats Kafka topic and aggregates metrics to be exposed to Prometheus and a GRPC API.

The goal of the Canary is to monitor, profile, and benchmark LittleHorse Clusters from the same exact perspective as the clients who use them.

The Canary is the brain child of Saúl Piña, who is also the author of the popular Kaskade TUI for Apache Kafka.

Exponential Backoff Retry Policy

PR (#707) introduced the ability to configure exponential backoff for TaskRun retries. Previously, only immediate retries were supported.

JavaScript Client

We published the first version of littlehorse-client on NPM here. This client contains the LHConfig in javascript, which provides access to our LittleHorse GRPC API. Note that we do not yet support a JavaScript Task Worker nor a JavaScript WfSpec SDK.

Bugfixes

In this release, we fixed several bugs:

  • Task Worker improperly reported EXCEPTIONs and ERRORs when throwing LHTaskException (#738)
  • Fixes task queue rehydration (#727)
  • Fixes the Retention Policy for ExternalEventDef's (#724)
  • Fixes deadlock in Java task worker (#723)
  • Fixes concurrency bug with the AsyncWaiter in the server (#719)
  • Fixes various issues from soak tests (#706)
  • Fixes to NodeRun lifecycle (#665).

Looking Forward

We continue to stabilize our API and add features that cover edge cases. Load testing, chaos testing, and soak testing are an ongoing project, and we are working with the Apache Kafka Community on a few bugfixes in the Kafka Streams library which is heavily used in the core of LittleHorse.

Once those action items are resolved, we will make a 1.0 release candidate. However, in the meantime we don't expect any massively-breaking API changes at the protocol level. However, certain syntactical changes may occur in our SDK's (especially Go and Python).