Releasing 0.9
The 0.9.2
release is now availble and ready for use. The 0.9.x
releases focused mainly on:
- Improving the user experience on the LittleHorse Dashboard
- Improving the reliability of the LH Server in the face of rebalances and failures.
New Features
While the majority of the improvements in the 0.9
release revolve around performance and stability, several of them are highly visible to the user (especially the new dashboard!).
Dashboard Rewrite
With help from Nelson Jumbo, LittleHorse Knight Mijaíl Rondón rewrote and revamped our administrative dashboard. It now inclues new features such as:
- User Task Detail page
- Improved details on
TaskRun
progress - Improved details on
WfRun
progress - A plethora of small bug fixes.
Internal Task Queue Optimizations
Deep in the internals of the LittleHorse Server, we implement a Task Queue mechanism to store ScheduledTask
s before they're dispatched to the Task Worker clients. This release included many improvements to stability of the Task Queues.
Most importantly, our Grumpy Maintainer (Eduwer Camacaro) put a cap on the memory consumption of a single TaskDef
. Prior to this release, it was possible for poorly-behaved clients to cause an OOM on the server by running millions of workflows which use a TaskDef
but not executing the resulting TaskRun
s. This would cause an un-bounded buildup of ScheduledTask
s in memory until the server crashed.
After the 0.9
release, any more than 1,000 ScheduledTask
s for a certain TaskDef
are not loaded into memory but left on disk.
Principal Deletion
The 0.9
release includes the ability to delete a Principal
. The rpc DeletePrincipal
is smart enough to ensure that there is always at least one Admin Principal
to prevent a user from locking themselves out of the cluster.
PollThread
in Java Task Worker
We refactored the internal implementation of the Java Task Worker so that, for each LH Server in the cluster, the Task Worker creates a single PollThread
object which is responsible for polling and executing TaskRun
s. The PollThread
s now poll in parallel, drastically increasing the throughput of a single Java Task Worker.
The PollThread
was introduced in #796.
What's Next
Our wire protocol (the GRPC API) is quite stable; there have been no major breaking changes since we introduced the alpha version of Multi-Tenancy in 0.7
. We are diligently proceeding through soak tests, load tests, and chaos tests with our server and we have found and addressed several issues.
We continue to look foward to the 1.0
release, and we will reach that milestone once:
- We are satisfied with results of load tests and soak tests.
- We have had language experts review each of our three main SDK's (Java, Go, Python) and we have addressed any change requests.
- We approach a year without any breaking changes to our wire protocol.