Back arrow
Go back to all blog posts
Data Overload Vol. 10

Data Overload Vol. 10

Welcome to Data Overload Vol. 10, the monthly newsletter with company updates from Orchest!

Migration of cloud users approaching ๐Ÿšง

Since we migrated the Orchest backend to Kubernetes, we have been pushing new features to the open source version but our Orchest Cloud users have been stuck with the pre-migration version. Fear no more! We will tackle the migration in the coming weeks so that we are back to convergence again. Stay tuned!

Meeting in real life at PyCon Lithuania ๐Ÿ‡ฑ๐Ÿ‡น

Our Data Scientist Advocate Juan Luis gave a talk called "Beyond pandas: The great Python dataframe showdown" at the PyData track of PyCon Lithuania, and it had quite good reception! You can import the materials into your Orchest account, and if you cannot wait for the recording, have a look at our latest blog post in the series: lightning-fast queries with Polars. And see you at PyData London in June!

Kickstarting our #beta-testers program ๐Ÿ‘ท

Now that we are ramping up our design work, we opened up a #beta-testers program for some of our most active community users so that they can give early feedback on the UI prototypes we are working on. If you want to be part of it, participate in our Slack!

Thank you!

We keep hearing from happy Orchest users that are delighted with the product and send us their suggestions to make it even better. Thanks to every one of you!

- Juan Luis Cano (Data Scientist Advocate)

Product updates

๐Ÿณ Simplified installation procedure

Installing Orchest is a pip install away. Try it out!

๐ŸŒ Support for JavaScript

You can now write your pipeline steps in JavaScript. The limit is your imagination!

๐Ÿ–ฑ๏ธ Context menu in pipeline editor

Now you can run a step, create a new one, and more actions using your right click.

What we're reading

  • Parallel Grouped Aggregation in DuckDB (blog post)

    Why we like it: Blog posts about data structures are quite enjoyable, and this writeup by DuckDB is no exception: by carefully mixing hash tables with some smart parallelization tricks, they were able to squeeze the performance of aggregation operations. They even go as far as providing the source code for the performance benchmarks they conducted, which is always appreciated.

Orchest is an open-source project that simplifies the development and deployment of data pipelines. Get started for free or download the open-source version!