Mar 02, 2023

Microsoft’s Magpie Uses Ibis to Optimize Computation

Keith Britt

magpie bird flys to top of building

Learn how Microsoft created Magpies using Ibis to optimize computation and ease migration to the cloud.

A team at Microsoft developed a system called Magpie, “which exposes the popular pandas API while lazily pushing large chunks of computation into scalable, efficient, and secured database engines.” What caught our attention while reading through their outstanding paper was how they used Ibis in Magpie’s compiler operations as a batching and backend optimization tool. In addition, they were able to extend Ibis to support many different cloud architecture backends. All this drove towards a tool that allows, “existing data science solutions to be migrated onto cloud engines without rewriting.” Lastly, the paper is worth a read as the Microsoft team does a great job of laying out the current “data science jungle” and a vision (incorporating Ibis and Apache Arrow) of what the future could be.

Check out their excellent work, which was presented at 11th Annual Conference of Innovative Data Systems Research (CIDR ‘21) on January 10-13, 2021 in Chaminade, USA, here: https://wentaowu.github.io/papers/cidr21-magpie.pdf

Congratulations to Jindal et. al. on their excellent work.

Photo by: Darius K