Jun 09, 2022

Celebrate Apache Arrow Language Communities at The Data Thread

The Data Thread Blog Banner

The Apache Arrow project aims to build bridges across the data analytics ecosystem by providing a language-independent columnar data format plus a multi-language toolbox of software libraries. These libraries implement the Arrow columnar data format, provide efficient methods for converting data between the Arrow format and other data formats, and include a variety of useful tools for working natively with data in the Arrow format.

The Arrow project has attracted developers and users from a wide range of popular language communities. Today there are Arrow libraries available for twelve languages: C, C++, C#, Go, Java, JavaScript, Julia, MATLAB, Python, R, Ruby, and Rust. As more capabilities are added to these libraries, it becomes ever easier for developers and users to build applications with Arrow in the languages of their choice. The underlying Arrow columnar data format ensures cross-language interoperability, enabling companies and teams to work in multiple languages without fear of silos or dead ends.

At The Data Thread on June 23 we are excited to be joined by speakers who have made major contributions to many of these different language implementations.

C++

Weston Pace and Vibhatha Abeykoon have made numerous contributions to the C++ implementation of Arrow. They will be co-speaking about the query execution engine inside of the Arrow C++ library, which was recently named “Acero.”



Java

David Li has made major contributions to multiple Arrow language implementations, including the Java implementation. David is also a member of the Apache Arrow Project Management Committee (PMC) and one of the core contributors to Arrow Flight. Additionally, David will be co-speaking with James Duong about Arrow Flight SQL, an Arrow-native client-server protocol that promises to accelerate database access.



JavaScript

Dominik Moritz is one of the primary developers of the JavaScript implementation of Arrow. Dominik will be speaking about the advantages of using Arrow to work natively with columnar data in web applications and showcasing some cool capabilities that the JavaScript implementation enables in the web browser.



Python

Alenka Frim and Joris Van den Bossche are frequent contributors to PyArrow, the Python implementation of Arrow. Alenka will be speaking about how new contributors can get involved in Arrow, highlighting recent work she has done on the New Contributor’s Guide. Joris, who is an Arrow PMC member, will be speaking about how Arrow helps to accelerate geospatial computing in Python and discussing the recent work to extend GeoPandas to support reading and writing of the Apache Parquet and Arrow IPC (Feather) file formats.



R

Nic Crane and Dewey Dunningham are major contributors to the Arrow R package. Nic will be speaking about different ways to get involved in the Arrow project and sharing tips from the Arrow maintainers for new Arrow contributors. Dewey will be co-speaking with Joris Van den Bossche (see above) about geospatial computing with Arrow, highlighting recent work on geoarrow.



Ruby

Sutou Kouhei is a member of the Arrow PMC and a prolific contributor to Arrow. His contributions span many of the Arrow language implementations. He will be speaking at The Data Thread about the importance of Arrow in the Ruby community.



Rust

Andrew Lamb and Jorge Leitão are core contributors to the Rust implementation of Arrow and to the DataFusion query execution framework which is implemented in Rust. Jorge has also led the development of an alternative implementation of Arrow in Rust. Both are members of the Arrow PMC. Andrew will speak about how Arrow—and the Rust implementation in particular—is making it possible to develop faster, better, and cheaper new databases and analytic systems. Jorge will speak about the efficient use of threads in Arrow implementations.

Register Today

If you haven’t already started exploring the speakers list, take a look at the lineup here and register for the event. New participants and sessions continue to be added.