Skip to content Marginalia About Donate Random

Search The Internet

Showing search results from arrow.apache.org.
https://arrow.apache.org/blog/2021/12/03/arrow-duckdb/

DuckDB quacks Arrow: A zero-copy data integration between Apache Arrow and DuckDB | Apache Arrow

TLDR: The zero-copy integration between DuckDB and Apache Arrow allows for rapid analysis of larger than memory datasets in Python and R using either SQL or relational APIs.

Js ⡲⢠⣢⣿⣿⣿⡌⠀
Terms appear in 37 positions

https://arrow.apache.org/docs/10.0/r/reference/to_duckdb.html

Create a (virtual) DuckDB table from an Arrow object — to_duckdb • Arrow R Package

This will do the necessary configuration to create a (virtual) table in DuckDB that is backed by the Arrow object given. No data is copied or modified until or are called or a query is run against the table.

Js ⠬⢾⠀⠀⠀⠀⠀⠀
Terms appear in 9 positions

https://arrow.apache.org/docs/6.0/r/reference/to_duckdb.html

Create a (virtual) DuckDB table from an Arrow object — to_duckdb • Arrow R Package

This will do the necessary configuration to create a (virtual) table in DuckDB that is backed by the Arrow object given. No data is copied or modified until or are called or a query is run against the table.

Js Tr Ad ⠬⠾⠀⠀⠀⠀⠀⠀
Terms appear in 8 positions

https://arrow.apache.org/blog/2021/11/08/r-6.0.0/

Apache Arrow R 6.0.0 Release | Apache Arrow

We are excited to announce the recent release of version 6.0.0 of the Arrow R package on While we usually don’t write a dedicated release blog post for the R package, this one is special. There are a number of major new features in this version, some of w

Js ⠀⠀⠀⢸⠎⠀⠀⠀
Terms appear in 7 positions

https://arrow.apache.org/blog/2025/03/10/fast-streaming-inserts-in-duckdb-with-adbc/

Fast Streaming Inserts in DuckDB with ADBC | Apache Arrow

Published 10 Mar 2025 By DuckDB is rapidly becoming an essential part of data practitioners’ toolbox, finding use cases in data engineering, machine learning, and local analytics. In many cases DuckDB has been used to query and process data that has alrea

Js ⣌⠂⣘⠘⡱⠇⠀⠀
Terms appear in 18 positions

https://arrow.apache.org/docs/dev/r/reference/index.html

Package index • Arrow R Package

Functions for converting R objects to Arrow data containers and combining Arrow data containers.

Js ⠀⠀⡀⠀⠀⠀⠀⠀
Terms appear in 1 positions

https://arrow.apache.org/blog/2023/08/05/datafusion_fast_grouping/

Aggregating Millions of Groups Fast in Apache Arrow DataFusion 28.0.0 | Apache Arrow

Grouped aggregations are a core part of any analytic tool, creating understandable summaries of huge data volumes s parallel aggregation capability is 2-3x faster in the for queries with a large number (10,000 or more) of groups.

Js ⠀⢐⠀⠁⠀⢰⠀⠀
Terms appear in 6 positions

https://arrow.apache.org/docs/r/news/index.html

Changelog • Arrow R Package

CRAN release: 2022-10-26 Several new functions can be used in queries: The package now has documentation that lists all dplyr methods and R function mappings that are supported on Arrow data, along with notes about any differences in functionality between

Js ⠂⠀⠠⠀⠄⠀⠐⠀
Terms appear in 4 positions

https://arrow.apache.org/docs/9.0/r/news/index.html

Changelog • Arrow R Package

There are now two ways to query Arrow data both grouped and ungrouped, is now implemented for Arrow Datasets, Tables, and RecordBatches. Because data is scanned in chunks, you can aggregate over larger-than-memory datasets backed by many files. Supported

Js ⠀⠀⠠⠀⠀⠘⠀⠀
Terms appear in 3 positions

https://arrow.apache.org/docs/11.0/r/news/index.html

Changelog • Arrow R Package

CRAN release: 2022-10-26 Several new functions can be used in queries: The package now has documentation that lists all dplyr methods and R function mappings that are supported on Arrow data, along with notes about any differences in functionality between

Js ⠀⠀⠀⠀⡀⠀⠘⠀
Terms appear in 3 positions

https://arrow.apache.org/docs/7.0/r/news/index.html

Changelog • Arrow R Package

There are now two ways to query Arrow data both grouped and ungrouped, is now implemented for Arrow Datasets, Tables, and RecordBatches. Because data is scanned in chunks, you can aggregate over larger-than-memory datasets backed by many files. Supported

Js ⠀⠀⠀⢠⠁⠀⠀⠀
Terms appear in 3 positions

https://arrow.apache.org/adbc/current/driver/status.html

Driver Implementation Status - ADBC 18 documentation

Warning There is a known problem on macOS x86_64 when using two drivers written in Go in the same process (unless working in a pure-Go application where using the second driver may crash. For more details, see

Js ⠀⠰⠀⠀⠀⠀⠀⠀
Terms appear in 2 positions

https://arrow.apache.org/docs/6.0/r/news/index.html

Changelog • Arrow R Package

There are now two ways to query Arrow data both grouped and ungrouped, is now implemented for Arrow Datasets, Tables, and RecordBatches. Because data is scanned in chunks, you can aggregate over larger-than-memory datasets backed by many files. Supported

Js ⠀⠀⡃⠀⠀⠀⠀⠀
Terms appear in 3 positions

https://arrow.apache.org/blog/2023/08/24/13.0.0-release/

Apache Arrow 13.0.0 Release | Apache Arrow

Published 24 Aug 2023 By The Apache Arrow team is pleased to announce the 13.0.0 release. This covers over 3 months of development work and includes from See the to learn how to get the libraries for your platform. The release notes below are not exhausti

Js ⠀⠀⠀⠀⠀⠂⠀⠀
Terms appear in 1 positions

https://arrow.apache.org/blog/2025/01/10/arrow-result-transfer/

How the Apache Arrow Format Accelerates Query Result Transfer | Apache Arrow

Published 10 Jan 2025 By Ian Cook, David Li, Matt Topol Translations This is the first in a series of posts that aims to demystify the use of Arrow as a data interchange format for databases and query engines. Posts in this series Why is this taking so lo

Js ⠀⠀⠀⠁⠀⠐⠀⠀
Terms appear in 2 positions

https://arrow.apache.org/docs/format/StatisticsSchema.html

Statistics schema — Apache Arrow v19.0.1

Statistics are useful for fast query processing. Many query engines use statistics to optimize their query plan. Apache Arrow format doesn’t have statistics but other formats that can be read as Apache Arrow data may have statistics. For example, the Apac

Js ⠀⠘⠀⠀⠀⠀⠀⠀
Terms appear in 2 positions

https://arrow.apache.org/release/8.0.0.html

Apache Arrow 8.0.0 Release | Apache Arrow

This is a major release covering more than 3 months of development. This release includes 636 commits from 127 distinct contributors. git shortlog -sn apache-arrow-7.0.0..apache-arrow-8.0.0 43 Antoine Pitrou 40 David Li 39 Sutou Kouhei 36 Alenka Frim 29 W

Js ⠀⠀⠀⠀⠐⠀⡀⠀
Terms appear in 2 positions

https://arrow.apache.org/docs/dev/format/StatisticsSchema.html

Statistics schema — Apache Arrow v21.0.0.dev44

Statistics are useful for fast query processing. Many query engines use statistics to optimize their query plan. Apache Arrow format doesn’t have statistics but other formats that can be read as Apache Arrow data may have statistics. For example, the Apac

Js ⠀⠘⠀⠀⠀⠀⠀⠀
Terms appear in 2 positions

https://arrow.apache.org/adbc/current/python/api/adbc_driver_manager.html

adbc_driver_manager - ADBC 16 documentation

String key-value options to pass to the underlying database. Must include at least “driver” to identify the underlying database driver to load.

Js ⠀⠀⠀⠀⠀⠀⠀⡀
Terms appear in 1 positions

https://arrow.apache.org/release/14.0.0.html

Apache Arrow 14.0.0 Release | Apache Arrow

This is a major release covering more than 2 months of development. This release includes 612 commits from 116 distinct contributors. git shortlog -sn apache-arrow-13.0.0..apache-arrow-14.0.0 69 Sutou Kouhei 59 dependabot[bot 52 sgilmore10 34 Nic Crane 28

Js ⠀⠀⠀⠀⠀⠡⠀⠀
Terms appear in 2 positions

https://arrow.apache.org/release/16.0.0.html

Apache Arrow 16.0.0 Release | Apache Arrow

This is a major release covering more than 1 months of development. This release includes 587 commits from 119 distinct contributors. git shortlog -sn apache-arrow-15.0.2..apache-arrow-16.0.0 79 dependabot[bot 70 Sutou Kouhei 41 Antoine Pitrou 31 Joris Va

Js ⠀⠀⠀⠀⠠⠈⠀⠀
Terms appear in 2 positions

https://arrow.apache.org/release/13.0.0.html

Apache Arrow 13.0.0 Release | Apache Arrow

This is a major release covering more than 2 months of development. This release includes 608 commits from 108 distinct contributors. git shortlog -sn apache-arrow-12.0.1..apache-arrow-13.0.0 83 Sutou Kouhei 47 Raúl Cumplido 35 Nic Crane 26 Joris Van den

Js ⠀⠀⠀⠀⡀⠀⠀⠂
Terms appear in 2 positions

https://arrow.apache.org/adbc/0.5.1/python/api/adbc_driver_manager.html

adbc_driver_manager - ADBC 0.5.1 documentation

The root module provides a fairly direct, 1:1 mapping to the C API definitions in Python. For a higher-level interface, use This requires PyArrow.)

Js ⠀⠀⠀⠀⠀⠀⠈⠀
Terms appear in 1 positions

https://arrow.apache.org/blog/2023/11/01/14.0.0-release/

Apache Arrow 14.0.0 Release | Apache Arrow

The Apache Arrow team is pleased to announce the 14.0.0 release. This covers over 3 months of development work and includes from See the to learn how to get the libraries for your platform.

Js ⠀⠄⠀⠀⠀⠀⠀⠀
Terms appear in 1 positions

https://arrow.apache.org/blog/

Project News and Blog | Apache Arrow

6 May 2025 The Apache Arrow team is pleased to announce the version 18 release of the Apache Arrow ADBC libraries. This release includes 28 resolved issues from 22 distinct contributors. This is a release of the libraries, which are at version 18. The API

Js ⠐⡈⠀⠀⠀⠀⠀⠀
Terms appear in 3 positions

https://arrow.apache.org/docs/6.0/r/reference/index.html

Function reference • Arrow R Package

Open a multi-file dataset Write a dataset Create a DatasetFactory Construct Hive partitioning Multi-file datasets Define Partitioning for a Dataset Arrow expressions Scan the contents of a dataset Dataset file formats Format-specific write options Format-

Js ⠀⠀⠄⠀⠀⠀⠀⠀
Terms appear in 1 positions

https://arrow.apache.org/blog/2023/03/07/nanoarrow-0.1.0-release/

Apache Arrow nanoarrow 0.1 Release | Apache Arrow

The Apache Arrow team is pleased to announce the 0.1.0 release of Apache Arrow nanoarrow. This initial release covers 31 resolved issues from 6 contributors.

Js ⠀⡀⠀⠀⠀⠀⠀⠀
Terms appear in 1 positions

https://arrow.apache.org/docs/11.0/r/reference/index.html

Function reference • Arrow R Package

Open a multi-file dataset Open a multi-file dataset of CSV or other delimiter-separated format Write a dataset Create a DatasetFactory Construct Hive partitioning Multi-file datasets Define Partitioning for a Dataset Arrow expressions Scan the contents of

Js ⠀⠀⠐⠀⠀⠀⠀⠀
Terms appear in 1 positions

https://arrow.apache.org/docs/dev/cpp/acero/overview.html

Acero Overview — Apache Arrow v19.0.0.dev78

Choose version This page gives an overview of the basic Acero concepts and helps distinguish Acero from other modules in the Arrow code base. It’s intended for users, developers, potential contributors, and for those that would like to extend Acero, eithe

Js ⠀⠀⠀⠄⠀⠀⠀⠀
Terms appear in 1 positions

https://arrow.apache.org/blog/2024/01/10/15.0.0-release/

Apache Arrow 15.0.0 Release | Apache Arrow

Published 10 Jan 2024 By The Apache Arrow team is pleased to announce the 15.0.0 release. This covers over 3 months of development work and includes on from See the to learn how to get the libraries for your platform. The release notes below are not exhau

Js ⠀⠀⠀⠀⠂⠀⠀⠀
Terms appear in 1 positions

https://arrow.apache.org/blog/2023/01/05/introducing-arrow-adbc/

Introducing ADBC: Database Access for Apache Arrow | Apache Arrow

The Arrow community would like to introduce version 1.0.0 of the specification. ADBC is a columnar, minimal-overhead alternative to JDBC/ODBC for analytical applications. Or in other words: ADBC is a single API for getting Arrow data in and out of differe

Js ⠀⢀⠀⠀⠀⠀⠀⠀
Terms appear in 1 positions

https://arrow.apache.org/blog/2023/06/24/datafusion-25.0.0/

Apache Arrow DataFusion 26.0.0 | Apache Arrow

It has been a whirlwind 6 months of DataFusion development since the community has grown, many features have been added, performance improved and we are branching out to our own top level Apache Project.

Js ⠀⠀⠄⠀⠀⠀⠀⠀
Terms appear in 1 positions

https://arrow.apache.org/blog/2024/01/19/datafusion-34.0.0/

Apache Arrow DataFusion 34.0.0 Released, Looking Forward to 2024 | Apache Arrow

We recently This blog highlights some of the major improvements since we (spoiler alert there are many) and a preview of where the community plans to focus in the next 6 months.

Js ⠀⠀⠀⠄⠀⠀⠀⠀
Terms appear in 1 positions

https://arrow.apache.org/powered_by/

Powered by | Apache Arrow

Organizations creating products and projects for use with Apache Arrow, along with associated marketing materials, should take care to respect the trademark in “Apache Arrow” and its logo. Please refer to and associated for comprehensive and authoritative

Js ⠀⠀⠀⢀⠀⠀⠀⠀
Terms appear in 1 positions

https://arrow.apache.org/release/7.0.0.html

Apache Arrow 7.0.0 Release | Apache Arrow

This is a major release covering more than 3 months of development. This release includes 650 commits from 105 distinct contributors. git shortlog -sn apache-arrow-6.0.0..apache-arrow-7.0.0 78 Antoine Pitrou 49 Sutou Kouhei 44 Krisztián Szűcs 39 David Li

Js ⠀⠀⠀⠄⠀⠀⠀⠀
Terms appear in 1 positions

https://arrow.apache.org/release/6.0.0.html

Apache Arrow 6.0.0 Release | Apache Arrow

This is a major release covering more than 3 months of development. This release includes 592 commits from 88 distinct contributors 58 David Li 56 Antoine Pitrou 46 Neal Richardson 42 Sutou Kouhei 38 Jonathan Keane 34 Krisztián Szűcs 27 Matthew Topo

Js ⠀⠀⠀⠈⠘⠀⠀⠀
Terms appear in 3 positions

https://arrow.apache.org/blog/2021/11/04/6.0.0-release/

Apache Arrow 6.0.0 Release | Apache Arrow

Published 04 Nov 2021 By The Apache Arrow team is pleased to announce the 6.0.0 release. This covers over 3 months of development work and includes from See the Install Page to learn how to get the libraries for your platform. The release notes below are

Js ⠀⠀⠀⠀⠈⠀⠀⠀
Terms appear in 1 positions

https://arrow.apache.org/release/12.0.0.html

Apache Arrow 12.0.0 Release | Apache Arrow

This is a major release covering more than 3 months of development. This release includes 531 commits from 97 distinct contributors. git shortlog -sn apache-arrow-11.0.0..apache-arrow-12.0.0 62 Sutou Kouhei 44 Weston Pace 26 Gang Wu 26 Matt Topol 23 Nic C

Js ⠀⠀⠀⠀⢀⠀⠀⠀
Terms appear in 1 positions

https://arrow.apache.org/blog/2021/07/29/5.0.0-release/

Apache Arrow 5.0.0 Release | Apache Arrow

Published 29 Jul 2021 By The Apache Arrow team is pleased to announce the 5.0.0 release. This covers 3 months of development work and includes 684 commits from in 2 repositories. See the Install Page to learn how to get the libraries for your platform. Th

Js ⠀⠀⠀⠀⠂⠀⠀⠀
Terms appear in 1 positions

https://arrow.apache.org/adbc/current/driver/installation.html

Installation - ADBC 15 documentation

See individual driver pages in the sidebar for specific installation instructions.

Js ⠀⠀⠠⠀⠀⠀⠀⠀
Terms appear in 1 positions

https://arrow.apache.org/docs/dev/r/articles/data_wrangling.html

Data analysis with dplyr syntax • Arrow R Package

The arrow package provides functionality allowing users to manipulate tabular Arrow data (Table and Dataset objects) with familiar syntax. To enable this functionality, ensure that the arrow and dplyr packages are both loaded. In this article we will take

Js ⠀⠀⠀⠀⠀⡀⠀⠀
Terms appear in 1 positions

https://arrow.apache.org/blog/2022/12/26/querying-parquet-with-millisecond-latency/

Querying Parquet with Millisecond Latency | Apache Arrow

Published 26 Dec 2022 By tustvold and alamb Note: this article was originally published on the We believe that querying data in files directly can achieve similar or better storage efficiency and query performance than most specialized file formats. While

Js ⠀⠀⠂⠀⠀⠀⠀⠀
Terms appear in 1 positions

https://arrow.apache.org/docs/r/articles/data_wrangling.html

Data analysis with dplyr syntax • Arrow R Package

The arrow package provides functionality allowing users to manipulate tabular Arrow data (Table and Dataset objects) with familiar syntax. To enable this functionality, ensure that the arrow and dplyr packages are both loaded. In this article we will take

Js ⠀⠀⠀⠀⠀⠀⠁⠀
Terms appear in 1 positions

https://arrow.apache.org/blog/2022/11/07/multi-column-sorts-in-arrow-rust-part-2/

Fast and Memory Efficient Multi-Column Sorts in Apache Arrow Rust, Part 2 | Apache Arrow

Published 07 Nov 2022 By tustvold and alamb In of this post, we described the problem of Multi-Column Sorting and the challenges of implementing it efficiently. This second post explains how the new in the of works and is constructed. The row format is a

Js ⠀⠀⠈⠀⠀⠀⠀⠀
Terms appear in 1 positions

https://arrow.apache.org/docs/8.0/r/reference/index.html

Function reference • Arrow R Package

Open a multi-file dataset Write a dataset Create a DatasetFactory Construct Hive partitioning Multi-file datasets Define Partitioning for a Dataset Arrow expressions Scan the contents of a dataset Dataset file formats Format-specific write options Format-

Js Tr Ad ⠀⠀⠈⠀⠀⠀⠀⠀
Terms appear in 1 positions

https://arrow.apache.org/cookbook/r/manipulating-data---tables.html

7 Manipulating Data - Tables | Apache Arrow R Cookbook

One of the aims of the Arrow project is to reduce duplication between different data frame implementations. The underlying implementation of a data frame is a conceptually different thing to the code- or the application programming interface (API)-that yo

Js ⠀⠀⠀⠀⠀⠀⠀⠘
Terms appear in 2 positions

Filters

  • Remove Javascript
  • Reduce Adtech
  • Recent Results
  • Search In Title

Domains

  • No Filter
  • Small Web
  • Blogosphere
  • Academia

  • Vintage
  • Plain Text
  • ~tilde

  • Wiki
  • Forum
  • Docs
  • Recipes

Syntax

This is a keyword-based search engine. When entering multiple search terms, the search engine will attempt to match them against documents where the terms occur in close proximity.

Search terms can be excluded with a hyphen.

While the search engine at present does not allow full text search, quotes can be used to specifically search for names or terms in the title. Using quotes will also cause the search engine to be as literal as possible in interpreting the query.

Parentheses can be used to add terms to the query without giving weight to the terms when ranking the search results.

Samples

soup -chicken
Look for keywords that contain soup, but not chicken.
"keyboard"
Look for pages containing the exact word keyboard, not keyboards or the like.
"steve mcqueen"
Look for pages containing the exact words steve mcqueen in that order, with no words in between.
apology (plato)
Look for pages containing apology and plato, but only rank them based on their relevance to apology

Special Keywords

Several special keywords are supported by the search engine.

KeywordMeaning
site:example.comDisplay site information about example.com
site:example.com keywordSearch example.com for keyword
browse:example.comShow similar websites to example.com
ip:127.0.0.1Search documents hosted at 127.0.0.1
links:example.comSearch documents linking to example.com
tld:edu keywordSearch documents with the top level domain edu.
?tld:edu keywordPrefer but do not require results with the top level domain edu. This syntax is also possible for links:..., ip:... and site:...
q>5The amount of javascript and modern features is at least 5 (on a scale 0 to 25)
q<5The amount of javascript and modern features is at most 5 (on a scale 0 to 25)
year>2005(beta) The document was ostensibly published in or after 2005
year=2005(beta) The document was ostensibly published in 2005
year<2005(beta) The document was ostensibly published in or before 2005
rank>50The ranking of the website is at least 50 in a span of 1 - 255
rank<50The ranking of the website is at most 50 in a span of 1 - 255
count>10 The search term must appear in at least 10 results form the domain
count<10 The search term must appear in at most 10 results from the domain
format:html5Filter documents using the HTML5 standard. This is typically modern websites.
format:xhtmlFilter documents using the XHTML standard
format:html123Filter documents using the HTML standards 1, 2, and 3. This is typically very old websites.
generator:wordpressFilter documents with the specified generator, in this case wordpress
file:zipFilter documents containing a link to a zip file (most file-endings work)
file:audioFilter documents containing a link to an audio file
file:videoFilter documents containing a link to a video file
file:archiveFilter documents containing a link to a compressed archive
file:documentFilter documents containing a link to a document
-special:mediaFilter out documents with audio or video tags
-special:scriptsFilter out documents with javascript
-special:affiliateFilter out documents with likely Amazon affiliate links
-special:trackingFilter out documents with analytics or tracking code
-special:cookiesFilter out documents with cookies

Results Legend

The estimated relevance of the search result is indicated using the color saturation of the color of the search result, as well as the order the results are presented.

Information about the position of the match is indicated using a dot matrix in the bottom bar of each search result. Each dot represents four sentences, and are presented in an order of top-to-bottom, left-to-right.

⣿⠃⠀⠀   — The terms occur heavily toward the beginning of the document.

⠠⠀⡄⠁   — The terms occur sparsely throughout the document.

⠀⠁⠀⠀   — The terms occur only in a single sentence.

Potentially problems with the document are presented with a warning triangle, e.g. ⚠ 3. Desktop users can mouse-over this to get a detailed breakdown.

Policies

This website complies with the GDPR by not collecting any personal information, and with the EU Cookie Directive by not using cookies. More Information.

Contact

Reach me at kontakt@marginalia.nu, @MarginaliaNu on twitter.

Open Source

The search engine is open source with an AGPL license. The sources can be perused at https://git.marginalia.nu/.

Data Sources

IP geolocation is sourced from the IP2Location LITE data available from https://lite.ip2location.com/ under CC-BY-SA 4.0.