Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add stream loaders, revise JSON formats. #384

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open

Add stream loaders, revise JSON formats. #384

wants to merge 6 commits into from

Conversation

jheer
Copy link
Member

@jheer jheer commented Dec 9, 2024

  • Breaking: Drop sync fromCSV, etc. methods. Replace with async parseCSV, etc.
  • Breaking: Drop custom Arquero JSON format with embedded schema. It provides little if any additional value (as the schema does not include type information), and I don't have any evidence that people use it.
  • Add support to load data from input readable streams.
  • Add full streaming readers for CSV, fixed width, and NDJSON formats.
  • Add support for gzip and deflate decompression of input streams.
  • Add parseArrow, parseCSV, parseFixed, and parseJSON methods for async parsing. These methods replace the prior fromCSV, etc. methods, and can take both input streams and pre-loaded data (text or binary, in the case of Arrow).
  • Expand parseJSON and toJSON to support a JSON type property, for control over row-oriented, column-oriented, and newline-delimited (NDJSON) formats.
  • Update build, including package.json use of the browser property, for more targeted node/web separation.
  • Update test cases to reflect changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant