Skip to content

Releases: elixir-explorer/explorer

v0.9.2

27 Aug 13:53
1175ff9
Compare
Choose a tag to compare

Added

  • Add a new :keep option to the mutate_with/3 function and mutate/3 macro.
    This option allows users to control which columns are retained in the output
    dataframe after a mutation operation. You can use :all (the default) or :none.

Fixed

  • Fix handling of "LazySeries" with remote dataframes.
  • Fix typespecs of Explorer.Series.cast/2 by adding a dtype_alias() type.
  • Stop converting io_dtypes() to maps in order to preserve names ordering.

Pull requests

New Contributors

Full Changelog: v0.9.1...v0.9.2

SHA256 of artifacts

6717497ec99ba169d3224f63a59099650311e8e376480327a6251c5c8c9544f2  explorer-v0.9.2-nif-2.15-x86_64-pc-windows-gnu--legacy_cpu.dll.tar.gz
a1889f2558a125e4703894db04d1fab2aae2c07daf8ff2724922a73b67376368  explorer-v0.9.2-nif-2.15-x86_64-pc-windows-gnu.dll.tar.gz
17def23350d5e6367a88734b5b8c1d3d7d7369f61dd4514c22287b5ddb782f3b  explorer-v0.9.2-nif-2.15-x86_64-pc-windows-msvc--legacy_cpu.dll.tar.gz
5554f17bbb5823ada068ef7b03fbd7504213c93861395e90490313909c9e524c  explorer-v0.9.2-nif-2.15-x86_64-pc-windows-msvc.dll.tar.gz
73c1fcc0db80c93b41bb74ee643de6ddc2e6c7053fe6eccd234edb007fa3a044  libexplorer-v0.9.2-nif-2.15-aarch64-apple-darwin.so.tar.gz
e548d17dbf70de230f6a13f4576182f611065d0765c4370ace9e01ec6d1ebb77  libexplorer-v0.9.2-nif-2.15-aarch64-unknown-linux-gnu.so.tar.gz
14a2f07fcdb815ecc483f4dcedbded982b59be40633528e71634e70e961d9f91  libexplorer-v0.9.2-nif-2.15-aarch64-unknown-linux-musl.so.tar.gz
1283a62cd2234d25b4b6d4d35a23a48e8fda2b915e068f91dcceb174c3a492aa  libexplorer-v0.9.2-nif-2.15-x86_64-apple-darwin.so.tar.gz
599e73cc71dac39d4e0a8607a59176655591705a132f7f32b32b90045482e8eb  libexplorer-v0.9.2-nif-2.15-x86_64-unknown-freebsd--legacy_cpu.so.tar.gz
6ce4df2a9c1815be4f0d0d8fadd0f6cdc55172ebfb79bb77bbd1a008bebb6f09  libexplorer-v0.9.2-nif-2.15-x86_64-unknown-freebsd.so.tar.gz
936e4cd3b9db9039538893fc634b1c34c33e1c8636b00fc396822d10f0bab7c4  libexplorer-v0.9.2-nif-2.15-x86_64-unknown-linux-gnu--legacy_cpu.so.tar.gz
b07378f05f51c35f79b20e2fc78dfb804626e6583b19de8aa47be773bd2fe5c8  libexplorer-v0.9.2-nif-2.15-x86_64-unknown-linux-gnu.so.tar.gz
0afe0cc7410a2c09f30ae81ef57324e69b22f736705224e46b73b36837882250  libexplorer-v0.9.2-nif-2.15-x86_64-unknown-linux-musl.so.tar.gz

Build with attestations: https://github.com/elixir-explorer/explorer/actions/runs/10579468339

v0.9.1

15 Aug 21:31
1f2ccbb
Compare
Choose a tag to compare

Added

  • Add support for saving to the cloud using streaming and the IPC format.
    This will enable saving a lazy frame to the cloud without loading it
    entirely in memory. It only supports saves to S3-compatible storage services.

Changed

  • Force garbage collection on remote gc.

Fixed

  • Re-enable support for saving to the cloud using streaming and the Parquet format.
    It's a fix from the release of v0.9.0 that disabled this feature.

  • Fix overwrite of dtypes for Explorer.DataFrame.load_csv/2.
    This was a regression introduced in v0.9.0.

Pull requests

New Contributors

Full Changelog: v0.9.0...v0.9.1

SHA256 of artifacts

13a1063430989ab65536e1195976028fa5a6274fcb71b04e9c77e77ffcc64f62  explorer-v0.9.1-nif-2.15-x86_64-pc-windows-gnu--legacy_cpu.dll.tar.gz
eec297e6d1a20c0fb4fcb83a9779dc0199e03252a1df1e985dcc95d44f3e533f  explorer-v0.9.1-nif-2.15-x86_64-pc-windows-gnu.dll.tar.gz
dedfe9f2e0b0a620038abeb40f8f6ae67decfd5eabb7313bd137168d94b25357  explorer-v0.9.1-nif-2.15-x86_64-pc-windows-msvc--legacy_cpu.dll.tar.gz
f2eb81ed0ed7eb5ed8d65a4fe6e6ae86beb77158da0ab8a514cfbd38e42805c9  explorer-v0.9.1-nif-2.15-x86_64-pc-windows-msvc.dll.tar.gz
142ec5f7898cbea3213dc1f36db3082798b96ac75688d7b7d3cd4521b6c26183  libexplorer-v0.9.1-nif-2.15-aarch64-apple-darwin.so.tar.gz
330cce54a8fc1a3f6ff79f340b3ad966a3705efbc67b13b16a03efca0e0567c4  libexplorer-v0.9.1-nif-2.15-aarch64-unknown-linux-gnu.so.tar.gz
eba94a784c28729e142143107092cf8c5c7534d443e6825ed68c878d1a01fd40  libexplorer-v0.9.1-nif-2.15-aarch64-unknown-linux-musl.so.tar.gz
9ba64fe4ba60bf218049752761cabda7fd5a41401c5938d69aad72a0c74dbf9f  libexplorer-v0.9.1-nif-2.15-x86_64-apple-darwin.so.tar.gz
aceea3bbc047feb7729110f3ec0bf61d087efb0efd76e39c93ccb92485a8595c  libexplorer-v0.9.1-nif-2.15-x86_64-unknown-freebsd--legacy_cpu.so.tar.gz
4edd090d6c200949d8cacdf0aee393cb0afbe188ec5035d06cb130cbb69c70ea  libexplorer-v0.9.1-nif-2.15-x86_64-unknown-freebsd.so.tar.gz
4abc7a67b27202b468eaa00c9b42afce896edfb0124204356e82128ef7a90a95  libexplorer-v0.9.1-nif-2.15-x86_64-unknown-linux-gnu--legacy_cpu.so.tar.gz
43dc79e1907c3230169136b002458b929fda3fc1fd2a6ed3287fa44cbf9db1f7  libexplorer-v0.9.1-nif-2.15-x86_64-unknown-linux-gnu.so.tar.gz
4f58b77bcbdd9c3c9545e26daf92eeb971a97bd0b382a5855b66b3823be51e7f  libexplorer-v0.9.1-nif-2.15-x86_64-unknown-linux-musl.so.tar.gz

v0.9.0

26 Jul 21:08
2537c18
Compare
Choose a tag to compare

Added

  • Add initial support for SQL queries.

    The Explorer.DataFrame.sql/3 is a function that accepts a dataframe and a SQL query. The SQL is not validated by Explorer, so the queries will be backend dependent. Right now we have only Polars as the backend.

  • Add support for remote series and dataframes.

    Automatically transfer data between nodes for remote series and dataframes and perform distributed garbage collection.

    The functions in Explorer.DataFrame and Explorer.Series will automatically move operations on remote dataframes to the nodes they belong to.
    The Explorer.Remote module provides additional conveniences for manual placement.

  • Add FLAME integration, so we automatically track remote series and dataframes returned from FLAME calls when the :track_resources option is enabled.
    See FLAME for more.

  • Add Explorer.DataFrame.transform/3 that applies an Elixir function to each row. This function is similar to Explorer.Series.transform/2, and as such, it's considered an expensive operation. So it's recommended only if there is no similar dataframe or series operation available.

  • Improve performance of Explorer.Series.from_list/2 for most of the cases where the :dtype option is given. This is specially true for when the dtype is :binary.

Changed

  • Stop inference of dtypes if the :dtype option is given by the user.
    The main goal of this change is to improve performance. We are now delegating the job of decoding the terms as the given :dtype to the backend.

  • Explorer.Series.pow/2 no longer casts to float when the exponent is a signed integer. We are following the way Polars works now, which is to try to execute the operation or raise an exception in case the exponent is negative.

  • Explorer.Series.pivot_wider/4 no longer includes the names_from column name in the new columns when values_from is a list of columns. This is more consistent with its behaviour when values_from is a single column.

  • Explorer.Series.substring/3 no longer cycles to the end of the string if the negative offset surpasses the beginning of that string. In that case, an empty string is returned.

  • The Explorer.Series.ewm_* functions no longer replace nil values with the value at the previous index. They now propogate nil values through to the result series.

  • Saving a dataframe as a Parquet file to S3 services no longer works when streaming is enabled. This is temporary due to a bug in Polars. An exception should be raised instead.

Pull requests

New Contributors

Full Changelog: v0.8.3...v0.9.0

SHA256 of artifacts

aeed3719479b9bbe1e342af272927a75b5ad4f38bd89dd971e739561d1923172  explorer-v0.9.0-nif-2.15-x86_64-pc-windows-gnu--legacy_cpu.dll.tar.gz
720914c3e85a0869174cd43c26b65400e6ac0131993ea5796689b2d19cf364e2  explorer-v0.9.0-nif-2.15-x86_64-pc-windows-gnu.dll.tar.gz
ddd1f1f70d2791fc662fd477c34f9c4aa18a7d4d1a80bf953e39ac3d49924de7  explorer-v0.9.0-nif-2.15-x86_64-pc-windows-msvc--legacy_cpu.dll.tar.gz
52138d2657f8af5c85b75d22129b107e0547d0ef4b3abca966de48a1b18be6d6  explorer-v0.9.0-nif-2.15-x86_64-pc-windows-msvc.dll.tar.gz
6210497158c3479bdf9f8ab8661fca6bc02addde2cce085d5b134b4cc43ad5d4  libexplorer-v0.9.0-nif-2.15-aarch64-apple-darwin.so.tar.gz
78d54509a7a37e8e174cff4cce06329e75740d2cab8936834f8df06ed6a4eaea  libexplorer-v0.9.0-nif-2.15-aarch64-unknown-linux-gnu.so.tar.gz
3acd48fc82d89eeb74b54db70e3a4f44404f8764f5f63d767f0281a93f91ab35  libexplorer-v0.9.0-nif-2.15-aarch64-unknown-linux-musl.so.tar.gz
2c2390d37a0171c0e096e96620959d78b8d66c73907d219d9f205562df933983  libexplorer-v0.9.0-nif-2.15-x86_64-apple-darwin.so.tar.gz
0dca014ba38a6be5705607bdf6560bd9c485dcd1596e8f0c7dde4e69001a2c93  libexplorer-v0.9.0-nif-2.15-x86_64-unknown-freebsd--legacy_cpu.so.tar.gz
96ec9de1e472a504bd101cd05273873930337089286103911a2d43db72ae48a8  libexplorer-v0.9.0-nif-2.15-x86_64-unknown-freebsd.so.tar.gz
1baee4332cc0f6e5c2e1f505ce3ca98f6350cdcda73a3cde476d2f2690ab2094  libexplorer-v0.9.0-nif-2.15-x86_64-unknown-linux-gnu--legacy_cpu.so.tar.gz
320b2fc65700b09f20c58d0674631b67714dffe665df34dd39e9b9984051ee79  libexplorer-v0.9.0-nif-2.15-x86_64-unknown-linux-gnu.so.tar.gz
7a93e9a34e7dfac721d96dca49c57405b40da5330dfa9ac78add561a234c4040  libexplorer-v0.9.0-nif-2.15-x86_64-unknown-linux-musl.so.tar.gz

v0.8.3

10 Jun 20:18
b3bee3b
Compare
Choose a tag to compare

Added

  • Add new data type for datetimes with timezones: {:datetime, precision, time_zone}
    The old dtype is now {:naive_datetime, precision}.

  • Add option to rechunk the dataframes when using Explorer.DataFrame.from_parquet/3

Changed

  • Change the {:datetime, precision} dtype to {:naive_datetime, precision}.
    The idea is to mirror Elixir's datetime, and introduce support for time zones.
    Please note: {:datetime, precision} will work as an alias for {:naive_datetime, precision} for now but will raise a warning.
    The alias will be removed in a future release.

  • Literal %NaiveDateTime{} structs used in expressions will now have :microsecond precision.
    Previously they defaulted to :nanosecond precision.
    This was incorrect because %NaiveDateTime{} structs only have :microsecond precision.

Fixed

  • Fix regression in Explorer.DataFrame.concat_rows/2.
    It's possible to concat dataframes that are not aligned again.

  • Fix "is_finite" and "is_infinite" from Series to work in the context of a Explorer.Query.

Pull requests

New Contributors

SHA256 of the artifacts

2caba60cb3132e6751bba2879366e5b95551158f344fcd86d3ad39d2ac87a255  explorer-v0.8.3-nif-2.15-x86_64-pc-windows-gnu--legacy_cpu.dll.tar.gz
839d89988421790dfc64894ebac830bbdb81b4ae0a9cfb8917935cf767c295cc  explorer-v0.8.3-nif-2.15-x86_64-pc-windows-gnu.dll.tar.gz
bde5f164e7b46cd30c371c959712507999644a046d41c658649bbeb86077ed3a  explorer-v0.8.3-nif-2.15-x86_64-pc-windows-msvc--legacy_cpu.dll.tar.gz
a0a091f6c2171c456f36dd516b03cf789ad028b51b8fb2fa0bdfeed73fce2b8f  explorer-v0.8.3-nif-2.15-x86_64-pc-windows-msvc.dll.tar.gz
325bdf2b6d13a0aa3366bbf8a02b714610a4625e9b95d0306b66b2f3ac3fa9d6  libexplorer-v0.8.3-nif-2.15-aarch64-apple-darwin.so.tar.gz
0cfe0f315db83686fa1d7d1a276852f6964bda135279822ad946ee619c723ec2  libexplorer-v0.8.3-nif-2.15-aarch64-unknown-linux-gnu.so.tar.gz
f221d655939a815156881c314d1c1794757dc23afd755eb6144f6b6fea5ee10f  libexplorer-v0.8.3-nif-2.15-aarch64-unknown-linux-musl.so.tar.gz
a69917a55aed8b0c0e40b7b3ba92cce5bec818bc1ca4a03b2277921e30f0c48e  libexplorer-v0.8.3-nif-2.15-x86_64-apple-darwin.so.tar.gz
f974fb1e4caa9ee07843e9b2691ade556560c8f1d35291dc73f37249bf6f3477  libexplorer-v0.8.3-nif-2.15-x86_64-unknown-freebsd--legacy_cpu.so.tar.gz
212422bdceeef98ca7f08648b0bf67520390fc35f924e96ba8f0e667715fd63b  libexplorer-v0.8.3-nif-2.15-x86_64-unknown-freebsd.so.tar.gz
cf3df4dbcc228d1801e5c6c2f258c721d375c22919bbd2690f05e9624d405b55  libexplorer-v0.8.3-nif-2.15-x86_64-unknown-linux-gnu--legacy_cpu.so.tar.gz
ff7a17d8a3e6d45f349ecfcb3246664d3aa34681b41e2fce32eb5076a38f0544  libexplorer-v0.8.3-nif-2.15-x86_64-unknown-linux-gnu.so.tar.gz
fbba507be1059dac16228eee2287906b0311041c7603afa8f7a0a6edb4382fe5  libexplorer-v0.8.3-nif-2.15-x86_64-unknown-linux-musl.so.tar.gz

Full Changelog: v0.8.2...v0.8.3

v0.8.2

22 Apr 12:34
990d4e5
Compare
Choose a tag to compare

Added

  • Add functions to work with strings and regexes.

    Some of the functions have the prefix "re_", because they accept a string that represents a regular expression.

    There is an important detail: we do not accept Elixir regexes, because we cannot guarantee that the backend supports it. Instead we accept a plain string that is "escaped". This means that you can use the ~S sigil to build that string.
    Example: ~S/(a|b)/.

    The added functions are the following:

    • Explorer.Series.split_into/3 - split a string series into a struct of string fields. This function accepts a string as a separator.

    • Explorer.Series.re_contains/2 - check is the string series matches the regex pattern. Like the "non regex" counterpart, it returns a boolean series.

    • Explorer.Series.re_replace/3 - replaces all occurences of a pattern with replacement in string series. The replacement can refer to groups captures by using the ${x}, where x is the group index (starts with 1) or name.

    • Explorer.Series.count_matches/2 - count how many times a substring appears in a string series.

    • Explorer.Series.re_count_matches/2 - count how many times a pattern matches in a string series.

    • Explorer.Series.re_scan/2 - scan for all matches for the given regex pattern.
      This is going to result in a series of lists of strings - {:list, :string}.

    • Explorer.Series.re_named_captures/2 - extract all capture groups as a struct for the given regex pattern. In case the groups are not named, their positions are used as names.

  • Enable the usage of system certificates if OTP version 25 or above.

  • Add support for the :streaming option in Explorer.DataFrame.to_csv/3.

  • Support operations with groups in the Lazy Polars backend. This change makes the lazy frame implementation more useful, by supporting the usage of groups in following functions:

    • Explorer.DataFrame.slice/3

    • Explorer.DataFrame.head/2

    • Explorer.DataFrame.tail/2

    • Explorer.DataFrame.filter_with/2 and the macro version of it, filter/2.

    • Explorer.DataFrame.sort_with/3, although it ignores "maintain order" and "nulls last" options when used with groups.

    • Explorer.DataFrame.mutate_with/2 and its macro version, mutate/2.

Changed

  • We now avoid raising an exception if a non existent column is used in Explorer.DataFrame.discard/2.

  • Make the dependency of cacerts optional. This is because people using Erlang/OTP 25 or above can use the certificates provided by the system.
    So you may need to add the dependency of cacerts if your OTP version is older than that.

  • Some precision differences in float operations may appear. This is due to an update in the Polars version to "v0.38.1". Polars is our default backend.

Fixed

  • Fix Explorer.Series.split/2 inside the context of Explorer.Query.

  • Add optional X-Amz-Security-Token header to S3 request. This is needed in case the user is passing down a token for authentication.

  • Fix Explorer.DataFrame.sort_by/3 with groups to respect :nils option.
    This is considering only the eager implementation.

  • Fix inspection of lazy frames in remote nodes.

Pull requests

  • Bump Polars 0.37 by @lkarthee in #861
  • DataFrame.discard/2 - don't raise for non existent column by @lkarthee in #872
  • Add native expression for Series.split/2 by @H12 in #875
  • Bump mio from 0.8.10 to 0.8.11 in /native/explorer by @dependabot in #876
  • Implements Series.split_into/3 by @ryancurtin in #873
  • Update Polars to v0.38 by @philss in #879
  • Add optional x-amz-security-token header to S3 request by @jschniper in #881
  • Rewrite LazyFrame by @philss in #882
  • Update Rustler to v0.32.1 by @philss in #884
  • Fix DF.sort_by/3 with groups to respect :nils option by @philss in #886
  • Update Polars to v0.38.3 by @philss in #887
  • Implements :streaming option for DataFrame.to_csv/3 by @ryancurtin in #889
  • Support operations with groups in the Lazy Polars backend by @philss in #890
  • Bump h2 from 0.3.25 to 0.3.26 in /native/explorer by @dependabot in #891
  • Revert LazyFrame implementation with stack by @philss in #892
  • Refactor eager DF implementation to make use of lazy backend by @philss in #893
  • Add re_contains/2 and re_replace/3 to match with a regex by @philss in #894
  • Add count_matches/2, re_count_matches/2, re_scan/2 and re_named_captures/2 to Series by @philss in #895
  • Add changes to the change log for the upcoming version by @philss in #897
  • Update dependencies by @philss in #899
  • Pass down backend to lazy series and enable re_named_captures/2 usage by @philss in #896
  • Release v0.8.2 by @philss in #900

New Contributors

SHA 256 of Artifacts

fd4d7db73577544d1008827502461fbc82644b44879bf4d50b8c7c2f7a04ad1f  explorer-v0.8.2-nif-2.15-x86_64-pc-windows-gnu--legacy_cpu.dll.tar.gz
ba9f6afe86d37e52b7481a29e6011cdc834b2c0196ee6b4235497c4a405fe6e3  explorer-v0.8.2-nif-2.15-x86_64-pc-windows-gnu.dll.tar.gz
447e3150ebffa1712ed7b6d56e11dd2369126a92bf5466570e4f36ae46f200b9  explorer-v0.8.2-nif-2.15-x86_64-pc-windows-msvc--legacy_cpu.dll.tar.gz
2032955e04c6632fd4d6d1015f611b7b90a84a405710d49cce46b7b1e1f52b3d  explorer-v0.8.2-nif-2.15-x86_64-pc-windows-msvc.dll.tar.gz
9f10c1b25846de37ca2caf271c7728716d5d6783c82e823783ced53cc6a0b4b0  libexplorer-v0.8.2-nif-2.15-aarch64-apple-darwin.so.tar.gz
aceade08eab94230b8f9dc87a5850e5523a7cf7a4222495bf3fa012c4622cd54  libexplorer-v0.8.2-nif-2.15-aarch64-unknown-linux-gnu.so.tar.gz
0f0341d8a0928554ea2c083a653e599fdd406a0675bfc8e615465e6461726508  libexplorer-v0.8.2-nif-2.15-aarch64-unknown-linux-musl.so.tar.gz
63f9ffda8f9dbcacb12a3a522adbde370b78579a23a75183321fb0fa81f0a596  libexplorer-v0.8.2-nif-2.15-x86_64-apple-darwin.so.tar.gz
65793e232a26a91bcfb90867f6392e34228d8f3f23419f500bee47f08c3e8896  libexplorer-v0.8.2-nif-2.15-x86_64-unknown-freebsd--legacy_cpu.so.tar.gz
3b4d5b1d88cfe416a13e3f0f1fdc87dd54dbf78dff7ff40cb1f64aa7652d5b8a  libexplorer-v0.8.2-nif-2.15-x86_64-unknown-freebsd.so.tar.gz
3afb057bfecdf86199a9dc380be2f44c44f09ac428d806e7983232ba6f15601b  libexplorer-v0.8.2-nif-2.15-x86_64-unknown-linux-gnu--legacy_cpu.so.tar.gz
cfab4552f1f3791e38c6b6ce3d5099fbe88f686ec062ac181ef2b35f3432c1e3  libexplorer-v0.8.2-nif-2.15-x86_64-unknown-linux-gnu.so.tar.gz
7fa4961f08f9278f6b8585d2b4f89d5712ccf222de7a41edb054915d8ec7d50c  libexplorer-v0.8.2-nif-2.15-x86_64-unknown-linux-musl.so.tar.gz

Full Changelog: v0.8.1...v0.8.2

v0.8.1

24 Feb 21:29
bfdf07b
Compare
Choose a tag to compare

Added

  • Add Explorer.Series.field/2 to extract a field from a struct series.
    It returns a new series with the field's dtype.

  • Add Explorer.Series.json_decode/2 that can decode a string series containing valid JSON objects according to dtype.

  • Add eager count/1 and lazy size/1 to Explorer.Series.

  • Add support for maps as expressions inside Explorer.Query. They are "converted" to structs.

  • Add json_path_match/2 to extract a string series from a string containing valid JSON objects.
    See the article JSONPath - XPath for JSON for details about JSON paths.

  • Add Explorer.Series.row_index/1 to retrieve the index of rows starting from 0.

  • Add support for passing the :on column directly (instead of inside a list) in Explorer.DataFrame.join/3.

Changed

  • Remove some deprecated functions from documentation.

  • Change internal representation of the :struct dtype to use list of tuples instead of a map to represent the dtypes of each field. This shouldn't break because we normalise maps to lists when a struct dtype is passed in from_list/2 or cast/2.

  • Update Rustler minimum version to ~> 0.31. Since Rustler is optional, this shouldn't affect most of the users.

Fixed

  • Fix float overflow error to avoid crashing the VM, and instead it returns an argument error.

  • Fix Explorer.DataFrame.print/2 for when the DF contains structs.

Pull requests

New Contributors

Full Changelog: v0.8.0...v0.8.1
Official Changelog: https://hexdocs.pm/explorer/changelog.html

SHA256 of precompiled artifacts

ce4b06cf51f6213b4e1917e52f73c8a09ef57c5cf5e157409122cdd348d00ee3  explorer-v0.8.1-nif-2.15-x86_64-pc-windows-gnu--legacy_cpu.dll.tar.gz
b78fb84a8847b17dd857213c9aea69622dff0b6b00233f395e9aaf2e3ee9a923  explorer-v0.8.1-nif-2.15-x86_64-pc-windows-gnu.dll.tar.gz
481204194b180b5dd4207cc00909f192a3e8f094f08b8be58bbc5e9e058150cd  explorer-v0.8.1-nif-2.15-x86_64-pc-windows-msvc--legacy_cpu.dll.tar.gz
f1a77c0f378582e300f17a85fe391eea6bdd673839fdd52d9bc5988906ba6171  explorer-v0.8.1-nif-2.15-x86_64-pc-windows-msvc.dll.tar.gz
86aec9dd29572a61cd064108b02360161bff3e93e109ec0eb6c3e516cd08a6b4  libexplorer-v0.8.1-nif-2.15-aarch64-apple-darwin.so.tar.gz
a10f4ea3c7c1135b15e4a15186f926eed9c18376d4168442156e7ab9d9678408  libexplorer-v0.8.1-nif-2.15-aarch64-unknown-linux-gnu.so.tar.gz
ac338d49cc96bdd8646c2e98e4eba877a352d02f84b956055d814dc66b884e1f  libexplorer-v0.8.1-nif-2.15-aarch64-unknown-linux-musl.so.tar.gz
ffc4d30c9c6802e5be5429b687f599fa4ca7875178468e880a5a6b2bc7f83663  libexplorer-v0.8.1-nif-2.15-x86_64-apple-darwin.so.tar.gz
e7bd3c239fd11db43f5fa822a5d25ce3c1a4569e33b155cdf341dfc20e5488c1  libexplorer-v0.8.1-nif-2.15-x86_64-unknown-freebsd--legacy_cpu.so.tar.gz
dba1128914e97a0edca3d0618ef6617a3eed284fbe6ee9d052b95c674e6eac14  libexplorer-v0.8.1-nif-2.15-x86_64-unknown-freebsd.so.tar.gz
90b8960ce6d57b48002a1ddffaab950a746de24f7c45eab7c89d65a075edeb9d  libexplorer-v0.8.1-nif-2.15-x86_64-unknown-linux-gnu--legacy_cpu.so.tar.gz
9480ab502d28b7540cf598115c6aadc6a2d1c61eaaa82ffa60fc2e530b0f1e91  libexplorer-v0.8.1-nif-2.15-x86_64-unknown-linux-gnu.so.tar.gz
c5ffa7c27f6dc44ec31be9eed3d09ff1f3fa9ce4727342af8607160b48d6b686  libexplorer-v0.8.1-nif-2.15-x86_64-unknown-linux-musl.so.tar.gz

v0.8.0

20 Jan 17:43
e0e242b
Compare
Choose a tag to compare

Added

  • Add explode/2 to Explorer.DataFrame. This function is useful to expand the contents of a {:list, inner_dtype} series into a "inner_dtype" series.

  • Add the new series functions all?/1 and any?/1, to work with boolean series.

  • Add support for the "struct" dtype. This new dtype represents the struct dtype from Polars/Arrow.

  • Add map/2 and map_with/2 to the Explorer.Series module.
    This change enables the usage of the Explore.Query features in a series.

  • Add sort_by/2 and sort_with/2 to the Explorer.Series module.
    This change enables the usage of the lazy computations and the Explorer.Query module.

  • Add unnest/2 to Explorer.DataFrame. It works by taking the fields of a "struct" - the new dtype - and transform them into columns.

  • Add pairwise correlation - Explorer.DataFrame.correlation/2 - to calculate the correlation between numeric columns inside a data frame.

  • Add pairwise covariance - Explorer.DataFrame.covariance/2 - to calculate the covariance between numeric columns inside a data frame.

  • Add support for more integer dtypes. This change introduces new signed and unsigned integer dtypes:

    • {:s, 8}, {:s, 16}, {:s, 32}
    • {:u, 8}, {:u, 16}, {:u, 32}, {:u, 64}.

    The existing :integer dtype is now represented as {:s, 64}, and it's still the default dtype for integers. But series and data frames can now work with the new dtypes. Short names for these new dtypes can be used in functions like Explorer.Series.from_list/2. For example, {:u, 32} can be represented with the atom :u32.

    This may bring more interoperability with Nx, and with Arrow related things, like ADBC and Parquet.

  • Add ewm_standard_deviation/2 and ewm_variance/2 to Explorer.Series.
    They calculate the "exponentially weighted moving" variance and standard deviation.

  • Add support for :skip_rows_after_header option for the CSV reader functions.

  • Support {:list, numeric_dtype} for Explorer.Series.frequencies/1.

  • Support pins in cond, inside the context of Explorer.Query.

  • Introduce the :null dtype. This is a special dtype from Polars and Apache Arrow to represent "all null" series.

  • Add Explorer.DataFrame.transpose/2 to transpose a data frame.

Changed

  • Rename the functions related to sorting/arranging of the Explorer.DataFrame.
    Now arrange_with is named sort_with, and arrange is sort_by.

    The sort_by/3 is a macro and it is going to work using the Explorer.Query module. On the other side, the sort_with/2 uses a callback function.

  • Remove unnecessary casts to {:s, 64} now that we support more integer dtypes.
    It affects some functions, like the following in the Explorer.Series module:

    • argsort
    • count
    • rank
    • day_of_week, day_of_year, week_of_year, month, year, hour, minute, second
    • abs
    • clip
    • lengths
    • slice
    • n_distinct
    • frequencies

    And also some functions from the Explorer.DataFrame module:

    • mutate - mostly because of series changes
    • summarise - mostly because of series changes
    • slice

Fixed

  • Fix inspection of series and data frames between nodes.

  • Fix cast of :string series to {:datetime, any()}

  • Fix mismatched types in Explorer.Series.pow/2, making it more consistent.

  • Normalize sorting options.

  • Fix functions with dtype mismatching the result from Polars.
    This fix is affecting the following functions:

    • quantile/2 in the context of a lazy series
    • mode/1 inside a summarisation
    • strftime/2 in the context of a lazy series
    • mutate_with/2 when creating a column from a NaiveDateTime or Explorer.Duration.

Pull requests

New Contributors

Read more

v0.7.2

30 Nov 20:21
e585012
Compare
Choose a tag to compare

Added

  • Add the functions day_of_year/1 and week_of_year/1 to Explorer.Series.

  • Add filter/2 - a macro -, and filter_with/2 to Explorer.Series.

    This change enables the usage of queries - using Explorer.Query - when
    filtering a series. The main difference is that series does not have a
    name when used outside a dataframe. So to refer to itself inside the
    query, we can use the special _ variable.

      iex> s = Explorer.Series.from_list([1, 2, 3])
      iex> Explorer.Series.filter(s, _ > 2)
      #Explorer.Series<
        Polars[1]
        integer [3]
      >
    
  • Add support for the {:list, any()} dtype, where any() can be any other
    valid dtype. This is a recursive dtype, that can represent nested lists.
    It's useful to group data together in the same series.

  • Add Explorer.Series.mode/2 to get the most common value(s) of the series.

  • Add split/2 and join/2 to the Explorer.Series module.
    These functions are useful to split string series into {:list, :string},
    or to join parts of a {:list, :string} and return a :string series.

  • Expose ddof option for variance, covariance and standard deviation.

  • Add a new {:f, 32} dtype to represent 32 bits float series.
    It's also possible to use the atom :f32 to create this type of series.
    The atom :f64 can be used as an alias for {:f, 64}, just like the
    :float atom.

  • Add lengths/1 and member?/2 to Explorer.Series.
    These functions work with {:list, any()}, where any() is any valid dtype.
    The idea is to count the members of a "list" series, and check if a given
    value is member of a list series, respectively.

  • Add support for streaming parquet files from a lazy dataframe to AWS S3
    compatible services.

Changed

  • Remove restriction on pivot_wider dtypes.
    In the early days, Polars only supported numeric dtypes for the "first"
    aggregation. This is not true anymore, and we can lift this restriction.

  • Change :float dtype to be represented as {:f, 64}. It's still possible
    to use the atom :float to create float series, but now Explorer.Series.dtype/1
    returns {:f, 64} for float 64 bits series.

Fixed

  • Add missing implementation of Explorer.Series.replace/3 for lazy series.

  • Fix inspection of DFs and series when limit: :infinity is used.

Removed

  • Drop support for the riscv64gc-unknown-linux-gnu target.

    We decided to stop precompiling to this target because it's been hard to maintain it.
    Ideally we should support it again in the future.

Pull requests

New Contributors

Full Changelog: v0.7.1...v0.7.2
Official Changelog: https://github.com/elixir-explorer/explorer/blob/main/CHANGELOG.md

Checksums

The list below if the SHA256 checksums of the precompiled artifacts.

363e9c8ecd92f2d7ff19cc977ab8fafbed8f5b5f4a9c483d98bb7441b469c5c2  explorer-v0.7.2-nif-2.15-x86_64-pc-windows-gnu--legacy_cpu.dll.tar.gz
abb960b51f56e76d594554c1f7cd082de195a64a04f5795c75ced6126c4b66a5  explorer-v0.7.2-nif-2.15-x86_64-pc-windows-gnu.dll.tar.gz
d9c5e084f22dc2fc3a4ef808e840c192109e4cd919054f0111f7f1e4f52b97b3  explorer-v0.7.2-nif-2.15-x86_64-pc-windows-msvc--legacy_cpu.dll.tar.gz
96a59887aff5e62b4838fb8d7189ac21b7a39bf6caae11985d8e2daea2d99f15  explorer-v0.7.2-nif-2.15-x86_64-pc-windows-msvc.dll.tar.gz
d5e384c292fca48941cd5400cf900fa39117f9cc187bfbc1d252d3a72798cd07  libexplorer-v0.7.2-nif-2.15-aarch64-apple-darwin.so.tar.gz
52c4455faec0c12789ecf2fb287f89b4f8350728092fa9de39ca24d1203d3daa  libexplorer-v0.7.2-nif-2.15-aarch64-unknown-linux-gnu.so.tar.gz
3524ebf3c73246eff8d3fb556786f30d063402cc212a83e3abaceae9f2ff86c5  libexplorer-v0.7.2-nif-2.15-aarch64-unknown-linux-musl.so.tar.gz
429ceebb5b7f465c66a6becc1e454ef7383fc58f6cfe081bf71fdd94eb55655b  libexplorer-v0.7.2-nif-2.15-x86_64-apple-darwin.so.tar.gz
029bb5fc6e102449260b706655428cd3ef02b36bd74246375e17897c1a26815d  libexplorer-v0.7.2-nif-2.15-x86_64-unknown-freebsd--legacy_cpu.so.tar.gz
d2075a364a23fc7911b3099716505d8d0df69af1aab944e527c169a96fa2ef58  libexplorer-v0.7.2-nif-2.15-x86_64-unknown-freebsd.so.tar.gz
86b9e9b671c46cb90d0d4bf7b0f2998e59f8715d6f13fde29f8070ad756da648  libexplorer-v0.7.2-nif-2.15-x86_64-unknown-linux-gnu--legacy_cpu.so.tar.gz
fd4e095fafa0055619a49383bd8680abd6639e3d5fc114dd246e0192bcccb5e8  libexplorer-v0.7.2-nif-2.15-x86_64-unknown-linux-gnu.so.tar.gz
ad3779850c36bf0ff3ca568124da8966cca25ce84912642dc13462cb9ca5a9dd  libexplorer-v0.7.2-nif-2.15-x86_64-unknown-linux-musl.so.tar.gz

v0.7.1

25 Sep 16:28
96a108d
Compare
Choose a tag to compare

Added

  • Add more temporal arithmetic operations. This change makes possible
    to mix some datatypes, like date, duration and scalar types like
    integers and floats.

    The following operations are possible now:

    • date - date
    • date + duration
    • date - duration
    • duration + date
    • duration * integer
    • duration * float
    • duration / integer
    • duration / float
    • integer * duration
    • float * duration
  • Support lazy dataframes on Explorer.DataFrame.print/2.

  • Add support for strings as the "indexes" of Explorer.Series.categorise/2.
    This makes possible to categorise a string series with a categories series.

  • Introduce cond/1 support in queries, which enables multi-clause conditions.
    Example of usage:

        iex> df = DF.new(a: [10, 4, 6])
        iex> DF.mutate(df,
        ...>   b:
        ...>     cond do
        ...>       a > 9 -> "Exceptional"
        ...>       a > 5 -> "Passed"
        ...>       true -> "Failed"
        ...>     end
        ...> )
        #Explorer.DataFrame<
          Polars[3 x 2]
          a integer [10, 4, 6]
          b string ["Exceptional", "Failed", "Passed"]
        >
    
  • Similar to cond/1, this version also introduces support for the if/2
    and unless/2 macros inside queries.

  • Allow the usage of scalar booleans inside queries.

  • Add Explorer.Series.replace/3 for string series.
    This enables the replacement of patterns inside string series.

Deprecated

  • Deprecate Explorer.DataFrame.to_lazy/1 in favor of just lazy/1.

Fixed

  • Fix the Explorer.Series.in/2 function to work with series of the
    :category dtype.

    Now, if both series shares the same categories, we can compare them.
    To make sure that a categorical series shares the same categories from
    another series, you must create that series using the
    Explorer.Series.categorise/2 function.

  • Display the dtype of duration series correctly in Explorer.DataFrame.print/2.

Pull requests

New Contributors

Full Changelog: v0.7.0...v0.7.1
Official Changelog: https://github.com/elixir-explorer/explorer/blob/main/CHANGELOG.md

SHA256 of compiled artifacts

c723d2185d3d908004d1a88c4a565a2c339598ad61388265415a55a54e1f786a  explorer-v0.7.1-nif-2.15-x86_64-pc-windows-gnu.dll.tar.gz
9d5a87706ab5334d13325f8c5ec753dddf700a46a78a42678f596663cc3b4aca  explorer-v0.7.1-nif-2.15-x86_64-pc-windows-msvc.dll.tar.gz
e95fa8e787161aab1e3080c59a8fcb3b8e11e8346c4311a64126eb9acb869c5a  libexplorer-v0.7.1-nif-2.15-aarch64-apple-darwin.so.tar.gz
3a7d40cabee2f8036ef8bf2ed85b39479fe20d77bb05962f490c979e156671b1  libexplorer-v0.7.1-nif-2.15-aarch64-unknown-linux-gnu.so.tar.gz
4481e50238cbe251ce5d5a89da50a8b6cb05d772098a9404a5067c605a8d49bb  libexplorer-v0.7.1-nif-2.15-aarch64-unknown-linux-musl.so.tar.gz
b4760dac6ab2d57f906957de0188bdff2b35595d8a095236f292119a61a85db1  libexplorer-v0.7.1-nif-2.15-riscv64gc-unknown-linux-gnu.so.tar.gz
b37e8ba2fe028dc1c34f54006f467b0493d38d52d2166f0438f29eb7d094b180  libexplorer-v0.7.1-nif-2.15-x86_64-apple-darwin.so.tar.gz
bab7426c2acd61604a793e950b55e8823077bfb6b3a609f3ee6025332170f87f  libexplorer-v0.7.1-nif-2.15-x86_64-unknown-freebsd.so.tar.gz
b3e3c8092da7cbdaf0078f556f61f3e6c68ebfb04a2f3f284a99b99f57310024  libexplorer-v0.7.1-nif-2.15-x86_64-unknown-linux-gnu.so.tar.gz
b0cfae969888be5a34079084c2a9b58c056578834526c3c8db41d33fec6a8407  libexplorer-v0.7.1-nif-2.15-x86_64-unknown-linux-musl.so.tar.gz

v0.7.0

28 Aug 13:34
6ee60de
Compare
Choose a tag to compare

Added

  • Enable reads and writes of dataframes from/to external file systems.

    It supports HTTP(s) URLs or AWS S3 locations.

    This feature introduces the FSS abstraction,
    which is also going to be present in newer versions of Kino. This is going to make the integration
    of Livebook files with Explorer much easier.

    The implementation is done differently, depending on which file format is used, and if
    it's a read or write. All the writes to AWS S3 are done in the Rust side - using an abstraction
    called CloudWriter -, and most of the readers are implemented in Elixir, by doing a download
    of the files, and then loading the dataframe from it. The only exception is the reads of
    parquet files, which are done in Rust, using Polars' scan_parquet with streaming.

    We want to give a special thanks to Qqwy / Marten for the
    CloudWriter implementation!

  • Add ADBC: Arrow Database Connectivity.

    Continuing with improvements in the IO area, we added support for reading dataframes from
    databases using ADBC, which is similar in idea to ODBC, but integrates much better with
    Apache Arrow, that is the backbone of Polars - our backend today.

    The function Explorer.DataFrame.from_query/1 is the entrypoint for this feature, and it
    allows quering databases like PostgreSQL, SQLite and Snowflake.

    Check the Elixir ADBC bindings docs for more information.

    For the this feature, we had a fundamental contribution from Cocoa
    in the ADBC bindings, so we want to say a special thanks to her!

    We want to thank the people that joined José in his live streamings on Twitch,
    and helped to build this feature!

  • Add the following functions to Explorer.Series:

  • Add duration dtypes. This is adds the following dtypes:

    • {:duration, :nanosecond}
    • {:duration, :microsecond}
    • {:duration, :millisecond}

    This feature was a great contribution from Billy Lanchantin,
    and we want to thank him for this!

Changed

  • Return exception structs instead of strings for all IO operation errors, and for anything
    that returns an error from the NIF integration.

    This change makes easier to define which type of error we want to raise.

  • Update Polars to v0.32.

    With that we made some minor API changes, like changing some options for cut/qcut operations
    in the Explorer.Series module.

  • Use nil_values instead of null_character for IO operations.

  • Never expect nil for CSV IO dtypes.

  • Rename Explorer.DataFrame.table/2 to Explorer.DataFrame.print/2.

  • Change :datetime dtype to be {:datetime, time_unit}, where time unit can be
    the following:

    • :millisecond
    • :microsecond
    • :nanosecond
  • Rename the following Series functions:

    • trim/1 to strip/2
    • trim_leading/1 to lstrip/2
    • trim_trailing/1 to rstrip/2

    These functions now support a string argument.

Fixed

  • Fix warnings for the upcoming Elixir v1.16.

  • Fix Explorer.Series.abs/1 type specs.

  • Allow comparison of strings with categories.

  • Fix Explorer.Series.is_nan/1 inside the context of Explorer.Query.
    The NIF function was not being exported.

Pull requests

New Contributors

Full Diff: v0.6.1...v0.7.0
Changelog: https://github.com/elixir-explorer/explorer/blob/main/CHANGELOG.md

SHA256 checksums

a4629f950187fd20f4b0efa0164e8e9e20b5799312688e4ce7d82c46e28dfbaa  explorer-v0.7.0-nif-2.15-x86_64-pc-windows-gnu.dll.tar.gz
9028f61dcde0e3d95ca886463d78cdaf0d749a319f0c721a11321947f86017d7  explorer-v0.7.0-nif-2.15-x86_64-pc-windows-msvc.dll.tar.gz
5a244343ad99310267531c7848c9f261944064770e90748c22c0dd17fa12867a  libexplorer-v0.7.0-nif-2.15-aarch64-apple-darwin.so.tar.gz
a72dae3b58b11d73a47f3a92137b53494e52b1454c413e9ec3877d4eb7d9e406  libexplorer-v0.7.0-nif-2.15-aarch64-unknown-linux-gnu.so.tar.gz
c87661de40d2447d90c7962dd61da98481f9cf9fa75a923e8699a216d0786a18  libexplorer-v0.7.0-nif-2.15-aarch64-unknown-linux-musl.so.tar.gz
0c5e602fb5e2680b916c07cb6615812e15e6f9fe28b361866341f8b4fd5cffe7  libexplorer-v0.7.0-nif-2.15-riscv64gc-unknown-linux-gnu.so.tar.gz
02f1638a7309133a72f8079560b06223ffd2d266dece3ddfe25ec0646ddfa23d  libexplorer-v0.7.0-nif-2.15-x86_64-apple-darwin.so.tar.gz
64bfae13b65e18b29a891820930ebbfac56abba01cf5b515b1fe11cf8019820b  libexplorer-v0.7.0-nif-2.15-x86_64-unknown-freebsd.so.tar.gz
f0296c139c68fa818f61bbc37c5cf293a4fc09bbd4ccf1d942e8566e89146c51  libexplorer-v0.7.0-nif-2.15-x86_64-unknown-linux-gnu.so.tar.gz
d2105e54fa5ab677b9b2e4313371108a9a2df355af659a73c486a55111fd29b4  libexplorer-v0.7.0-nif-2.15-x86_64-unknown-linux-musl.so.tar.gz