[0.5.2] - 2023-09-01

Support Apache Arrow 13.0.0 . This version is compatible with Arrow 12.0.0 .

  • Breaking change

  • Bug fixes

  • Fix bundle install issue by install libyaml-devel (#280)

  • Fix ownership in devcontainer ci (#280)

  • New features and improvements

  • Support Arrow 13.0.0 (#280)

  • Documentation and Example

  • Add dataframe_comparison_ja (#281)

[0.5.1] - 2023-08-18

Docker environment is replaced by Dev Container, and Jupyter Notebooks will be created from qmd files.

  • Breaking change

  • Bug fixes

  • Fix timestamp test to set TZ locally (#249)

  • Fix regexp for beginning of String (#251)

  • Fix loading bin/Gemfile locally in bin/jupyter script (#261)

  • New features and improvements

  • Support sort and null_placement options in Vector#rank (#265)

  • Add Vector#find_substring method (#270)

  • Add Group#one method (#274)

  • Add Group#all and #any method (#274)

  • Add Group#median method (#274)

  • Add Group#count_uniq method (#274)

  • Introduce Dev Container environment

    • Introduce Devcontainer environment (#253)

    • Change lifecycle script from postCreate to onCreate (#253)

    • Move example to bin (#253)

    • Fix Python and Ruby versions in Dev Container (#254)

    • Add locale and timezone settings (#256)

    • Add quarto from devcontainer feature (#259)

    • Install HaranoAjiFonts as default Tex font (#259)

  • Refactoring

  • Rename boolean methods in VectorStringFunction (#263)

  • Refine Vector#inspect to show wheather chunked or not (#267)

  • Add an alias Group#count_all for #group_count (#274)

  • Improve in tests/CI

  • Create rake commands for Notebook convert/test (#269)

  • Fix rubocop warning of forwarding arguments in assign_update (#269)

  • Use rake to start example script (#269)

  • Add test in Vector#rank to cover illegal rank option error (#271)

  • Add bundle install to Rakefile (#276)

  • Use Dockerfile to create dev container (#276)

  • Save image to ghcr in ci (#276)

  • Documentation and Example

  • YARD

    • Update Docker Environment (#245)

    • Refine jupyter notebook environment (#253)

    • Refine yard in Group aggregations (#274)

    • Fix yard of Vector#rank (#269)

    • Fix yard of Group (#269)

  • Notebook

    • Start source management for jupyter notebook by qmd (#259)

    • Don't create ipynb if it exists (#261)

    • Add Group methods (125 in total) (#269)

    • Add ArrowFunction (126 in total) (#269)

    • Add DataFrame#auto_cast (127 in total) (#269)

    • Update required version in examples notebook (#269)

    • Update examples_of_red_amber (#269)

    • Update red-amber.qmd (#269)

  • GitHub site

  • Fix broken link in README/README.ja by Viktorius Suwandi (#262)

  • Change description in gemspec (#254)

  • Add documents for Dev Container (#254)

  • Thanks

  • Viktorius Suwandi

[0.5.0] - 2023-05-24

  • Breaking change

  • Use non keyword argument in #sub_by_value (#219)

  • Upgrade dependency to Arrow 12.0.0 (#238)

    • right_join will output columns as same order as Red Arrow.

    • DataFrame#join will not force ordering of original column by default

    • Join with type, such as full_join, sort after join by default

    • right_join will output columns as same order as Red Arrow.

    • DataFrame#join will not force ordering of original column by default

    • Join with type, such as full_join, sort after join by default

  • Bug fixes

  • Use truncate in Vector#sample(float) (#229)

  • Support options in DataFrame#tdra (#231)

  • Fix printing table with non-ascii strings (#233)

  • Fix join for Arrow 12.0.0

  • New features and improvements

  • Add a singleton method Vector.[] (#218)

  • Add an alias #sub_group (#219)

  • Accept Group#summarizeHash to rename aggregated columns (#219)

  • Add Group#group_frame (#219)

  • Add Vector#cast (#224)

  • Add Vector#fill_nil(value) (#226)

  • Add Vector#one (#227)

  • Add Vector#mode (#228)

  • Add DataFrame#propagate (#235)

  • Add DataFrame#sample (#237)

  • Add DataFrame#shuffle (#237)

  • Support RankOptions in Vector#rank (#239)

  • Introduce MatchSubstringOptions family in Vector (#241)

    • Introduce Vector#match_substring?

    • Add Vector#end_with?, #start_with? method

    • Add Vector#match_like?

    • Add Vector#count_substring method

  • Refactoring

  • Refine Group and SubFrames function (#219)

    • Refine Group#group_count

    • Use Acero in Group#filters

    • Refine Group#filters, not using Acero

    • Refine Group#summarize(array)

  • Use Acero for renaming columns in join (#238)

  • Use index kernel with IndexOptions introduced in 12.0.0 (#240)

  • Improve in tests/CI

  • Use Fedra 39 Rawhide in CI (#238)

  • Documentation and Example

  • Add missing yard documents for SubFrames::Selectors (#219)

  • Update docker/example (#219)

  • Update Gemfile in docker (#219)

  • Add README.ja.md (#242)

  • GitHub site

  • Update link of Red Data Tools Chat to matrix (#242)

  • Thanks

[0.4.2] - 2023-04-02

  • Breaking change

  • Bug fixes

  • Fix Vector#modulo, #fdiv, #remainder (#203)

  • New features and improvements

  • Update SubFrames#take to return SubFrames (#212)

  • Refactoring

  • Refine SubFrames to support partial retrieval (#207)

  • Upgrade SubFrames#frames and promote to public (#207)

  • Use faster count in Group#inspect (#207)

  • Improve in tests/CI

  • Documentation and Example

  • Introduce minimum docker environment (#205)

  • Move example REPL to docker (#205)

  • Add readme.md in docker (#205)

  • Add example_of_red_amber.ipynb (#205)

  • Use smaller dataset in irb example

  • Fix docker/example

  • Updated link to red-data-tools (#213)

    • Thanks to Soumya Kushwaha

  • GitHub site

  • Migrated to Red Data Tools

    • Thanks to Sutou Kouhei

  • Thanks

  • Sutou Kouhei

  • Soumya Kushwaha

[0.4.1] - 2023-03-11

  • Breaking change

  • Remove Vector.aggregate? method (#200)

  • Bug fixes

  • Return self in DataFrame#drop when dropper is empty (reverts 746ac263) (#193)

  • Return self in DataFrame#rename when renaming to same name (#193)

  • Return self in DataFrame#pick when pick itself (#199)

  • Fix column width for non-ascii elemnts in DataFrame#to_s (#193)

    • This change uses String#width.

  • Fix DataFrame#to_iruby when data is date32 type (#193)

  • Fix DataFrame#shorthand to show temporal type data simply (#193)

  • Fix Vector#rank when data is ChunkedArray (#198)

  • Fix Vector element-wise functions with nil as scalar (#198)

  • Support :force_order for all methods of join family (#199)

    • Supports :force_order option to force sorting after join for all #join familiy.

    • This will valuable in some cases such as large dataframes.

  • Ensure baseframe's schema for SubFrames (#200)

  • New features and improvements

  • Add Vector#first, #last method (#198)

    • This method will be used in SubFrames feature.

  • Add Vector#modulo method (#198)

    • The divmod function in Arrow C++ is still in draft state. This method was created by combining existing functions

  • Add Vector#quotient method (#198)

  • Add aliases #div, #mod, #mul, #pow, #quo and #sub for Vector (#198)

  • Add Vector#*_checked functions (#198)

    • This functions will check numeric range overflow.

  • Add 'tdra' and 'plain' in display mode (#193)

    • The plain mode and default inspect will show up to 128 rows and 128 columns.

  • Add String#width method in refinements (#193)

    • This will be used to update DataFrame#to_s.

  • Introduce pre-loaded REPL environment (#199)

    • This commit will add bin/example and it will start irb environment with enabled commonly used datasets such as penguins, diamonds, etc.

  • Upgrade SubFrames#aggregate to accept block (#200)

  • Refactoring

  • Use symbolized keys in refinements of Table#keys, #key? (#193)

    • This can be treat Tables and DataFrames as same manner.

  • Use key_name.succ in suffix of DataFrame#join (#193)

    • This will make simple to get name candidate.

  • Use ||= to memorize instance variables (#193)

  • Refine vector projection to use #variables (#193)

    • variables is fastest when picking Vectors.

  • Refine Vector#is_in to avoid #pack (#198)

  • Refine Vector#index (#198)

  • Improve in tests/CI

  • Tests

    • Update benchmarks to test from older version (#193)

    • Refine test of Vector function with scalar (#198)

    • Refine test subframes and test_vector_selectable (#200)

  • Cops

  • CI

  • Documentation

  • Update documents(small fix) (#201)

  • GitHub site

  • Thanks

[0.4.0] - 2023-02-25

  • Breaking change

  • Upgrade dependency to Arrow 11.0.0 (#188)

  • Bug fixes

  • Add :force_order option for DataFrame#join (#174)

  • Return error for empty DataFrame in DataFrame#filter (#172)

  • Accept ChunkedArray in DataFrame#filter (#172)

  • Fix Vector#replace to accept Arrow::Array as a replacer (#179)

  • Fix Vector#round_to_multiple to accept Float or Integer (#180)

  • Change Vector atan2 to a class method (#180)

  • Fix Vector#shift when boolean Vector (#184)

  • Fix processing empty SubFrames (#183)

  • Do not check object id in DataFrame#rename, #drop for self (#188)

  • New features and improvements

  • Accept a block in DataFrame#filter (#172)

  • Add Vector.aggregate? method (#175)

  • Introduce Vector#propagate method (#175)

  • Add Vector#rank methods (#176)

  • Add Vector#sample method (#176)

  • Add Vector#sort method (#176)

  • Promote DataFrame#shape_str to public (#184)

  • Introduce Vector#concatenate (#184)

  • Add #numeric? in refinements of Array (#184)

  • Add Vector#cumulative_sum_checked and #cumsum (#184)

  • Add Vector#resolve method (#184)

  • Add DataFrame#tdra method (#184)

  • Add #expand as an alias for Vector#propagate (#184)

  • Add #glimpse as an alias for DataFrame#tdr (#184)

  • New class SubFrames (#183)

    • Introduce class SubFrames

    • Memorize dataframes in SubFrames

    • Add @frames to memorize sub DataFrames

    • Accept filters in SubFrames.new

    • Accept block in SubFrames.new

    • Add SubFrames.by_filter

    • Introduce methods creating SubFrames from DataFrame

    • Introduce SubFrames#each method

    • Add SubFrames#to_s method

    • Add SubFrames#concatenate method

    • Add SubFrames#offset_indices method

    • SubFrames#aggregate method

    • Redefine SubFrames#map to return SubFrames

    • Define SubFrame#map dynamically

    • Add SubFrames#assign method

    • Redefine SubFrames#select to return SubFrames

    • Add SubFrames#reject method

    • Add SubFrames#filter_map method

    • Refine DataFrame#indices memorizing @indices

    • Rename SubFrames#universal_frame as #baseframe

    • Set Group iteration feature to @api private

  • Refactoring

  • Generate Vector functions in class method (#177)

  • Set Constant visibility to private (#179)

  • Separate test_vector_function (#179)

  • Relocate methods in DataFrameIndexable (#179)

  • Rename Array refinements to the same name as Vector (#184)

  • Improve in tests/CI

  • Tests

    • Update benchmarks to set 0.3.0 as a reference (#167)

    • Move test of Vector#logb to proper location (#180)

  • Cops

    • Update .rubocop.yml to align with latest cops (#174)

    • Unify style of MethodCallIndentation as relative to reciever (#184)

  • CI

    • Fix setting up Arrow by homebrew in CI (#167)

    • Fix CI error on homebrew deleting python link (#167)

    • Set cache-version to get new C extensions in CI (#173)

    • Thanks to @kou for suggestion.

  • Documentation

  • Update DataFrame.md about loading csv without headers (#165)

    • Thanks to kojix2

  • Update YARD in DataFrame combinable (#168)

  • Update comment for Ruby 2.7 support in README.md

  • Update license year

  • Update README (#172)

  • Update Vector.md and yardoc in #propagate (#175)

  • Use customized style sheet for YARD (#179)

  • Add examples for the doc of #pick and #drop (#179)

  • Add examples to YARD in DataFrame reshaping methods (#179)

  • Update documents in DataFrameDisplayable (#179)

  • Update documents in DataFrameVariableOperation (#179)

  • Update document for dynamically generated methods (#179)

  • Unify style in document (#179)

  • Update documents in DataFrameSelectable (#179)

  • Update documents of basic Vector methods (#179)

  • Update document in VectorUpdatable (#179)

  • Update document of Group (#179)

  • Update document of DataFrameLoadSave (#180)

  • Add examples for document of ArrowFunction (#180)

  • Update document of Vector_unary_aggregation (#180)

  • Update document of Vector_unary_element_wise (#180)

  • Update document of Vector_biary_element_wise (#180)

  • Add documentation to give comparison of dataframes(#169)

    • Thanks to Benson Muite

  • Update documents for consistency of method indentation (#189)

  • Update CHANGELOG (#189)

  • Update README for 0.4.0 (#189)

  • GitHub site

  • Thanks

  • kojix2

  • Benson Muite

[0.3.0] - 2022-12-18

  • Breaking change

  • Supported Ruby version has changed from 2.7 to 3.0

    • Upgrade minimum supported/required version of Ruby from 2.7 to 3.0 (#159, #160)

    • Upgrade minimum supported/required version of Ruby from 2.7 to 3.0 (#159, #160)

  • Bug fixes

  • Add check with #key? in DataFrame#method_missing (#140)

  • Delete unnecessary backslash to supress warning in unary functions (#140)

  • Fix syntax in code_climate.yml (144)

  • Temporary disable simplecov test report (#149)

  • Change Vector#[] to return Array or scalar (#148)

  • Add missing simplecov HTML formatter (#148)

  • Change return value of DataFrame#save to self (#160)

    • Originally reported by kojix2.

  • New features and improvements

  • Update Vector#take to accept block (#148)

  • Add properties of list Vectors (#148)

  • Add Vector#split, #split_to_column, #split_to_row (#148)

  • Add Vector#merge (#148)

  • Refactoring

  • Refactor code (#140)

    • Add DataFrame.create as a faster constructor

    • Refactor DataFrame.new using refinements and duck typing

    • Refactor Vector.new using refinements and duck typing

    • Add Vector.create as a faster constructor

    • Refactor Group

    • Refactor DataFrame#pick/#drop by refininig Array

    • Refactor DataFrame#pick/#drop

    • Refactor nil treatment in pick/drop

    • Refactor DataFrame#pick/#drop using new parser

    • Refactor DataFrame#[]

    • Refactor Vector#[], #take, #filter by updating parser

    • Add for_keys option to parse_args

    • Refactor Vector properties by refinements for Arrow::Array

    • Refactor DataFrame selectable using Arrow::Array refinements instead of Vector methods

    • Refactor DataFrame#assign

  • Refine error message in DataFrame#to_long/to_wide #143)

  • Refactor Vector#take/filter returns arrow array (#148)

  • Change LineLength in cop from 120 to 90 (#152)

  • Refine DataFrame combinable (join) operations (#159)

    • Refine DataFrame#join effectively using outputs options

    • Simplify DataFrame set operations

  • Improve in tests/CI

  • Tests

    • Update benchmark using 0.2.3 (#138)

    • Update benchmark basic#02/pick by [] (#140)

    • Update benchmark contexts and loop_count (#140)

    • Add benchmark for vector (#140)

    • Add tests for refinements (#140)

    • Add benchmark for the series of DataFrame operations (#140)

    • Add missing test for tdr and dictionary (#140)

    • Add missing test for group#method with foreign key (#152)

    • Add missing test for set operations and natural join (#152)

    • Add missing test for DataFrame#[] with selecting by Array of illegal type' (#152)

    • Add missing test for DataFrame#assign when assigner size is mismatch (#152)

    • Accept Hash as join keys in DataFrame join methods (#159)

  • Cops

    • Refactor/clean rubocop.yml (#138)

  • CI

    • Support Ruby 3.2 in CI test (#141)

    • Send test coverage report to Code Climate (#144)

    • Add test on Fedora (#151)

    • Thanks to Benson Muite.

    • Add workflow to generate document (#153)

    • Thanks to kojix2.

    • Support Code Climate test coverage report in CI (#155)

  • Documentation

  • Add YARD in data_frame.rb (#140)

  • Fix YARD document in the code (#140)

  • Add Code Climate badges of maintainability and coverage (#144)

  • Add installation for Fedora in README (#147)

    • Thanks to Benson Muite.

  • Add Vector#split/merge in Vector.md (#148)

  • Fix codeclimate badges in README (#155)

  • Update YARD in DataFrame join methods (#159)

  • Update jupyter notebook '89 examples of Redamber' (#160)

  • Thanks

  • Benson Muite

  • kojix2

[0.2.3] - 2022-11-16

  • Bug fixes

  • Fix DataFrame#to_s when DataFrame.size == 0 (#125)

  • Remove unused lines in funcs (#128)

  • Remove unused methods in helper (#128)

  • Add test for invalid arg in DataFrame.new (#128)

  • Add test for Vector#shift(0) (#128)

  • Fix bugs for DataFrame#[], #pick and #drop with Range of Symbols and Symbol (#135)

  • New features and improvements

  • Upgrade dependency to Arrow 10.0.0 (#132)

    It is possible to initialize by the objects responsible to to_arrow since 0.2.3 . Arrays in Numo::NArray is responsible to to_arrow with Red Arrow Numo::NArray 0.0.6 . This feature is proposed by the Red Data Tools member @kojix2 and implemented by @kou. I made also Vector to be responsible to to_arrow and to_arrow_array. It becomes a member of ducks ('quack quack'). Thanks!

    • Change dev dependency to red-dataset-arrow (#117)

    • Add dev dependency for red-arrow-numo-narray (#132)

    • Support Numo::NArray in Vector.new (#132)

    • Support Vector#to_arrow_array (#132)

  • Update group (#118)

    • Introduce new DataFrame group support (experimental)

    This additional API will treat a grouped DataFrame as a list of DataFrames. I think this API has pros such as: - API is easy to understand and flexible. - It has good compatibility with Ruby's primitive Enumerables. - We can only use non hash-ed aggregation functions. - Do not need grouped DataFrame state, nor #ungroup method. - May be useful for concurrent operations.

    This feature is implemented by Ruby, so it is pretty slow and experimental. Use original Group API for practical purpose.

    • include Enumerable to Group (experimental)

    • Add Group#each, #inspect

    • Refactor Group to align with Arrow

  • Introduce DataFrame combining methods (#125)

    • Introduce DataFrame#concatenate method

    • Add DataFrame#merge method

    • Add DataFrame#inner_join method

    • Add DataFrame#full_join method

    • Add DataFrame#left_join method

    • Add DataFrame#right_join method

    • Add DataFrame#semi_join method

    • Add DataFrame#anti_join method

    • Add DataFrame#intersect method

    • Add DataFrame#union method

    • Add DataFrame#setdiff method

    • Rename #setdiff to #difference

    • Support natural join in DataFrame#join

    • Support partial join_key and renaming

    • Fix DataFrame#join to merge key columns

    • Add DataFrame#set_operable? method

    • Add join/set/bind image to DataFrame.md

    • Fix DataFrame#join, #right_semi, #right_anti (#128)

  • Miscellaneous

    • Return Vector in DataFrame#indices (#118)

  • Improve tests/ci

  • Improve CI

    • Add CI test on macOS (#133)

    • Enable bundler-cache on macOS (#128)

    • Add install gobject introspection prior to glib in CI (#133) This will stabilize CI system installation especially with cache.

    • Rename workflows/test.yml to ci.yml (#133)

    • Fix link in CI badge of README.md (#118)

    • Add github action for coverage (#128)

  • Add benchmark

    • Add benchmarks with Rover (#118)

    • Introduce benchmark suite (#134)

    • Add benchmark for combining operations (#134)

  • Measuring test coverage

    • Add test coverage measurement (#128)

  • Refactoring

  • Remove redundant string escape in test_vector_function (#132)

  • Refine tests to use assert_equal_array (#128)

  • Rewrite Vector#replace (#128)

  • Documentation

  • Update README.md for installation (#126)

  • Add clause that keys must be unique in doc. (#126)

  • Rows should be called as 'records' (#126)

  • Update Jupyter Notebook 83 examples of RedAmber (#135)

  • GitHub site

    • Update Jupyter notebooks in Binder

    • Change default branch name from 'master' to 'main' (#127)

  • Thanks

Ruby Association Grant committee It is a great honor for selecting RedAmber as a project of Ruby Association Grant 2022.

[0.2.2] - 2022-10-04

  • Bug fixes

  • Return self when no replacement happen in Vector#replace. (#92)

  • Limit n-digits in to_iruby. (#111)

  • Fix displaying space in to_iruby. (#111)

  • Raise error if key is duplicated. (#113)

  • Fix DataFrame#pick/#drop with endless Range. (#113)

  • Change type from dictionary to string in DataFrame reshaping methods. (#113)

  • Fix arguments parser to accept Enumerator. (#114)

  • New features and improvements

  • Support to make a data frame from a to_arrow-responsible object. (#106) [Patch by Kenta Murata]

  • Introduce DataFrame#auto_cast (experimental feature) (#105)

  • Change default name in DataFrame#transpose, #to_long, #to_wide. (#110)

  • Add Vector#dictionary? method. (#113)

  • Add display mode 'Plain' and 'Minimum'. (#113)

  • Refactor code

    • Refine test_vector_selectable. (#92)

    • Refine test_vector_updatable. (#92)

    • Refine Vector.new. (#113)

    • Refine DataFrame#pick, #drop. (#113)

  • Documents

    • Update images. (#90, #105, #113)

    • Update README to use simpler examples. (#112)

    • Update README with a new screenshot example. (#113)

  • GitHub site

    • Update Jupyter notebooks in Binder (#88, #115)

    • Move binder support to heronshoes/docker-stacks repository.

    • Update README notebook on binder.

    • Add examples_of_RedAmber notebook on binder.

    • Start to use discussions.

  • Thanks

  • Kenta Murata

[0.2.1] - 2022-09-07

  • Bug fixes

  • Fix Vector#each with block (#66) Vector#each will return value of each element with block.

  • Fix table format at size == 9 (#67)

  • Fix to support Vector in DataFrame#assign (#77)

  • Add assert_delta functionality for assert_with_NaN (#78)

  • Fix Vector#is_in when self is chunked (#79)

  • Fix Array type error (uint/int) (#79)

  • New features and improvements

  • Refine DataFrame#indices method (#67)

  • Update DataFrame reshaping methods (#73)

    • Change default option value of DataFrame reshaping

    • Change the order of import_cars example

  • Add DataFrame#method_missing to get column vector by method (#75)

    • Add DataFrame#method_missing to get column (#75)

  • Accept both args and block in DataFrame#assign (#75)

  • Accept indices in DataFrame#pick and DataFrame#drop (#76)

  • Add DataFrame#slice_by method (#77)

  • Add new Vector functions (#78)

    • Add inverse trigonometric function for Vector

    • acos

    • asin

    • Add logarithmic function for Vector

    • ln

    • log10

    • log1p

    • log2

    • Add binary function Vector#logb

  • Docker image and Jupyter Notebook [Thanks to Kenta Murata]

    • Add link to RubyData in README

    • Add link to interactive README by Binder

  • Update Jupyter Notebook 71 examples of RedAmber

  • Thanks

  • Kenta Murata

[0.2.0] - 2022-08-15

  • Bump version up to 0.2.0

  • Bug fixes

  • Fix order of multiple group keys (#55) Only 1 group key comes to left. Other keys remain in right.

  • Remove optional require for rover (#55) Fix DataFrame.new for argument with Rover::DataFrame.

  • Fix occasional failure in CI (#59) Sometimes the CI test fails. I added -dev dependency in Arrow install by apt, not doing in bundler.

  • Fix calling :take in V#[] (#56) Fixed to call Arrow function :take instead of :array_take in Vector#take_by_vector. This will prevent the error below when called with Arrow::ChunkedArray.

  • Raise error renaming non existing key (#61) Add error when specified key is not exist.

  • Fix DataFrame#rename #assign by array (#65)

  • New features and improvements

  • Support Arrow 9.0.0

    • Upgrade to Arrow 9.0.0 (#59)

    • Add Vector#quantile method (#59) Arrow::QuantileOptions has supported in Arrow GLib 9.0.0 (ARROW-16623, Thanks!)

    • Add Vector#quantiles (#62)

    • Add DataFrame#each_row (#56)

    • Returns Enumerator if block is not given.

    • Change DataFrame#each_row to return a Hash => row (#63)

  • Refactor to use pattern match in overloaded parameter parsing (#61)

    • Refine DataFrame.new to use pattern match

    • Use pattern match in DataFrame#assign

    • Use pattern match in DataFrame#rename

  • Accept Array for renamer/assigner in #rename/#assign (#61)

    • Accept assigner by Arrays in DataFrame#assign

    • Accept renamer pairs by Arrays in DataFrame#rename

    • Add DataFrame#assign_left method

  • Add summary/describe (#62)

    • Introduce DataFrame#summary(#describe)

  • Introduce reshaping methods for DataFrame (#64)

    • Introduce DataFrame#transpose method

    • Intorduce DataFrame#to_long method

    • Intorduce DataFrame#to_wide method

  • Others

    • Add alias sort_index for array_sort_indices (#59)

    • Enable :width option in DataFrame#to_s (#62)

    • Add options to DataFrame#format_table (#62)

  • Update Documents

    • Add Yard doc for some methods

    • Update Jupyter notebook '61 Examples of Red Amber' (#65)

[0.1.8] - 2022-08-04 (experimental)

  • Bug fixes

  • Fix unnamed column in table formatter (#52)

  • Fix DataFrame#key?, DataFrame#key_index when @keys.nil? (#52)

  • Align order of replacer in Vector#replace (#53, resolved #38)

  • New features and improvements

  • Refine DataFrame.new for empty arguments (#50)

    • Delete .rubocop_todo.yml for not to use yoda condition (#50)

  • Refine Group (#52, resolved #28)

    • Refine Group methods creation

    • Make group key at first(left)

    • Show only one group count when same counts

    • Add block acceptability for group

    • Rename empty key to :unnamed in DataFrame.new

    • Rename Group#aggregated_by to #summarize (#54)

  • Add Vector#shift (#51)

  • Vector#[] accepts Range as an argument (#51)

  • Update documents

  • Add support for yard (#54)

  • Renew jupyter notebook '53 examples' (#54)

  • Add more examples and images in README (#52)

  • Add document of group manipulations in README (#52)

  • Renew DF#group document in DataFrame.md (#52)

[0.1.7] - 2022-07-15 (experimental)

  • Bug fixes

  • Remove development dependency for red-dataset-arrow (#47)

    • To avoid irregular fails in CI test

    • Add red-datasets to development dependency instead (#49)

  • Supress useless log in tests (#46) Suppress log of Webrick and iruby.

  • New features and improvements

  • Use Table mode as default preview mode in inspect/to_s (#40)

    • Show examples in documents in Table

    • Use the word rows/columns

    • Update images of data processing in Table style

  • Introduce a new Table formatter (#47)

    • Migrate from the Arrow's formatter

    • Do not use TAB, format by spaces only.

    • Align column width with head rows and tail rows.

    • Show nils.

    • Show data types.

    • Refine documents to use new formatter output

  • Simplify options of Vector functions (#46) Vector functions with options use optional argument opt in previous code.

  • Add #float?, #integer? to Vector (#46)

  • Add #each to Vector (#47)

  • Introduce class Group (#48)

    • Refine DataFrame#group to use class Group

    • Add methods to Group

  • Move parquet and rover to development dependency (#49)

  • Refine text in DataFrame#to_iruby (#40)

  • Add badges in Github site

    • Gitter badge for Red Data Tools (#42)

    • Gem version and CI status badge (#45)

  • Exchange containers in red-amber.rb and red_amber.rb (#47)

    • Mainly use red_amber by consistency with the folder name

  • Add Jupyter notebook '47 Examples of Red Amber' (#49)

[0.1.6] - 2022-06-26 (experimental)

  • Bug fixes

  • Fix mime-type of empty DataFrame in #to_iruby (#31)

  • Fix mime setting in DataFrame#to_iruby (#36)

  • Fix unmatched return val in Selectable (#34)

  • Fix to return same error as #[] in DataFrame#slice (#34)

  • New features and improvements

  • Introduce Jupyter support (#29, #30, #31, #32)

    • Add 'DataFrame#to_html (changed to use #to_iruby)

    • Add feature to show nil in to_iruby

    • nil is expressed as (nil)

    • empty string(") is ""

    • blank spaces are " "

  • Enable to change DataFrame display mode by ENV (#36)

    • Support ENV to change display mode in #inspect and #to_iruby

    • ENV = ‘table’ # => Table mode

    • ENV = nil or other than ‘table’ # => TDR mode

  • Support require 'red-amber', as well (#34)

  • Refine Vector slicing methods (#31)

    • Introduce Vector#take method

    • Introduce Vector#filter method

    • Improve Vector#[] to overload take and filter

    • Introduce Vector#drop_nil method

    • Introduce Vector#if_else method

    • Intorduce Vector#is_in method

    • Add alias Vector#all?, #any? methods (#32)

    • Add Vector#has_nil? method(#32)

    • Add Vector#empty? method

    • Add Vector#primitive_invert method

    • Refactor Vector#take, #filter

    • Move Vector#if_else from function to Updatable

    • Move if_else test to updatable

    • Rename updatable in test

    • Remove method Vector#take_out_element_wise

    • Rename inner metthod name

  • Refine DataFrame slicing methods (#31)

    • Introduce 'DataFrame#take method

    • take is implemented as vector calculation by #if_else

    • Introduce 'DataFrame#fliter method

    • Change 'DataFrame#[] to use take and filter

    • Float indices is acceptable (#10)

    • Negative index (like Array) is also acceptable

  • Further refinement in DataFrame slicing methods (#34)

  • Improve DataFrame#[], #slice, #remove by a new engine

    • It parses arguments to Vector internally.

    • Used Kernel#Array to simplify code (#16) .

    • Move DataFrame#slice, #remove to Selectable

    • Refine DataFrame#take, #filter (undocumented)

  • Introduce coerce in Vector (#35)

    • Introduce Vector#coerce

    • Now we can -1 * Vector.new([1, 2, 3])

    • Add Vector#to_ary method

    • Now we can [1, 2] + Vector.new([3, 4, 5])

  • Other new feature or refinements

    • Common

    • Refactor helper as common for DataFrame and Vector (#35)

    • Change name row/col to obs/var (#34)

    • Rename internal function name (#34)

    • Delete unused methods (#34)

    • DataFrame

    • Change to return instance variable in #to_arrow, #keys and #key_index (#34)

    • Change to return an Array in DataFrame#indices (#35)

    • Vector

    • Introduce Vector#replace method

    • Accept Range and expanded Array in Vector#new

    • Add Vector#indices method (#35)

    • Add Vector#index method (#35)

    • Rename VectorCompensable to *Updatable (#33)

  • Documentation

    • Fix typo in DataFrame.md

  • Github site

    • Add gem and status badges in README. (#42) [Patch by kojix2]

  • Thanks

  • kojix2

[0.1.5] - 2022-06-12 (experimental)

  • Bug fixes

  • Fix DataFrame#tdr to display timestamp type (#19)

  • Add TZ setting in CI test to pass temporal tests (#19)

  • Fix example in document of #load(csv_from_URI) (#23)

  • New features and improvements

  • Improve usability of DataFrame manipulating block (#19)

    • Add DataFrame#v to select a Vector

    • Add DataFrame#variables method

    • Add DataFrame#to_arrow

    • Add instance variables in DataFrame with lazy initialization

    • Add Vector#key to get key name

    • Add Vector#temporal? to check if temporal type

    • Refine around DataFrame#variables

    • Refine init of instance variables

    • Refine DataFrame#type_classes, Vector#ectortype_class

    • Refine DataFrame#tdr to shorten temporal data

  • Add supports to make up for missing values (#20)

    • Add VectorArgumentError

    • Add Vector#replace_with

    • Add helper function to assert with NaN

    • To assert NaN == NaN

    • Add Vector#fill_nil_backward, Vector#forward

    • Add DataFrame#remove_nil method

    • Change to accept nil as replacement in Vector#replace_with

  • Introduce index related methods (#22)

    • Add Vector#sort_indexes method

    • Add Vector#uniq method

    • Add Vector#tally and Vectorvalue_counts methods

    • Add DataFrame#sort method

    • Add DataFrame#group method

    • Change to use DataFrame#map_indices in #[]

  • Add rounding functions with opts (#21)

    • With options :mode and :n_digits

    • :n_digits also can be specified with :multiple option in Vector#round_to_multiple

    • Vector#round

    • Vector#ceil

    • Vector#floor

    • Vector#trunc

  • Documentation

    • Update TDR, TDR_ja documents to latest (#18)

    • Refinement and small fix in DataFrame.md (#18)

    • Update README to use more effective example (#18)

    • Delete expired TDR_operations.pdf (#23)

    • Update README and dataframe_model image (#23)

    • Update description about rover-df in README (#23)

    • Add installation of Arrow in README (#23)

  • Others

    • Tried but cannot use bundler cache in ci test (#17)

    • Bump up requirements to Arrow 8.0.0 (#25)

    • Arrow 7.0.0 with Ubuntu 21.04 causes an fatal error in replace_with_mask function.

    • Update the description of gem (#23)

    • Add benchmark tests (#26)

[0.1.4] - 2022-05-29 (experimental)

  • Bug fixes

  • Fix missing support for scalar argument (#1)

  • Fix type name of boolean in DataFrame#types to be same as Vector#type (#6, #7)

  • Fix zero picking to return empty DataFrame (#8)

  • Fix code at both args and a block given (#8)

  • New features and improvements

  • DataFrame

    • Refine module name Displayable

    • Rename nrow/ncol methods to size/n_keys to align with TDR concept (#4)

    • Remain n_row/n_col for compatibility

    • Rename ls method to tdr (#4)

    • Add limit option to tdr

    • Shorten option name (#11)

    • Introduce pick method to create sub DataFrame (#8)

    • Add boolean support (#8)

    • Refactor pick (#9)

    • Introduce drop method to create sub DataFrame (#8)

    • Add boolean support (#8)

    • Refactor drop (#9)

    • Add boolean array support for [] (#9)

    • Add indexes/indices to use with selecting observations (#9)

    • Introduce slice method to create sub DataFrame (#8)

    • Refactor slice (#9)

    • Introduce remove method to create sub DataFrame (#9)

    • Introduce rename method to create sub DataFrame (#14)

    • Introduce assign method to create sub DataFrame (#14)

    • Improve to call block by instance_eval (#13)

  • Vector

    • Refine find(function)

    • Add min_max method (#2)

    • Add std/sd method (ddof=0 version: stddev) (#2)

    • Add var method (ddof=0 version: variance) (#2)

    • Add VectorFunctions.arrow_doc(func_name) (temporally)

  • Documentation

    • Show code in README

    • Change row/column names for TDR concept (#4)

    • Add documents about TDR concept (#4)

    • Add example about TDR (#4)

    • Separate README to create DataFrame and Vector documents (#12)

    • Add DataFrame model concept image to README (#12)

  • GitHub site

    • Switched to use merge on GitHub (not to push merged master) (#1)

    • Create lifetime issue #3 to show the goal of this project (#3)

[0.1.3] - 2022-05-15 (experimental)

  • Bug fixes

  • Fix boolean functions in Vector to align with Ruby's behavior

    • & == and_kleene

    • | == or_kleene

    • & == and_kleene

    • | == or_kleene

  • Quote strings of data-preview in DataFrame#inspect

  • Quote empty and blank keys in DataFrame#inspect

  • Respond to error for a wrong key in DataFrame#[]

  • New features and improvements

  • DataFrame

    • Display nil elements in inspect

    • Show NaN and nil counts in inspect

    • Refactor inspect

    • Add method key and key_index

    • Add how to load/save Parquet to README

  • Vector

    • Add categorization functions

    This is an important step to support slice method and NA treatment features. - is_finite - is_inf - is_na (RedAmber original) - is_nan - is_nil, is_null - is_valid - Show in a reduced representation for long array in inspect - Support options in aggregatiton functions - Return values in non-arrow object for scalar aggregation functions

[0.1.2] - 2022-05-08 (experimental)

  • Bug fixes:

  • DataFrame

    • Fix bug in #[] with end-less Range

  • New features and improvements

  • Add support for Arrow 8.0.0

  • DataFrame

    • types and data_types

    • Range is usable to specify columns in #[]

  • Vector

    • type and data_type

[0.1.1] - 2022-05-06 (experimental)

  • Release on rubygems.org

  • Introduce class DataFrame

  • New from Hash, schema/rows, Arrow::Table, Rover::DataFrame

  • Load from file, string, URI

  • Save to file, string, URI

  • Methods for basic properties

  • Rich inspect method

  • Basic selecting by #[]

  • Introduce class Vector

  • New from a column in a DataFlame

  • New from Arrow::Array, Arrow::ChunkedArray, Array

  • Methods for basic properties

  • Function support

    • Unary aggregations

    • Unary element-wises

    • Binary element-wises

    • Some operators defined

[0.1.0] - 2022-04-15 (unreleased)

  • Initial version