Class: RedAmber::DataFrame

Inherits:
Object
  • Object
show all
Includes:
DataFrameCombinable, DataFrameDisplayable, DataFrameIndexable, DataFrameLoadSave, DataFrameReshaping, DataFrameSelectable, DataFrameVariableOperation, Helper
Defined in:
lib/red_amber/data_frame.rb

Overview

Class to represent a data frame. Variable @table holds an Arrow::Table object.

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Methods included from DataFrameVariableOperation

#assign, #assign_left, #drop, #pick, #rename

Methods included from DataFrameSelectable

#[], #filter, #first, #head, #last, #remove, #remove_nil, #sample, #shuffle, #slice, #slice_by, #tail, #take, #v

Methods included from DataFrameReshaping

#to_long, #to_wide, #transpose

Methods included from DataFrameLoadSave

#auto_cast, included, #save

Methods included from DataFrameIndexable

#indices, #sort, #sort_indices

Methods included from DataFrameDisplayable

#inspect, #shape_str, #summary, #tdr, #tdr_str, #tdra, #to_iruby, #to_s

Methods included from DataFrameCombinable

#anti_join, #concatenate, #difference, #full_join, #inner_join, #intersect, #join, #left_join, #merge, #right_join, #semi_join, #set_operable?, #union

Constructor Details

#initialize(hash) ⇒ DataFrame #initialize(table) ⇒ DataFrame #initialize(schama, row_oriented_array) ⇒ DataFrame #initialize(arrowable) ⇒ DataFrame #initialize(rover_like) ⇒ DataFrame #initializeDataFrame #initialize(empty) ⇒ DataFrame

Creates a new DataFrame.

Overloads:

  • #initialize(hash) ⇒ DataFrame

    Initialize a DataFrame by a Hash.

    Examples:

    Initialize by a Hash

    hash = { x: [1, 2, 3], y: %w[A B C] }
    DataFrame.new(hash)

    Initialize by a Hash like arguments.

    DataFrame.new(x: [1, 2, 3], y: %w[A B C])

    Initialize from #to_arrow_array responsibles.

    # #to_arrow_array responsible `array-like` is also available.
    require 'arrow-numo-narray'
    DataFrame.new(numo: Numo::DFloat.new(3).rand)

    Parameters:

    • hash (Hash<key => <Array, Arrow::Array, #to_arrow_array>>)

      a Hash of ‘key` with array-like for column values. `key`s are Symbol or String.

  • #initialize(table) ⇒ DataFrame

    Initialize a DataFrame by an ‘Arrow::Table`.

    Examples:

    Initialize by a Table

    table = Arrow::Table.new(x: [1, 2, 3], y: %w[A B C])
    DataFrame.new(table)

    Parameters:

    • table (Arrow::Table)

      a table to have in the DataFrame.

  • #initialize(schama, row_oriented_array) ⇒ DataFrame

    Initialize a DataFrame by schema and row_oriented_array.

    Examples:

    Initialize by a schema and a row_oriented_array.

    schema = { x: :uint8, y: :string }
    row_oriented_array = [[1, 'A'], [2, 'B'], [3, 'C']]
    DataFrame.new(schema, row_oriented_array)

    Parameters:

    • schema (Hash<key => type>)

      a schema of key and data type.

    • row_oriented_array (Array)

      an Array of rows.

  • #initialize(arrowable) ⇒ DataFrame
    Note:

    ‘RedAmber::DataFrame` itself is readable by this.

    Note:

    Hash is refined to respond to ‘#to_arrow` in this class.

    Initialize DataFrame by a ‘#to_arrow` responsible object.

    Examples:

    Initialize by Red Dataset object.

    require 'datasets-arrow'
    dataset = Datasets::Penguins.new
    penguins = DataFrame.new(dataset)

    Parameters:

    • arrowable (#to_arrow)

      Any object which responds to ‘#to_arrow`. `#to_arrow` must return `Arrow::Table`.

    Since:

    • 0.2.2

  • #initialize(rover_like) ⇒ DataFrame
    Note:

    ‘Rover::DataFrame` is readable by this.

    Initialize DataFrame by a ‘Rover::DataFrame`-like `#to_h` responsible object.

    Parameters:

    • rover_like (#to_h)

      Any object which responds to ‘#to_h`. `#to_h` must return a Hash which is convertable by `Arrow::Table.new`.

  • #initializeDataFrame

    Create empty DataFrame

    Examples:

    DataFrame.new
  • #initialize(empty) ⇒ DataFrame

    Create empty DataFrame

    Examples:

    Return empty DataFrame.

    DataFrame.new([])
    DataFrame.new({})
    DataFrame.new(nil)

    Parameters:

    • empty (nil, [], {})


134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
# File 'lib/red_amber/data_frame.rb', line 134

def initialize(*args)
  case args
  in nil | [nil] | [] | {} | [[]] | [{}]
    @table = Arrow::Table.new({}, [])
  in [Arrow::Table => table]
    @table = table
  in [arrowable] if arrowable.respond_to?(:to_arrow)
    table = arrowable.to_arrow
    unless table.is_a?(Arrow::Table)
      raise DataFrameTypeError,
            "to_arrow must return an Arrow::Table but #{table.class}: #{arrowable}"
    end
    @table = table
  in [rover_like] if rover_like.respond_to?(:to_h)
    begin
      # Accepts Rover::DataFrame
      @table = Arrow::Table.new(rover_like.to_h)
    rescue StandardError
      raise DataFrameTypeError, "to_h must return Arrowable object: #{rover_like}"
    end
  else
    begin
      @table = Arrow::Table.new(*args)
    rescue StandardError
      raise DataFrameTypeError, "invalid argument to create Arrow::Table: #{args}"
    end
  end

  name_unnamed_keys
  check_duplicate_keys(keys)
end

Dynamic Method Handling

This class handles dynamic methods through the method_missing method

#method_missing(name, *args, &block) ⇒ Object

Catch variable (column) key as method name.



775
776
777
778
779
# File 'lib/red_amber/data_frame.rb', line 775

def method_missing(name, *args, &block)
  return variables[name] if args.empty? && key?(name)

  super
end

Instance Attribute Details

#tableArrow::Table (readonly) Also known as: to_arrow

Returns the table having within.

Returns:

  • (Arrow::Table)

    the table within.



171
172
173
# File 'lib/red_amber/data_frame.rb', line 171

def table
  @table
end

Class Method Details

.create(table) ⇒ DataFrame

Note:

This method will allocate table directly and may be used in the method.

Note:

‘table` must have unique keys.

Quicker DataFrame constructor from a ‘Arrow::Table`.

Parameters:

  • table (Arrow::Table)

    A table to have in the DataFrame.

Returns:



31
32
33
34
35
# File 'lib/red_amber/data_frame.rb', line 31

def create(table)
  instance = allocate
  instance.instance_variable_set(:@table, table)
  instance
end

.new_dataframe_with_schema(dataframe_for_schema, dataframe_for_value) ⇒ DataFrame

Return new DataFrame for specified schema and value.

Parameters:

  • dataframe_for_schema (Dataframe)

    schema of this dataframe will be used.

  • dataframe_for_value (DataFrame)

    column values of thes dataframe will be used.

Returns:

Since:

  • 0.4.1



47
48
49
50
51
52
# File 'lib/red_amber/data_frame.rb', line 47

def new_dataframe_with_schema(dataframe_for_schema, dataframe_for_value)
  DataFrame.create(
    Arrow::Table.new(dataframe_for_schema.table.schema,
                     dataframe_for_value.table.columns)
  )
end

Instance Method Details

#==(other) ⇒ true, false

Compare DataFrames.

Returns:

  • (true, false)

    true if other is a DataFrame and table is same. Otherwise return false.



323
324
325
# File 'lib/red_amber/data_frame.rb', line 323

def ==(other)
  other.is_a?(DataFrame) && @table == other.table
end

#build_subframes(subset_specifier) ⇒ SubFrames #build_subframes {|self| ... } ⇒ Object

Generic builder of sub-dataframes from self.

Experimental feature

this method may be removed or be changed in the future.

Overloads:

  • #build_subframes(subset_specifier) ⇒ SubFrames

    Create a new SubFrames object.

    Examples:

    df.build_subframes([[0, 2, 4], [1, 3, 5]])
    
    # =>
    #<RedAmber::SubFrames : 0x000000000000fe9c>
    @baseframe=#<RedAmber::DataFrame : 6 x 3 Vectors, 0x000000000000fba4>
    2 SubFrames: [3, 3] in sizes.
    ---
    #<RedAmber::DataFrame : 3 x 3 Vectors, 0x000000000000feb0>
            x y        z
      <uint8> <string> <boolean>
    0       1 A        false
    1       3 B        false
    2       5 B        true
    ---
    #<RedAmber::DataFrame : 3 x 3 Vectors, 0x000000000000fec4>
            x y        z
      <uint8> <string> <boolean>
    0       2 A        true
    1       4 B        (nil)
    2       6 C        false

    Parameters:

    • subset_specifier (Array<Vector>, Array<array-like>)

      an Array of numeric indices or boolean filters to create subsets of DataFrame.

    Returns:

  • #build_subframes {|self| ... } ⇒ Object

    Create a new SubFrames object by block.

    Examples:

    dataframe.build_subframes do
      even = indices.map(&:even?)
      [even, !even]
    end
    
    # =>
    #<RedAmber::SubFrames : 0x000000000000fe60>
    @baseframe=#<RedAmber::DataFrame : 6 x 3 Vectors, 0x000000000000fba4>
    2 SubFrames: [3, 3] in sizes.
    ---
    #<RedAmber::DataFrame : 3 x 3 Vectors, 0x000000000000fe74>
            x y        z
      <uint8> <string> <boolean>
    0       1 A        false
    1       3 B        false
    2       5 B        true
    ---
    #<RedAmber::DataFrame : 3 x 3 Vectors, 0x000000000000fe88>
            x y        z
      <uint8> <string> <boolean>
    0       2 A        true
    1       4 B        (nil)
    2       6 C        false

    Yields:

    • (self)

      the block is called within the context of self. (Block is called by instance_eval(&block). )

    Yield Returns:

    • (Array<numeric_array_like>, Array<boolean_array_like>)

      an Array of index or boolean array-likes to create subsets of DataFrame. All array-likes are responsible to #numeric? or #boolean?.

Since:

  • 0.4.0



693
694
695
696
697
698
699
# File 'lib/red_amber/data_frame.rb', line 693

def build_subframes(subset_specifier = nil, &block)
  if block
    SubFrames.new(self, instance_eval(&block))
  else
    SubFrames.new(self, subset_specifier)
  end
end

#each_rowEnumerator #each_row {|key_row_pairs| ... } ⇒ Integer

Enumerate for each row.

Overloads:

  • #each_rowEnumerator

    Returns Enumerator when no block given.

    Returns:

    • (Enumerator)

      enumerator of each rows.

  • #each_row {|key_row_pairs| ... } ⇒ Integer

    Yields with key and row pairs.

    Yield Parameters:

    • key_row_pairs (Hash)

      key and row pairs.

    Yield Returns:

    • (Integer)

      size of the DataFrame.

    Returns:

    • (Integer)

      returns size.



354
355
356
357
358
359
360
361
362
363
364
# File 'lib/red_amber/data_frame.rb', line 354

def each_row
  return enum_for(:each_row) unless block_given?

  size.times do |i|
    key_row_pairs =
      vectors.each_with_object({}) do |v, h|
        h[v.key] = v.data[i]
      end
    yield key_row_pairs
  end
end

#empty?true, false

Check if it is a empty DataFrame.

Returns:

  • (true, false)

    true if it has no columns.



332
333
334
# File 'lib/red_amber/data_frame.rb', line 332

def empty?
  variables.empty?
end

#group(*group_keys) ⇒ Group #group(*group_keys) {|group| ... } ⇒ DataFrame

Create a Group object. Or create a Group and summarize it.

Overloads:

  • #group(*group_keys) ⇒ Group

    Create a Group object.

    Examples:

    Create a Group

    penguins.group(:species)
    
    # =>
    #<RedAmber::Group : 0x000000000000c3c8>
      species   group_count
      <string>      <uint8>
    0 Adelie            152
    1 Chinstrap          68
    2 Gentoo            124

    Parameters:

    • group_keys (Array<Symbol, String>)

      keys for grouping.

    Returns:

    • (Group)

      Group object.

  • #group(*group_keys) {|group| ... } ⇒ DataFrame

    Create a Group and summarize it by aggregation functions from the block.

    Examples:

    Create a group and summarize it.

    penguins.group(:species)  { mean(:bill_length_mm) }
    
    # =>
    #<RedAmber::DataFrame : 3 x 2 Vectors, 0x000000000000f3fc>
      species   mean(bill_length_mm)
      <string>              <double>
    0 Adelie                   38.79
    1 Chinstrap                48.83
    2 Gentoo                    47.5

    Yield Parameters:

    • group (Group)

      passes Group object.

    Yield Returns:

    • (DataFrame, Array<DataFrame>)

      an aggregated DataFrame or an array of aggregated DataFrames.

    Returns:



416
417
418
419
420
# File 'lib/red_amber/data_frame.rb', line 416

def group(*group_keys, &block)
  g = Group.new(self, group_keys)
  g = g.summarize(&block) if block
  g
end

#key?(key) ⇒ Boolean Also known as: has_key?

Returns true if self has a specified key in the argument.

Parameters:

  • key (Symbol, String)

    key to test.

Returns:

  • (Boolean)

    returns true if self has key in Symbol.



236
237
238
# File 'lib/red_amber/data_frame.rb', line 236

def key?(key)
  keys.include?(key.to_sym)
end

#key_index(key) ⇒ Integer Also known as: find_index, index

Returns index of specified key in the Array keys.

Parameters:

  • key (Symbol, String)

    key to know.

Returns:

  • (Integer)

    index of key in the Array keys.



248
249
250
# File 'lib/red_amber/data_frame.rb', line 248

def key_index(key)
  keys.find_index(key.to_sym)
end

#keysArray Also known as: column_names, var_names

Returns an Array of keys.

Returns:

  • (Array)

    keys in an Array.



223
224
225
# File 'lib/red_amber/data_frame.rb', line 223

def keys
  @keys ||= init_instance_vars(:keys)
end

#n_keysInteger Also known as: n_variables, n_vars, n_cols

Returns the number of variables (columns).

Returns:

  • (Integer)

    number of variables (columns).



191
192
193
# File 'lib/red_amber/data_frame.rb', line 191

def n_keys
  @table.n_columns
end

#propagate(scalar) ⇒ Vector #propagate {|self| ... } ⇒ Vector

Returns a Vector such that all elements have value ‘scalar`

and have same size as self.

Overloads:

  • #propagate(scalar) ⇒ Vector

    Specifies scalar as an agrument.

    Examples:

    propagate a value

    df
    # =>
    #<RedAmber::DataFrame : 6 x 3 Vectors, 0x00000000000849a4>
            x y        z
      <uint8> <string> <boolean>
    0       1 A        false
    1       2 A        true
    2       3 B        false
    3       4 B        (nil)
    4       5 B        true
    5       6 C        false
    
    df.assign(:sum_x) { propagate(x.sum) }
    # =>
    #<RedAmber::DataFrame : 6 x 4 Vectors, 0x000000000007bd04>
            x y        z           sum_x
      <uint8> <string> <boolean> <uint8>
    0       1 A        false          21
    1       2 A        true           21
    2       3 B        false          21
    3       4 B        (nil)          21
    4       5 B        true           21
    5       6 C        false          21
    
    # Using `Vector#propagate` like below has same result as above.
    df.assign(:sum_x) { x.propagate(:sum) }
    
    # Also it is same as creating column from an Array.
    df.assign(:sum_x) { [x.sum] * size }

    Parameters:

    • scalar (scalar)

      a value to propagate in Vector.

    Returns:

    • (Vector)

      created Vector.

  • #propagate {|self| ... } ⇒ Vector

    Returns created Vector.

    Examples:

    propagate the value from the block

    df.assign(:range) { propagate { x.max - x.min } }
    # =>
    #<RedAmber::DataFrame : 6 x 4 Vectors, 0x00000000000e603c>
            x y        z           range
      <uint8> <string> <boolean> <uint8>
    0       1 A        false           5
    1       2 A        true            5
    2       3 B        false           5
    3       4 B        (nil)           5
    4       5 B        true            5
    5       6 C        false           5

    Yield Parameters:

    • self (DataFrame)

      gives self to the block.

    Yield Returns:

    • (scalar)

      a value to propagate in Vector

    Returns:

    • (Vector)

      created Vector.

Since:

  • 0.5.0



765
766
767
768
769
770
771
772
# File 'lib/red_amber/data_frame.rb', line 765

def propagate(scalar = nil, &block)
  if block
    raise VectorArgumentError, "can't specify both function and block" if scalar

    scalar = instance_eval(&block)
  end
  Vector.new([scalar] * size)
end

#respond_to_missing?(name, include_private) ⇒ Boolean

Catch variable (column) key as method name.

Returns:

  • (Boolean)


782
783
784
785
786
# File 'lib/red_amber/data_frame.rb', line 782

def respond_to_missing?(name, include_private)
  return true if key?(name)

  super
end

#schemaHash

Returns column name and data type in a Hash.

Examples:

RedAmber::DataFrame.new(x: [1, 2, 3], y: %w[A B C]).schema
# => {:x=>:uint8, :y=>:string}

Returns:

  • (Hash)

    column name and data type.



313
314
315
# File 'lib/red_amber/data_frame.rb', line 313

def schema
  keys.zip(types).to_h
end

#shapeArray

Returns the numbers of rows and columns.

Returns:

  • (Array)

    number of rows and number of columns in an array. Same as [size, n_keys].



204
205
206
# File 'lib/red_amber/data_frame.rb', line 204

def shape
  [size, n_keys]
end

#sizeInteger Also known as: n_records, n_obs, n_rows

Returns the number of records (rows).

Returns:

  • (Integer)

    number of records (rows).



179
180
181
# File 'lib/red_amber/data_frame.rb', line 179

def size
  @table.n_rows
end

#sub_by_enum(enumerator_method, *args) ⇒ SubFrames Also known as: subframes_by_enum

Create SubFrames by Grouping/Windowing by posion from a enumrator method.

This method will process the indices of self by enumerator.

Experimental feature

this method may be removed or be changed in the future.

Examples:

Create a SubFrames object sliced by 3 rows.

df.sub_by_enum(:each_slice, 3)

# =>
#<RedAmber::SubFrames : 0x000000000000fd20>
@baseframe=#<RedAmber::DataFrame : 6 x 3 Vectors, 0x000000000000fba4>
2 SubFrames: [3, 3] in sizes.
---
#<RedAmber::DataFrame : 3 x 3 Vectors, 0x000000000000fd34>
        x y        z
  <uint8> <string> <boolean>
0       1 A        false
1       2 A        true
2       3 B        false
---
#<RedAmber::DataFrame : 3 x 3 Vectors, 0x000000000000fd48>
        x y        z
  <uint8> <string> <boolean>
0       4 B        (nil)
1       5 B        true
2       6 C        false

Create a SubFrames object for each consecutive 3 rows.

df.sub_by_enum(:each_cons, 4)

# =>
#<RedAmber::SubFrames : 0x000000000000fd98>
@baseframe=#<RedAmber::DataFrame : 6 x 3 Vectors, 0x000000000000fba4>
3 SubFrames: [4, 4, 4] in sizes.
---
#<RedAmber::DataFrame : 4 x 3 Vectors, 0x000000000000fdac>
        x y        z
  <uint8> <string> <boolean>
0       1 A        false
1       2 A        true
2       3 B        false
3       4 B        (nil)
---
#<RedAmber::DataFrame : 4 x 3 Vectors, 0x000000000000fdc0>
        x y        z
  <uint8> <string> <boolean>
0       2 A        true
1       3 B        false
2       4 B        (nil)
3       5 B        true
---
#<RedAmber::DataFrame : 4 x 3 Vectors, 0x000000000000fdd4>
        x y        z
  <uint8> <string> <boolean>
0       3 B        false
1       4 B        (nil)
2       5 B        true
3       6 C        false

Parameters:

  • enumerator_method (Symbol)

    Enumerator name.

  • args (<Object>)

    arguments for the enumerator method.

Returns:

Since:

  • 0.4.0



575
576
577
# File 'lib/red_amber/data_frame.rb', line 575

def sub_by_enum(enumerator_method, *args)
  SubFrames.new(self, indices.send(enumerator_method, *args).to_a)
end

#sub_by_kernel(kernel, step: 1) ⇒ SubFrames Also known as: subframes_by_kernel

Create SubFrames by windowing with a kernel (i.e. masked window) and step.

Experimental feature

this method may be removed or be changed in the future.

Examples:

kernel = [true, false, false, true]
df.sub_by_kernel(kernel, step: 2)

# =>
#<RedAmber::SubFrames : 0x000000000000fde8>
@baseframe=#<RedAmber::DataFrame : 6 x 3 Vectors, 0x000000000000fba4>
2 SubFrames: [2, 2] in sizes.
---
#<RedAmber::DataFrame : 2 x 3 Vectors, 0x000000000000fdfc>
        x y        z
  <uint8> <string> <boolean>
0       1 A        false
1       4 B        (nil)
---
#<RedAmber::DataFrame : 2 x 3 Vectors, 0x000000000000fe10>
        x y        z
  <uint8> <string> <boolean>
0       3 B        false
1       6 C        false

Parameters:

  • kernel (Array<true, false>, Vector)

    boolean array-like to pick records in the window. Kernel is a boolean Array and it behaves like a masked window.

  • step (Integer) (defaults to: 1)

    moving step of window.

Returns:

Since:

  • 0.4.0



613
614
615
616
617
618
619
620
621
# File 'lib/red_amber/data_frame.rb', line 613

def sub_by_kernel(kernel, step: 1)
  limit_size = size - kernel.size
  kernel_vector = Vector.new(kernel.concat([nil] * limit_size))
  SubFrames.new(self) do
    0.step(by: step, to: limit_size).map do |i|
      kernel_vector.shift(i)
    end
  end
end

#sub_by_value(*keys) ⇒ SubFrames Also known as: subframes_by_value, sub_group

Create SubFrames by value grouping.

Experimental feature

this method may be removed or be changed in the future.

Examples:

df.sub_by_value(:y)

# =>
#<RedAmber::SubFrames : 0x000000000000fc08>
@baseframe=#<RedAmber::DataFrame : 6 x 3 Vectors, 0x000000000000fba4>
3 SubFrames: [2, 3, 1] in sizes.
---
#<RedAmber::DataFrame : 2 x 3 Vectors, 0x000000000000fc1c>
        x y        z
  <uint8> <string> <boolean>
0       1 A        false
1       2 A        true
---
#<RedAmber::DataFrame : 3 x 3 Vectors, 0x000000000000fc30>
        x y        z
  <uint8> <string> <boolean>
0       3 B        false
1       4 B        (nil)
2       5 B        true
---
#<RedAmber::DataFrame : 1 x 3 Vectors, 0x000000000000fc44>
        x y        z
  <uint8> <string> <boolean>
0       6 C        false

Parameters:

  • keys (List<Symbol, String>, Array<Symbol, String>)

    grouping keys.

Returns:

  • (SubFrames)

    a created SubFrames grouped by column values on ‘keys`.

Since:

  • 0.4.0



457
458
459
# File 'lib/red_amber/data_frame.rb', line 457

def sub_by_value(*keys)
  SubFrames.new(self, group(keys.flatten).filters)
end

#sub_by_window(from: 0, size: nil, step: 1) ⇒ SubFrames Also known as: subframes_by_window

Create SubFrames by Windowing with ‘from`, `size` and `step`.

Experimental feature

this method may be removed or be changed in the future.

Examples:

df.sub_by_window(size: 4, step: 2)

# =>
#<RedAmber::SubFrames : 0x000000000000fc58>
@baseframe=#<RedAmber::DataFrame : 6 x 3 Vectors, 0x000000000000fba4>
2 SubFrames: [4, 4] in sizes.
---
#<RedAmber::DataFrame : 4 x 3 Vectors, 0x000000000000fc6c>
        x y        z
  <uint8> <string> <boolean>
0       1 A        false
1       2 A        true
2       3 B        false
3       4 B        (nil)
---
#<RedAmber::DataFrame : 4 x 3 Vectors, 0x000000000000fc80>
        x y        z
  <uint8> <string> <boolean>
0       3 B        false
1       4 B        (nil)
2       5 B        true
3       6 C        false

Parameters:

  • from (Integer) (defaults to: 0)

    start position of window.

  • size (Integer) (defaults to: nil)

    window size.

  • step (Integer) (defaults to: 1)

    moving step of window.

Returns:

Since:

  • 0.4.0



500
501
502
503
504
505
506
# File 'lib/red_amber/data_frame.rb', line 500

def sub_by_window(from: 0, size: nil, step: 1)
  SubFrames.new(self) do
    from.step(by: step, to: (size() - size)).map do |i| # rubocop:disable Style/MethodCallWithoutArgsParentheses
      [*i...(i + size)]
    end
  end
end

#to_aArray Also known as: raw_records

Note:

If you need column-oriented array, use ‘.to_h.to_a`.

Returns a row-oriented array without header.

Returns:

  • (Array)

    row-oriented data without header.



299
300
301
# File 'lib/red_amber/data_frame.rb', line 299

def to_a
  @table.raw_records
end

#to_hHash

Returns column-oriented data in a Hash.

Returns:

  • (Hash)

    a Hash of ‘key => column_in_an_array’.



288
289
290
# File 'lib/red_amber/data_frame.rb', line 288

def to_h
  variables.transform_values(&:to_a)
end

#to_roverRover::DataFrame

Returns self in a ‘Rover::DataFrame`.

Returns:

  • (Rover::DataFrame)

    a ‘Rover::DataFrame`.



371
372
373
374
# File 'lib/red_amber/data_frame.rb', line 371

def to_rover
  require 'rover'
  Rover::DataFrame.new(to_h)
end

#type_classesArray

Returns an Array of Classes of data type.

Returns:

  • (Array)

    an Array of Red Arrow data type Classes.



270
271
272
# File 'lib/red_amber/data_frame.rb', line 270

def type_classes
  @type_classes ||= @table.columns.map { |column| column.data_type.class }
end

#typesArray

Returns abbreviated type names in an Array.

Returns:

  • (Array)

    abbreviated Red Arrow data type names.



259
260
261
262
263
# File 'lib/red_amber/data_frame.rb', line 259

def types
  @types ||= @table.columns.map do |column|
    column.data.value_type.nick.to_sym
  end
end

#variablesHash Also known as: vars

Returns a Hash of key and Vector pairs in the columns.

Returns:

  • (Hash)

    ‘key => Vector` pairs for each columns.



213
214
215
# File 'lib/red_amber/data_frame.rb', line 213

def variables
  @variables ||= init_instance_vars(:variables)
end

#vectorsArray

Returns Vectors in an Array.

Returns:

  • (Array)

    an Array of Vector.



279
280
281
# File 'lib/red_amber/data_frame.rb', line 279

def vectors
  @vectors ||= init_instance_vars(:vectors)
end