Module: RedAmber::DataFrameLoadSave

Included in:
DataFrame
Defined in:
lib/red_amber/data_frame_loadsave.rb

Overview

Mix-in for the class DataFrame

Defined Under Namespace

Modules: ClassMethods

Class Method Summary collapse

Instance Method Summary collapse

Class Method Details

.included(klass) ⇒ Object

Enable ‘self.load` as class method of DataFrame



7
8
9
# File 'lib/red_amber/data_frame_loadsave.rb', line 7

def self.included(klass)
  klass.extend ClassMethods
end

Instance Method Details

#auto_cast(format: :tsv) ⇒ DataFrame

Note:

experimental feature

Save and reload to cast automatically via tsv format file temporally as default.

Parameters:

  • format (Symbol) (defaults to: :tsv)

    format specifier.

Returns:



99
100
101
102
103
104
105
# File 'lib/red_amber/data_frame_loadsave.rb', line 99

def auto_cast(format: :tsv)
  return self if empty?

  buffer = Arrow::ResizableBuffer.new(1024)
  save(buffer, format: format)
  DataFrame.load(buffer, format: format)
end

#save(output, format: nil, compression: nil, schema: nil, skip_lines: nil) ⇒ DataFrame

Save DataFrame

Format is automatically detected by extension.

Examples:

Save a csv file

DataFrame.save("file.csv")

Save a csv.gz file

DataFrame.save("file.csv.gz")

Save an arrow file

DataFrame.save("file.arrow")

Parameters:

  • output (path)

    output path.

  • format (:arrow_file, :batch, :arrows, :arrow_stream, :stream, :csv, :tsv) (defaults to: nil)

    format specifier.

  • compression (:gzip, nil) (defaults to: nil)

    compression type.

  • schema (Arrow::Schema) (defaults to: nil)

    schema of table.

  • skip_lines (Regexp) (defaults to: nil)

    pattern of rows to skip.

Returns:



85
86
87
88
# File 'lib/red_amber/data_frame_loadsave.rb', line 85

def save(output, **options)
  @table.save(output, options)
  self
end