Module: RedAmber::VectorStringFunction

Included in:
Vector
Defined in:
lib/red_amber/vector_string_function.rb

Overview

Mix-in for class Vector

Methods for string-like related function

Instance Method Summary collapse

Instance Method Details

#count_substring(string, ignore_case: nil) ⇒ Vector #count_substring(regexp, ignore_case: nil) ⇒ Vector

For each string in self, count occuerences of substring in given pattern.

Overloads:

  • #count_substring(string, ignore_case: nil) ⇒ Vector

    Count if it contains ‘string`.

    Examples:

    Count with string.

    vector2 = Vector.new('amber', 'Amazon', 'banana', nil)
    vector2.count_substring('an')
    # =>
    #<RedAmber::Vector(:int32, size=4):0x000000000003db30>
    [0, 0, 2, nil]

    Parameters:

    • string (String)

      string pattern to count.

    • ignore_case (boolean) (defaults to: nil)

      switch whether to ignore case. Ignore case if true.

    Returns:

    • (Vector)

      int32 or int64 Vector to show if elements contain a given pattern. nil inputs emit nil.

  • #count_substring(regexp, ignore_case: nil) ⇒ Vector

    Count if it contains substring matching with ‘regexp`. It calls `count_substring_regex` in Arrow compute function and uses re2 library.

    Examples:

    Count with regexp with case ignored.

    vector2.count_substring(/a[mn]/i)
    # =>
    #<RedAmber::Vector(:int32, size=4):0x0000000000051298>
    [1, 1, 2, nil]
    # it is same result as `vector2.count_substring(/a[mn]/, ignore_case: true)`

    Parameters:

    • regexp (Regexp)

      regular expression pattern to count. Ruby’s Regexp is given and it will passed to Arrow’s kernel by its source.

    • ignore_case (boolean) (defaults to: nil)

      switch whether to ignore case. Ignore case if true. When ‘ignore_case` is false, casefolding option in regexp is priortized.

    Returns:

    • (Vector)

      int32 or int64 Vector to show the counts in given pattern. nil inputs emit nil.

Since:

  • 0.5.0



195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
# File 'lib/red_amber/vector_string_function.rb', line 195

def count_substring(pattern, ignore_case: nil)
  options = Arrow::MatchSubstringOptions.new
  datum =
    case pattern
    when String
      options.ignore_case = (ignore_case || false)
      options.pattern = pattern
      find(:count_substring).execute([data], options)
    when Regexp
      options.ignore_case = (pattern.casefold? || ignore_case || false)
      options.pattern = pattern.source
      find(:count_substring_regex).execute([data], options)
    else
      message =
        "pattern must be either String or Regexp: #{pattern.inspect}"
      raise VectorArgumentError, message
    end
  Vector.create(datum.value)
end

#end_with(string, ignore_case: nil) ⇒ Vector Also known as: end_with?

Check if elements in self end with a literal pattern.

Examples:

Check if end with?.

vector = Vector.new('array', 'Arrow', 'carrot', nil, 'window')
vector.end_with('ow')
# =>
#<RedAmber::Vector(:boolean, size=5):0x00000000000108ec>
[false, true, false, nil, true]

Parameters:

  • string (String)

    string pattern to match.

  • ignore_case (boolean) (defaults to: nil)

    switch whether to ignore case. Ignore case if true.

Returns:

  • (Vector)

    boolean Vector to show if elements end with a given pattern. nil inputs emit nil.

Since:

  • 0.5.0



89
90
91
92
93
94
95
# File 'lib/red_amber/vector_string_function.rb', line 89

def end_with(string, ignore_case: nil)
  options = Arrow::MatchSubstringOptions.new
  options.ignore_case = (ignore_case || false)
  options.pattern = string
  datum = find(:ends_with).execute([data], options)
  Vector.create(datum.value)
end

#find_substring(string, ignore_case: nil) ⇒ Vector #find_substring(regexp, ignore_case: nil) ⇒ Vector

Find first occurrence of substring in string Vector.

Overloads:

  • #find_substring(string, ignore_case: nil) ⇒ Vector

    Emit the index in bytes of the first occurrence of the given

    literal pattern, or -1 if not found.
    

    Examples:

    Match with string.

    vector = Vector['array', 'Arrow', 'carrot', nil, 'window']
    vector.find_substring('arr')
    # =>
    #<RedAmber::Vector(:boolean, size=5):0x00000000000161e8>
    [0, -1, 1, nil, -1]

    Parameters:

    • string (String)

      string pattern to match.

    • ignore_case (boolean) (defaults to: nil)

      switch whether to ignore case. Ignore case if true.

    Returns:

    • (Vector)

      index Vector of occurences. nil inputs emit nil.

  • #find_substring(regexp, ignore_case: nil) ⇒ Vector

    Emit the index in bytes of the first occurrence of the given

    regexp pattern, or -1 if not found.
    

    It calls ‘find_substring_regex` in Arrow compute function and uses re2 library.

    Examples:

    Match with regexp.

    vector.find_substring(/arr/i)
    # or vector.find_substring(/arr/, ignore_case: true)
    # =>
    #<RedAmber::Vector(:boolean, size=5):0x000000000001b74c>
    [0, 0, 1, nil, -1]

    Parameters:

    • regexp (Regexp)

      regular expression pattern to match. Ruby’s Regexp is given and it will passed to Arrow’s kernel by its source.

    • ignore_case (boolean) (defaults to: nil)

      switch whether to ignore case. Ignore case if true. When ‘ignore_case` is false, casefolding option in regexp is priortized.

    Returns:

    • (Vector)

      index Vector of occurences. nil inputs emit nil.

Since:

  • 0.5.1



259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
# File 'lib/red_amber/vector_string_function.rb', line 259

def find_substring(pattern, ignore_case: nil)
  options = Arrow::MatchSubstringOptions.new
  datum =
    case pattern
    when String
      options.ignore_case = (ignore_case || false)
      options.pattern = pattern
      find(:find_substring).execute([data], options)
    when Regexp
      options.ignore_case = (pattern.casefold? || ignore_case || false)
      options.pattern = pattern.source
      find(:find_substring_regex).execute([data], options)
    else
      message =
        "pattern must be either String or Regexp: #{pattern.inspect}"
      raise VectorArgumentError, message
    end
  Vector.create(datum.value)
end

#match_like(string, ignore_case: nil) ⇒ Vector Also known as: match_like?

Match elements of self against SQL-style LIKE pattern.

The pattern matches a given pattern at any position.
'%' will match any number of characters,
'_' will match exactly one character,
and any other character matches itself.
To match a literal '%', '_', or '\', precede the character with a backslash.

Examples:

Check with match_like?.

vector = Vector.new('array', 'Arrow', 'carrot', nil, 'window')
vector.match_like('_rr%')
# =>

Parameters:

  • string (String)

    string pattern to match.

  • ignore_case (boolean) (defaults to: nil)

    switch whether to ignore case. Ignore case if true.

Returns:

  • (Vector)

    boolean Vector to show if elements start with a given pattern. nil inputs emit nil.

Since:

  • 0.5.0



144
145
146
147
148
149
150
# File 'lib/red_amber/vector_string_function.rb', line 144

def match_like(string, ignore_case: nil)
  options = Arrow::MatchSubstringOptions.new
  options.ignore_case = (ignore_case || false)
  options.pattern = string
  datum = find(:match_like).execute([data], options)
  Vector.create(datum.value)
end

#match_substring(string, ignore_case: nil) ⇒ Vector #match_substring(regexp, ignore_case: nil) ⇒ Vector Also known as: match_substring?

For each string in self, emit true if it contains a given pattern.

Overloads:

  • #match_substring(string, ignore_case: nil) ⇒ Vector

    Emit true if it contains ‘string`.

    Examples:

    Match with string.

    vector = Vector.new('array', 'Arrow', 'carrot', nil, 'window')
    vector.match_substring('arr')
    # =>
    #<RedAmber::Vector(:boolean, size=5):0x000000000005a208>
    [true, false, true, nil, false]

    Parameters:

    • string (String)

      string pattern to match.

    • ignore_case (boolean) (defaults to: nil)

      switch whether to ignore case. Ignore case if true.

    Returns:

    • (Vector)

      boolean Vector to show if elements contain a given pattern. nil inputs emit nil.

  • #match_substring(regexp, ignore_case: nil) ⇒ Vector

    Emit true if it contains substring matching with ‘regexp`. It calls `match_substring_regex` in Arrow compute function and uses re2 library.

    Examples:

    Match with regexp.

    vector.match_substring(/arr/)
    # =>
    #<RedAmber::Vector(:boolean, size=5):0x0000000000014b68>
    [true, false, true, nil, false]

    Parameters:

    • regexp (Regexp)

      regular expression pattern to match. Ruby’s Regexp is given and it will passed to Arrow’s kernel by its source.

    • ignore_case (boolean) (defaults to: nil)

      switch whether to ignore case. Ignore case if true. When ‘ignore_case` is false, casefolding option in regexp is priortized.

    Returns:

    • (Vector)

      boolean Vector to show if elements contain a given pattern. nil inputs emit nil.

Since:

  • 0.5.0



51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
# File 'lib/red_amber/vector_string_function.rb', line 51

def match_substring(pattern, ignore_case: nil)
  options = Arrow::MatchSubstringOptions.new
  datum =
    case pattern
    when String
      options.ignore_case = (ignore_case || false)
      options.pattern = pattern
      find(:match_substring).execute([data], options)
    when Regexp
      options.ignore_case = (pattern.casefold? || ignore_case || false)
      options.pattern = pattern.source
      find(:match_substring_regex).execute([data], options)
    else
      message =
        "pattern must be either String or Regexp: #{pattern.inspect}"
      raise VectorArgumentError, message
    end
  Vector.create(datum.value)
end

#start_with(string, ignore_case: nil) ⇒ Vector Also known as: start_with?

Check if elements in self start with a literal pattern.

Examples:

Check if start with?.

vector = Vector.new('array', 'Arrow', 'carrot', nil, 'window')
vector.start_with('arr')
# =>
#<RedAmber::Vector(:boolean, size=5):0x00000000000193fc>
[true, false, false, nil, false]

Parameters:

  • string (String)

    string pattern to match.

  • ignore_case (boolean) (defaults to: nil)

    switch whether to ignore case. Ignore case if true.

Returns:

  • (Vector)

    boolean Vector to show if elements start with a given pattern. nil inputs emit nil.

Since:

  • 0.5.0



115
116
117
118
119
120
121
# File 'lib/red_amber/vector_string_function.rb', line 115

def start_with(string, ignore_case: nil)
  options = Arrow::MatchSubstringOptions.new
  options.ignore_case = (ignore_case || false)
  options.pattern = string
  datum = find(:starts_with).execute([data], options)
  Vector.create(datum.value)
end