datamancer/column

    Dark Mode
Search:
Group by:
Source   Edit  

Types

ColKind = enum
  colNone, colFloat, colInt, colBool, colString, colObject, colConstant
Source   Edit  
Column = ref object
  len*: int
  case kind*: ColKind
  of colFloat:
    fCol*: Tensor[float]
  of colInt:
    iCol*: Tensor[int]
  of colBool:
    bCol*: Tensor[bool]
  of colString:
    sCol*: Tensor[string]
  of colObject:
    oCol*: Tensor[Value]
  of colConstant:
    cCol*: Value
  of colNone:
    nil
  
Source   Edit  

Procs

proc `[]`(c: Column; slice: Slice[int]): Column {....raises: [], tags: [].}
Source   Edit  
proc `[]`[T](c: Column; idx: int; dtype: typedesc[T]): T
Source   Edit  
proc `[]=`(c: var Column; slice: Slice[int]; col: Column) {.
    ...raises: [Exception, ValueError, KeyError], tags: [RootEffect].}
Source   Edit  
proc `[]=`[T](c: var Column; idx: int; val: T)
assign val to column c at index idx If the types match, it just calls []= on the tensor. If they are compatible, val is converted to c's type. If they are incompatible, c will be rewritten to an object column. Source   Edit  
proc `[]=`[T](c: var Column; slice: Slice[int]; t: Tensor[T])

Assigns the tensor t to the slice slice. The slice length must match the tensor length exactly and must be smaller than the column length.

If the type of t does not match the column kind, we reallocate to an object column.

Source   Edit  
proc add(c1, c2: Column): Column {....raises: [ValueError, KeyError], tags: [].}
adds column c2 to c1. Uses concat internally. Source   Edit  
proc asValue[T](t: Tensor[T]): Tensor[Value] {.noinit.}
Apply type conversion on the whole tensor Source   Edit  
proc clone(c: Column): Column {....raises: [ValueError], tags: [].}
clones the given column by cloning the Tensor Source   Edit  
proc combinedColKind(c: seq[Column]): ColKind {....raises: [], tags: [].}
Source   Edit  
proc compatibleColumns(c1, c2: Column): bool {.inline, ...raises: [], tags: [].}
Source   Edit  
proc constantColumn[T](val: T; len: int): Column
creates a constant column based on val and its type Source   Edit  
proc constantToFull(c: Column): Column {....raises: [], tags: [].}
creates a real constant full tensor column based on a constant column Source   Edit  
proc contains[T: float | string | int | bool | Value](c: Column; val: T): bool
Source   Edit  
proc equal(c1: Column; idx1: int; c2: Column; idx2: int): bool {.
    ...raises: [ValueError, Exception, KeyError], tags: [RootEffect].}
checks if the value in c1 at idx1 is equal to the value in c2 at idx2 Source   Edit  
func high(c: Column): int {....raises: [], tags: [].}
Source   Edit  
func isConstant(c: Column): bool {....raises: [], tags: [].}
Source   Edit  
proc lag(c: Column; n = 1): Column {....raises: [ValueError, Exception],
                                     tags: [RootEffect].}
Overload of the above for columns Source   Edit  
proc lag[T](t: Tensor[T]; n = 1; fill: T = default(T)): Tensor[T] {.noinit.}

Lags the input tensor by n, i.e. returns a shifted tensor such that it lags behind t:

lag([1, 2, 3], 1) [null, 1, 2]

NOTE: The value of null is filled by default(T) value by default! Use the fill argument to change the value to be set.

Source   Edit  
proc lead(c: Column; n = 1): Column {....raises: [ValueError, Exception],
                                      tags: [RootEffect].}
Overload of the above for columns Source   Edit  
proc lead[T](t: Tensor[T]; n = 1; fill: T = default(T)): Tensor[T] {.noinit.}

Leads the input tensor by n, i.e. returns a shifted tensor such that it leads ahead of t:

lead([1, 2, 3], 1) [2, 3, null]

NOTE: The value of null is filled by default(T) value by default! Use the fill argument to change the value to be set.

Source   Edit  
proc map[T; U](c: Column; fn: (T -> U)): Column

Maps a given column given fn to a new column. Because Column is a variant type, an untyped mapping function won't compile.

See the map_inline template below, which attempts to work around this limitation by compiling all map function bodies, which are valid for c.

c.map((x: int) => x * 5)

Using this is not really recommended. Use df["x", int].map(x => x * 5) instead!

Source   Edit  
proc max(c`gensym949: Column): Value {....raises: [ValueError, Exception, KeyError],
                                       tags: [RootEffect].}
Source   Edit  
proc nativeColKind(col: Column): ColKind {....raises: [], tags: [].}
Returns the native column kind, i.e. the column kind the native data stored in the column has, including constant columns (hence the native kind is not equal to the kind field of the column! Source   Edit  
proc newColumn(kind = colNone; length = 0): Column {....raises: [], tags: [].}
Source   Edit  
proc nullColumn(num: int): Column {....raises: [], tags: [].}
returns an object Column with N values, which are all VNull Source   Edit  
proc pretty(c: Column): string {....raises: [ValueError], tags: [].}
pretty prints a Column Source   Edit  
proc toColKind(vKind: ValueKind): ColKind {....raises: [], tags: [].}
Source   Edit  
proc toColKind[T](dtype: typedesc[T]): ColKind
Source   Edit  
proc toColumn[T: SomeFloat | SomeInteger | string | bool | Value](
    s: openArray[T]): Column
Source   Edit  
proc toColumn[T: SomeFloat | SomeInteger | string | bool | Value](t: Tensor[T]): Column
Source   Edit  
proc toColumn[T: SomeFloat | SomeInteger | string | bool | Value](x: T): Column
Source   Edit  
proc toNativeColumn(c: Column; failIfImpossible: static bool = true): Column

attempts to convert the given column from colObject to its native type, if possible. This is mainly useful after removal of null values. If it fails (i.e. floats and strings in one col) the result stays a colObject.

In the default case failIfImpossible = true this procedure will fail with an AssertionDefect if a column contains multiple datatypes. This can be disabled so that at worst the input is returned as an object type column.

Source   Edit  
proc toNativeColumn(s: openArray[Value]): Column {.
    ...raises: [Exception, ValueError], tags: [RootEffect].}
given input as Value, will attempt to return the column as native data type. NOTE: this is unsafe and assumes the values are indeed all one type! Source   Edit  
proc toNimType(colKind: ColKind): string {....raises: [], tags: [].}
returns the string name of the underlying data type of the column kind Source   Edit  
proc toObject(c: Column): Column {.inline, ...raises: [ValueError], tags: [].}
Source   Edit  
proc toObjectColumn(c: Column): Column {....raises: [ValueError], tags: [].}
returns c as an object column XXX: can't we somehow convert same slices of a tensor? Source   Edit  
proc toTensor[T](c: Column; dtype: typedesc[T]; dropNulls: static bool = false): Tensor[
    T]
dropNulls only has an effect on colObject columns. It allows to drop Null values to get (hopefully) a valid raw Tensor Source   Edit  
proc toTensor[T](c: Column; slice: Slice[int]; dtype: typedesc[T]): Tensor[T]
Source   Edit  
proc toValueKind(col: ColKind): ValueKind {....deprecated: "This version of `toValueKind` has been deprecated in favor of a `toValueKind` taking a `Column` object. This way a conversion of `colConstant` can be done to the underlying type of the `Value` object.",
    raises: [], tags: [].}
Deprecated: This version of `toValueKind` has been deprecated in favor of a `toValueKind` taking a `Column` object. This way a conversion of `colConstant` can be done to the underlying type of the `Value` object.
Source   Edit  
proc toValueKind(col: Column): ValueKind {....raises: [], tags: [].}
Source   Edit  
proc valueTo[T](t: Tensor[Value]; dtype: typedesc[T];
                dropNulls: static bool = false): Tensor[T]
Source   Edit  

Templates

template `$`(c: Column): string
Source   Edit  
template `%~`(v: Value): Value
Source   Edit  
template liftScalarToColumn(name: untyped): untyped
Source   Edit  
template map_inline(c: Column; body: untyped): Column
This is a helper template, which attempts to work around this limitation by compiling all map function bodies, which are valid for c. However, be careful: by using the template you throw out possible compile time checking and replace it by possible exceptions in your code!
c.map_inline(x * 5)

This example will throw a runtime exception, if * 5 is invalid for the column type that c actually is at runtime! Using this is not really recommended. Use df["x", int].map_inline(x * 5) instead!

Source   Edit  
template toColumn(c: Column): Column
Source   Edit  
template withDtypeByColKind(colKind: ColKind; body: untyped): untyped
Source   Edit  
template withNative(c: Column; idx: int; valName: untyped; body: untyped): untyped
Source   Edit  
template withNative2(c1, c2: Column; idx1, idx2: int;
                     valName1, valName2: untyped; body: untyped): untyped
Source   Edit  
template withNativeDtype(c: Column; body: untyped): untyped
Source   Edit  
template withNativeTensor(c: Column; valName: untyped; body: untyped): untyped
Source   Edit