datamancer/formula

Source   Edit  

Types

FormulaMismatchError = object of CatchableError
Source   Edit  
FormulaNode = object
  name*: string
  case kind*: FormulaKind
  of fkVariable:
      val*: Value

  of fkAssign:
      lhs*: string
      rhs*: Value

  of fkVector:
      colName*: string
      resType*: ColKind
      fnV*: proc (df: DataFrame): Column

  of fkScalar:
      valName*: string
      valKind*: ValueKind
      fnS*: proc (c: DataFrame): Value

  of fkNone:
    nil
  
Source   Edit  

Procs

proc `$`(node: FormulaNode): string {....raises: [ValueError], tags: [].}
Converts node to its string representation Source   Edit  
proc hash(fn: FormulaNode): Hash {....raises: [Exception], tags: [RootEffect].}
Source   Edit  
proc raw(node: FormulaNode): string {....raises: [], tags: [].}
prints the raw stringification of node Source   Edit  
proc toUgly(result: var string; node: FormulaNode) {....raises: [ValueError],
    tags: [].}
This is the formula stringification, which can be used to access the corresponding column of in a DF that corresponds to the formula Source   Edit  

Macros

macro compileFormulaImpl(rawName: static string; funcKind: static FormulaKind): untyped

Second stage of formula macro. The typed stage of the macro. It's important that the macro is typed, as otherwise we risk that it is evaluated before the addSymbols calls, if we are within a generic procedure. That leads to the CT tables being empty. By making it typed (which we can do, because we store all problematic AST into the CT tables), we force evaluation to the same compilation stage as addSymbols.

Extracts the typed symbols from TypedSymbols CT table and uses it to determine possible types for column references.

Source   Edit  
macro fn(x: untyped): untyped
Source   Edit  
macro formula(n: untyped): untyped
Source   Edit  
macro `{}`(x: untyped{ident}; y: untyped): untyped
TODO: add some ability to explicitly create formulas of different kinds more easily! Essentially force the type without a check to avoid having to rely on heuristics. Use
  • <- for assignment
  • << for reduce operations, i.e. scalar proc?
  • ~ for vector like proc
  • formula without any of the above will be considered:
    • fkVariable if no column involved
    • fkVector else
  • <type>: <actualFormula>: simple type hint for tensors in closure
  • <type> -> <resType>: <actualFormula>: full type for closure. <type> is the dtype used for tensors, <resType> the resulting type
  • df[<someIdent/Sym>]: to access columns using identifiers / symbols defined in the scope
  • idx: can be used to access the loop iteration index
Source   Edit