Skip to content

Latest commit

 

History

History
118 lines (110 loc) · 6.87 KB

README.md

File metadata and controls

118 lines (110 loc) · 6.87 KB

DataFrame addon for J

Experimenting with a DataFrame-like structure in J

This implementation of a J Dataframe uses an Inverted Table structure to store the data and adds a header row for naming fields/columns. The result is a 2-row by n-column table of boxes where n is the number of fields/columns in the Dataframe.

The addon allows verbs from the 'general/misc/inverted' script (designed for Inverted tables), to work with a J Dataframe structure via the dfPipe adverb.

Install

To install addon, from a J session:

   install 'github:tikkanz/jdataframe'

Usage

   load 'tables/dataframe'
   coinsert 'pdataframe'
   require 'tables/csv'
   ]B=: fixcsv noun define
Id,Name,Job,Status
3,Jerry,Unemployed,Married
6,Jan,CEO,Married
5,Frieda,student,Single
1,Alex,Waiter,Separated
)
┌──┬──────┬──────────┬─────────┐
│Id│Name  │Job       │Status   │
├──┼──────┼──────────┼─────────┤
│3 │Jerry │Unemployed│Married  │
├──┼──────┼──────────┼─────────┤
│6 │Jan   │CEO       │Married  │
├──┼──────┼──────────┼─────────┤
│5 │Frieda│student   │Single   │
├──┼──────┼──────────┼─────────┤
│1 │Alex  │Waiter    │Separated│
└──┴──────┴──────────┴─────────┘

   NB. A DataFrame is a list of boxed column labels, laminated to an inverted table.
   ]Bdf=: 1 makeDataFrame B
┌──┬──────┬──────────┬─────────┐
│Id│Name  │Job       │Status   │
├──┼──────┼──────────┼─────────┤
│3 │Jerry │Unemployed│Married  │
│6 │Jan   │CEO       │Married  │
│5 │Frieda│student   │Single   │
│1 │Alex  │Waiter    │Separated│
└──┴──────┴──────────┴─────────┘

   NB. `dfPipe` is an adverb that applies inverted table verbs to a DataFrame.
   tsort dfPipe Bdf
┌──┬──────┬──────────┬─────────┐
│Id│Name  │Job       │Status   │
├──┼──────┼──────────┼─────────┤
│1 │Alex  │Waiter    │Separated│
│3 │Jerry │Unemployed│Married  │
│5 │Frieda│student   │Single   │
│6 │Jan   │CEO       │Married  │
└──┴──────┴──────────┴─────────┘

   NB. `dfp` is an alias for `dfPipe`
   tmakenumcol dfp tsort dfp Bdf
┌───────┬──────┬──────────┬─────────┐
│Id     │Name  │Job       │Status   │
├───────┼──────┼──────────┼─────────┤
│1 3 5 6│Alex  │Waiter    │Separated│
│       │Jerry │Unemployed│Married  │
│       │Frieda│student   │Single   │
│       │Jan   │CEO       │Married  │
└───────┴──────┴──────────┴─────────┘
   0 2 3 tfrom dfp ('Id';'Job') dfSelect tmakenumcol dfp tsort dfp Bdf
┌─────┬──────────┐
│Id   │Job       │
├─────┼──────────┤
│1 5 6│Waiter    │
│     │student   │
│     │CEO       │
└─────┴──────────┘

NB. Working with bigger tables
   $Ivt=. (<1e6 + ?~1e5), ifa 1e5 5 ?@$ 0   NB. create 6 column numeric Inverted table
6
   NB. `tshow` is a verb for formatting inverted tables to display a sample
   tshow Ivt
┌───────┬─────────┬────────┬─────────┬─────────┬────────┐
│integer│floating │floating │floating │floating│floating│
│------------------     │
│10750460.5773240.3269080.5134080.04040370.796914│
│10498920.09939460.9984840.641120.5343530.853303│
│10714690.1526750.466380.9615240.696550.887527│
│10971640.4308580.125420.6880660.6055720.27933│
│10971340.5138340.9840580.6613390.7368240.575513│
│...    │...      │...     │...      │...      │...     │
│10815260.1166910.7789370.6101060.9214760.440309│
│10035920.4213780.9226010.738720.4488780.876693│
│10141890.191640.318560.07562060.7556820.456555│
│10986430.2726870.1836230.955110.07469020.996458│
│10203490.674460.645290.6002520.1806310.922642│
└───────┴─────────┴────────┴─────────┴─────────┴────────┘

   NB. `makeDataFrame` adds default header names to an inverted table, creating a DataFrame
   tshow dfp makeDataFrame Ivt
┌────────┬─────────┬────────┬─────────┬─────────┬────────┐
│column_1│column_2 │column_3│column_4 │column_5 │column_6│
├────────┼─────────┼────────┼─────────┼─────────┼────────┤
│integer │floating │floating│floating │floating │floating│
│------------------     │
│10750460.5773240.3269080.5134080.04040370.796914│
│10498920.09939460.9984840.641120.5343530.853303│
│10714690.1526750.466380.9615240.696550.887527│
│10971640.4308580.125420.6880660.6055720.27933│
│10971340.5138340.9840580.6613390.7368240.575513│
│...     │...      │...     │...      │...      │...     │
│10815260.1166910.7789370.6101060.9214760.440309│
│10035920.4213780.9226010.738720.4488780.876693│
│10141890.191640.318560.07562060.7556820.456555│
│10986430.2726870.1836230.955110.07469020.996458│
│10203490.674460.645290.6002520.1806310.922642│
└────────┴─────────┴────────┴─────────┴─────────┴────────┘