A {K}otlin g{ra}mmar for data {vis}ualization

Overview

kravis - A {k}otlin {gra}mmar for data {vis}ualization

Download Build Status

Visualizing tabular and relational data is the core of data-science. kravis implements a grammar to create a wide range of plots using a standardized set of verbs.

The grammar implemented by kravis is inspired from ggplot2. In fact, all it provides is a more typesafe wrapper around it. Internally, ggplot2 is used as rendering engine. The API of kravis is highly similar to allow even reusing their excellent cheatsheet.

R is required to use ggplot. However, kravis works with various integration backend ranging such as docker or remote webservices.


This is an experimental API and is subject to breaking changes until a first major release


Jupyter

An easy way to get started with kravis is with jupyter, you simply need to install the kotlin-jupyter kernel.

See here for a notebook example.

Setup

Add the following artifact to your gradle.build

compile "com.github.holgerbrandl:kravis:0.8.1"

You can also use JitPack with Maven or Gradle to build the latest snapshot as a dependency in your project.

repositories {
    maven { url 'https://jitpack.io' }
}
dependencies {
        compile 'com.github.holgerbrandl:kravis:-SNAPSHOT'
}

To build and install it into your local maven cache, simply clone the repo and run

./gradlew install

First Example

Let's start by analyzing mamalian sleep patterns

import krangl.*
import kravis.*

sleepData
    .addColumn("rem_proportion") { it["sleep_rem"] / it["sleep_total"] }
        // Analyze correlation
    .plot(x = "sleep_total", y = "rem_proportion", color = "vore", size = "brainwt")
        .geomPoint(alpha = 0.7)
        .guides(size = LegendType.none)
        .title("Correlation between dream and total sleep time")

Find more examples in our gallery {comding soon}.

The Grammar of Graphics

ggplot2 and thus kravis implement a grammar for graphics to build plots with

aesthetics + layers + coordinates system + transformations + facets

Which reads as map variables from data space to visual space + add one or more layers + configure the coordinates system + optionally apply statistical transformations + optionally add facets. That's the way!

Module Architecture

Supported Data Input Formats

Iterators

Every Iterable<T> is a valid data source for kravis, which allows to create plots using a type-save builder DSL. Essentially we first digest it into a table and use it as data source for visualization. Here's an example:

//  deparse records using property references (which will allow to infer variable names via reflection)
val basePlot = sleepPatterns.plot(
        x = SleepPattern::sleep_rem,
        y = SleepPattern::sleep_total,
        color = SleepPattern::vore,
        size = SleepPattern::brainwt
    )
            
basePlot
    .geomPoint()
    .title("Correlation of total sleep and and rem sleep by food preference")
    .show()

In the previous example we have used property references. kravis also supports an extractor lambda function syntax, which allow for on-the-fly data transformations when deparsing an Iterable<T>. The (not yet solved) disadvantage is that we need to assign axis labels manually

sleepPatterns
    .plot(x = { sleep_total/60 })
    .geomHistogram()
    .xLabel("sleep[h]")

And here's another example using a custom data class:

enum class Gender { male, female }

data class Person(val name: String, val gender: Gender, val heightCm: Int, val weightKg: Double)

// define some persons
val persons = listOf(
    Person("Max", Gender.male, 192, 80.3),
    Person("Anna", Gender.female, 162, 56.3),
    Person("Maria", Gender.female, 172, 66.3)
)

// visualize sizes by gender
persons.plot(x = {name}, y = { weightKg }, fill = { gender.toString() })
    .geomCol()
    .xLabel("height [m]")
    .yLabel("weight [kg]")
    .title("Body Size Distribution")

Tables

kravis can handle any kind of tabular data via krangl data-frames

import kravis.* 
import krangl.irisData 

irisData.plot(x="Species" , y="Petal.Length" )
    .geomBoxplot()
    .geomPoint(position = PositionJitter(width = 0.1), alpha = 0.3)
    .title("Petal Length by Species")

Output Devices

kravis auto-detects the environment, and will try to guess the most reasonable output device to show your plots. The following output devices are available.

  1. A swing graphics device for rendering when running in interactive mode.
  2. A javaFX graphics device for rendering when running in interactive mode.
  3. It can render directly into files
  4. will render directly into jupyter notebooks.

By default kravis will render as png on all devices, but it also supports vector rendering using svg as output format.

The preferred output can be configured using the SessionPrefs object

SessionPrefs.OUTPUT_DEVICE = SwingPlottingDevice()

Rendering

Currently kravis provided 3 different options to bind an R engine which is required to render plots.

(1) Local R

This is the default mode which can be configured by using

SessionPrefs.RENDER_BACKEND = LocalR()

(2) Dockerized R.

SessionPrefs.RENDER_BACKEND = Docker()

This will pull and use by default the container rocker/tidyverse:3.5.1, but can be configured to use more custom images as needed.

(3) Rserve

An (optionally) remote backend based using Rserve

Simply install the corresponding R package and start the daemon with

R -e "install.packages('Rserve',,'http://rforge.net/',type='source')"
R CMD Rserve

For configuration details see https://www.rforge.net/Rserve/doc.html

Alternatively, in case you don't have or want a local R installation, you can also run it dockerized locally or remotly with

# docker run -p <public_port>:<private_port> -d <image>  
docker run -dp 6311:6311 holgerbrandl/kravis_rserve 

See Dockerfile for the spec of this image.

To use the Rserve backend, configure the kravis SessionPrefs accordingly by pointing to the correct host and port.

SessionPrefs.RENDER_BACKEND = RserveEngine(host="localhost", port=6302)

Plot Immutability

Plots are -- similar to krangl data-frames -- immutable.

val basePlot = mpgData.plot("displ" to x, "hwy" to y).geomPoint()

// create one version with adjusted axis text size
basePlot.theme(axisText = ElementText(size = 20.0, color = RColor.red))

// create another version with unchanged axis labels but using a log scale instead
basePlot.scaleXLog10()

API Coverage

Currently we just map a subset of the ggplot2 API.

  • Checks - implemented already
  • Crosses - Planned but not yet done

Feel welcome to submit a ticket or PR if some important usecase is missing.

How to use missing API elements from ggplot2?

Since kravis just mimics some parts of ggplot2, and because user may want to create more custom plots we do support preambles (e.g. to define new geoms) and custom layer specs.

Example

irisData.plot(x = "Species", y = "Sepal.Length", fill = "Species")
    .addPreamble("""devtools::source_url("https://git.io/fAiQN")""")
    .addCustom("""geom_flat_violin(scale = "count", trim = FALSE)""")
    .geomDotplot(binaxis = "y", dotsize = 0.5, stackdir = "down", binwidth = 0.1, position = PositionNudge(-0.025))
    .theme(legendPosition = "none")
    .labs(x = "Species", y = "Sepal length (cm)")

References

You don't like it? Here are some other projects which may better suit your purpose. Before you leave, consider dropping us a ticket with some comments about whats missing, badly designed or simply broken in kravis.

GGplot Wrappers

  • gg4clj Another ggplot2 wrapper written in java

Other JVM visualization libraries ordered by -- personally biased -- usefullness

  • SmilePlot provides data visualization tools such as plots and maps for researchers to understand information more easily and quickly.
  • XChart is a light-weight Java library for plotting data
  • data2viz is a multi platform data visualization library with comprehensive DSL
  • Kubed is a Kotlin library for manipulating the JavaFX scenegraph based on data.
  • TornadoFX provides some Kotlin wrappers around JavaFX
  • plotly-scala which provides scala bindings for plotly.js and works within jupyter
  • breeze-viz which is a Visualization library backed by Breeze and JFreeChart
  • grafana is an open platform for beautiful analytics and monitoring
  • Jzy3d is an open source java library that allows to easily draw 3d scientific data: surfaces, scatter plots, bar charts

Other

Vega-lite based

  • Vegas aims to be the missing MatPlotLib for Scala + Spark
  • altair provides declarative statistical visualization library for Python
  • vega-embed allows to publish Vega visualizations as embedded web components with interactive parameters.
  • hrbrmstr/vegalite provides R ggplot2 "bindings" for Vega-Lite

Acknowledgements

Thanks to vega-lite team for making this project possible.

Thanks to the ggplot2 team for providing the best data vis API to date.

Comments
  • Problems rendering kotlin jupiter sample

    Problems rendering kotlin jupiter sample

    I tried to follow the tutorial

    but I keep getting:

    	at org.jetbrains.kotlinx.jupyter.config.LoggingKt.catchAll(logging.kt:41)
    	at org.jetbrains.kotlinx.jupyter.config.LoggingKt.catchAll$default(logging.kt:40)
    	at org.jetbrains.kotlinx.jupyter.ReplForJupyterImpl$evalEx$1.invoke(repl.kt:404)
    	at org.jetbrains.kotlinx.jupyter.ReplForJupyterImpl$evalEx$1.invoke(repl.kt:383)
    	at org.jetbrains.kotlinx.jupyter.ReplForJupyterImpl.withEvalContext(repl.kt:347)
    	at org.jetbrains.kotlinx.jupyter.ReplForJupyterImpl.evalEx(repl.kt:383)
    	at org.jetbrains.kotlinx.jupyter.ReplForJupyterImpl.eval(repl.kt:434)
    	at org.jetbrains.kotlinx.jupyter.ProtocolKt$shellMessagesHandler$res$1.invoke(protocol.kt:296)
    	at org.jetbrains.kotlinx.jupyter.ProtocolKt$shellMessagesHandler$res$1.invoke(protocol.kt:295)
    	at org.jetbrains.kotlinx.jupyter.JupyterConnection$runExecution$execThread$1.invoke(connection.kt:162)
    	at org.jetbrains.kotlinx.jupyter.JupyterConnection$runExecution$execThread$1.invoke(connection.kt:160)
    	at kotlin.concurrent.ThreadsKt$thread$thread$1.run(Thread.kt:30)
    Caused by: java.lang.RuntimeException: java.io.IOException: Cannot run program "R" (in directory "."): error=2, No such file or directory
    	at kravis.render.RUtils.evalCmd(RenderEngine.kt:137)
    	at kravis.render.RUtils.evalCmd$default(RenderEngine.kt:118)
    	at kravis.render.RUtils.runRScript(RenderEngine.kt:103)
    	at kravis.render.LocalR.render$kravis(LocalR.kt:28)
    	at kravis.GGPlot.save(GGPlot2.kt:170)
    	at kravis.device.JupyterDevice.show(JupyterDevice.kt:26)
    	at kravis.device.JupyterDevice.show(JupyterDevice.kt:21)
    	at kravis.GGPlot.show(GGPlot2.kt:175)
    	at kravis.jupyter.Integration$onLoaded$1.invoke(Integration.kt:13)
    	at kravis.jupyter.Integration$onLoaded$1.invoke(Integration.kt:13)
    	at kravis.jupyter.Integration$onLoaded$$inlined$render$1.invoke(JupyterIntegration.kt:88)
    	at kravis.jupyter.Integration$onLoaded$$inlined$render$1.invoke(JupyterIntegration.kt:34)
    	at kravis.jupyter.Integration$onLoaded$$inlined$render$2.execute(JupyterIntegration.kt:95)
    	at org.jetbrains.kotlinx.jupyter.codegen.RenderersProcessorImpl$renderResult$newField$1.invoke(RenderersProcessorImpl.kt:24)
    	at org.jetbrains.kotlinx.jupyter.codegen.RenderersProcessorImpl$renderResult$newField$1.invoke(RenderersProcessorImpl.kt:23)
    	at org.jetbrains.kotlinx.jupyter.exceptions.ReplLibraryExceptionKt.rethrowAsLibraryException(ReplLibraryException.kt:24)
    	... 14 more
    Caused by: java.io.IOException: Cannot run program "R" (in directory "."): error=2, No such file or directory
    	at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1128)
    	at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1071)
    	at kravis.render.RUtils.evalCmd(RenderEngine.kt:124)
    	... 29 more
    Caused by: java.io.IOException: error=2, No such file or directory
    	at java.base/java.lang.ProcessImpl.forkAndExec(Native Method)
    	at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:340)
    	at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:271)
    	at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1107)
    	... 31 more
    	```
    opened by elect86 4
  • Erroneous classpath/dependencies directory in krangl dependency

    Erroneous classpath/dependencies directory in krangl dependency

    Apparently, de.mpicbg krangl artefact is not avaliable solely through jitpack. For me it worked with adding jcentral() repo to the project. I'm sending a PR with the proposed README update.

    opened by stefanches7 2
  • Fix ci

    Fix ci

    I can't run test cases correctly on my local machine. I suggest to use Docker for the CI and local development environment.

    Since the generated SVG files are slightly different between the version of the R libraries.

    opened by tokuhirom 1
  • Some missing parameters in ElementText

    Some missing parameters in ElementText

    https://ggplot2.tidyverse.org/reference/element.html https://github.com/tidyverse/ggplot2/blob/HEAD/R/theme-elements.r

    I want to pass hjust, vjust and angle to the ElementText.

    enhancement help wanted 
    opened by tokuhirom 1
  • Exceptions when I calling .show() or .save()

    Exceptions when I calling .show() or .save()

    I would like to be able to show a popup window with the chart like I do in Python.

    I pasted the data class example into my spring boot application and when I run I get:

    Caused by: java.awt.HeadlessException: null

    If I comment the .show() and call .save() I get:

    java.lang.IllegalArgumentException: No enum constant kravis.render.PlotFormat.

    Please tell me if I'm missing something.

    Thanks, Michael

    opened by fiddleatwork 1
  • Why R ggplot2 over Vega-Lite or XChart

    Why R ggplot2 over Vega-Lite or XChart

    Hi

    Thank you for your work on the data science tooling in Kotlin!

    I see you implemented experimental backends for Vega-Lite and XCharts, too, but decided to stay with calling R and ggplot2. I'm curious what was the reason for this, because, at first sight, this seems the least optimal due to introducing the dependency on R and missing out on the interactiveness of Vega?

    opened by ValdarT 1
  • Library depends on 0.7 of krangl, can't retrieve dependency from jitpack

    Library depends on 0.7 of krangl, can't retrieve dependency from jitpack

    The current build relies on 0.7 of krangl which isn't on jcenter, so the dependency fails and the package never builds on jitpack. Here's one of the recent log files.

    https://jitpack.io/com/github/holgerbrandl/kravis/afd77dea54/build.log

    Fix is to either downgrade jitpack dep to 0.6 or upload the 0.7 build to jcenter. Currently one has to have krangl 0.7 built from source and in their local maven for things to work correctly.

    opened by jwill 1
  • support for half violin plots with dot overlay

    support for half violin plots with dot overlay

    # half violin plot with raw data ------------------------------------------
    
    ## create a violin plot of Sepal.Length per species
    ## using the custom function geom_flat_violin()
    
    ggplot(data = iris, 
           mapping = aes(x = Species, y = Sepal.Length, fill = Species)) + 
      geom_flat_violin(scale = "count", trim = FALSE) + 
      stat_summary(fun.data = mean_sdl, fun.args = list(mult = 1), 
                   geom = "pointrange", position = position_nudge(0.05)) + 
      geom_dotplot(binaxis = "y", dotsize = 0.5, stackdir = "down", binwidth = 0.1, 
                   position = position_nudge(-0.025)) + 
      theme(legend.position = "none") + 
      labs(x = "Species", y = "Sepal length (cm)")
    

    See https://helenajambor.wordpress.com/2018/08/28/pick-n-mix-plots/

    opened by holgerbrandl 0
  • Can not plot grouped krangl data

    Can not plot grouped krangl data

    When a dataframe is grouped, plotting fails with

    java.lang.UnsupportedOperationException
    krangl.Extensions.rowData(Extensions.kt:902)
    krangl.TableIOKt.writeCSV(TableIO.kt:297)
    krangl.TableIOKt.writeTSV(TableIO.kt:274)
    krangl.TableIOKt.writeTSV$default(TableIO.kt:273)
    kravis.render.LocalR.render$kravis(LocalR.kt:21)
    kravis.GGPlot.save(GGPlot2.kt:164)
    kravis.device.JupyterDevice.show(JupyterDevice.kt:25)
    kravis.device.JupyterDevice.show(JupyterDevice.kt:20)
    kravis.GGPlot.show(GGPlot2.kt:169)
    Line_101_jupyter.<init>(Line_101.jupyter.kts:3)
    sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    
    opened by holgerbrandl 0
  • Missing value where TRUE/FALSE needed

    Missing value where TRUE/FALSE needed

    Encountered this error when plotting with geomLine:

    Exception in thread "main" Script:
    library(ggplot2)
    library(dplyr)
    library(readr)
    library(scales)
    library(forcats)
    
    
    
    data01 = read_tsv("/tmp/.txt9217350101246720089.tmp")
    
    set.seed(2009)
    
    gg = ggplot(mapping=aes(x=`x`,y=`y`,color=`Function`), data=data01) + 
    	geom_line(stat='identity', position=position_identity(), na.rm=FALSE, inherit.aes=TRUE, size=1.0) + 
    	ggtitle("Derivatives of y=(x^(x))")
    
    ggsave(filename="/home/breandan/IdeaProjects/kotlingrad/src/main/resources/plot.png", plot=gg)
    
    Attaching package: ‘dplyr’
    
    The following objects are masked from ‘package:stats’:
    
        filter, lag
    
    The following objects are masked from ‘package:base’:
    
        intersect, setdiff, setequal, union
    
    
    Attaching package: ‘scales’
    
    The following object is masked from ‘package:readr’:
    
        col_factor
    
    Parsed with column specification:
    cols(
      x = col_double(),
      y = col_double(),
      Function = col_character()
    )
    Saving 7 x 7 in image
    Error in if (score > best$score && (!only.loose || (lmin <= dmin && lmax >=  : 
      missing value where TRUE/FALSE needed
    Calls: ggsave ... f -> <Anonymous> -> f -> <Anonymous> -> <Anonymous>
    Execution halted
    	at kravis.render.LocalR.render$kravis(LocalR.kt:23)
    	at kravis.GGPlot.save(GGPlot2.kt:164)
    	at kravis.GGPlot.save$default(GGPlot2.kt:162)
    	at edu.umontreal.kotlingrad.samples.TestPlotKt.main(TestPlot.kt:46)
    

    Code follows:

        val xs = 0.0..5.0 step 0.09
        val ys = (xs.map { listOf(it, y(it), "y") }
                + xs.map { listOf(it, dy_dx(it), "dy/dx") }
                + xs.map { listOf(it, `d²y_dx²`(it), "d²y/x²") }
                + xs.map { listOf(it, `d³y_dx³`(it), "d³y/dx³") }
                + xs.map { listOf(it, `d⁴y_dx⁴`(it), "d⁴y/dx⁴") }
                + xs.map { listOf(it, `d⁵y_dx⁵`(it), "d⁵y/dx⁵") }
          ).flatten()
    
        dataFrameOf("x", "y", "Function")(ys)
          .plot(x = "x", y = "y", color = "Function")
          .geomLine(size = 1.0)
          .title("Derivatives of y=$y")
          .save(File("src/main/resources/plot.png"))
    
    opened by breandan 2
  • Use bytecode scanner to infer axis labels when using extractor functions

    Use bytecode scanner to infer axis labels when using extractor functions

    Example

    sleepPatterns
        .ggplot(
    x = { sleep_total/60 },
    y = { sleep_rem }
    ).geomPoint()
    

    We should get the java class of the lambda and then use a bytecode analyzing library to look for an invocation bytecode. See https://asm.ow2.io/

    opened by holgerbrandl 0
Releases(v0.8.6)
Owner
Holger Brandl
machine learning & data science enthusiast.
Holger Brandl
Use Android Data Binding wih Live Data to glue View Model and Android

Gruop-C Spliff Summary Use Android Data Binding wih Live Data to glue View Model and Android. Asynchronous communications implemented with KotlinX Cor

null 2 Nov 21, 2021
Algorithms and data structures in Kotlin.

Here you can find the most common algorithms and data structures written in Kotlin. The goal of this project is to create the most eloquent implementa

Boris Maslakov 805 Dec 26, 2022
Small kotlin library for persisting _single instances_ of kotlin data classes

PerSista Small library for persisting single instances of kotlin data classes. NB: PerSista uses typeOf() internally which is marked as @ExperimentalS

Eric Donovan 5 Nov 13, 2022
A clean-aesthetically pleasing Measuring Application, which uses relevant sensors-converts raw sensor data into human readable formatted outputs-and displays accurate measurements.

Measure App A clean-aesthetically pleasing Measuring Application, which uses relevant sensors-converts raw sensor data into human readable formatted o

ACM Student Chapter, PESU ECC 1 Oct 15, 2021
Multi-thread ZX0 data compressor in Kotlin

ZX0-Kotlin ZX0-Kotlin is a multi-thread implementation of the ZX0 data compressor in Kotlin. Requirements To run this compressor, you must have instal

Einar Saukas 2 Apr 14, 2022
This repository is a simple humidity and temperature dashboard to present data from sensors on your phone

ChilliBook This repository is a simple humidity and temperature dashboard to present data from sensors on your phone. It uses Bluetooth LE and an ESP3

Alejandro Mera 1 Nov 8, 2021
An AutoValue extension that generates binary and source compatible equivalent Kotlin data classes of AutoValue models.

AutoValue Kotlin auto-value-kotlin (AVK) is an AutoValue extension that generates binary-and-source-compatible, equivalent Kotlin data classes. This i

Slack 19 Aug 5, 2022
Basic application that uses Retrofit, Moshi and Coil libraries to parse data from web API

DogAlbum_Api_CodeThrough Basic application that uses Retrofit, Moshi and Coil libraries to parse data from web API This folder contains the completed

Ayana Bando 0 Nov 9, 2021
Data structures in kotlin that maintain order

Ordered Data Structures I came from C++ and fell in love with kotlin. I used the C++ stdlib a lot. I have really been wanted to reach for map and unor

Kyle McBurnett 0 Nov 1, 2021
FirestoreCleanArchitectureApp is an app built with Kotlin and Firestore that displays data in real-time using the MVVM Architecture Pattern. For the UI it uses Jetpack Compose, Android's modern toolkit for building native UI.

FirestoreCleanArchitectureApp FirestoreCleanArchitectureApp is an app built with Kotlin and Cloud Firestore that displays data in real-time using Andr

Alex 66 Dec 15, 2022
Starter code for Android Basics codelab - Store the data in a ViewModel

Unscramble App =================================== Starter code for Android Basics codelab - Store the data in a ViewModel Unscramble is a single pla

Исмоил 0 Nov 19, 2021
This repo contains my solutions to some data structures and algorithms problems on leetcode.

DSA Playground This repository contains solutions to dsa problems in kotlin. NOTE: This file will get long, please consider using <Ctrl>F DSA With Kun

Hardik Sachan 2 Dec 9, 2021
A simple App which fetches data from NewYork times api and show news to the user

Stay-TheNewsApp This is a simple java app, which fetches data from NewYork times api and show news to the user, News can be seen from various categori

Gautam Garg 0 Dec 7, 2021
Clean MVVM with eliminating the usage of context from view models by introducing hilt for DI and sealed classes for displaying Errors in views using shared flows (one time event), and Stateflow for data

Clean ViewModel with Sealed Classes Following are the purposes of this repo Showing how you can remove the need of context in ViewModels. I. By using

Kashif Mehmood 22 Oct 26, 2022
Starter code for Android Basics codelab - Store the data in a ViewModel

Unscramble App Starter code for Android Basics codelab - Store the data in a Vie

null 0 Dec 29, 2021
Solution code for Android Kotlin Fundamentals Codelab 8.1 Getting data from the internet

MarsRealEstateNetwork - Solution Code Solution code for Android Kotlin Fundamentals Codelab 8.1 Getting data from the internet Introduction MarsRealEs

DavidHieselmayr 1 Apr 7, 2022
KotlinForDS - An exploration of data science using Kotlin

Kotlin Jupyter Notebook An example notebook can be found here: https://mybinder.

Krulvis 0 Jan 20, 2022
Swarup 2 Feb 6, 2022