xls2json - Read in Excel file (.xls, .xlsx, .xlsm) and output JSON

Related tags

JSON kotlin cli json xlsx xls
Overview

CI

xls2json

Read in Excel file (.xls, .xlsx, .xlsm) and output JSON. Evaluates formulas where possible. Preserve type information from Excel via JSON types.

❯ xls2json --help
Open an xls(x|m) file and transform to json.

Usage: xls2json [-hlmsvV] [-D=<dtfmt>] [-T=<tfmt>] [-t=<tables>]... [<files>...]

  If no `--table`s are provided, then all
  tables will be extracted.

  All files will be processed one by one,
  each outputting on line of json.

      [<files>...]           xls(x|m)-file(s) to transform
  -h, --help                 Show this help message and exit.
  -V, --version              Print version information and exit.
  -m, --memory               Show memory usage information.
  -v, --verbose              Show more information.
      --pretty               Pretty print the JSON.
  -l, --list-tables          List all tables.
  -t, --table=<tables>       Specify the tables to transform
  -p, --password=<password>  Password for opening the input file(s).
  -s, --strip                Strip empty columns and empty rows.
  -D, --datetime-format=<dtfmt>
                             The datetime format.
                             [default: 'yyyy-MM-dd'T'HH:mm:ss.SSS']
  -T, --time-format=<tfmt>   The time format.
                             [default: 'HH:mm:ss.SSS']

By Tammo Ippen <[email protected]>
Issues: https://github.com/tammoippen/xls2json/issues

Usage

list of rows "Sheet1": [ # each row is a list of cell-values ["empty"], ["String", "hello"], ["StringNumber", "14.8"], ["Int", 1234], ["bool", true], ["bool", false], ["float", 23.12345], # format for datetime and time can be specified using # the -D and -T options. The format-string documentation: # https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/time/format/DateTimeFormatter.html # Please be aware, that Excel dates have no timezone attached, # so if you provide a format with timezone, your locally configured # timezone will be used. ["datetime", "2021-05-18T21:19:53.040"], ["time", "21:19:32.000"], # formulars are evaluated ["formulars"], ["string", "hello"], ["float", -0.34678748622465627], ["int", 5], ["datetime", "2021-06-03T16:48:26.710"], ["time", "13:37:00.000"] ] } # (format is not true to the jq-output for brevity) # XLSX works the same ❯ xls2json -s src/test/resources/sample.xlsx {"Sheet1":[["empty"],["String","hello"],["StringNumber","14.8"],["Int",1234],["bool",true],["bool",false],["float",23.12345],["datetime","2021-05-18T21:19:53.040"],["time","21:19:32.000"],["formulars"],["string","hello"],["float",-0.34678748622465627],["int",5],["datetime","2021-06-03T16:59:06.466"],["time","13:37:00.000"]]} # works on multiple files as well ❯ xls2json -s src/test/resources/sample.xls* {"Sheet1":[["empty"],["String","hello"],["StringNumber","14.8"],["Int",1234],["bool",true],["bool",false],["float",23.12345],["datetime","2021-05-18T21:19:53.040"],["time","21:19:32.000"],["formulars"],["string","hello"],["float",-0.34678748622465627],["int",5],["datetime","2021-06-03T16:59:06.466"],["time","13:37:00.000"]]} {"Sheet1":[["empty"],["String","hello"],["StringNumber","14.8"],["Int",1234],["bool",true],["bool",false],["float",23.12345],["datetime","2021-05-18T21:19:53.040"],["time","21:19:32.000"],["formulars"],["string","hello"],["float",-0.34678748622465627],["int",5],["datetime","2021-06-03T16:47:22.789"],["time","13:37:00.000"]]} # list the available tables ❯ xls2json -l src/test/resources/sampleTwoSheets.xls ["Sheet1","Sheet2"] # only output some table(s) ❯ xls2json -t Sheet2 -s src/test/resources/sampleTwoSheets.xls | jq { "Sheet2": [ ["empty"], ["String", "hello"], ["StringNumber", "14.8"], ["Int", 1234], ["bool", true], ["bool", false], ["float", 23.12345], ["datetime", "2021-05-18T21:19:53.040"], ["time", "21:19:32.000"], ["formulars"], ["string", "hello"], ["float", -0.34678748622465627], ["int", 5], ["datetime", "2021-06-03T16:48:26.710"], ["time", "13:37:00.000"] ] } # if you are running the native-image executable, you can configure the # garbage collector with the usual options ❯ xls2json -Xmx1k --help Exception in thread "main" java.lang.OutOfMemoryError: Garbage-collected heap size exceeded. # 1kb is a bit low # if you are running with the installable distribution, you can set # the environment variable XLS2JSON_OPTS ❯ XLS2JSON_OPTS="-Xmx1k" xls2json --help Error occurred during initialization of VM Too small maximum heap # still 1kb is too small :D ">
# read a XLS file
❯ ./dist/xls2json-amd64 src/test/resources/empty.xls
{"Sheet1":[]}

# read a XLS file with content
❯ xls2json src/test/resources/sample.xls
{"Sheet1":[["empty",null],["String","hello",null],["StringNumber","14.8",null],["Int",1234,null,null,null,null],["bool",true,null,null,null,null],["bool",false,null,null,null,null],["float",23.12345,null,null,null,null],["datetime","2021-05-18T21:19:53.040",null,null,null,null],["time","21:19:32.000",null],["formulars",null],["string","hello",null],["float",-0.34678748622465627,null],["int",5,null],["datetime","2021-06-03T16:45:56.709",null],["time","13:37:00.000",null],[],[],[],[null,null,null],[null,null,null],[null,null,null]]}

# strip empty cells from from the end of columns and empty rows from the bottom
❯ xls2json -s src/test/resources/sample.xls
{"Sheet1":[["empty"],["String","hello"],["StringNumber","14.8"],["Int",1234],["bool",true],["bool",false],["float",23.12345],["datetime","2021-05-18T21:19:53.040"],["time","21:19:32.000"],["formulars"],["string","hello"],["float",-0.34678748622465627],["int",5],["datetime","2021-06-03T16:47:22.789"],["time","13:37:00.000"]]}

# pretty print the output
❯ xls2json -s --pretty src/test/resources/empty.xls
{
  "Sheet1": []
}

# use jq (https://stedolan.github.io/jq/)
# or gojq (https://github.com/itchyny/gojq)
# to get some nice output and / or to process the json
❯ xls2json -s src/test/resources/sample.xls | jq
{
  # we have a dict sheetname -> list of rows
  "Sheet1": [
    # each row is a list of cell-values
    ["empty"],
    ["String", "hello"],
    ["StringNumber", "14.8"],
    ["Int", 1234],
    ["bool", true],
    ["bool", false],
    ["float", 23.12345],
    # format for datetime and time can be specified using
    # the -D and -T options. The format-string documentation:
    # https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/time/format/DateTimeFormatter.html
    # Please be aware, that Excel dates have no timezone attached,
    # so if you provide a format with timezone, your locally configured
    # timezone will be used.
    ["datetime", "2021-05-18T21:19:53.040"],
    ["time", "21:19:32.000"],
    # formulars are evaluated
    ["formulars"],
    ["string", "hello"],
    ["float", -0.34678748622465627],
    ["int", 5],
    ["datetime", "2021-06-03T16:48:26.710"],
    ["time", "13:37:00.000"]
  ]
}
# (format is not true to the jq-output for brevity)

# XLSX works the same
❯ xls2json -s src/test/resources/sample.xlsx
{"Sheet1":[["empty"],["String","hello"],["StringNumber","14.8"],["Int",1234],["bool",true],["bool",false],["float",23.12345],["datetime","2021-05-18T21:19:53.040"],["time","21:19:32.000"],["formulars"],["string","hello"],["float",-0.34678748622465627],["int",5],["datetime","2021-06-03T16:59:06.466"],["time","13:37:00.000"]]}

# works on multiple files as well
❯ xls2json -s src/test/resources/sample.xls*
{"Sheet1":[["empty"],["String","hello"],["StringNumber","14.8"],["Int",1234],["bool",true],["bool",false],["float",23.12345],["datetime","2021-05-18T21:19:53.040"],["time","21:19:32.000"],["formulars"],["string","hello"],["float",-0.34678748622465627],["int",5],["datetime","2021-06-03T16:59:06.466"],["time","13:37:00.000"]]}
{"Sheet1":[["empty"],["String","hello"],["StringNumber","14.8"],["Int",1234],["bool",true],["bool",false],["float",23.12345],["datetime","2021-05-18T21:19:53.040"],["time","21:19:32.000"],["formulars"],["string","hello"],["float",-0.34678748622465627],["int",5],["datetime","2021-06-03T16:47:22.789"],["time","13:37:00.000"]]}

# list the available tables
❯ xls2json -l src/test/resources/sampleTwoSheets.xls
["Sheet1","Sheet2"]

# only output some table(s)
❯ xls2json -t Sheet2 -s src/test/resources/sampleTwoSheets.xls | jq
{
  "Sheet2": [
    ["empty"],
    ["String", "hello"],
    ["StringNumber", "14.8"],
    ["Int", 1234],
    ["bool", true],
    ["bool", false],
    ["float", 23.12345],
    ["datetime", "2021-05-18T21:19:53.040"],
    ["time", "21:19:32.000"],
    ["formulars"],
    ["string", "hello"],
    ["float", -0.34678748622465627],
    ["int", 5],
    ["datetime", "2021-06-03T16:48:26.710"],
    ["time", "13:37:00.000"]
  ]
}

# if you are running the native-image executable, you can configure the
# garbage collector with the usual options
❯ xls2json -Xmx1k --help
Exception in thread "main" java.lang.OutOfMemoryError: Garbage-collected heap size exceeded.
# 1kb is a bit low

# if you are running with the installable distribution, you can set
# the environment variable XLS2JSON_OPTS
❯ XLS2JSON_OPTS="-Xmx1k" xls2json --help
Error occurred during initialization of VM
Too small maximum heap
# still 1kb is too small :D

Output JsonSchema

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "minProperties": 1,
  "patternProperties": {
    ".*": {
      "type": "array",
      "items": {
        "type": "array",
        "items": {
          "type": ["number", "string", "integer", "boolean", "null"]
        }
      }
    }
  }
}

Installation

The Releases contain various formats for installation:

  • An executable fat jar (build with the shadowJar plugin). Put the jar wherever you like and call it like:

    ❯ java -jar path/to/xls2json-1.0.0-all.jar --help
  • An installable distribution as .zip / .tar.gz. Extract it, put it wherever you like and update your PATH, e.g.:

    ❯ tar xf xls2json-1.0.0.tar
    ❯ mv xls2json-1.0.0 "$HOME/.local/share"export PATH="$PATH:$HOME/.local/share/xls2json-1.0.0/bin"
    ❯ xls2json --help
  • Native executable build with native-image from GraalVM using the native-image plugin from mike_neck. Put the executable for your operating system into your path and call it like: xls2json --help.

    No JAVA installation needed.

Another Excel Reader?

There are already some great Excel readers out there, but most do not satisfy my requirements:

  • commandline program with (in the best case) no external dependencies
  • XLS (2003 and earlier) and XLSX (2007 and later) support
  • evaluate formulas within the spreadsheet
  • keep type information from Excel
  • open source
  • (fast and multi-platform)

The existing programs / libraries I found and why I think they do not satisfy my needs:

  • pyexcel-cli: can do much more in and output formats, but lacks formula evaluation. Same for the corresponding pyexcel library.
  • unoconv and libreoffice: can do much more in and output formats, can do formula evaluation, but requires a LibreOffice installation.
  • ssconvert: requires a gnumeric installation, I am not sure about the capabilities.
  • Spreadsheet::XLSX: no cli afaik, xlsx only. Also perl is not a language I am eager to learn.
  • PhpSpreadsheet: no cli afaik, but from the docs looks as if you can evaluate formulas. But again, PHP is not a language I am eager to learn.

I am already using Apache POI for quite some time and found it quite complete for my needs. We still have the Java VM dependency, but with GraalVM native-image we can build self-contained executables 🎉 .

Standing on the shoulders of giants

  • Apache POI: the Java API for Microsoft Documents
  • Picocli: a mighty tiny command line interface
  • gson: A Java serialization/deserialization library to convert Java Objects into JSON and back
  • Gradle Build Tool
  • kotlin: A modern programming language that makes developers happier.
  • GraalVM native-image: ahead-of-time compile Java code to a standalone executable

Some linkes I found invaluable for this project:

Building and Development

Prerequisits:

  • Make sure you have a Java (11+) SDK in your path.
  • If you want to build the native binary, use the GraalVM as your Java SDK and install native-image: gu install native-image

Building:

❯ git clone https://github.com/tammoippen/xls2json.git
❯ cd xls2json
# build all java related targets
# => build/distributions/xls2json-{version}.[tar|zip]
❯ ./gradlew build
# build the native executable for your system
# => build/executable/xls2json
❯ ./gradlew nativeImage
# build the fat jar only (already in build)
# => build/libs/xls2json-{version}-all.jar
❯ ./gradlew shadowJar
# tests and run
❯ ./gradlew test
❯ ./gradlew run --args="--help"

Issues with the native-image executables

Apache POI uses many resources and reflections when working with the excel files and the native-image executable needs to be configured during building to include those resources and reflections.

When issues arise reading an Excel file, please try to use the fat-jar or the installable distribution and run the Excel file with that. If it works, please open an issue with the failing Excel file (or a minimal reproducing Excel file). I will generate the configuration then. If it still does not work, please again open an issue including the Excel file, stack trace, xls2json and java version.

If you want to generate the configuration, consider running:

❯ ./gradlew shadowJar
❯ $(JAVA_HOME)/bin/java -agentlib:native-image-agent=config-merge-dir=native-image-config \
    -jar build/libs/xls2json-1.0.0-all.jar \
    the-problematic-excel-file.xls(x)
You might also like...
A Shopping cart library for Android that allows you add to add items to cart and retrieve at ease using JSONObjects.

Carteasy A Shopping cart library for Android that allows you add to add items to cart and retrieve at ease using JSONObjects. Quick Start Add the foll

An easy way to customize your log in Android,including output to console, writing log to file in high performance way and so on

EasyLog An easy way to customize your log in Android,including output to console, writing log to file in high performance way and so on. 1. Initializa

An android app which allows users to display the data of any excel sheet in rows and columns
An android app which allows users to display the data of any excel sheet in rows and columns

ExcelReader App description An android app which allows users to display the data of any excel sheet in rows and columns. Features Display data of an

API que converte uma planilha do excel para o banco de dados, utilizando Kotlin + Spring.

API Excel2DB 📈 Uma API que tem como objetivo facilitar a carga de dados nos formatos .xlsx ou .xls para um banco de dados. Nessa situação, como o foc

This project created just for help developer who want to and ability of read VISA, UNION PAY, HUMO, ATTO and some other cards data read.

If you enjoy my content, please consider supporting what I do. Thank you. By me a Coffee To get a Git project into your build: Step 1. Add the JitPack

Self hosted read and to-read list book tracker
Self hosted read and to-read list book tracker

JELU Official documentation Like Jelu or find it useful ? Offer me a coffee ☕ Purpose This app main purpose is to track what you have read, what you a

Another-read-more-lib - Another read more library for android

another-read-more-lib 📖 Another read more library. Add it in your root build.gr

A library that extends the existing JDBC API so that data objects can be used as input (to set parameters) and output (from ResultSet's rows).

SqlObjectMapper This is a library that extends the existing JDBC API so that data objects can be used as input (to set parameters) and output (from Re

This program will read from your android application string.xml file and generate translated strings.xml files in your preferred languages using google sheet.
This program will read from your android application string.xml file and generate translated strings.xml files in your preferred languages using google sheet.

Localize your application content This program will read from your application string.xml file and generate translated strings.xml files in your prefe

A simple utility to remove unused resources in your Android app to lower the size of the APK. It's based on the Android lint tool output.

android-resource-remover android-resource-remover is utility that removes unused resources reported by Android Lint from your project. The goal is to

Use an expanded notification as a makeshift Android log output console
Use an expanded notification as a makeshift Android log output console

NotificationLog Use an expanded notification as a makeshift Android log output console. https://vine.co/v/bFxhUAVjvrY The design goals of this project

An intelligent diff tool for the output of Gradle's dependencies task

Dependency Tree Diff An intelligent diff tool for the output of Gradle's dependencies task which always shows the path to the root dependency. +--- c

Generate inspections from output of executables

tool-to-inspection Template ToDo list Create a new IntelliJ Platform Plugin Template project. Get known with the template documentation. Verify the pl

🚀Plugin for Android Studio And IntelliJ Idea to generate Kotlin data class code from JSON text ( Json to Kotlin )
🚀Plugin for Android Studio And IntelliJ Idea to generate Kotlin data class code from JSON text ( Json to Kotlin )

JsonToKotlinClass Hi, Welcome! This is a plugin to generate Kotlin data class from JSON string, in another word, a plugin that converts JSON string to

MMDUtils is a library for read/write mmd related file in java

MMDUtils MMDUtils is a library for read/write mmd related file in java Features Read/Write VMD(Vocaloid Motion Data) file Read/Write PMX(Polygon Model

Android Material Json Form Wizard is a library for creating beautiful form based wizards within your app just by defining json in a particular format.
Android Material Json Form Wizard is a library for creating beautiful form based wizards within your app just by defining json in a particular format.

Android Json Wizard Android Json Wizard is a library for creating beautiful form based wizards within your app just by defining json in a particular f

 Images grid JSON | Сетка изображений JSON
Images grid JSON | Сетка изображений JSON

Images grid JSON | Сетка изображений JSON Задача Разработать приложение: Приложение должно получать JSON-список ссылок на изображения с сервера по адр

Dynamic-UI-From-JSON - A Sample Android app to show dynamic UI generation from Json
Dynamic-UI-From-JSON - A Sample Android app to show dynamic UI generation from Json

Dynamic UI from JSON Functionality The app's functionality includes: The app gen

Pulls notifications from a remote JSON file and shows them in your app.
Pulls notifications from a remote JSON file and shows them in your app.

android-remote-notifications A Google GCM/Amazon SNS alternative using pull instead of push. Main features Independent user notifications (no Google G

Releases(v1.2.2)
Owner
Tammo Ippen
Tammo Ippen
A Java serialization/deserialization library to convert Java Objects into JSON and back

Gson Gson is a Java library that can be used to convert Java Objects into their JSON representation. It can also be used to convert a JSON string to a

Google 21.7k Jan 5, 2023
A modern JSON library for Kotlin and Java.

Moshi Moshi is a modern JSON library for Android and Java. It makes it easy to parse JSON into Java objects: String json = ...; Moshi moshi = new Mos

Square 8.7k Jan 2, 2023
Simple CLI app to convert XML retrieved from a configurable URI to JSON and back

XmlToJsonUtility Simple CLI app written in Kotlin (1.5.31) on Java 11, using Spring Boot. Queries a URI (default) as an XML source. Attempts to valida

Darius Washington 2 Oct 20, 2021
Manager of progress using Lottie JSON, compatible for Java and Kotlin.

Progress Lottie IGB Manager of progress using Lottie JSON, compatible for Java and Kotlin. Int Top In Bottom Important Info: To create a raw folder: R

Isaac G. Banda 21 Sep 16, 2022
Customizable JSON Schema-based forms for Kotlin and Compose

Kotlin JSON Forms Customizable JSON Schema-based forms for Kotlin and Compose This project aims to reimplement JSON Forms in Kotlin for use in Compose

Copper Leaf 3 Oct 1, 2022
Fast JSON parser for java projects

ig-json-parser Fast JSON parser for java projects. Getting started The easiest way to get started is to look at maven-example. For more comprehensive

Instagram 1.3k Dec 29, 2022
A lightweight Kotlin DSL to render DSL style Json to String.

A lightweight Kotlin DSL to render DSL style Json to String.

null 4 May 5, 2021
Modern JSON processor with readable Kotlin syntax.

Kq Modern cross-platform JSON processor with readable Kotlin syntax. cat ~/Desktop/bdb.ndjson | kq '.filter{it.bool("muted")}.sortedBy{it.long("size")

Daniel Demidko 6 Jan 25, 2022
ktlint JSON Lines reporter

ktlint JSON Lines reporter Usage Download the jar and run: ktlint --reporter=jsonlines,artifact=ktlint-jsonlines-reporter.jar Download Either downloa

Anton Musichin 1 Nov 24, 2021
Generate a JSON bookmarks document from a GitHub user

Github to bookmarks This little webapp will generate a JSON bookmarks document from a GitHub user. This is intended to be used with bbt. An instance i

Benoit Lubek 2 Nov 8, 2021