Training a dense neural network using both KotlinDL and Python3 on the mnist dataset.

Overview

KotlinDL-mnist

KotlinDL: Deep Learning Framework written in Kotlin

This project trains a dense neural network using both KotlinDL and Python3 on the mnist dataset.

The test result is intriguing: KotlinDL on a i5 CPU (without AVX) is 3x faster than Python on a RTX2080 GPU.

Quick-Start

Kotkin

  1. Import the project into Intellij Idea IDE

  2. Run main.kt or build a JAR file:

$ ./gradlew shadowJar 
$ java -jar build/libs/hello-kotlindl-1.0-SNAPSHOT-all.jar 

Outputs:

Extracting 60000 images of 28x28 from /home/wuhanstudio/kotlndl-mnist/cache/datasets/mnist/train-images-idx3-ubyte.gz
Extracting 60000 labels from /home/wuhanstudio/kotlndl-mnist/cache/datasets/mnist/train-labels-idx1-ubyte.gz
Extracting 10000 images of 28x28 from /home/wuhanstudio/kotlndl-mnist/cache/datasets/mnist/t10k-images-idx3-ubyte.gz
Extracting 10000 labels from /home/wuhanstudio/kotlndl-mnist/cache/datasets/mnist/t10k-labels-idx1-ubyte.gz
2021-10-18 11:36:13.974410: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2021-10-18 11:36:13.991707: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 1190500000 Hz
2021-10-18 11:36:13.992207: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f1d7d234db0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-10-18 11:36:13.992244: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
[main] INFO org.jetbrains.kotlinx.dl.api.core.GraphTrainableModel - ===========================================================================
[main] INFO org.jetbrains.kotlinx.dl.api.core.GraphTrainableModel - Model: Sequential
[main] INFO org.jetbrains.kotlinx.dl.api.core.GraphTrainableModel - ___________________________________________________________________________
[main] INFO org.jetbrains.kotlinx.dl.api.core.GraphTrainableModel - Layer (type)                           Output Shape              Param #   
[main] INFO org.jetbrains.kotlinx.dl.api.core.GraphTrainableModel - ===========================================================================
[main] INFO org.jetbrains.kotlinx.dl.api.core.GraphTrainableModel - input_1(Input)                         [None, 28, 28, 1]         0
[main] INFO org.jetbrains.kotlinx.dl.api.core.GraphTrainableModel - ___________________________________________________________________________
[main] INFO org.jetbrains.kotlinx.dl.api.core.GraphTrainableModel - flatten_2(Flatten)                     [None, 784]               0
[main] INFO org.jetbrains.kotlinx.dl.api.core.GraphTrainableModel - ___________________________________________________________________________
[main] INFO org.jetbrains.kotlinx.dl.api.core.GraphTrainableModel - dense_3(Dense)                         [None, 256]               200960
[main] INFO org.jetbrains.kotlinx.dl.api.core.GraphTrainableModel - ___________________________________________________________________________
[main] INFO org.jetbrains.kotlinx.dl.api.core.GraphTrainableModel - dense_4(Dense)                         [None, 128]               32896
[main] INFO org.jetbrains.kotlinx.dl.api.core.GraphTrainableModel - ___________________________________________________________________________
[main] INFO org.jetbrains.kotlinx.dl.api.core.GraphTrainableModel - dense_5(Dense)                         [None, 10]                1290
[main] INFO org.jetbrains.kotlinx.dl.api.core.GraphTrainableModel - ___________________________________________________________________________
[main] INFO org.jetbrains.kotlinx.dl.api.core.GraphTrainableModel - ===========================================================================
[main] INFO org.jetbrains.kotlinx.dl.api.core.GraphTrainableModel - Total trainable params: 235146
[main] INFO org.jetbrains.kotlinx.dl.api.core.GraphTrainableModel - Total frozen params: 0
[main] INFO org.jetbrains.kotlinx.dl.api.core.GraphTrainableModel - Total params: 235146
[main] INFO org.jetbrains.kotlinx.dl.api.core.GraphTrainableModel - ===========================================================================
[main] INFO org.jetbrains.kotlinx.dl.api.core.GraphTrainableModel - epochs: 1 loss: 1.5531934 metric: 0.91681665
[main] INFO org.jetbrains.kotlinx.dl.api.core.GraphTrainableModel - epochs: 2 loss: 1.5114808 metric: 0.95426667
[main] INFO org.jetbrains.kotlinx.dl.api.core.GraphTrainableModel - epochs: 3 loss: 1.5006521 metric: 0.96405
[main] INFO org.jetbrains.kotlinx.dl.api.core.GraphTrainableModel - epochs: 4 loss: 1.4932657 metric: 0.97105
[main] INFO org.jetbrains.kotlinx.dl.api.core.GraphTrainableModel - epochs: 5 loss: 1.4879782 metric: 0.97581667
[main] INFO org.jetbrains.kotlinx.dl.api.core.GraphTrainableModel - epochs: 6 loss: 1.4843937 metric: 0.97915
[main] INFO org.jetbrains.kotlinx.dl.api.core.GraphTrainableModel - epochs: 7 loss: 1.4815989 metric: 0.98175
[main] INFO org.jetbrains.kotlinx.dl.api.core.GraphTrainableModel - epochs: 8 loss: 1.479125 metric: 0.98373336
[main] INFO org.jetbrains.kotlinx.dl.api.core.GraphTrainableModel - epochs: 9 loss: 1.4773757 metric: 0.9853
[main] INFO org.jetbrains.kotlinx.dl.api.core.GraphTrainableModel - epochs: 10 loss: 1.4757954 metric: 0.9867333
EvaluationResult(lossValue=1.4855552911758423, metrics={ACCURACY=0.97607421875})

Python

$ python python/mnist.py

Outputs:

2021-10-18 10:58:21.579529: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2021-10-18 10:58:21.585691: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:1a:00.0 name: GeForce RTX 2080 Ti computeCapability: 7.5
coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.76GiB deviceMemoryBandwidth: 573.69GiB/s
2021-10-18 10:58:21.586119: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 1 with properties: 
pciBusID: 0000:68:00.0 name: GeForce RTX 2080 Ti computeCapability: 7.5
coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.76GiB deviceMemoryBandwidth: 573.69GiB/s
2021-10-18 10:58:21.586272: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2021-10-18 10:58:21.587325: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2021-10-18 10:58:21.588445: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2021-10-18 10:58:21.588666: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2021-10-18 10:58:21.589871: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2021-10-18 10:58:21.590520: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2021-10-18 10:58:21.593104: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2021-10-18 10:58:21.594470: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0, 1
2021-10-18 10:58:21.594647: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2021-10-18 10:58:21.599731: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 3699850000 Hz
2021-10-18 10:58:21.600265: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fb000000b20 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-10-18 10:58:21.600279: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2021-10-18 10:58:21.895407: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x56097c3007e0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2021-10-18 10:58:21.895430: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): GeForce RTX 2080 Ti, Compute Capability 7.5
2021-10-18 10:58:21.895435: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (1): GeForce RTX 2080 Ti, Compute Capability 7.5
2021-10-18 10:58:21.895962: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:1a:00.0 name: GeForce RTX 2080 Ti computeCapability: 7.5
coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.76GiB deviceMemoryBandwidth: 573.69GiB/s
2021-10-18 10:58:21.896369: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 1 with properties: 
pciBusID: 0000:68:00.0 name: GeForce RTX 2080 Ti computeCapability: 7.5
coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.76GiB deviceMemoryBandwidth: 573.69GiB/s
2021-10-18 10:58:21.896406: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2021-10-18 10:58:21.896415: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2021-10-18 10:58:21.896426: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2021-10-18 10:58:21.896436: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2021-10-18 10:58:21.896444: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2021-10-18 10:58:21.896452: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2021-10-18 10:58:21.896460: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2021-10-18 10:58:21.897642: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0, 1
2021-10-18 10:58:21.897671: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2021-10-18 10:58:21.898686: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-10-18 10:58:21.898696: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108]      0 1 
2021-10-18 10:58:21.898702: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0:   N N 
2021-10-18 10:58:21.898706: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 1:   N N 
2021-10-18 10:58:21.899649: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 148 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:1a:00.0, compute capability: 7.5)
2021-10-18 10:58:21.900536: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 10184 MB memory) -> physical GPU (device: 1, name: GeForce RTX 2080 Ti, pci bus id: 0000:68:00.0, compute capability: 7.5)
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
flatten (Flatten)            (None, 784)               0         
_________________________________________________________________
dense (Dense)                (None, 256)               200960    
_________________________________________________________________
dense_1 (Dense)              (None, 128)               32896     
_________________________________________________________________
dense_2 (Dense)              (None, 10)                1290      
=================================================================
Total params: 235,146
Trainable params: 235,146
Non-trainable params: 0
_________________________________________________________________
Epoch 1/10
2021-10-18 10:58:22.667553: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
1875/1875 [==============================] - 6s 3ms/step - loss: 1.5839 - accuracy: 0.8797 - val_loss: 1.5560 - val_accuracy: 0.9050
Epoch 2/10
1875/1875 [==============================] - 6s 3ms/step - loss: 1.5712 - accuracy: 0.8896 - val_loss: 1.5731 - val_accuracy: 0.8872
Epoch 3/10
1875/1875 [==============================] - 6s 3ms/step - loss: 1.5696 - accuracy: 0.8910 - val_loss: 1.5977 - val_accuracy: 0.8635
Epoch 4/10
1875/1875 [==============================] - 6s 3ms/step - loss: 1.5702 - accuracy: 0.8905 - val_loss: 1.5692 - val_accuracy: 0.8915
Epoch 5/10
1875/1875 [==============================] - 6s 3ms/step - loss: 1.5738 - accuracy: 0.8872 - val_loss: 1.5686 - val_accuracy: 0.8921
Epoch 6/10
1875/1875 [==============================] - 6s 3ms/step - loss: 1.5667 - accuracy: 0.8943 - val_loss: 1.5547 - val_accuracy: 0.9061
Epoch 7/10
1875/1875 [==============================] - 5s 3ms/step - loss: 1.5681 - accuracy: 0.8929 - val_loss: 1.5700 - val_accuracy: 0.8907
Epoch 8/10
1875/1875 [==============================] - 6s 3ms/step - loss: 1.5694 - accuracy: 0.8915 - val_loss: 1.5717 - val_accuracy: 0.8891
Epoch 9/10
1875/1875 [==============================] - 5s 3ms/step - loss: 1.5708 - accuracy: 0.8902 - val_loss: 1.6057 - val_accuracy: 0.8553
Epoch 10/10
1875/1875 [==============================] - 6s 3ms/step - loss: 1.5722 - accuracy: 0.8887 - val_loss: 1.5660 - val_accuracy: 0.8951
313/313 - 1s - loss: 1.5660 - accuracy: 0.8951
0.8950999975204468
You might also like...
Kotlin microservices with REST, and gRPC using BFF pattern. This repository contains backend services. Everything is dockerized and ready to
Kotlin microservices with REST, and gRPC using BFF pattern. This repository contains backend services. Everything is dockerized and ready to "Go" actually "Kotlin" :-)

Microservices Kotlin gRPC Deployed in EC2, Check it out! This repo contains microservices written in Kotlin with BFF pattern for performing CRUD opera

Clean MVVM with eliminating the usage of context from view models by introducing hilt for DI and sealed classes for displaying Errors in views using shared flows (one time event), and Stateflow for data

Clean ViewModel with Sealed Classes Following are the purposes of this repo Showing how you can remove the need of context in ViewModels. I. By using

Microservices-demo - Microservices demo project using Spring, Kotlin, RabbitMQ, PostgreSQL and Gradle and deployed to Azure Kubernetes
A POC for spring app using testng, cucumber, findbugs, and jacoco framework with failsafe and surefire plugins.

A POC for spring app using testng, cucumber, findbugs, and jacoco framework with failsafe and surefire plugins.

Login-and-Signup - Simple Login-and-Signup with authentication using Firebase API
Login-and-Signup - Simple Login-and-Signup with authentication using Firebase API

Simple Login-and-Signup with authentication using Firebase API. Log in Sign Up

đź“š  Sample Android Components Architecture on a modular word focused on the scalability, testability and maintainability written in Kotlin, following best practices using Jetpack.
đź“š Sample Android Components Architecture on a modular word focused on the scalability, testability and maintainability written in Kotlin, following best practices using Jetpack.

Android Components Architecture in a Modular Word Android Components Architecture in a Modular Word is a sample project that presents modern, 2020 app

Integration Testing Kotlin Multiplatform Kata for Kotlin Developers. The main goal is to practice integration testing using Ktor and Ktor Client Mock
Integration Testing Kotlin Multiplatform Kata for Kotlin Developers. The main goal is to practice integration testing using Ktor and Ktor Client Mock

This kata is a Kotlin multiplatform version of the kata KataTODOApiClientKotlin of Karumi. We are here to practice integration testing using HTTP stub

:cyclone: A Pokedex app using ViewModel, LiveData, Room and Navigation
:cyclone: A Pokedex app using ViewModel, LiveData, Room and Navigation

Pokedex app built with Kotlin Download Go to the releases page to download the latest available apk. Screenshots Development Roadmap Kotlin LiveData N

A clean architecture example. Using Kotlin Flow, Retrofit and Dagger Hilt, etc.
A clean architecture example. Using Kotlin Flow, Retrofit and Dagger Hilt, etc.

android-clean-architecture A clean architecture example. Using Kotlin Flow, Retrofit and Dagger Hilt, etc. Intro Architecture means the overall design

Owner
Wu Han
Ph.D. Student at the University of Exeter in the U.K. for Autonomous System Security. Prior research experience at RT-Thread, LAIX, Xilinx.
Wu Han
A counter down timer for android which supports both dark and light mode and Persian text and digit.

FlipTimerView A counter down timer for android which supports both dark and light mode and Persian text and digit. English Perisan Getting started Ste

Arezoo Nazer 7 Jul 17, 2022
🔥The Android Startup library provides a straightforward, performant way to initialize components at the application startup. Both library developers and app developers can use Android Startup to streamline startup sequences and explicitly set the order of initialization.

??The Android Startup library provides a straightforward, performant way to initialize components at the application startup. Both library developers and app developers can use Android Startup to streamline startup sequences and explicitly set the order of initialization.

Rouse 1.3k Dec 30, 2022
🍓CookHelper - food social network. The Api and Websocket are based on Ktor framework. Dependency injection with Koin library.

CookHelper [ ?? Work in Progress ?? ] CookHelper is a cross-platform application that will allow you to cook a delicious dish from an existing recipe

Arthur 11 Nov 9, 2022
Simple Kotlin application that displays the currently available network interfaces on your machine

Network-Interface-Checker Simple Kotlin application that displays the currently available network interfaces on your machine. An executable jar can be

Joshua Soberg 3 Jun 10, 2022
KTor-Client---Android - The essence of KTor Client for network calls

KTor Client - Android This project encompasses the essence of KTor Client for ne

Mansoor Nisar 2 Jan 18, 2022
This is a sample app to demonstrate the power of using EventSourced models and the ease with which these can be modelled using Kotlin.

Lego 4 Rent This is a sample app to demonstrate the power of using EventSourced models and the ease with which these can be modelled using Kotlin. To

Nico Krijnen 4 Jul 28, 2022
Screencast using Minecraft blocks using Minestom

BlockScreen ??️ Usage Note: This can only be used locally, servers generally don't have capturable screens First, download the latest jar, then to sta

emortal 3 May 4, 2022
A complete Kotlin application built to demonstrate the use of Modern development tools with best practices implementation using multi-module architecture developed using SOLID principles

This repository serves as template and demo for building android applications for scale. It is suited for large teams where individuals can work independently on feature wise and layer wise reducing the dependency on each other.

Devrath 11 Oct 21, 2022
This Kotlin Multiplatform library is for accessing the TMDB API to get movie and TV show content. Using for Android, iOS, and JS projects.

Website | Forum | Documentation | TMDb 3 API Get movie and TV show content from TMDb in a fast and simple way. TMDb API This library gives access to T

Moviebase 37 Dec 29, 2022