Commit Graph

7 Commits (9f3813a83abe26d9a0da9f3402ae0a233b100597)

Author SHA1 Message Date
Timo Bryant cc727c681a adding core api 2023-12-27 16:11:12 +01:00
Timo Bryant 1ef987f611 rework build logic 2023-12-22 00:18:53 +01:00
Timo Bryant 4cafac4583 refactoring into parallelUnordered method 2023-12-18 22:55:29 +01:00
Timo Bryant d995b26459 rewriting IDF stuff 2023-12-17 17:46:51 +01:00
Timo Bryant ca51b50306 Refactor code and add functionality for term frequency calculation
The major changes in this commit involve code refactoring and adding new functionality to calculate Term frequency (TF). The TF is now computed as a separate step from the TF-IDF calculation, which improves the modularity and maintainability of the code. Additionally, an unnecessary test file (MessageUtilsTest.kt) has been removed, and various dependencies have been updated or removed as needed. A few changes were also made to improve the readability and usability of the code.
2023-12-15 21:14:36 +01:00
Timo Bryant 67d65cee93 Add text processing and tfidf libraries
This commit introduces two new libraries: textprocessing and tfidf. The textprocessing library provides classes to read words from a text file, generate histogram from the words, and store the histogram to a CSV file. The tfidf library adds support for term frequency–inverse document frequency (tf-idf) computation using the functionalities provided by the textprocessing library.
2023-12-15 17:17:27 +01:00
Timo Bryant 1259dc8764 Add build.gradle.kts file in tfidf library
Added a new build.gradle.kts file in the tfidf library. This file includes the "docthor.kotlin-library-conventions" plugin. This is the initial setup for the build configuration of the tfidf library.
2023-12-15 15:40:24 +01:00