The goal of this project was to create a tool using which users can analyse data in various ways by applying machine learning clusterization and other methods on specific datasets. Target users may vary from person having some knowledge of clustering techniques but not coding and ML engineers as well. You can input your data, analyze it and output a file with insights.
Project Details
In order to use web application there was a workflow established. First, user should upload input dataset in csv or xlsx format. Then, so called record name column should be selected, which is considered a unique column. After that users can normalize data, select X, Y and Z lens, i.e. 3 dimesions for further analysis. Next user chooses from 3 available clustering algorithms DBSCAN, MeanShift, MST and hyperparameters for them. As output you get interactive D3 powered network graph. https://observablehq.com/@d3/force-directed-graph
Finally, users can select nodes interesting for them which will be colored in specific colors based on dimensions. User chooses and outputs to csv or excel format with new columns which indicates some cluster ids and other useful information for future analysis.