Clustering & k-NN Analysis Results — HKTVMall Contact Lens Reviews

Author: Composer (Cursor AI Agent)
Date: 2026-06-26
Script: cluster_reviews.py
Outputs: cluster_output/review_clusters.parquet, cluster_output/cluster_report.json


Sources

# Source Role
1 cl_reviews.parquet Base reviews (deduped on review_id, keep last → 16,835 rows)
2 6091e_cl_reviews_model_enriched.parquet LLM dimensions: sentiment, score, topics, complaints, praise
3 initial_assessment_analytics.md Prior exploratory analytics informing feature design

Methodology

Feature engineering

Each review is encoded as a fixed-length vector, StandardScaler-normalized:

Block Pass A (26 features) Pass B (20 features)
Topics 14 canonical multi-hot flags 14 canonical multi-hot flags
Sentiment 5 one-hot + ordinal score Excluded
Behavior ★ rating, LLM m_score, log(comment len), log(#complaints), log(#praise), #topics Same

Canonical topics: comfort, quality, price, packaging, delivery, repurchase, prescription, brand, appearance, customer_service, eye_health, expiry, fit_sizing, other

Algorithms

Method Purpose
KMeans Primary segmentation; k selected by silhouette score (k = 4…12)
k-NN coherence (k=15) Validation — fraction of 15 nearest neighbors sharing same cluster label
HDBSCAN Sanity check on Pass A vectors (density-based; reported separately)

Two-pass design

Pass Name Features Answers
A Full VoC Topics + sentiment + behavior How does the customer feel and express it?
B Topic archetypes Topics + behavior only What experience path are they on, regardless of sentiment label?

Pass A — Full VoC Segmentation

Best k: 5 (silhouette 0.469)
k-NN coherence (k=15): mean 0.997, median 1.000, 99.5% of reviews have ≥80% same-cluster neighbors

k-selection table

k Silhouette Calinski-Harabasz Davies-Bouldin
4 0.4658 8514 1.191
5 0.4691 7532 0.960
6 0.4126 6990 1.327
7 0.4163 6994 1.142
8 0.4255 6678 1.159
9 0.4036 6112 1.403
10 0.3646 5839 1.283
11 0.3560 5356 1.527
12 0.3967 5093 1.534

Cluster profiles

Cluster n Share Avg ★ Avg LLM score Avg comment len Archetype
0 6,759 40.1% 4.84 4.82 5.3 Silent promoters
1 7,054 41.9% 4.82 4.82 12.9 Articulate advocates
2 1,751 10.4% 1.78 1.76 45.9 Vocal detractors
3 1,263 7.5% 3.62 3.63 31.1 Ambivalent critics
4 8 0.05% 3.75 0.0 60.2 LLM parse failures (ignore)

Sentiment mix by cluster

Cluster Sentiment breakdown
0 positive 87%, neutral 7%, mixed 4%, negative 2%
1 positive 100%
2 negative 99%, neutral 1%
3 mixed 98%, neutral 2%
4 unknown 100%

Top topics by cluster

Cluster Top 5 topics
0 eye_health, fit_sizing, comfort, expiry, brand
1 comfort, fit_sizing, appearance, quality, packaging
2 fit_sizing, eye_health, quality, packaging, comfort
3 fit_sizing, comfort, quality, appearance, packaging

Top brands by cluster

Cluster Top brands (review count)
0 ACUVUE (2,158), 博士倫 (1,357), Bausch + Lomb (828)
1 ACUVUE (2,126), 博士倫 (1,231), OLENS (949)
2 ACUVUE (541), 博士倫 (360), OLENS (236)
3 ACUVUE (349), 博士倫 (230), OLENS (220)

Centroid signatures (top z-scored features)

Cluster Defining features (z-score)
0 log_n_praise (+1.19), topic_other (+1.17), topic_prescription (+1.12), topic_expiry (+1.11), topic_brand (+1.06)
1 topic_expiry (−0.82), topic_other (−0.81), log_n_praise (−0.81), topic_eye_health (−0.78), topic_prescription (−0.77)
2 sent_negative (+2.80), n_topics (−2.29), sentiment_ord (−2.28), rating (−2.27), sent_positive (−1.81)
3 sent_mixed (+3.12), sent_positive (−1.81), log_comment_len (+1.40), n_topics (−1.36), m_score (+0.92)
4 sent_unknown (+45.86) — artifact

Prototype reviews (nearest to centroid)

Cluster 0 — Silent promoters

Brand Comment
博士倫 5 GOOD
博士倫 5 回購。扺

Cluster 1 — Articulate advocates

Brand Comment
OLENS 5 唔乾,超舒服
Candymagic 5 透氣度高,非常舒服

Cluster 2 — Vocal detractors

Brand Comment
ACUVUE 2 一盒Con裏面有一半都係吉眼嘅
Bausch + Lomb 1 買左兩盒 有一盒戴左上去矇茶茶 已經換左幾隻戴都是一樣

Cluster 3 — Ambivalent critics

Brand Comment
博士倫 4 送既隨機顏色隱形眼鏡係無色,唔係大眼仔
OLENS 4 幾靚,但未算好舒服戴得

Pass A interpretation

The market splits into ~82% promoters (clusters 0 + 1) with two behavioral modes:

  • Silent (40%) — ultra-short 5★ reviews, corrective-lens topic bias
  • Articulate (42%) — comfort/appearance-led praise, OLENS-heavy

~10% hard detractors (cluster 2) write long fit + eye_health complaints at ★1.8.
~8% ambivalent (cluster 3) are structurally mixed — trade-off language at ★3.6.

Clusters 0 and 1 share high stars but differ in verbosity and topic emphasis, not sentiment.


Pass B — Topic Archetypes (no sentiment features)

Best k: 4 (silhouette 0.414)
k-NN coherence (k=15): mean 0.973, median 1.000, 95.6% with ≥80% same-cluster neighbors

k-selection table

k Silhouette Calinski-Harabasz Davies-Bouldin
4 0.4139 11104 1.556
5 0.4059 9055 1.798
6 0.3190 7858 1.666
7 0.3056 6847 1.934
8 0.3014 6230 1.849
9 0.2944 5623 2.012
10 0.2880 5232 1.996
11 0.3179 5009 1.862
12 0.3206 4721 1.825

Cluster profiles

Cluster n Share Avg ★ Avg LLM score Avg comment len Archetype
1 6,785 40.3% 4.82 4.81 5.5 Corrective lens buyers
0 4,502 26.7% 4.81 4.81 14.4 Value loyalists
3 3,486 20.7% 4.73 4.74 13.4 Cosmetic enthusiasts
2 2,062 12.2% 1.70 1.69 47.7 Failure bundle

Sentiment mix by cluster (within topic archetypes)

Cluster Sentiment breakdown
1 (corrective) positive 86%, neutral 7%, mixed 4%, negative 2%
0 (value) positive 87%, mixed 9%, negative 3%, neutral 1%
3 (cosmetic) positive 89%, mixed 10%, negative 1%
2 (failure) negative 74%, mixed 23%, positive 2%

Top topics by cluster

Cluster Top 5 topics
1 (corrective) fit_sizing, eye_health, expiry, delivery, packaging
0 (value) packaging, price, fit_sizing, delivery, repurchase
3 (cosmetic) comfort, appearance, quality, fit_sizing, customer_service
2 (failure) fit_sizing, packaging, quality, eye_health, comfort

Top brands by cluster

Cluster Top brands (review count)
1 (corrective) ACUVUE (2,169), 博士倫 (1,361), Bausch + Lomb (828)
0 (value) ACUVUE (1,507), 博士倫 (895), Bausch + Lomb (523)
3 (cosmetic) ACUVUE (863), OLENS (625), 博士倫 (511)
2 (failure) ACUVUE (640), 博士倫 (412), OLENS (296)

Centroid signatures

Cluster Defining features (z-score)
0 (value) topic_comfort (−1.14), topic_quality (−1.02), topic_appearance (−1.00), n_topics (−0.95) — defined by absence of cosmetic topics
1 (corrective) n_topics (+1.19), topic_other (+1.17), topic_prescription (+1.12), topic_expiry (+1.11), topic_brand (+1.06)
2 (failure) rating (−2.34), m_score (−2.33), log_n_complaints (+1.64), log_n_praise (−1.43), log_comment_len (+1.32)
3 (cosmetic) topic_packaging (−1.02), topic_price (−0.92), topic_delivery (−0.92), topic_repurchase (−0.84) — defined by absence of ops/value topics

Prototype reviews

Cluster 0 — Value loyalists

Brand Comment
Bausch + Lomb 5 訂購方便
Johnson & Johnson 5 買一盒都有贈品,抵!

Cluster 1 — Corrective lens buyers

Brand Comment
博士倫 5 抵,回購
博士倫 5 完好無缺

Cluster 2 — Failure bundle

Brand Comment
TICON 1 買了兩盒,7.00度的是否也一起退回?
Delight 2 為什麼二盒打開不一樣的?跟新包裝也不一樣

Cluster 3 — Cosmetic enthusiasts

Brand Comment
CooperVision 5 無得彈,非常滿意
Candymagic 5 平靚正 日常使用

Pass B interpretation

Three distinct happy paths converge on ~4.8★ for different reasons:

  1. Corrective (40%) — fit, prescription, expiry, brand trust
  2. Value (27%) — price, repurchase, packaging, delivery
  3. Cosmetic (21%) — comfort, appearance, quality; OLENS over-indexed

Failure bundle (12%) is topic-structural: fit + packaging + delivery co-occur with low ratings and high complaint volume. It captures 74% negative AND 23% mixed — the mixed reviews here are ops/fit problems expressed ambivalently, not price trade-offs.


Cross-tab: Pass A (tone) × Pass B (experience)

Rows = kmeans_cluster (Pass A), columns = topic_cluster (Pass B):

Pass A ↓ / Pass B → B1 Corrective B2 Failure B3 Cosmetic B0 Value
A0 Silent promo 6,750 1 8 0
A1 Articulate adv 0 34 3,079 3,941
A2 Detractors 26 1,531 43 151
A3 Ambivalent 9 488 356 410
A4 Parse fail 0 8 0 0

Cross-tab reads

Pattern Finding
Silent promoters (A0) 99.9% corrective archetype — terse repurchase/fit reviews
Articulate advocates (A1) Split 56% value / 44% cosmetic — two praise vocabularies
Detractors (A2) 87% failure bundle (1,531 / 1,751)
Ambivalent (A3) Spread: 39% failure, 33% value, 28% cosmetic — true trade-off reviews

k-NN validation summary

Metric Pass A (full VoC) Pass B (topic archetypes)
Neighbors (k) 15 15
Mean same-cluster share 0.997 0.973
Median 1.000 1.000
Reviews with ≥80% coherent neighbors 99.5% 95.6%

Pass A coherence is near-perfect because sentiment one-hot features dominate separation. Pass B's slightly lower coherence (97.3%) reflects softer boundaries between value and cosmetic happy paths — expected and healthy.


HDBSCAN sanity check (Pass A vectors)

Metric Value
Clusters found 22
Noise points 3,088 (18.3%)
Silhouette (excl. noise) 0.454

HDBSCAN over-segments for executive use. KMeans with k=4–5 yields interpretable archetypes; HDBSCAN useful for micro-SKU anomaly detection later.


Strategic actions by segment

Segment Pass A Pass B Recommended action
Silent promoters A0 B1 Low-touch retention; repurchase nudges
Articulate advocates A1 B0 or B3 Loyalty programs (value) or cosmetic upsell (OLENS)
Vocal detractors A2 B2 Fit guides, QC on packaging/expiry, eye health response
Ambivalent critics A3 B2/B0/B3 Price-match policy; expectation setting on color lenses
Failure bundle any B2 Cross-functional ops — delivery, packaging, fit, returns

Reproducing

cd /home/sidmishra/ontologer/subprojects/review_sequencing_sentiment
source venv/bin/activate
python cluster_reviews.py

Outputs:

  • cluster_output/review_clusters.parquet — all reviews with kmeans_cluster, topic_cluster, hdbscan_cluster
  • cluster_output/cluster_report.json — full numeric results and prototypes

Join key: review_id → base cl_reviews.parquet or enrichment file.


End of clustering results.

Updated: 2026-06-26 13:55 via generate_clustering_results.py

Report block convention

Each analytical section uses up to three executable layers:

Layer Tag Role
Results [result:…]…[/result] Markdown table — findings
Data SQL [babysql:polars]…[/babysql] Polars SQL on parquet — reproduce inputs/aggregates
ML / stats [ml:…]…[/ml] Python estimator spec (sklearn, statsmodels, scipy)

Prose How we found this (math) gives the formulas. Cached artifacts live in cluster_output/.

Pass A — Full VoC Segmentation (auto-generated blocks)

[result:review_cluster_k_selection] | k | Silhouette | | --- | --- | | 4 | 0.4658 | | 5 | 0.4691 | | 6 | 0.4126 | | 7 | 0.4163 | | 8 | 0.4255 | | 9 | 0.4036 | | 10 | 0.3646 | | 11 | 0.3560 | | 12 | 0.3967 |

Cluster N % Avg ★ Top topics
1 7054 41.9% 4.82 comfort, fit_sizing, appearance
0 6759 40.1% 4.84 fit_sizing, eye_health, comfort
2 1751 10.4% 1.78 fit_sizing, eye_health, quality
3 1263 7.5% 3.62 fit_sizing, comfort, quality
4 8 0.0% 3.75

k-NN coherence (k=15): 0.997 [/result]

[babysql:polars] SELECT k, silhouette, calinski_harabasz, davies_bouldin FROM read_csv('/home/sidmishra/ontologer/subprojects/review_sequencing_sentiment/cluster_output/review_k_selection.csv') ORDER BY k [/babysql]

[babysql:polars] SELECT kmeans_cluster AS cluster, COUNT() AS n, ROUND(AVG(rating), 2) AS avg_rating, ROUND(100.0 * SUM(CASE WHEN m_sentiment = 'positive' THEN 1 ELSE 0 END) / COUNT(), 1) AS pos_pct FROM '/home/sidmishra/ontologer/subprojects/review_sequencing_sentiment/cluster_output/review_clusters.parquet' GROUP BY kmeans_cluster ORDER BY cluster [/babysql]

[ml:sklearn-kmeans] library: scikit-learn script: cluster_reviews.py :: run_clustering unit: review (one row per review_id) features: 15 topic flags + sentiment one-hot + rating/m_score/verbosity preprocess: StandardScaler model: KMeans(n_init=10, random_state=42) selection: k in 4..12, maximize silhouette validation: knn_coherence(k=15) artifact: cluster_output/review_clusters.parquet [/ml]

How we found this (math): KMeans on z-scored features; silhouette s=(b−a)/max(a,b).

Pass B — Topic archetypes (auto-generated blocks)

[result:topic_cluster_k_selection] | k | Silhouette | | --- | --- | | 4 | 0.4139 | | 5 | 0.4059 | | 6 | 0.3190 | | 7 | 0.3056 | | 8 | 0.3014 | | 9 | 0.2944 | | 10 | 0.2880 | | 11 | 0.3179 | | 12 | 0.3206 |

Best k=4; k-NN coherence 0.973 [/result]

[ml:sklearn-kmeans] library: scikit-learn script: cluster_reviews.py :: run_clustering(include_sentiment=False) features: topics + behavior only (no sentiment one-hot) [/ml]