- User data improves their technology. Examples: Midjourney, TikTok, search/ads, deployed self driving (Tesla FSD, comma), v0.
- User data doesn’t improve their technology. Must pay for human labelers.
However, type-2 companies can bootstrap valuable data generation. Small amounts of initial human-labeled data can incrementally train better models that then speed up human labeling. Segment Anything is a golden example of this (new?) paradigm of data synthesis.