v2

Discovery Domain

Purpose

The Discovery domain powers content surfacing through personalized timelines, search, trending feeds, and creator discovery. It uses SQL-based ranking with weighted signals, Redis-backed seen-post suppression, and scheduled jobs to refresh affinity caches.

Key MVP Principle: No Elasticsearch, no external recommender system — everything runs on PostgreSQL + Redis with carefully tuned SQL queries and in-memory scoring.

Core Responsibilities

  • Timeline Generation — Home feed (personalized), Explore feed (public), Trending feed
  • Search — Full-text search across posts and creators with keyword matching
  • Ranking — Multi-signal scoring system for relevance and personalization
  • Interaction Tracking — Record likes, shares, views, follows for affinity building
  • Signal Refresh — Background jobs to keep ranking weights and affinity fresh

Owned Entities

Entity Purpose
PostSignalCache Pre-computed engagement metrics per post (likes, shares, views, engagement_rate)
UserAffinityCache User preferences aggregated from interactions (niches, categories, creators)
Interaction Raw event log (like, share, view, follow, comment) for analytics

Endpoint Matrix

Method Path Auth Scope Purpose
GET /v1/timeline/home Required User token Personalized feed
GET /v1/timeline/explore Public Public discovery feed
GET /v1/timeline/trending Public Viral content feed
GET /v1/discovery/search Public Search posts by keyword
GET /v1/discovery/creators Public Discover creators
POST /v1/discovery/interactions Required User token Track interaction event

Key Workflows

Home Timeline (Personalized)

User requests /v1/timeline/home
  │
  ├─→ TimelineService::home($user, $limit, $cursor)
  │
  ├─→ Load UserAffinityCache for $user
  │    ├─→ favoriteNiches: [1, 5, 12]
  │    ├─→ favoriteCategories: [23, 45, 67]
  │    └─→ favoriteCreators: [101, 202, 303]
  │
  ├─→ Query published posts with:
  │    ├─→ Eager load: creator, niche, category, media, poll, tags, accessRules
  │    ├─→ Exclude: posts user can't view (gated posts via IAccessService)
  │    ├─→ Exclude: posts already seen (SeenPostStore → Redis)
  │
  ├─→ RankingService::rank($posts, $user, $keyword)
  │    ├─→ Load PostSignalCache for all candidates
  │    ├─→ For each post:
  │    │    ├─→ RecencySignal → 3.0 weight
  │    │    ├─→ EngagementRateSignal → 2.5 weight
  │    │    ├─→ CreatorAffinitySignal → 4.0 weight
  │    │    ├─→ NicheAffinitySignal → 2.0 weight
  │    │    ├─→ CategoryAffinitySignal → 1.5 weight
  │    │    ├─→ AlreadySeenPenalty → -10.0 penalty
  │    │    └─→ Sum weighted scores → discovery_score
  │    ├─→ Sort by discovery_score DESC
  │    └─→ Diversify: max 2 posts per creator in top 20
  │
  ├─→ Paginate with cursor (opaque offset)
  │
  └─→ Mark posts as seen in SeenPostStore (Redis)

Search Flow

User searches for "fitness tips"
  │
  ├─→ SearchService::posts($keyword, $limit, $cursor)
  │
  ├─→ Query posts where:
  │    ├─→ title ILIKE '%fitness%' OR title ILIKE '%tips%'
  │    ├─→ description ILIKE '%fitness%' OR description ILIKE '%tips%'
  │    ├─→ OR tags.name ILIKE '%fitness%' OR tags.name ILIKE '%tips%'
  │
  ├─→ RankingService::rank($posts, $user, "fitness tips")
  │    ├─→ TextMatchSignal → 5.0 weight (keyword relevance)
  │    ├─→ RecencySignal → 2.0 weight
  │    ├─→ EngagementRateSignal → 3.0 weight
  │    └─→ Sort by weighted score
  │
  └─→ Return ranked results

Ranking System

Signal Architecture

Every signal implements ISignalContributor:

interface ISignalContributor
{
    public function identifier(): string;  // e.g., "recency", "engagement_rate"
    
    public function contribute(
        Post $post,
        RankingContextData $context,
        ?PostSignalCache $signals
    ): float;
}

Active Signals (11 Total)

Signal Weight Purpose
RecencySignal 3.0 Boost recent posts (decay curve)
EngagementRateSignal 2.5 Reward high engagement (likes+shares / views)
CreatorAffinitySignal 4.0 Boost posts from creators user follows/likes
NicheAffinitySignal 2.0 Boost posts in user's favorite niches
CategoryAffinitySignal 1.5 Boost posts in user's favorite categories
TagOverlapSignal 1.0 Boost posts with tags user interacted with
TrendingBoostSignal 2.0 Amplify viral content (sudden engagement spike)
TextMatchSignal 5.0 Search keyword relevance (used in search only)
PremiumCreatorSignal 1.5 Slightly favor verified/premium creators
AlreadySeenPenalty -10.0 Suppress posts user already viewed
ColdStartBoostSignal 2.0 Randomize for new users with no affinity

Weight Configuration

Weights are tunable via PlatformSettings:

$weights = [
    'w_recency' => 3.0,
    'w_engagement_rate' => 2.5,
    'w_creator_affinity' => 4.0,
    'w_niche_affinity' => 2.0,
    'w_category_affinity' => 1.5,
    'w_tag_overlap' => 1.0,
    'w_trending' => 2.0,
    'w_text_match' => 5.0,
    'w_premium_creator' => 1.5,
    'w_seen_penalty' => -10.0,
    'w_cold_start' => 2.0,
    'max_per_creator_in_top_20' => 2,
];

Admins can adjust via Filament PlatformSettingsResource.

Operational Jobs

RefreshPostSignalsJob

Schedule: Every 10 minutes (routes/console.php)

Purpose: Update PostSignalCache with fresh engagement metrics

foreach ($posts as $post) {
    $cache = PostSignalCache::firstOrNew(['post_id' => $post->id]);
    
    $cache->likes_count = $post->likes()->count();
    $cache->shares_count = $post->shares()->count();
    $cache->views_count = $post->views()->count();
    $cache->engagement_rate = ($cache->likes_count + $cache->shares_count) / max($cache->views_count, 1);
    
    $cache->save();
}

RefreshUserAffinityJob

Schedule: Daily at 02:30 UTC + after interaction batches

Purpose: Rebuild UserAffinityCache from Interaction log

$interactions = Interaction::where('user_id', $user->id)
    ->where('created_at', '>=', now()->subDays(30))
    ->get();

$niches = $interactions->pluck('post.niche_id')->countBy();
$categories = $interactions->pluck('post.category_id')->countBy();
$creators = $interactions->pluck('post.creator_id')->countBy();

UserAffinityCache::updateOrCreate(['user_id' => $user->id], [
    'favorite_niches' => $niches->sortDesc()->take(5)->keys()->all(),
    'favorite_categories' => $categories->sortDesc()->take(10)->keys()->all(),
    'favorite_creators' => $creators->sortDesc()->take(20)->keys()->all(),
]);

TrackInteractionJob

Trigger: POST /v1/discovery/interactions

Purpose: Log interaction event to interactions table

Interaction::create([
    'user_id' => $user->id,
    'post_id' => $data->postId,
    'creator_id' => $post->creator_id,
    'type' => InteractionType::Like,
]);

// Queue affinity refresh after batch of interactions
if ($user->interactions()->count() % 100 === 0) {
    RefreshUserAffinityJob::dispatch($user->id);
}

Seen Post Suppression

Uses Redis to track viewed posts per user (24-hour TTL):

// SeenPostStore
public function mark(User $user, array $postIds): void
{
    $key = "seen_posts:{$user->id}";
    Redis::sadd($key, ...$postIds);
    Redis::expire($key, 86400); // 24 hours
}

public function idsFor(?User $user): array
{
    if (!$user) return [];
    
    $key = "seen_posts:{$user->id}";
    return Redis::smembers($key);
}

Why 24 hours? Balance freshness (allow re-seeing after a day) vs memory (don't store forever).

Performance Characteristics

Query Complexity

Feed Candidates Filter Ops Rank Ops Total Time
Home 500-1000 posts Access check, seen filter 11 signals × N posts ~100-300ms
Explore 200-500 posts Access check 8 signals × N posts ~50-150ms
Search 100-300 posts Text match 5 signals × N posts ~30-100ms

Scaling Limits

Current MVP capacity: ~100K DAU, ~1M posts

Bottlenecks to watch:

  1. PostSignalCache refresh — Grows with post count (mitigated by batching)
  2. Affinity cache size — One row per active user (fine up to millions)
  3. Seen post Redis memory — ~1KB per user (100K users = 100MB)

Scale-out plan (future):

  • Move to Elasticsearch for search (current SQL ILIKE is fine for MVP)
  • Add read replicas for timeline queries
  • Shard Redis by user ID range

Configuration

Setting Default Purpose
discovery.default_timeline_limit 20 Posts per page
discovery.max_timeline_limit 100 Max posts client can request
discovery.seen_post_ttl 86400 Seconds to remember seen posts
discovery.signal_refresh_interval 600 Seconds between PostSignalCache updates

Dependencies

  • Upstream: Identity (user auth), Content (posts, media, taxonomy)
  • Downstream: None (top of the stack)

Further Reading

Testing

  • Unit Tests: tests/Unit/Discovery/
  • Feature Tests: tests/Feature/Discovery/

Monitoring

Key metrics to track in production:

Metric Target Alert Threshold
Timeline latency (p95) <200ms >500ms
Search latency (p95) <150ms >300ms
Signal refresh duration <30s >60s
Affinity refresh duration <5s/user >10s/user
Redis memory usage <500MB >1GB