# LinkedIn Analytics — Project Instructions ## Purpose Track your LinkedIn growth over time. Maintain a persistent HTML dashboard with sub-pages that updates whenever new data is added to the inbox. ## Folder structure ``` projects/LinkedInAnalytics/ inbox/ ← drop new LinkedIn xlsx downloads here (trigger for update) data/ ← processed/merged JSON files (source of truth for dashboard) pages/ ← sub-page HTML files linked from dashboard dashboard.html ← main dashboard CLAUDE.md ← this file ``` ## Data sources LinkedIn allows two CSV/xlsx exports from https://www.linkedin.com/analytics/: - **Audience analytics** — follower growth by day, demographics (company, location, seniority, job title) - **Content analytics** — daily impressions and engagements, top posts by impressions and by engagement Maximum export window is 365 days. Download both when refreshing data. **Single-post analytics** can be downloaded from LinkedIn for individual posts (three-dot menu → Analytics). These give richer per-post metrics not available in the aggregate export: members reached, saves, reposts, sends, link clicks, profile viewers, followers gained from post, and per-post viewer demographics including Industry. Filename format: `SinglePostAnalytics_[Your Name]_ACTIVITYID.xlsx`. Only worth downloading for viral posts (see threshold below). ## Workflow: adding new data 1. Drop new xlsx files into `inbox/` 2. Detect file types by filename pattern: - `AggregateAnalytics_*_Audience.xlsx` → follower + demographics data - `AggregateAnalytics_*_Content.xlsx` → impressions + top posts data - `SinglePostAnalytics_*.xlsx` → per-post enriched data (see single-post workflow below) 3. Merge aggregate data with existing JSON in `data/` (deduplicate by date — overlapping date ranges are expected) 4. Write updated JSON to `data/` 5. Regenerate `dashboard.html` and all `pages/` HTML files 6. Archive processed inbox files to `data/raw/YYYY-MM-DD/` so inbox stays clean 7. **Viral post advisory** (ALWAYS do this after processing new aggregate data): scan `posts.json` for posts with impressions ≥ 5,000 that do NOT have an entry in `single_post_details.json`. List them and say: "These posts crossed the viral threshold — consider downloading single-post analytics for them from LinkedIn (three-dot menu on the post → Analytics, then export)." 8. Report what changed (date range extended, new posts added, etc.) ### Single-post workflow When a `SinglePostAnalytics_*.xlsx` file is in the inbox: 1. Extract the activity ID from the filename (last numeric segment before `.xlsx`) 2. Parse the file with openpyxl — it is a key-value layout, not tabular. Fields: Post URL, Post Date, Post Publish Time, Impressions, Members reached, Profile viewers from this post, Followers gained from this post, Social engagements, Reactions, Comments, Reposts, Saves, Sends on LinkedIn, Link engagements. Demographics section has rows with (Category, Value, %) columns for: Job title, Location, Seniority, Company, Company size, Industry. 3. Write to `data/single_post_details.json` keyed by activity ID (merge, don't overwrite existing entries) 4. Regenerate `pages/top-posts.html` so the enriched badge and detail panel appear for this post ## Data files in data/ - `followers.json` — daily new followers + cumulative total - `impressions.json` — daily impressions + engagements - `posts.json` — top posts from LinkedIn aggregate export (url, date, engagements, impressions, title) - `single_post_details.json` — enriched per-post data keyed by LinkedIn activity ID; fields: url, date, publish_time, impressions, members_reached, profile_viewers, followers_gained, social_engagements, reactions, comments, reposts, saves, sends, link_engagements, demographics (job_title, location, seniority, company, company_size, industry — each an array of {value, pct}) - `demographics.json` — latest audience demographics snapshot (company, location, seniority, job title) - `meta.json` — date range covered, last updated, total followers at last update Note: if you have a historical post export from Taplio (CSV), you can add a `taplio_posts.json` file with the full post history and reference it in the viral post advisory step. This is optional — the setup works without it. ### Viral threshold A post is considered viral at ≥ 5,000 impressions. Posts above this threshold should be flagged for single-post analytics download. Posts above 20,000 impressions are exceptional — always flag these prominently. ## Dashboard structure `dashboard.html` — main page, shows: - KPI cards: total followers, new followers (last 30d / last 90d), impressions (last 30d), engagement rate - Follower growth chart (full history, cumulative line) - Monthly new followers bar chart - Links to sub-pages Sub-pages in `pages/`: - `followers.html` — detailed follower growth, daily view, monthly breakdown - `content.html` — impressions and engagement over time, daily chart - `top-posts.html` — table of top posts sorted by impressions and by engagements - `demographics.html` — audience breakdown: seniority, company, location, job title ## Design rules - Self-contained HTML files — no external dependencies except Chart.js from cdnjs.cloudflare.com - Navigation bar on every page linking back to dashboard and between sub-pages - Dark mode compatible using CSS variables where possible; fallback to hardcoded hex for Chart.js - Data is embedded inline in each HTML file (no separate fetch — files need to open locally) - Consistent color scheme: blue (#378ADD) for primary data, coral (#D85A30) for highlights/spikes