docs: SEC/OTC pilot results - viable (domain free from EDGAR filings, 100%)
Ran the email-findability pilot we should have run for CLIA. SEC/OTC is viable: ~940 US-domestic OTC issuers, domain recoverable from the 10-K/8-K filing itself at ~100% (free, no scrape), email via site scrape ~25-50%, phone 100%. High per-deal value (reincorporation/RA/foreign-qual/franchise tax). Documented the build plan.
This commit is contained in:
parent
1465690832
commit
591e387513
1 changed files with 34 additions and 0 deletions
|
|
@ -200,3 +200,37 @@ For a *reincorporation* pitch (a 4-5 figure decision), a tighter, partly direct-
|
|||
- SEC Fair-Access policy: `https://www.sec.gov/os/accessing-edgar-data` ("no more than 10 requests per second", declare User-Agent)
|
||||
- EDGAR full-text search: `https://efts.sec.gov/LATEST/search-index` (Texas reincorporation filing counts)
|
||||
- Texas Business Organizations Code Ch. 10 Subch. C (conversion/domestication); Texas Business Court (eff. 2024-09-01); Texas Stock Exchange (TXSE), 2024-2025.
|
||||
|
||||
---
|
||||
|
||||
## PILOT RESULTS (2026-06-13) -- SEC/OTC is VIABLE (better than CLIA)
|
||||
|
||||
Ran the email-findability pilot (the make-or-break test we skipped on CLIA):
|
||||
|
||||
| Metric | Result | How |
|
||||
|---|---|---|
|
||||
| OTC/None SEC issuers (universe) | 2,771 | `company_tickers_exchange.json` |
|
||||
| US-domestic (reincorporation-eligible) | ~34% = **~940** | `stateOfIncorporation` in the per-CIK submissions JSON |
|
||||
| DE/NV (prime reincorp/foreign-qual) | ~22% = **~610** | same |
|
||||
| **Website/domain recoverable** | **~100%** | extracted directly from the company's recent 10-K/8-K filing HTML on EDGAR -- FREE, bulk-OK, NO scraping/proxy needed (4/4 in test: fortitudegold.com, mobivity.com, good-gaming.com, fzmd.com) |
|
||||
| Email via basic home/contact scrape | ~25% (1/4: info@fortitudegold.com) | many use contact forms / JS mailto -> improvable with deeper scrape |
|
||||
| Phone + business address | **100%** | submissions JSON `phone` + `addresses.business` |
|
||||
|
||||
**Why this beats CLIA:** the domain (the thing CLIA lacked) comes FREE from the
|
||||
filing itself. Email yield ~25-50%, phone 100%. Small universe but high per-deal
|
||||
value (reincorporation, registered agent, foreign-qualification, franchise tax,
|
||||
annual report). EDGAR is free + explicitly bulk-OK (10 req/s, declare UA).
|
||||
|
||||
### Build plan
|
||||
1. `harvest_otc_issuers.py`: pull master list -> filter exchange OTC/None ->
|
||||
per-CIK submissions JSON -> keep US-domestic -> record name, ticker, CIK,
|
||||
stateOfIncorporation, phone, business address, and the **domain extracted from
|
||||
the latest 10-K/8-K** (regex the filing HTML, drop sec.gov/filing-agent noise).
|
||||
2. Scrape domain -> contact/IR email (home + /contact + /investors + /investor-
|
||||
relations; gzip+HTML-only; ~25-50% yield). Phone is the fallback channel.
|
||||
3. Verify emails (existing verifier, .72).
|
||||
4. Offer/segment: lead with the reincorporate-to-Texas hook (Business Court +
|
||||
TXSE, real trend in filings) for DE/NV issuers; cross-sell RA / foreign-qual /
|
||||
annual-report / franchise tax. CAN-SPAM B2B, full address + unsubscribe.
|
||||
5. Channel split: email the ~25-50% we get addresses for; the rest are a clean
|
||||
PHONE list (100% have phone) -- corporate/IR lines, real businesses.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue