PRM Gateway

TPO

PRM

The Sequel

Metrics

EURE

LDI

RACS

Pipeline

Data

Integrity

Project Rocket Man

Project Rocket Man is a rights-controlled, human-authored lyrical language corpus and benchmark environment built for public-safe aggregate analysis, private technical review, and potential AI/data licensing.

PRM measures a protected solo-authored corpus through entropy, lexical diversity, tokenizer behavior, repetition control, phonetic/rhyme structure, and reproducible aggregate outputs while keeping the protected source text private.

Public pages show aggregate evidence, methods, charts, and safeguards. Protected text, source mappings, reconstructable manifests, and audit-level materials are reserved for controlled review.

Technical / Licensing Review

Role

Corpus doorway

Frames PRM as a rights-controlled human-authored language project, not a raw text archive.

Headline Evidence

Aggregate only

Keeps charts and claims inspectable while protected text and mappings stay private.

Review

Inquiry-ready

Positions AI evaluation, linguistic benchmarking, provenance review, and dataset licensing inquiry as controlled next steps.

What PRM is

Project Rocket Man is the public-safe research map for a protected solo-authored corpus. It tracks structure, vocabulary behavior, repetition pressure, tokenizer effects, and evidence quality through aggregate outputs intended for AI evaluation, linguistic benchmarking, provenance review, and controlled technical review.

What The Sequel is

The PRM Dataset is the protected solo-authored corpus behind the measurement run. Public pages describe it only through aggregate segments and comparison-safe labels. Once that dataset is split into broad stages, PRM Half 1 becomes the early-stage baseline and PRM Half 2 becomes the late-stage test. That second half is The Sequel. The question is whether the late stage merely continues the baseline or separates in complexity, lexical discipline, and repetition control.

Start Here

The Sequel

Begin with the headline finding and the charts that make it public-safe.

Methods & Metrics

Compare entropy, lexical behavior, repetition control, EURE, LDI, and RACS without blending their meanings.

Dataset & Integrity

See how protected corpus families, slices, transforms, and hygiene rules stay separated.

Pipeline & Reproducibility

Trace how protected inputs become tokenized, measured, summarized, charted, and reviewed.

Controlled Review

Start here for the public-safe review pathway, disclosure boundary, glossary, and controlled-access inquiry route.

PRM Glossary

Decode PRM terms such as protected corpus, tokenizer, slice size, EURE, LDI, and RACS.

Disclosure Boundary

Read what the public site can show and what remains reserved for controlled review.

Public-safe boundary

Public pages show aggregate evidence, methods, charts, safeguards, metric behavior, and method provenance. Protected text, source mappings, reconstructable manifests, source titles, private identities, and audit-level materials stay out of view.