Glossary¶

The terms you'll see all over the docs and the agent output, in plain language.

ASTRA¶

Agentic Schema for Transparent Research Analysis. The schema lightcone-cli is built around. ASTRA's job is to capture an analysis's inputs, outputs, and methodological decisions in a single file (astra.yaml); lightcone-cli's job is to execute that spec reproducibly. ASTRA ships separately as the astra-tools package and the astra CLI handles the spec itself (validation, paper management, evidence verification).

astra.yaml¶

Your project's spec file. The single source of truth — every input, output, recipe, and decision is declared here. Sub-analyses can be nested via analyses: references.

Recipe¶

A short shell or Python command that produces an output. Lives inside an output's recipe: block in astra.yaml. Recipes can declare which sibling outputs they depend on:

outputs:
  - id: r2
    recipe:
      command: python scripts/fit.py --output {output[0]}
  - id: fit_plot
    recipe:
      command: python scripts/plot.py --r2_dir {input.r2} --output {output[0]}
      inputs: [r2]

Decision¶

A methodological choice with multiple defensible options (e.g. "standardize features?", "what outlier threshold?"). Decisions live in the decisions: section of astra.yaml along with their default, their options, and their rationale.

Universe¶

One specific selection of decision values. Universes live as YAML files in universes/ (e.g. universes/baseline.yaml, universes/permissive.yaml). Each universe materializes its results to its own directory: results/<universe>/<output_id>/.

If your spec has no universes, lc run materializes against a universe called "default" with all decisions at their declared defaults.

Sub-analysis¶

A nested ASTRA analysis with its own inputs, outputs, and decisions, referenced from a parent's analyses: section. The full tree shares one set of universes; sub-analyses can reference parent decisions with from: references. Sub-analyses are useful when an analysis has genuinely different stages (training vs. inference, fit vs. evaluate); keep things in one analysis when they share the same product.

Manifest¶

The per-output sidecar JSON file (<output_dir>/.lightcone-manifest.json) that records what produced the output and what's inside it. Fields include code_version, data_version, container_image, recipe, decisions, input_versions, git_sha, host, lc_version, and a few more. Manifests are written atomically by lc run and read by lc status and lc verify.

code_version¶

A SHA-256 over (recipe + container_image + decisions). The fingerprint of "what does this rule do?" When it drifts, downstream outputs go stale in lc status.

data_version¶

A SHA-256 over the contents of an output directory (excluding the manifest itself). The fingerprint of "what bytes were produced?" lc verify recomputes this and compares to the recorded value to catch tampering.

input_versions¶

Inside a manifest, a dict mapping each declared input id to its version: the upstream output's data_version when the input is another materialized output, or an mtime-size/sha256 fingerprint when the input is an external file. This is the chain lc verify walks back through.

Container¶

A Docker / Podman / podman-hpc image used to execute a recipe in isolation. Declared at the analysis level (container: Containerfile) or per-recipe (recipe: { container: python:3.12-slim }). Recipe-level overrides win.

Containerfile¶

A Dockerfile by another name (the syntax is identical). lightcone-cli calls them Containerfiles to make clear they work with podman as well as docker.

Image tag¶

The string the runtime uses to identify a built image. lightcone-cli generates content-addressed tags for Containerfile builds: lc-<project>-<sha256[:12]>. The hash covers the Containerfile and your dependency files, so tags only change when the inputs to the build change.

Runtime¶

The OCI tool that actually executes containers: docker, podman, or podman-hpc. Set in ~/.lightcone/config.yaml under container.runtime. auto picks the first usable; none opts out (runs recipes directly on the host).

Snakemake¶

The workflow engine lc run shells out to. You don't need to learn Snakemake to use lightcone-cli — the Snakefile at .lightcone/Snakefile is auto-generated from your astra.yaml. If you're curious, peek at it; just don't edit it (your changes will get overwritten on the next lc run).

Dask¶

The distributed scheduler lc run dispatches jobs through. On a laptop it's a LocalCluster sized to your machine; inside a SLURM allocation it's an in-process scheduler with one dask worker per node launched via srun.

Skill¶

A Claude Code slash command bundled with the lightcone-cli plugin (/lc-new, /lc-build, /lc-verify, /lc-migrate, /lc-feedback). Each one is a structured prompt that drives the agent through a specific phased workflow.

Subagent¶

A Claude Code agent invoked by another agent via the Task tool. The lc-extractor subagent reads PDFs and pulls verifiable quotes; it's spawned by /lc-new during the literature deep-dive phase. Subagents have isolated context, which is why /lc-new uses one per paper — PDFs are big.

Prior insight¶

A piece of evidence from the literature that informs a decision. Lives in the prior_insights: section of astra.yaml. Each insight has a claim, one or more evidence entries with verbatim quotes, and a list of decision options it supports. Quotes are machine-verified against the source PDF.

Finding¶

A conclusion drawn from the analysis (as opposed to a prior insight, which comes into the analysis). Findings live in the findings: section, can cite specific outputs as evidence, and act as the bridge between materialized results and the eventual paper.

Status (`ok`, `stale`, `missing`, `alias`)¶

The four labels lc status produces:

ok — manifest present, recomputed code_version matches.
stale — manifest present but code_version drifted.
missing — no manifest at the expected output directory.
alias — output declared without a recipe; just a reference to another output.

Failure kinds (`tampered_data`, `broken_chain`, `missing_manifest`)¶

The three labels lc verify produces when something's wrong:

tampered_data — bytes on disk no longer match recorded data_version.
broken_chain — recorded input_versions references an upstream whose data_version drifted.
missing_manifest — output directory exists but the manifest is missing or unparseable.

Ralph loop¶

The autonomous build loop driven by /lc-build. Each iteration: survey state, decide what to do next, write/run code, commit, exit. The Claude Code stop hook re-injects the loop prompt until the agent emits BUILD_COMPLETE or hits its iteration limit. State persists across crashes in .claude/ralph-loop.local.md. Cancel with /cancel-ralph.

Permission tier¶

The set of tools and bash patterns Claude Code is allowed to use in your project. Three tiers ship: yolo (everything), recommended (default — full access minus dangerous patterns), minimal (read only). Selected at lc init time and stored in .claude/settings.json.