Self-Healing Triage Dataset #

Complete audit of all triage decisions made by the sisakuintel-agent self-healing system, published for independent verification and community challenge.

Data Sources
sisakulint Issues/PRs by sisakuintel-agent — False positive reports and fix PRs (45 items)
sisakuintel-worker Scan Reports — Full triage results including true positives (~220 reports, #300–#555)
Last Updated: 2026-03-06

How to Read This Dataset #

The self-healing architecture routes findings through two repositories:

sisakuintel-worker: The scan-orchestrator creates [Scan Report] issues for each scanned repository. A triage comment classifies every finding as TP or FP with reasoning.
sisakulint: When the triage agent identifies a false positive, it creates a bug report Issue and a corresponding fix PR in the scanner’s own repository. Only FP-related items appear here.

The “45 triaged items” referenced in the evaluation metrics counts 27 Finding Issues + 18 Fix PRs created by sisakuintel-agent[bot] in the sisakulint repository. TP findings are documented in sisakuintel-worker Scan Report comments.

Classification Criteria #

A finding is classified as:

True Positive (TP): The flagged pattern is exploitable given the workflow’s trigger type, permission scope, and step dependency graph.
False Positive (FP): Any contextual factor (trigger restrictions, permission scope, step dependencies, safe data types) eliminates exploitability.

Ground truth was determined by the author (Atsushi Sada). This dataset publishes the complete classification with per-finding rationale to enable independent re-classification by any reviewer.

Part A: sisakulint Repository — FP Reports + Fix PRs (45 Items) #

Legend #

Symbol	Meaning
FP	False Positive
Bug	Scanner bug (not a TP/FP classification)
Merged	Fix PR merged after human review
Open	Fix PR pending
N/A	No bot-generated fix PR

Week 1 (2026-01-06 – 2026-01-11) — 12 items #

#	Issue/PR	Type	Classification	Rule	Target Repo	Rationale
1	#234	Issue	FP	syntax (parser)	anthropics/claude-code	`description` is a valid workflow key. Missing from allowed keys list in parse_main.go
2	#235	PR (HikaruEgashira)	Fix → Merged	syntax	—	Fix for #234
3	#240	Issue	FP	cache-poisoning	—	Checking out `base.ref` (target branch) is safe; does not execute PR code
4	#241	PR	Fix → Merged	cache-poisoning	—	Fix for #240
5	#242	Issue	FP (2 findings)	cond, expression	ophub/fnnas	(a) Multi-expression conditions `${{ A }} == ${{ B }}` flagged as “always true”; (b) `cancelled()` treated as undefined function
6	#245	Issue	FP	cond	—	Same root cause as #242
7	#246	Issue	Bug	cond, expression	—	Auto-fix functionality broken for these rules
8	#247	PR	Fix → Merged	cond	—	Fix for #242/#245/#246
9	#249	Issue	FP	permissions	—	`permissions: read-all` is valid (used by OpenSSF Scorecard). Missing from switch statement
10	#250	PR	Fix → Merged	permissions	—	Fix for #249
11	#251	Issue	FP	artifact-poisoning	—	`/tmp` is outside workspace; cannot overwrite source code
12	#252	PR	Fix → Merged	artifact-poisoning	—	Fix for #251

Week 3–4 (2026-01-24 – 2026-02-01) — 6 items #

#	Issue/PR	Type	Classification	Rule	Target Repo	Rationale
13	#307	Issue	FP	impostor-commit	github/copilot-sdk	Valid commit on `releases/v5` branch flagged as impostor
14	#308	Issue	FP (3 findings)	untrusted-checkout, cache-poisoning, code-injection	google/langextract	Job-level `if: github.event_name == 'pull_request'` restricts execution to safe triggers, but rules only check workflow-level triggers
15	#309	Issue	FP (2 findings)	permissions, impostor-commit	nexmoe/VidBee	(a) Job-level permissions defined but workflow-level flagged as missing; (b) valid main branch commit flagged as impostor
16	#310	Issue	FP	artifact-poisoning	OpenBMB/VoxCPM	No checkout step in job — artifact download cannot overwrite source code
17	#311	PR	Fix → Merged	permissions	—	Fix for #309
18	#312	PR	Fix → Merged	artifact-poisoning	—	Fix for #310

Note: #308 was fixed by #315 (by ultra-supara, introducing JobTriggerAnalyzer). Not a bot PR, not counted in 45.

Week 5–6 (2026-02-04 – 2026-02-17) — 10 items #

#	Issue/PR	Type	Classification	Rule	Target Repo	Rationale
19	#324	Issue	FP	impostor-commit	j178/prek	Annotated tag object SHA differs from commit SHA; valid commit flagged
20	#328	Issue	FP	artipacked	—	Auto-fix unconditionally adds `persist-credentials: false`, breaking workflows that need git credentials for `git push`
21	#329	PR	Fix → Merged	artipacked	—	Fix for #328: guard condition checking for upload-artifact
22	#333	Issue	FP (2 findings)	impostor-commit, cache-poisoning	koala73/worldmonitor	(a) Annotated tag handling; (b) `swatinem/rust-cache` false alert
23	#334	PR	Fix → Merged	impostor-commit, cache-poisoning	—	Fix for #324/#333
24	#335	Issue	FP	parser	nearai/ironclaw	YAML anchors (`&name`) and aliases (`*name`) flagged as syntax errors
25	#336	PR	Fix → Merged	parser	—	Fix for #335: added `dereferenceAlias()` helper
26	#337	Issue	FP	parser	—	`dependabot.yml` validated as workflow file, reporting missing `on:`/`jobs:`
27	#338	PR	Fix → Merged	parser	—	Fix for #337

Week 6–7 (2026-02-18 – 2026-02-27) — 10 items #

#	Issue/PR	Type	Classification	Rule	Target Repo	Rationale
28	#339	Issue	FP	dependabot-github-actions	—	Remote scan mode uses `os.Stat()` on local filesystem
29	#340	PR	Fix → Merged	dependabot-github-actions	—	Fix for #339
30	#341	Issue	FP	toctou	google/langextract	`JobTriggerAnalyzer` (PR #315) was not applied to TOCTOU rule variant
31	#342	PR	Fix → Merged	toctou	—	Fix for #341
32	#344	Issue	FP	parser	—	#338 fix not deployed to API server; dependabot.yml FP recurrence
33	#346	Issue	FP	commit-sha	—	Local actions (`./my-action`) are part of the same repo; not subject to supply chain attacks
34	#347	PR	Fix → Merged	commit-sha	—	Fix for #346
35	#348	Issue	Bug	infra	—	Lambda deployment outdated
36	#349	Issue	FP	impostor-commit	—	API rate-limiting causes `getTags()` to return empty; all fallback checks fail, falling through to `isImpostor: true`
37	#350	PR	Fix → Merged	impostor-commit	—	Fix for #349: fail-open on API errors

Week 8–9 (2026-03-01 – 2026-03-03) — 8 items #

#	Issue/PR	Type	Classification	Rule	Target Repo	Rationale
38	#370	Issue	FP	impostor-commit	gsd-build/get-shit-done	Official tagged commits (v4.3.1, v4.4.0) flagged as impostor
39	#371	Issue	FP	code-injection	—	`head.sha` is always 40-char hex; cannot contain shell metacharacters
40	#372	PR	Fix → Merged	code-injection	—	Fix for #371
41	#373	Issue	FP	impostor-commit	dtolnay/rust-toolchain	Non-default branches (`stable`, `nightly`) used as version identifiers; reachability check only compared against default branch
42	#374	PR	Fix → Merged	impostor-commit	—	Fix for #373
43	#375	Issue	FP	secret-exfiltration	—	Webhook URL used as curl destination misidentified as data exfiltration
44	#376	PR	Fix → Open	secret-exfiltration	—	Fix for #375 (superseded by #378)
45	#377	Issue	FP (2 findings)	secret-exfiltration	—	(a) Shell line continuation (`\`) breaks `matchesLegitPattern`; (b) secret used as curl URL positional arg misidentified as data payload
—	#378	PR	Fix → Merged	secret-exfiltration	—	Fix for #375/#377

Visualizations #

Scan Report Classification Distribution #

pie title Scan Report Classification (n≈220)
    "All TP" : 130
    "Mixed (TP+FP)" : 50
    "All/Majority FP" : 20
    "No triage / Non-scan" : 20

FP Corrections by Rule Category #

xychart-beta
    title "False Positive Corrections by Rule Category"
    x-axis ["impostor-commit", "parser", "cond/expr/code-inj", "credential", "artifact-poison", "cache-poison", "permissions", "toctou", "commit-sha"]
    y-axis "Number of Corrections" 0 --> 7
    bar [6, 4, 3, 3, 2, 2, 2, 1, 1]

Self-Healing Timeline (Cumulative FP Corrections) #

xychart-beta
    title "Cumulative FP Corrections Over Time"
    x-axis ["W1 (Jan 6)", "W2", "W3-4 (Jan 24)", "W5-6 (Feb 4)", "W6-7 (Feb 18)", "W8-9 (Mar 1)"]
    y-axis "Cumulative Corrections" 0 --> 27
    line [8, 8, 13, 19, 24, 27]
    bar [8, 0, 5, 6, 5, 3]

Bar = new FP findings per period. Line = cumulative total.

Self-Healing Pipeline Flow #

flowchart LR
    A["sisakuintel-worker\nscans trending repo"] --> B{"Triage\ncomment"}
    B -->|All TP| C["TP findings\ndocumented in\nScan Report"]
    B -->|FP detected| D["sisakuintel-agent\ncreates Issue\nin sisakulint"]
    D --> E["sisakuintel-agent\ncreates Fix PR"]
    E --> F{"Human review"}
    F -->|Approve| G["Merged\n(15 bot PRs)"]
    F -->|Reject/Revise| H["Manual fix\n(2 human PRs)"]
    G --> I["Scanner improved\nFP eliminated"]
    H --> I
    I -.->|"Next scan"| A

    style A fill:#4a90d9,color:#fff
    style B fill:#f5a623,color:#fff
    style C fill:#7ed321,color:#fff
    style D fill:#d0021b,color:#fff
    style E fill:#d0021b,color:#fff
    style G fill:#7ed321,color:#fff
    style I fill:#7ed321,color:#fff

Fix PR Outcome Distribution #

pie title Fix PR Outcomes (18 total)
    "Bot PR Merged" : 15
    "Bot PR Open" : 1
    "Human PR (outside count)" : 2

Part B: sisakuintel-worker Repository — Scan Report Triage (TP + FP) #

Each [Scan Report] issue in sisakuintel-worker contains a triage comment with per-finding TP/FP classification and reasoning. This section summarizes all triaged reports from #300 to #555.

Aggregate Classification (#300–#555, ~220 Scan Reports) #

Classification	Count	Percentage
All TP (every finding confirmed valid)	~130	~60%
Mixed (TP + FP in same report)	~50	~23%
All/Majority FP (scanner bug)	~20	~9%
No triage comment / Non-scan issues	~20	~8%

Representative HIGH/CRITICAL Severity TP Findings #

Worker #	Repository	Detection	Severity
#513	ZhuLinsen/daily_stock_analysis	`pull_request_target` + PR head checkout → external contributor code runs with access to GEMINI_API_KEY, OPENAI_API_KEY, GITHUB_TOKEN	CRITICAL
#542	router-for-me/CLIProxyAPI	Additional HIGH severity vulnerability discovered during triage	HIGH
#469	Veirt/weathr	`code-injection-critical` in homebrew.yml	HIGH
#484	stan-smith/FossFLOW	impostor-commit + dangerous-triggers-critical	HIGH

Representative All-TP Scan Reports #

Worker #	Repository	Findings	Key Rules
#554	rtk-ai/rtk	105	commit-sha, secrets:inherit, dependabot, latest tag
#553	ruvnet/RuView	280	permissions, dependabot, commit-sha
#552	openai/symphony	21	permissions, dependabot, commit-sha
#534	superset-sh/superset	406	commit-sha, artifact-poisoning, cache-poisoning, artipacked
#527	alibaba/OpenSandbox	316	commit-sha, artipacked, dependabot, self-hosted-runner
#509	ruvnet/ruvector	1269	commit-sha, artipacked, permissions
#506	clockworklabs/SpacetimeDB	499	permissions, dependabot, commit-sha, artipacked
#492	D4Vinci/Scrapling	71	all TP
#491	cloudflare/agents	29	all TP
#452	anthropics/claude-quickstarts	68	all TP

Representative Mixed (TP + FP) Scan Reports #

Reports where the triage correctly separated TP from FP within the same scan:

Worker #	Repository	TP Count	FP Count	FP Details	Resulting Fix
#543	mengxi-ream/read-frog	86	2	`secret-exfiltration` FP	→ sisakulint #378
#538	block/goose	many	many	`impostor-commit` mass FP; `reusable-workflow-taint` TP	→ sisakulint #370
#531	gsd-build/get-shit-done	5	2	`impostor-commit` on actions/checkout@v4.3.1	→ sisakulint #370
#536	koala73/worldmonitor	many	few	`impostor-commit` non-default branch	→ sisakulint #374
#520	ruvnet/claude-flow	376	6	6 FP out of 382 total	—
#508	vercel/chat	many	7	dependabot.yml misidentified as workflow	→ sisakulint #338
#465	google/langextract	many	few	`toctou/critical` FP	→ sisakulint #342
#358	google/langextract	40	3+	untrusted-checkout/cache-poisoning/code-injection FP	→ sisakulint #315
#325	anthropics/claude-code	14	5	`description` syntax FP	→ sisakulint #235

FP-to-Fix Traceability #

Complete chain from FP discovery in worker to scanner fix:

Worker #	FP Discovered	→ sisakulint Issue	→ Fix PR	Status
#325	anthropics/claude-code `description` syntax	#234	#235	Merged
#331	marcelscruz/public-apis cache-poisoning `base.ref`	#240	#241	Merged
#336	ophub/fnnas cond/expression	#242	#247	Merged
#371	nexmoe/VidBee permissions + impostor-commit	#309	#311	Merged
#361	OpenBMB/VoxCPM artifact-poisoning without checkout	#310	#312	Merged
#392	j178/prek impostor-commit annotated tag	#324	#334	Merged
#416	koala73/worldmonitor impostor-commit	#333	#334	Merged
#423	nearai/ironclaw YAML anchors	#335	#336	Merged
#508	vercel/chat dependabot.yml	#337	#338	Merged
#358	google/langextract job-level triggers	#308	#315	Merged
#465	google/langextract TOCTOU	#341	#342	Merged
#482	qwibitai/nanoclaw local action commit-sha	#346	#347	Merged
#536	koala73/worldmonitor non-default branch	#373	#374	Merged
#531	gsd-build/get-shit-done official tag	#370	—	Open
#543	mengxi-ream/read-frog secret-exfiltration	#377	#378	Merged

Statistics #

Fix PR Status #

Status	Count	PR Numbers
Bot PR Merged	15	#241, #247, #250, #252, #311, #312, #329, #334, #336, #338, #340, #342, #347, #350, #372, #374, #378
Bot PR Open	1	#376
Human PR Merged (outside 45 count)	2	#235 (HikaruEgashira), #315 (ultra-supara)
Unfixed	1	#370 (open)

Rule Category Breakdown #

Category	Corrections	sisakulint Issues
Impostor-commit	6	#307, #309, #324, #333, #349, #370, #373
Artifact Poisoning	2	#251, #310
Access Control	2	#249, #309
Credential Protection (artipacked, secret-exfiltration)	3	#328, #375, #377
Code Injection (cond, expression, code-injection)	3	#242, #245, #371
Parser/Validation	4	#234, #335, #337, #339
TOCTOU	1	#341
Third Party (commit-sha)	1	#346
Cache Poisoning	2	#240, #333

Verification Guide #

To independently verify any triage decision:

Pick a Scan Report from sisakuintel-worker issues (e.g., #543)
Read the triage comment — it contains the full reasoning for each finding
Clone the target repository listed in the Scan Report
Inspect the workflow file at the specified path and line number
Assess exploitability using the classification criteria above
Compare your classification with the published determination

For FP findings that led to scanner fixes:

Read the sisakulint Issue (e.g., #377) for the FP report with root cause analysis
Read the Fix PR (e.g., #378) for the code change and regression tests
Verify the fix does not reduce detection of true positives by checking CI results

If you disagree with any TP/FP determination, open an issue in the sisakulint repository with your analysis.

Limitations #

Single annotator: Ground truth was classified by the author. Per-finding rationale and triage reasoning are published for independent re-classification. GHSA/GHSL advisory outcomes provide partial external validation for the advisory-confirmed subset, though advisories confirm vulnerability existence at the repository level, not individual rule-level TP/FP status.
FP-only in sisakulint: The sisakulint repository contains only false positive reports and fixes because the architecture routes only FPs to the scanner for correction. TP evidence is in sisakuintel-worker Scan Report comments.
Evolving dataset: The self-healing system continues to operate. New findings, triage decisions, and fixes are added continuously and are publicly visible in real time.