Why clean datasets matter
Training quality lives or dies on data hygiene. Consistent folders, sensible filenames, and clear captions help Kohya learn the right patterns quickly — with fewer steps and less VRAM.
Folder structure & naming
Keep one subject/style per subfolder. Avoid spaces/odd symbols in names.
/your-dataset/
character_a/
0001.jpg
0002.jpg
0003.jpg
0001.txt # optional caption sidecars
character_b/
0001.jpg
0002.jpg
...
- Filenames: stick to
a-z, 0-9, -, _. Avoid spaces. - Images: keep them reasonably sharp; remove near-duplicates.
- Resolution: aim for consistent long side (e.g., 640–1024) before training.
Auto path-fix
If you moved or renamed folders, Training Center attempts an auto path-fix when you choose the dataset root.
Captions: manual or auto
You can provide .txt sidecars or let the app auto-caption.
Auto-caption requires A1111 running — start A1111 from the Launcher, then hit Auto Caption. Stop A1111 after captioning to free VRAM.
Caption tips
- Style: concise nouns/adjectives, avoid story prose.
- Content: describe pose/clothes/background if relevant.
- Consistency: use similar wording across the same subject.
Caption Example:
sks_waifu, 1girl, bangs, bare legs, bare shoulders, barefoot, beach, blonde hair, blue dress, blue eyes, blue sky, boat, branch, breasts, cherry blossoms, cloud, cloudy sky, day, dress, falling petals, flower, gradient sky, horizon, knees up, lake, landscape, long hair, looking at viewer, mount fuji, mountain, mountainous horizon, ocean, outdoors, petals, petals on liquid, pink flower, planet, pond, river, rock, sand, scenery, seashell, shell, shore, sitting, sky, sleeveless, smile, solo, spring \(season\), starfish, sunrise, sunset, tree, water, watercraft, waves
Quick quality checks
- Scan thumbnails — remove obvious blurs/dupes.
- Ensure per-folder consistency (one subject/style each).
- Verify captions exist (if you use them) and aren’t contradictory.
- Normalize long side (e.g., 768) to stabilize training.
System checks (before training)
Open the Launcher → System panel to confirm VRAM headroom and disk space before you hit Start. If close to the limit, lower resolution/batch first.
Starter sizes (good defaults)
- Portrait LoRA: long side ~
640–768 - Full-body LoRA: taller canvas, ~
832×1216class - DreamBooth: match the family (SD 1.5 vs XL); keep sizes moderate first