Last updated: April 28, 2026
YAML in Python: A Practical Guide
Read, write, and manipulate YAML in Python with PyYAML and ruamel.yaml. Includes the safe_load vs load distinction, common errors, and when to pick each library.
Written by Mohan Raj Kolavi.
Quick answer: read YAML in Python
Install PyYAML with pip install pyyaml, then call yaml.safe_load(stream) where stream is a file object or a string. The result is a regular Python dict, list, or scalar. Always prefer safe_load over load on untrusted input.
Installing PyYAML
The Python YAML library lives on PyPI as pyyaml but imports as yaml — a naming mismatch that trips up most newcomers:
# PyYAML is the de-facto YAML library for Python pip install pyyaml # Verify python -c "import yaml; print(yaml.__version__)"
If import yaml fails after install, see our No module named 'yaml' troubleshooting guide.
Reading a YAML file
import yaml
with open("config.yaml", "r") as f:
config = yaml.safe_load(f)
print(config["database"]["host"])
The return value of yaml.safe_load is a regular Python data structure — typically a dict at the top level, but could be a list or scalar depending on the YAML.
Parsing a YAML string
safe_load accepts both file objects and strings:
import yaml raw = """ name: my-app version: 1.4.2 features: - search - billing """ data = yaml.safe_load(raw) print(data["features"]) # ['search', 'billing']
safe_load vs load vs full_load
PyYAML offers multiple loaders. They differ only in which Python objects they will construct from YAML tags:
import yaml # SAFE - rejects arbitrary Python objects, recommended default data = yaml.safe_load(stream) # UNSAFE - can construct arbitrary Python objects, only use on trusted input data = yaml.full_load(stream) # equivalent to yaml.load(stream, Loader=yaml.FullLoader) data = yaml.load(stream, Loader=yaml.UnsafeLoader) # never use on untrusted YAML
Rule of thumb: always use safe_load unless you have a specific reason and the YAML source is fully trusted. yaml.load with the default loader can execute arbitrary Python via crafted YAML tags — a known CVE vector.
Writing YAML from Python
import yaml
config = {
"name": "my-app",
"version": "1.4.2",
"features": ["search", "analytics"],
"database": {"host": "db.example.com", "port": 5432},
}
# Pretty-printed YAML
print(yaml.safe_dump(config, sort_keys=False, default_flow_style=False))
Pass default_flow_style=False for human-readable block style and sort_keys=False to preserve insertion order (PyYAML alphabetizes by default).
Multiple documents in one file
Kubernetes manifests and Compose-style files often pack multiple YAML documents in a single file separated by ---. Use safe_load_all:
import yaml
# YAML files can contain multiple documents separated by ---
documents = """
---
kind: Deployment
metadata:
name: api
---
kind: Service
metadata:
name: api
"""
for doc in yaml.safe_load_all(documents):
print(doc["kind"], doc["metadata"]["name"])
ruamel.yaml: round-tripping with comments
PyYAML drops comments and reorders keys on round-trip. If you need to load, edit, and save a file while preserving its formatting and comments, switch to ruamel.yaml:
# ruamel.yaml preserves comments and round-trips cleanly
pip install ruamel.yaml
# In Python:
from ruamel.yaml import YAML
yaml = YAML() # round-trip mode by default
with open("config.yaml") as f:
data = yaml.load(f)
# Mutate in place
data["replicas"] = 5
with open("config.yaml", "w") as f:
yaml.dump(data, f)
# Comments survive!
Handling YAML errors with line numbers
When YAML fails to parse, the exception carries a problem_mark with line and column info:
import yaml
try:
data = yaml.safe_load(text)
except yaml.YAMLError as e:
if hasattr(e, "problem_mark"):
mark = e.problem_mark
print(f"YAML error at line {mark.line + 1}, column {mark.column + 1}: {e.problem}")
else:
print(f"YAML error: {e}")
PyYAML vs ruamel.yaml at a glance
| Feature | PyYAML | ruamel.yaml |
|---|---|---|
| Spec compliance | YAML 1.1 (mostly) | YAML 1.2 full |
| Preserve comments | No | Yes (round-trip mode) |
| Preserve key order | Optional via sort_keys=False | Default |
| Speed | Faster (with libyaml) | Slower in round-trip mode |
| Maturity | De-facto default | Newer, actively maintained |
| Best for | Read-only or write-only | Edit-in-place workflows |
Best practices
- Always use safe_load on untrusted input.
yaml.loadwith the default loader is a remote-code execution vector. - Set sort_keys=False on dump. Default alphabetization makes diffs unstable for human-edited config.
- Pin the version in requirements.txt. PyYAML between 5.x and 6.x changed several defaults.
- Use ruamel.yaml when comments matter. Don't fight PyYAML on round-trips — it will lose comments every time.
- Validate before deploy. Drop generated YAML into our YAML validator to catch issues before kubectl or docker compose runs.
Related YAML guides
- Fix: ModuleNotFoundError: No module named 'yaml'
- YAML syntax reference
- YAML anchors and aliases — important when parsing Compose-style YAML in Python.
- YAML validator — sanity-check files before Python parses them.
Frequently Asked Questions
How do I read a YAML file in Python?
Install PyYAML with 'pip install pyyaml', then open the file and call yaml.safe_load(). Example: with open('config.yaml') as f: data = yaml.safe_load(f). The result is a regular Python dict, list, or scalar.
What is the difference between yaml.load and yaml.safe_load?
yaml.safe_load only constructs basic Python types (dict, list, str, int, float, bool, None) and is safe on untrusted input. yaml.load (or yaml.full_load) can construct arbitrary Python objects and should only be used on trusted YAML. Default to safe_load.
How do I write a Python dict to YAML?
Use yaml.safe_dump(data). For human-readable output, pass default_flow_style=False (block style) and sort_keys=False to preserve insertion order. Example: yaml.safe_dump(data, default_flow_style=False, sort_keys=False).
What is the Python YAML package called?
The PyPI package is called 'pyyaml' but the import name is 'yaml'. Install it with 'pip install pyyaml', then use 'import yaml' in your code. The mismatch is a common source of confusion.
How do I handle multiple YAML documents in one file?
Use yaml.safe_load_all() instead of safe_load. It returns a generator that yields each document separated by '---'. This is the standard pattern for Kubernetes manifests and multi-document config files.
Should I use PyYAML or ruamel.yaml?
Use PyYAML for simple read/write cases where comments and ordering do not matter. Use ruamel.yaml when you need to round-trip files (load, edit, save) while preserving comments, key order, and original formatting.
How do I get line and column numbers from YAML errors in Python?
Catch yaml.YAMLError and check hasattr(e, 'problem_mark'). The mark has 'line' and 'column' attributes (zero-indexed). Add 1 to display them as one-indexed for human readability.
Why does yaml.safe_load return None for empty input?
An empty string or whitespace-only YAML is valid and represents the YAML null value, which Python maps to None. Always handle the None case in your code, especially when loading user-provided files that might be empty.
How do I parse YAML strings (not files) in Python?
yaml.safe_load() accepts both file objects and strings. Just pass the string directly: data = yaml.safe_load('name: app\nversion: 1.0'). Behavior is identical to loading from a file.
Does PyYAML support YAML 1.2?
PyYAML 6.x targets YAML 1.1 by default. Some YAML 1.2 features (like the merge key '<<:') are partially supported but boolean parsing may differ. For full YAML 1.2 compliance, use ruamel.yaml in YAML 1.2 mode.
How do I install PyYAML in a virtual environment?
Activate the venv first ('source .venv/bin/activate' on macOS / Linux, '.venv\\Scripts\\activate' on Windows), then run 'pip install pyyaml'. Verify with 'python -c "import yaml"'. If installation seems successful but import fails, see our 'No module named yaml' troubleshooting guide.
Can I edit a YAML file and preserve its comments in Python?
Yes, with ruamel.yaml. PyYAML drops all comments on load. Use 'from ruamel.yaml import YAML' and the default round-trip mode preserves comments, key order, and formatting through load -> mutate -> dump.