Data Support
Orbitron’s io/pipelines crate provides a unified loader (load_scene, load_trajectory, load_frequency_data) that automatically selects the correct parser. The table summarises formats tested in production:
| Format | Extensions | Highlights |
|---|---|---|
| XYZ | .xyz |
Multi-frame trajectories, streamed via canonical helpers, attaches final-frame structure plus optional positions attachment |
| Gaussian LOG/FCHK/CUBE | .log, .fchk, .cube, .gjf, .com |
Optimization trajectories, vibrational modes, orbital/NBO metadata, stage summaries + boundary loaders, canonical bundles capture MO and vibrational attachments |
| NWChem OUT/NW + companions | .out, .nw, .movecs, .hess, .cube |
Multi-task support (OPT/FREQ/SP), byte-ranged loaders, canonical extras summarise tasks and attach MO coefficients/trajectory shards. Sources tab also loads .movecs (Fortran-binary MO coefficients) for full nbf × nmo coverage and .hess (Cartesian Hessian) for follow-up frequency synthesis when the .out doesn’t already carry the freq block. NWChem .out extracts the basis-set definition into a GaussianBasisSet, so .out + .movecs together render molecular orbitals as 3D isosurfaces in the Surfaces tab. |
| Molpro OUT (+ XML) | .out, optional XML sidecars |
Task summaries (SCF/OPT/FREQ/CASPT2/CI/MRCC/MP2), correlated-energy capture, canonical extras seeded from XML manifests |
| Molcas/OpenMolcas OUT + companions | .out, .molden |
Canonical documents expose module tasks (extras.molcas.tasks), optimisation energy profiles, frequency mode counts, RASSCF/CASPT2 diagnostics. Sources tab loads .molden siblings (SCF / RASSCF / Guess / MP2 flavors) and renders MOs as 3D isosurfaces in the Surfaces tab — the MOLDEN file ships geometry, basis, and coefficients in one file so no extra companion is needed. Sibling matching auto-detects Molcas-style task suffixes (acrolein.scf.molden matches acrolein.out). |
| DIRAC | .out (+ HDF5 checkpoint) |
Relativistic QC output — geometry and run/energy summary; MO coefficients load from the companion HDF5 checkpoint via Analysis → Sources for orbital isosurfaces. Content-detected (no fixed extension). |
| Quantum ESPRESSO | .out, .in, .xml, .xsf, .dos, .pdos_*, .bands, .UPF |
Periodic structures from .out text output, .in input decks, structured data-file-schema.xml (eigenvalues, occupations, fermi level), Cube-formatted .xsf densities (pp.x output_format=6) and proper XSF, DOS/PDOS/bands plot summaries under extras.qe, relax energy profiles, inline band samples |
| VASP | vasprun.xml, POSCAR, CONTCAR, OUTCAR, INCAR, KPOINTS, POTCAR, DOSCAR, EIGENVAL, PROCAR, CHGCAR/CHG/PARCHG, XDATCAR, DYNMAT, ACF.dat |
Open any canonical filename (no extension required); auto-switches to Analysis → Sources showing the rest of the directory. Periodic geometry + DOS/band/forces/magmom/Bader. CHGCAR exposes total + spin-density isosurfaces. XDATCAR loads as multi-frame trajectory. POSCAR ↔︎ CONTCAR Compare overlay shows relaxation. Sources retired the old .zip/.tar archive-import flow. |
| CIF | .cif |
Symmetry + periodic unit cells with provenance tags |
| PDB | .pdb |
Biomolecular chains/residues and inferred bonds |
| SDF/MOL | .sdf, .mol |
Bond orders, connection tables, canonical structure + raw-source attachments |
| Standalone NBO summaries | .nbo + optional .47 (FILE47) |
Natural population tables, optional FILE47 geometry/basis metadata for NBO7, canonical documents capture population extras and raw logs |
| Volumetric CUBE files | .cube |
Single grids with metadata + volumetric attachments |
| Volumetric directories | directory of .cube files |
Ensures all cubes share a geometry, emits per-grid attachments and dataset metadata |
Loaders apply provenance tags, infer covalent bonds (SceneBuilder::infer_covalent_bonds), and attach task metadata. Remote loading uses the same pipeline over SSH/SFTP (see §6).
Interested in adding a new format? See the checklist in the Developer Guide (§5.1 Adding a New Chemistry Format) for a turnkey parser skeleton, naming conventions, and test scaffolding.
2.1 Quantum ESPRESSO support
Orbitron’s QE ingestion handles the full file zoo a typical QE project carries:
- SCF / relax / nscf outputs (
*.out): periodic structures + unit cells, SCF energetics, relax profiles, and task summaries (extras.qe). - Input decks (
*.in): parses&CONTROL,&SYSTEM,ATOMIC_SPECIES,ATOMIC_POSITIONS, andCELL_PARAMETERSsections; derives the lattice fromibrav(0–14, including centred orthorhombic / monoclinic / triclinic variants) +celldm(1..6), or from explicitCELL_PARAMETERSforibrav=0. Atom positions handled in alat / bohr / angstrom / crystal units. Open a.indirectly to view the structure before running the calculation. - Structured XML output (
<prefix>.xml/<prefix>.save/data-file-schema.xml): parsed viaroxmltree. Surfaces atomic_structure, per-k-point eigenvalues + occupations, Fermi level, total energy, convergence status, exit status, and spin / SOC flags. More reliable than scraping.outbecause the schema is version-stable. - XSF volumetric (
*.xsf): handles both proper XSF (CRYSTAL / MOLECULE / ATOMS) and QE’s “Cube-as-xsf” flavour (pp.xoutput_format=6writes Gaussian Cube content with an.xsfextension). Cube-formatted XSF routes through the existing cube parser sopp.xdensities, individual orbitals, and transition densities flow through the orbital-dataset registry alongside native.cubefiles. - DOS / PDOS / bands (
*.dos,*.pdos_tot,*.pdos_atm#*,*.bands,*.bands.gnu): parsed into canonical-documentextras.qeblocks with metadata for atom / WFC / orbital labels. The Sources panel surfaces each file as its own role.
Current limitations:
- A dedicated QE Spectra panel that plots DOS / PDOS / band structure (mirroring the VASP equivalent) is future work — the parsers and Sources Load are in place but the plotting UI is not yet wired.
.UPFpseudopotentials are surfaced informationally in Sources but not parsed (no viewer feature consumes them today).atomic_proj.xml(projwfc.x output) and the<prefix>.save/binary checkpoint hierarchy are not parsed.ibravvalues are derived to a primitive cell; the-12 / ±13monoclinic centred variants are treated as primitive with a warning that the centring isn’t expanded.
2.2 NBO FILE47 support
Orbitron can enrich standalone .nbo summaries with optional FILE47 (.47) sidecars. When a .47 file is present next to the .nbo log, Orbitron uses it to reconstruct the NBO7 basis/geometry payload so orbital and population views align with the original analysis. If the .47 file is missing or malformed, Orbitron still loads the .nbo summary and continues with population-only data.
2.3 Streaming defaults
Orbitron now prefers streaming parsers whenever possible. Gaussian and NWChem log readers walk the file once to build a run summary, then seek straight to the requested stage/task when you load individual geometries, trajectories, or frequency sets. CLI/TUI/GUI all call the same boundary helpers:
gaussian_stage_scene_by_boundary/gaussian_stage_trajectory_by_boundarygaussian_stage_frequency_by_boundarynwchem_task_scene_by_boundary/nwchem_task_trajectory_by_boundary
If you script against the Rust API, use these helpers to avoid re-reading the full file.
Note: The
orbitron_services::streamingtrait is still a roadmap item; current services load whole trajectories before playback. Background loaders keep the UI responsive, but true frame-by-frame streaming is pending future work.
2.4 Parser Reliability Improvements (2026)
Orbitron’s parsing infrastructure underwent significant improvements in early 2026, establishing a comprehensive set of shared parsing utilities that enhance reliability and consistency across all supported formats:
Key improvements:
Consistent error handling: All parsers now use shared utilities (
parsing_utils.rs) that handle edge cases uniformly—scientific notation (including Fortran D-format like1.23D+05), malformed data, whitespace variations, and boundary conditions are handled consistently across Gaussian, NWChem, Molpro, Molcas, VASP, QE, and structural formats.Battle-tested reliability: Parsing utilities are verified by 116 unit tests covering common parsing patterns and edge cases. This ensures that format detection, tokenization, float parsing, unit conversions, and coordinate extraction work reliably even with unusual or malformed input files.
Better diagnostics: When parsing fails, error messages are more actionable and consistent. Failed float parsing, missing delimiters, and malformed coordinates return structured errors instead of panicking.
Zero-copy performance: Many operations now use string slices (
&str) instead of allocations, improving performance on large files (multi-GB logs, long trajectories, extensive VASP outputs) without sacrificing safety or error handling.
What this means for users:
- Format detection is more accurate and handles files with unusual formatting or missing headers
- Parsing is more forgiving of whitespace variations and non-standard numeric formats
- Error messages clearly indicate what went wrong during parsing (e.g., “expected float after ‘=’ delimiter on line 42” instead of generic parse failures)
- Large files load faster due to reduced memory allocations during tokenization
These improvements affect all file formats supported by Orbitron, making the entire I/O pipeline more robust and maintainable. For technical details about the parsing utilities and their adoption, see the Developer Guide (§6.4 Parsing Utilities Reference).
2.5 Per-atom population coverage
The 2026 redesign of the Analysis panel (§3.6) made per-atom charges first-class — every supported format now surfaces them through the same UI (the “Charges” tab + the Atom Coloring halo overlay). Coverage by format:
| Format | Mulliken | Löwdin | Natural (NBO) | APT |
|---|---|---|---|---|
NWChem (.out) |
✓ (modern + property-module headers) | ✓ | — | — |
Gaussian (.log) |
✓ | — | ✓ (when pop=full / NBO) |
✓ (frequency jobs) |
Gaussian (.fchk) |
✓ | — | — | — |
Molpro (.out) |
✓ | — | — | — |
Molcas / OpenMolcas (.out) |
✓ (column-oriented blocks) | — | — | — |
For optimization trajectories (NWChem, Gaussian), Orbitron backfills earlier frames with the converged populations so the Charges tab and halos stay visible regardless of which step you have parked. Without this, navigating to step 0 of an opt job used to drop the charges entirely.
For property-module runs (e.g. NWChem task scf property with the mulliken keyword) the parser returns the last Mulliken table emitted in the file — the converged value, not the preliminary atomic-guess Mulliken that some packages print early in the run.