4. CALYPSO Parameters

Note

This documentation is up-to-date with v10.1.x!

Inputs and outputs.

4.1. CALYPSO Inputs —— toml

Main input files, named as input.toml, which contains all necessary parameters for the structure prediction. This files consists of input tags that can be given in any order, or be omitted while the default values are used. below we offer a quick view of the syntax of the tags:

  1. the general syntax is consistence with toml , one can find more information about this format file here.

  2. the labels are case-insensitive.

  3. all text following the “#” character is taken as comment.

  4. logical values can be given as t (or true), or f (or false).

  5. null is allowed.

below are brief descriptions on necessary input parameters.

4.1.1. Common parameters in CALYPSO block

4.1.1.1. Systemname

SystemName = string

A description string of the targeted system(max. 40 characters).

Default: CALYPSO

4.1.1.2. Seed

seed = integer

Positive int number to set random seed for REPRODUCIBILITY, negative to do not set it.

Default: -1

4.1.1.3. IType

IType = int or string

Control the type of structures to be generated.

IType int IType string Module
1 CRYSTAL Crystal structure prediction
2 CLUSTER Cluster structure prediction
3 MOLECULAR Molecular crystal structure prediction
4 LAYER Layer (including film) structure prediction
5 SURFACE Surface or adsorption structure prediction

One can use int or string to specify the type of structure prediction. But if string is used, it must be uppercase.

Default: 1

4.1.1.4. ICode

ICode = integer or string

Defines which code to be used for local structure optimization during the structure prediction.

1:

VASP

3:

GULP

4:

PWSCF

9:

LAMMPS

15:

MLP

16:

MlpVasp # prerelax with MLP and then VASP

Default: 1

4.1.1.5. IAlgo

IAlgo = integer or string

Defines which PSO algorithm to be adopted in the simulation.

1:

global PSO algorithm

2:

local PSO algorithm

3:

ABC algorithm with symmetry

Default: 2

4.1.1.6. IDisp

 IDisp = integer or string
1:

ORCH

The build-in task dispatcher by CALYPSO, other third party libraries will be implemented.

Default: 1

4.1.1.7. IFit

IFit = integer or string

Defining the fitness to determine the evolution structure of the population.

1:

ENTHALPY

2:

HARDNESS

3:

GIBBS

Default: 1

4.1.1.8. IRunner

IRunner = int

Define the style of running calypso.

1:

automatically run

2:

manually run each step (split mode)

Default: 1

4.1.1.9. ISim

ISim = int or string

Define the descriptor of structures, it will be used to determine whether two structures are similar.

0:

NAN

1:

BCM

2:

CCF

BCM is faster than CCF, so we suggest to use BCM for most cases. if encountering the similarity warning when generating structures, one should decrease the value of SimThreshold or turn off the similarity compare by setting ISim = 0.

Default: 1

4.1.1.10. BlockMode

BlockMode = bool

Define the evolution way.

true:

evolution will be performed after each generation is done.

false:

evolution will be performed once each structures local optimization is done.

Warning

Now we only support the blockmode = true.

Default: true

4.1.1.11. PickUp

PickUp = bool

Whether to pick up a calculation. Now CALYPSO support pickup in any stage, just turn this on.

Another interesting thing is that, pickup can not only pick up an aborted CALYPSO task, but also can “pick up” a finished CALYPSO task with a new changed MaxStep, which can allow you to keep the evolution information you don’t want to drop and continue to run.

true:

pickup the old calculation.

false:

restart a new calculation.

Default: false

4.1.2. Parameters for evolution in CALYPSO.EVO block

4.1.2.1. NBest

NBest = int

Defines how many parts the PES will be separated and PSO will move to the closest one to generate the next structure.

In global PSO, NBest is equal to 1.

Default: 4

4.1.2.2. PsoRatio

PsoRatio = float

Defines what percentage of the structures per generation should be produced by PSO.

The rest of structures will then be randomly generated with symmetry constraints.

Default: 0.6

4.1.2.3. SabcRatio

Sabcratio = list of float

Define the percentage of scouts, employees, and onlookers, in which:

  • scouts choose a different space groups

  • onlookers choose a different combination of the wyckoff positions

  • employees choose different atomic coordinates of the wyckoff positions

Please make sure the sum of three float number should equal to 1.0.

Default: [0.3, 0.2, 0.5]

4.1.2.4. PopSize

PopSize = integer

The population size, i.e., the total number of structures per generation.

Normally, a larger population size is needed for a larger system. Very large population size should be used for simulations of automatic variation of chemical compositions.

Default: 10

4.1.2.5. MaxStep

MaxStep = integer

The maximum number of generations to be executed for the entire structure prediction simulation.

Typically, a larger number of generations are needed for a larger system.

Default: 2

4.1.2.6. Temperature

Temperature = 300

The temperature value when considering Gibbs free energy (IFit = 3). The algorithm can be found here.

The unit is Kelvin.

Default: 300

4.1.3. Parameters for generator in CALYPSO.GENERATOR block

4.1.3.1. basic parameters for each type of crystal structure prediction

4.1.3.2. FormulaUnit

FormulaUnit = list of string

For example, if we set FormulaUnit = ['(LiH4)1-2(NH3)3-4'], it means that we want to predict LiH4-NH3 structure, within the range of 1 to 2, and 3 to 4, respectively.

In Crystal Structure prediction, the length of FormulaUnit is 1. But for layer structure prediction, the length of FormulaUnit is equal to the number of layers.

There is no default. you must define it.

4.1.3.3. MaxNumAtom

MaxNumAtom = integer

The maximal number of atoms allowed in the simulation cell.

Default: 100

4.1.3.4. VolumeUnit

VolumeUnit = dict of string and int

Custom volume of each unit. Set 0 or leave empty means calculated by covalent radii (only available for single element), which is 1.3*(4/3)πr^3.

For example, VolumeUnit = {Li=10, H=10, N=10} mean volume of atom Li, H, and N are equal to 10.

Warning

The key of dict in toml is no need to add quote for string.

Default: {} <=> (1.3*(4/3)π(covalent radii)^3)

4.1.3.5. DistanceOfIon

DistanceOfIon = list or dict

Minimal inter atomic distances (in unit of angstrom) in a format of (n+1)x(n+1) matrix or in a format of dict.

for example, DistanceOfIon = [["X", "Li", "H", "N"], ["Li",  1.0, 1.0,  1.0], ["H",  1.0, 1.0,  1.0], ["N", 1.0, 1.0,  1.0],]] is equal to DistanceOfIon = {Li: 0.5, N: 0.5, H: 0.5}.

Default: {} <=> covalent radii

4.1.3.6. SpaceGroup

SpaceGroup = list of int and string

Defines the range of space groups to be considered.

The rule of specific space group is :

  1. one single integer means a single space group number

  2. “int1-int2” means space group number ranging from int1 to int2

  3. “int1:int2:int3” means space group number ranging from int1 to int2 with step size int3. [int1, int2)

Note

There are some differences when choosing different structure generating method.

  • crystal (IType = 1): SpaceGroup ranging from 1 to 230

  • cluster (IType = 2): SpaceGroup ranging from 1 to 31

  • molecular crystal (IType = 3): SpaceGroup ranging from 1 to 230

  • layer (IType = 4): SpaceGroup ranging from 1 to 17 for multi-layer, ranging from 1-230 for single layer.

Default: [1, “2-210”, “211:231:1”]

4.1.3.7. PrototypePath

PrototypePath = list of string

The provided path which containing the prototype structures (end with .vasp).

For example, PrototypePath = ["path/to/vasp/poscar"]. In the very beginning, the code will parser the provided path and save them into ~/.cache/calypso/prototype naming as {number of atoms}.csv. And all the structures with same number of atoms will save here.

There is no default value. You must supply this variable if you want to use it.

4.1.3.8. PrototypeRatio

PrototypeRatio = float

The ratio of prototype-base-generated structures in random-generated structures.

Default: 0.0

4.1.3.9. bulk detail parameters

4.1.3.10. LengthMaxRatio

LengthMaxRatio = float

The max ratio of the length of a, b, c.

Default: 5.0

4.1.3.11. LengthMinRatio

LengthMinRatio = float

The min ratio of the length of a, b, c.

Default: 1.0

4.1.3.12. Extra Parameters for layer structure prediction

4.1.3.13. Thicknesses

Thicknesses = list of float

The thicknesses of thin films (in unit of angstrom).

The length of Thicknesses is equal to the length of FormulaUnit

There is no default value. You must supply this variable if IType = 4.

4.1.3.14. Area

Area = float

The area (in unit of angstrom^2) per formula unit.

If you cannot provide a good estimation on the area, please use the default value. The program will automatically generate an estimated area by using the ionic radii of given atoms.

There is no default value. You must supply this variable if IType = 4.

4.1.3.15. Gaps

Gaps = list of float

The gap between two layers, i.e., the interlayer distance (in unit of angstrom). The length of Gaps should be equal to the length of FormulaUnit. And the last value of Gaps is always the vacancy value.

For example, the FormulaUnit = ["MoS2", "CrI3"], the gap can be set as Gaps = [2, 10], which means the distance between two “MoS2” layer is 2 angstrom, and the vacancy is 10 angstrom.

There is no default value. You must supply this variable if IType = 4.

4.1.3.16. Extra Parameters for cluster structure prediction

4.1.3.17. Vacancy

Vacancy = list of float

The isolated cluster is placed into an orthorhombic box where the periodic boundary condition is applied.

This variable defines the separations (in unit of angstrom) between the studied cluster and its nearest-neighboring periodic images. It should be large enough to ensure that interactions between the studied cluster and its nearest-neighboring images are negligible.

For cluster structure prediction, we do not recommend the use of VASP for the structure optimization for large systems since computationally VASP calculations are very expensive.

Default: [10.0 10.0 10.0]

4.1.3.18. cluster_type

ClusterType = string 
normal:

the core-shell type cluster

cage:

the cage cluster

plane:

the plane cluster

Default: normal

4.1.3.19. Extra Parameters for molecule structure prediction

4.1.3.20. MoleculesPath

MoleculesPath = dict of string

The path of molecules. And the molecular name in FormulaUnit will be parsed by this key.

For example, if we have FormulaUnit = ["{Water}4"], then MoleculesPath = {'Water'='./H2O.xyz'}, so that Water will be parsed as H2O.

Default: {}

4.1.3.21. Extra Parameters for surface structure prediction

4.1.3.22. AdsorptionStyle

AdsorptionStyle = integer

Determines which method should be adopted for generation of adsorption structures in the simulation cell.

AdsorptionStyle int AdsorptionStyle string Method
1 UAS Unfixed adsorption sites.
Random generations of structures.
2 FAS Fixed adsorption sites.
Generating structures with fixed positions of adatom.

Default: 1

4.1.3.23. AdsorptionSymmetry

AdsorptionSymmetry = bool

A Boolean parameter governs the activation of the symmetry search in structure generation.

If True, the 2D space group is randomly chosen from the crystal system associated with the lattice of the provided substrate.

Default: True

4.1.3.24. SubstratePath

SubstratePath = string

The file path for the substrate file.

The selective dynamics feature is also supported, allowing for control the respective coordinates of substrate atoms will be allowed to change during the ionic relaxation.

Warning

Only the VASP format is accepted, with .vasp suffix.

There is no default value. You must supply substrate file and its path if you want to use surface module.

4.1.3.25. Supercell

Supercell = list of list of integer

This 2x2 matrix is used to define the substrate. Whose lattice vectors can be obtained via multiplying this matrix by the ideal lattice vectors.

Default = [[1, 0], [0, 1]]

4.1.3.26. RangeOfZAxis

RangeOfZAxis = list of float

Defines the range of distances between the 2D layer and adatoms. Two integers of this parameters specify the maximal and minimal distance, respectively.

Default: [1.7, 1.2]

4.1.3.27. PointPath

PointPath = string

Path for the fixed adsorption sites file.

Site coordinates are based on the substrate’s lattice and supercell parameter.

The Z-coordinate of points are excluded; only the XY coordinates are recorded.

There is no default value. You must supply point file and path if you are using AdsorptionStyle = 2.

Two file formats are supported:

  • VASP-like format

  • JSON format

Both allow for coordinates to be specified in either Direct or Cartesian format.

Warning

The lattice parameters must be identical to those of the substrate.

VASP-like format (POINT.vasp)

points
1.0
        2.4593999386         0.0000000000         0.0000000000
       -1.2296999693         2.1299028249         0.0000000000
        0.0000000000         0.0000000000        15.0000000000
  H O OH
  1 1 1
Direct
     0.666666687         0.333333343         0.500000000
     0.333333313         0.666666627         0.500000000
     0.666666687         0.333333343         0.500000000
  • Scaling factor support 1.0 only.

  • Species names support species of atoms or functional group or molecule.

  • The selective dynamics format of POSCAR is not supported.

JSON format (POINT.json)

{
    "coordinate system": "Direct",
        "point":
        {
            "H": [[0.666666687, 0.333333343, 0.500000000]],
            "O": [[0.333333313, 0.666666627, 0.500000000]],
            "OH": [[0.666666687, 0.333333343, 0.500000000]]
        }
}
  • Do not provided substrate lattice in this format.

4.1.3.28. MoleculesPath

MoleculesPath = dict of string

The path of molecules. And the molecular name in FormulaUnit will be parsed by this key.

For example, if we have FormulaUnit = ["(H)2{OH}2{H2O}4"], then MoleculesPath = {'OH' = './OH.xyz', 'H2O' = './H2O.xyz'}, so that H2O molecular file will be parsed as H2O.xyz and OH functional group will be parsed as OH.xyz.

Warning

Only the XYZ format is accepted, with .xyz suffix.

There is no default value. You must supply moleculer files and paths when the system contains molecules or functional groups within the FormulateUnit.

4.1.3.29. FixSitesC

FixSitesC = bool
FixSitesC Method
True The Z-coordinate of the adatom is fixed to the mean value of RangeOfZAxis.
False  The Z-coordinate of the adatom is randomly sampled from the RangeOfZAxis range.

Default: False

4.1.3.30. TranslateAtomPosition

Apply a random translation on the symmetric sites to generate the adsorption sites or not.

Take no effect if AdsorptionSymmetry=False.

TranslateAtomPosition = bool

Default: True

4.1.3.31. RotateAtomPosition

Whether to rotate the adatom sites about the Z-axis by a random angle and the molecule/functional group about its first atom by a random angle.

Take no effect if AdsorptionSymmetry=False.

RotateAtomPosition = bool

Default: True

4.1.4. Parameters for optimization in CALYPSO.OPT block

4.1.4.1. DFTInputPath

DFTInputPath = string

The Path that contains the input files for the DFT code.

If one using MLP with model file, it also should be saved in here.

Default: “./”

4.1.4.2. JobFlow

JobFlow = list of string

Define the sequence of calculation to be conducted. The number of input files should also be equal to the length of JobFlow.

default: [“opt”, “opt”, “opt”]

4.1.4.3. PpMap

PpMap = list of string

Define the path of pseudopotential files and their corresponding element mapping. Only work for VASP for now.

For example, PpMap = {Li: "POTCAR_Li", Mg: "mmm"}

There is no default value. One must set it manually.

4.1.4.4. ShareFiles

ShareFiles = list of string

the absolute path of model of other files need to be copied into the real calculation directory.

For example , if one using VASP as calculator and considering vdw functional which definitely needs the vdw_kernel.bindat file, one can put the path of kernel file in ShareFiles to make sure the kernel will be used in each structure optimization.

Another example is that one can put the path of model here if using mlp as calculator.

Default: []

4.1.4.5. CustomizableScript

CustomizableScript: str

CustomizableScript value must be an absolute path that point to a script.

While you can use this code to run your own, the output must conform to the format and file names of the ICode you select. For instance, an ICode of 1 (VASP) requires VASP-compatible (name and format) output. Documentation with more details will be provided soon.

Default: None

4.1.4.6. RunCustomizableScriptCMD

RunCustomizableScriptCMD: str

Default: None

You only provide this when your CustomizableScript value is not None. CAUTION: RunCustomizableScriptCMD value must use the script provided by CustomizableScript, more specifically, RunCustomizableScriptCMD value must include the customizable script name, not the full path in CustomizableScript. For instance, if CustomizableScript value is /path/to/customizable_run_flow.sh, RunCustomizableScriptCMD value can be bash customizable_run_flow.sh $CALYPSO_MLP_CMD. It can’t be bash /path/to/customizable_run_flow.sh $CALYPSO_MLP_CMD, because during calculation, CALYPSO may move the given customizable script to another place or even a different machine, its path is not stable, but the script name is stable.

4.1.4.7. Extra Parameters MLP calculator

4.1.4.8. MLPType

MLPType = "dp"
dp:

deep potential

deepmd:

deep potential

dpa:

deep potential

dpa2:

deep potential

m3gnet:

chgnet:

mace_mp:

mace_off:

gulp:

emt:

lj:

morse:

Choose which type of mlp will be used.

There is no default value. One must set it manually.

4.1.4.9. MLPParams

MLPParams = {"model"="M3GNet-MP-2021.2.8-PES"}

The parameters of mlp initialization. chgnet: {“model”=”0.3.0”, “check_cuda_mem”=true, “on_isolated_atoms”=”warn”} dp: {“model”: “path/to/model”}

Default: {}

4.1.4.10. OptAlgo

OptAlgo = string

The algorithm of optimization.

LBFGS:

FIRE:

BFGS:

Default: “LBFGS”

4.1.4.11. OptStep

OptStep = int

The number of step of optimization.

Default: 1000

4.1.4.12. TrajFile

TrajFile = string

The filename of optimization trajectory.

Default: traj.traj

4.1.4.13. Pstress

Pstress = float

The pressure of when conducting mlp structure optimization. in GPa

Default: 0.0

4.1.4.14. Fmax

Fmax = float

The coverage condition. The optimization will stop when all the force of each atom is smaller than Fmax.

Default: 0.1

4.1.4.15. MLPKeepSym

MKPKeepSym = bool

Whether to keep symmetry when using mlp to conducting optimization.

Default: false

4.1.4.16. UcfMask

None or a list of booleans, indicating which of the six independent components of the strain are relaxed.

See ase.filters.UnitCellFilter mask parameter.

UcfMask = list of bool

Default: None

4.1.5. Parameters for dispatcher in CALYPSO.DISPATCHER block

4.1.5.1. MachineList

MachineList = list of string

These parameters define the available computational resources. For example, you are using the cluster with two queues that can be used, then can choose to set up at most two machine.json to perform structure optimization, in very parallel way.

MachineList = ["./machine-1.json", "/machine-2.json"]

There is no default value for MachineList. One must set it manually.

4.1.5.2. TimeInterval

TimeInterval = int

How often the dispatcher will check the status of the jobs.

Default: 10

4.1.5.3. TmpPath

TmpPath = string

The path to save the log file of Orchestrator (dispatcher).

Default: “BackStage”

4.1.6. Parameters for descriptor in CALYPSO.DESCRIPTOR block

4.1.6.1. SimThreshold

SimThreshold = float

Define the threshold of similarity between two structures. If the distance of two structures is less than the threshold, they are considered as the same structure.

Default: 0.01

4.2. CALYPSO Outputs

All the major output files are listed in the folder of “results”:

File Name Description
Analysis_Output.csv The results file of the predicted structures.
database.db Contains the intermediate parameters of CALYPSO.
descriptor.pkl Includes the information of the descriptor of each structures.
ini.json Includes the initial structures information.
opt.json Includes the optimized structures information and the corresponding energy, force and so on.
opt_task All structures optimization are saved in this folder.

4.3. Analysis of Results

CALYPSO calculations often generate a large number of structures. Therefore, it is essential to have a versatile tool for efficient data analysis.

Introducing the CALYPSO ANALYSIS KIT (CAK), a tool designed for automatic structure analysis.

Once CALYPSO is installed, the pycak tool will be available for use from the command line.

> cd path-to-calculation/results
> pycak --help
usage: pycak [-h] [-d DIR] [--refene REFENE_FILE] [-m TOL [TOL ...]] [-a] [--reduce-sim] [--energy-threshold ENERGY_THRESHOLD] [--split-by-formula | --no-split-by-formula]
             [--out-root OUT_ROOT] [--pcell] [--ucell] [--vasp] [--synth] [--synth-model-dir SYNTHESISABILITY_MODEL_DIR] [--spap] [--spap-symprec SPAP_SYMPREC]
             [--spap-threshold SPAP_THRESHOLD] [--spap-r-cutoff SPAP_R_CUTOFF] [--spap-ilat {0,1,2}] [--spap-no-compare] [--spap-no-db] [--spap-cif] [--spap-poscar] [--skip-analyze]

CALYPSO Analysis Toolkits
-------------------------
Use `analyze` (default) to analyze CALYPSO results, or/and `spap` to run SPAP symmetry/similarity.

Examples:
    - pycak -m 0.1 0.01 -a --ucell --vasp
    - pycak -m 0.1 --ucell --vasp --split-by-formula --out-root by-formula
    - pycak -m 0.1 --ucell --vasp --refene ../ref_ene.txt
    - pycak --spap
    - pycak -m 0.01 --reduce-sim -a --spap --spap-poscar
    - pycak -m 0.1 0.01 0.3 --reduce-sim --spap --spap-poscar
    - pycak -m 0.1 0.01 0.3 --reduce-sim --vasp --split-by-formula --spap --spap-poscar --skip-analyze

Optional: analysis synthesisability
    This requires pytorch etc. being installed. See detailed instruction in
    <https://iccms-calypso.github.io/CALYPSO-Python/posts/_installation.html#installation>

Then download and decompress the model archive into the default cache directory:
    MODEL_ARCHIVE_URL=https://github.com/ICCMS-CALYPSO/open-resources/releases/download/CALYPSO-v10.0.0-alpha.1/synth-ckpt-v1.0.0.tar.gz
    PROJECT_CACHEDIR=${HOME}/.cache/calypso
    curl -L $MODEL_ARCHIVE_URL | tar -C $PROJECT_CACHEDIR -zxf -

options:
  -h, --help            show this help message and exit
  -d DIR, --results-dir DIR
                        path to the results directory (default: .)
  --refene REFENE_FILE  reference energy (enthalpy) for energy above hull (default: ../ref_ene.txt)
  -m TOL [TOL ...], --multi-tolerance TOL [TOL ...]
                        tolerances for analysising symmetry;
                        multiple values are acceptable; some useful
                        values: 1.0, 0.5, 0.1, 0.01, 0.001; (default: 0.1)
  -a, --all             analysis all structures; by default only the
                        50 lowest energy structures are considered
  --reduce-sim          reduce similarity using energy threshold
  --energy-threshold ENERGY_THRESHOLD
                        energy threshold (eV) of reducing similarity; below which
                        two structures are considered duplicates (default: 1e-3)
  --split-by-formula, --no-split-by-formula
                        also emit per-formula outputs in <out-root>/<formula>/ (default: False)
  --out-root OUT_ROOT   root directory to hold per-formula outputs (default: by_formula)

output format:
  --pcell               write primcell cell
  --ucell               write unit cell; If neither pcell nor ucell are specified,
                        ucell is switched on
  --vasp                write structure in vasp format

analysis synthesisability:
  --synth               whether to analyse synthesisability with machine learning model
  --synth-model-dir SYNTHESISABILITY_MODEL_DIR
                        directory to model parameters for synthesisability model
                        (default: /home/wangzy/.cache/calypso/synth-ckpt-v1.0.0)

SPAP (post-analysis):
  --spap                after analysis, run SPAP under each caly_structs.<tol>/spap_run/
  --spap-symprec SPAP_SYMPREC
                        this precision is used to analyze symmetry of atomic structures (default: 0.1)
  --spap-threshold SPAP_THRESHOLD
                        threshold for similar/dissimilar boundary (default: None)
  --spap-r-cutoff SPAP_R_CUTOFF
                        inter-atomic distances within this cut off radius will contribute to CCF (default: None)
  --spap-ilat {0,1,2}   this parameter controls which method will be used to deal with lattice for comparing structural similarity
                        0 do not change lattice;
                        1 equal particle number density;
                        2 try equal particle number density and equal lattice
  --spap-no-compare     not to compare similarity of structures  (default: False)
  --spap-no-db          not to write structures into ase (https://wiki.fysik.dtu.dk/ase/) database file (default: False)
  --spap-cif            write structures into cif files (default: False)
  --spap-poscar         write structures into files in VASP format (default: False)
  --skip-analyze        only run SPAP (assumes caly_structs.<tol>/ already exist) (default: False)

> pycak

An output file named “Analysis_Output.dat” will be generated. And the ref_ene.txt file should be considered cause the calypso will calculating the energy/enthalpy above the hull.

The ref_ene.txt file should have the following format:

> cat ../ref_ene.txt 
formula enthalpy_per_atom label
La 13.06301 element_La
H  -0.55342 element_H

The results generated by pycak will appear as follows:

> head -n 3 Analysis_Output.csv 
 idx   caly_name      formula       enth_per_atom   enth_above_hull       fitness       volume_per_atom   density   min_dis  spg(0.1)  spgnum(0.1) natom(0.1)
  0    caly_4403        H2La            3.322            0.000             3.322             6.206        12.568     1.622   P6/mmm        191         3     
  1    caly_4463        H4La            1.569            0.000             1.569             4.529        10.481     1.488   I4/mmm        139         10  
  2    caly_6283       H10La            0.330            0.000             0.330             2.992         7.516     1.131   Fm-3m         225         44    
  3    caly_2934        H4La            1.571            0.002             1.571             4.506        10.535     1.455   I4/mmm        139         10 

However, when the reference energy is set to 0, the enth_per_atom and enth_above_hull columns in Analysis_Output.csv will have identical values.

Duplicated structures can be eliminated using the --reduce-sim option. Structures can be saved in VASP format using the --vasp option and unit cells can be written with --ucell. Structures with different compositions will be separated into different directories by using the --split-by-formula option. Additionally, the structure prototype can be analyzed using the --spap option. After running this command, the structure prototype will be saved in different subdirectories. When using --spap, the options --vasp and --split-by-formula will also be activated.

The directory structure will look like this:

> pycak -m 0.1 0.01 0.3 --reduce-sim --vasp --split-by-formula --spap --spap-poscar
> pwd
/calypso/results

> ls
Analysis_Output.csv  by_formula/  caly_structs.0.01/  caly_structs.0.1/  caly_structs.0.3/  ini.json  opt.json

> ls by_formula 
H10La/  H2La/  H4La/  H6La/

> ls by_formula/H10La/
Analysis_Output.csv  caly_structs.0.01/  caly_structs.0.1/  caly_structs.0.3/

You can also skip the normal analysis process and directly analyze the structure prototype:

> pycak -m 0.1 0.01 0.3 --reduce-sim --vasp --split-by-formula --spap --spap-poscar --skip-analyze

For more information about the SPAP package, please refer to the official documentation here.

4.4. Orchestrator —— CALYPSO task dispatcher

To make CALYPSO more flexible, we develop a task dispatcher to help users to submit CALYPSO jobs in more ways.

Orchestrator mainly depends on an input file: machine.json, which defines how to reach the computational resources, and how to run calculation in these resources.

Here is the parameters of machine.json:

4.4.1. common parameters

4.4.1.1. name

name = string

Name of this computational resources, useful when you have multi-computational resources.

Default: “Machine”