fireflow aims to support the FCS standards as written as best as possible.
However, there are a few instances where this library deviates from the
standards in order to achieve practicality.
The following is a general overview of how fireflow adheres to the standards
and where it deviates. Users are encouraged to read the standards themselves as
this documentation assumes a general level of familiarity. References to the
standards will be given as sectional citations (ie §X.Y.Z, referring to headers)
when necessary and applicable.
By default, fireflow will only read fully-compliant files as outlined
here. fireflow has many ways to deal with "non-compliant" errors commonly
found in FCS files which are covered elsewhere; this is meant to provide an
overview of how fireflow implements the standards themselves.
fireflow will only write fully-compliant files which follow the standards
and any exceptions outlined here. Any deviation from this is a bug.
The first 58 bytes of HEADER are supported as-is. OTHER segments are also supported.
FCS 3.2 is the only version that specifies that the OTHER segment offsets must
be 8 bytes long (§3.2). By default, fireflow will use 8 bytes for the width of
each OTHER offset when reading and writing (both are configurable).
All standard keywords (those that start with $) for each FCS version are
generally supported in fireflow. For many keywords this means that that key
and value will be read exactly as specified in each standard. There are some
important exceptions outlined below.
All offset keywords ($NEXTDATA, $BEGINDATA, $ENDDATA, $BEGINSTEXT,
$ENDSTEXT, $BEGINANALYSIS, $ENDANALYSIS) are restricted to be 20 digits
long. This reflects fireflow's internal limitation of 64-bit offsets, which
should be more than enough for all use cases.
Furthermore, fireflow will write new FCS files with each of these left-padded
with zeros up to 20 characters. This makes the length of TEXT predictable,
which in turn allows computing the value of each offset.
FCS 2.0 and 3.0 only allow ASCII (vs UTF-8) characters in TEXT. For 3.0, this can be overridden by $UNICODE which will permit UTF-8 on a keyword-basis.
fireflow is written in a (decenly modern) language that supports UTF-8 out of
the box. As such there is little reason to penalize 2.0/3.0 files for having
UTF-8 but non-ASCII characters. Standard keys (in all versions) are ASCII-only,
so this restriction in theory only ever applies to keyword values and to
non-standard keys. Furthermore, any downstream application which consumes the
string values produced by fireflow will also likely have UTF-8 support.
Finally, any standardized values (ie strings parsed to an integer) will fail if
the characters cannot be interpreted (UTF-8 or not).
Therefore, there is almost nothing to gain by enforcing this.
For all versions, fireflow will ensure the measurement name in this keyword
matches a $PnN.
For FCS 3.2, fireflow will ensure all measurement names in this keyword
match a $PnN.
fireflow will check that the "start time" occurs before the "end time".
For the keywords $BTIM, $ETIM and $DATE, this requires all three to be present since it is possible for a run to start on one day and end at an earlier wall clock time on a subsequent day.
For $BEGINDATETIME and $ENDDATETIME this comparison is performed directly if both are present.
No version explicitly mandates this check, but it makes sense to do.
For $DFCnTOm and $COMP fireflow will only accept matrices that are 2x2 or
larger despite not being written into the standard. This restriction is
specified for $SPILLOVER starting in FCS 3.1.
Furthermore, fireflow will ensure that each matrix "matches" the measurements
specified in TEXT. For $DFCnTOM and $COMP this means that the length/width
of the matrix should match $PAR and each measurement name in $SPILLOVER must
match a $PnN.
fireflow will parse $CSMODE, $CSVBITS, $CSTOT, and $CSVnFLAG as-is.
No other processing will be done with these despite FCS 3.0 and FCS 3.1 (see
§3.4 in both) outlining their use in parsing the ANALYSIS segment.
As of FCS 3.2, this keyword is "recommended" to be multiples of 8 (§3.3.38) and anything otherwise is deprecated.
fireflow outright forbids anything that is not a multiple of 8 for any version
for numeric data. Specifically, $PnB must be a multiple of 8 between 8 and
64 (inclusive) for numeric data, and an integer between 1 and 20 for ASCII data
(inclusive).
In reality, non-multiples of 8 are rarely seen (if ever) and implementing
support for these would make the internal logic of fireflow much more complex.
Furthermore, the $BYTEORD keyword in FCS 2.0 and 3.0 effectively forbids $PnB to be anything other than a multiple of 8 since its length is measured in bytes rather than bits. Even when $BYTEORD was changed to merely describe endianness in FCS 3.1, it is still not clear how one would interpret a binary string that is anything other than a multiple of 8 bits long since "endianness" refers to the order of bytes rather than bits.
In all FCS versions, this keywords is supposed to be an unsigned integer.
However, it makes practical sense to allow this number to match the type of the
measurement it describes; if $DATATYPE is a F then $PnR should be a 32-bit
float.
This also applies to $GnR.
fireflow implements this by parsing each $PnR as a decimal (ie an integer
with precision describing the decimal point, not a float to avoid rounding
errors). For cases where $PnR is needed to process data, it will be converted
to the appropriate type (and the user will be alerted if this fails). As of FCS
version 3.2, the only case where this is done is for unsigned integer
measurements ($DATATYPE or $PnDATATYPE are I) where $PnR is used to
make the bitmask that will be applied to data when read/written.
As of FCS 3.2, $PnG shall not be used when $PnE is set to log scaling (§3.3.46).
fireflow slightly loosens this restriction to allow $PnG to be 1.0 in the
case of log scaling. Since $PnG is simply a multiplier, setting it to 1.0
equates to a no-op, and thus no harm should come from this while also allowing
many machines that "erroneously" set $PnG to 1.0.
In all other cases, the standard is enforced as written, including the
restriction of log scaling to integer data only. Note that ASCII values are not
counted as integer values here despite being processed as 64-bit integers in
fireflow.
For linear displays, the lower bound should be less than the upper bound. For
logarithmic displays, both decades and offsets should be positive floats. The
standards do not enforce this (although these restrictions are implied);
however, fireflow will enforce these.
These keywords are specified to be written as integers in some/all FCS versions.
However, fireflow will interpret these as 32-bit floats. This generally should
not be a problem unless users wish to use gigantic integers that will lose
precision if stored in a 32-bit float (which is probably inappropriate for these
keywords anyways).
Additionally, $PnL and $PnV may only use positive floats and $PnO may only use non-negative floats.
Specifically, $PnL is supposed to be an integer in all versions (although FCS 3.1 allowed this to include multiple wavelengths), PnO was specified as an integer through FCS 3.1, and $PnV was specified to be an integer through FCS 3.0. Note that FCS 2.0 does not list the type of these but integer is implied via the examples given.
Everything that applies to $PnV above also applies to $GnV where applicable.
fireflow restricts these to be one of Area, Width or Height. It is not
clear if FCS 3.2 (§3.3.45) allows other other values to be used; however, these
make sense when specifying measurements that are optical in nature which have a
signal vs time curve.
fireflow allows any string to be used for this keyword. It is not clear
if only the values given in FCS 3.2 (§3.3.55) are allowed. However, it can be
assumed that many other "types" of measurements not in this list might be used,
especially as new machines are developed.
fireflow makes a clear distinction between "temporal" and "optical"
measurements. "Temporal" measurements describe the channel often labeled "Time".
"Optical" describe everything else and includes actual optical measurements with
lasers, filters, and detectors but could include other types which as of FCS 3.2
are not well-described.
As of FCS 3.2, the restrictions applied to temporal measurements are:
- shall only have linear scaling (§3.3.43)
- shall not have $PnG set (§3.3.46)
- $PnN shall be set to "Time" (§3.3.48)
- if provided, $TIMESTEP should also be present (§3.3.64)
fireflow enforces (1) and (2) according to the logic outlined
previously. (3) is enforced by default but in a
case-insensitive manner as outlined in (§3.3.64). (4) is enforced for all
versions except FCS 2.0 where $TIMESTEP did not exist.
Additionally, fireflow adds the following restrictions to temporal
measurements:
- there can only be zero or one temporal measurement per dataset
- the temporal measurement cannot be referenced by $SPILLOVER, $TR, or $UNSTAINEDCENTERS as each of these pertain to optical data
- PnTYPE must be
Timefor FCS 3.2 - $TIMESTEP must be greater than zero
- Only $PnN, $PnS, $PnD, $PnTYPE, $PKn, and PKNn are allowed since all other keywords describe some aspect of an optical measurement
For (1), some vendors store data in 32-bit but will "hack" a 64-bit time
measurement by spreading the 64-bit values across two 32-bit measurements.
fireflow can read these without issue, but only one will be considered the
temporal measurement (assuming all other restrictions are met as described
above).
The internal encoding of the DATA segment is controlled by $DATATYPE, $BYTEORD, $PnB, $PnR (integers and ASCII only), $PnDATATYPE (FCS 3.2 only), and $TOT (delimited ASCII only). Only certain combinations of these keywords are allowed and their implications are clarified here.
All possible data layouts in fireflow (in terms of their keywords) are:
| Name | $DATATYPE | $BYTEORD | $PnB | $PnR | $PnDATATYPE | Versions |
|---|---|---|---|---|---|---|
| Ascii (delimited) | A |
Note 1 | * |
x<2^64 | n/a | all |
| Ascii (fixed) | A |
Note 1 | 0<x<21 | x<2^64 | n/a | all |
| Int8 (mixed) | I |
1 |
8 |
x<2^8 | n/a | 2.0/3.0 |
| Int16 (mixed) | I |
{1..2} | 16 |
x<2^16 | n/a | 2.0/3.0 |
| Int24 (mixed) | I |
{1..3} | 24 |
x<2^24 | n/a | 2.0/3.0 |
| Int32 (mixed) | I |
{1..4} | 32 |
x<2^32 | n/a | 2.0/3.0 |
| Int40 (mixed) | I |
{1..5} | 40 |
x<2^40 | n/a | 2.0/3.0 |
| Int48 (mixed) | I |
{1..6} | 48 |
x<2^48 | n/a | 2.0/3.0 |
| Int56 (mixed) | I |
{1..7} | 56 |
x<2^56 | n/a | 2.0/3.0 |
| Int64 (mixed) | I |
{1..8} | 64 |
x<2^64 | n/a | 2.0/3.0 |
| F32 (mixed) | F |
{1..4} | 32 |
f32 | n/a | 2.0/3.0 |
| F64 (mixed) | D |
{1..8} | 32 |
f64 | n/a | 2.0/3.0 |
| F32 | F |
endian | 32 |
f32 | n/a | 3.1/3.2 |
| F64 | D |
endian | 64 |
f64 | n/a | 3.1/3.2 |
| Int (any width) | I |
endian | octet | Note 2 | n/a | 3.1/3.2 |
| Mixed-Type | A/I/F/D |
endian | Note 3 | Note 3 | I/F/D |
3.2 |
Legend:
X: literal value- {X..Y}: comma-separated integers between X and Y (inclusive) in any order
- endian:
1,2,3,4or4,3,2,1 - octet: any multiple of 8 between 8 and 64 (inclusive)
- f32: any f32 value except +/- infinity and NaN
- f64: any f64 value except +/- infinity and NaN
Notes:
- For ASCII, $BYTEORD does not matter.
fireflowwill still read it to ensure validity; beyond this its value has no impact on the layout. - These must be less than 2^
$PnB. - Must match whatever is specified by $DATATYPE and $PnDATATYPE. $PnR
must be within [1, 20] and can never be
*if $DATATYPE isAand $PnDATATYPE is not set (the only way to specify an ASCII column in this layout). If the measurement is an integer, $PnR must be less than 2^$PnB. If measurement is a float, $PnR must be a valid representation of a float (either 32 or 64 bit depending on datatype).
For all but delimited ASCII, the $TOT keyword is overspecified since the
number of events can be computed using the length of DATA and the sum of all
$PnB. fireflow will check if $TOT matches this precomputed event number
and alert the user for a discrepancy.
For delimited ASCII, $TOT is necessary since there is no fixed event width.
For FCS 2.0, $TOT is optional, and if not given fireflow will simply read
until the end of DATA.
One of the implications of changing $BYTEORD to only mean "big or little endian" in FCS 3.1 was that it became possible to make $PnB different across measurements; this is because previously $PnB was implied in $BYTEORD.
As a consequence, integer layouts can express any width from 8 to 64 bits (in
multiples of 8) starting in FCS 3.1. It wasn't clear if this was intended, but
fireflow nonetheless supports this layout.
In practice, many vendors still seem to use a single width for integer data in FCS 3.1+ despite the added flexibility.
fireflow treats ASCII values as unsigned 64-bit integers. It isn't clear from
FCS 3.2 §3.3.14 if signed integers are allowed).
fireflow will read ANALYSIS and OTHER segments as an unprocessed
bytestream. Nothing else will be done. It is up to the user to process these
as needed.
FCS 3.0 and 3.1 provide guidance for parsing ANALYSIS in §3.4 but this is not
implemented in fireflow. Furthermore, the $CS* keywords are available from
fireflow's TEXT API which may be used in this process as required.
Each segment as specified in HEADER and/or TEXT should not overlap with any other segment. Additionally, each offset must be contained within the file.
Together, these checks will flag many common offset issues (off-by-one) as well as any truncated files that were not fully written.
When writing, fireflow will arrange segments in the following order:
- HEADER
- TEXT
- OTHER (if they exist, in the order listed in HEADER)
- supplemental TEXT
- DATA
- ANALYSIS
TEXT must come directly after HEADER to make it obvious when to stop parsing OTHER offsets. OTHER comes after TEXT to maximize likelihood these will fit within the first 99,999,999 bytes. Everything else can go in any order since they can be addressed using "large" offsets.
- cyclic redundancy check (CRC)