Aird is a new format for mass spectrometry data storage. It is an opensource and
computation-oriented format with controllable precision, flexible indexing strategies, and high
compression rate for m/z, intensity and ion mobility pairs. Aird provides a novel compressor called
ComboComp for m/z data compression,which makes up an amazing compression rate. Compared with Zlib, m/z data is
about
65% lower in the Aird on average. Aird is a computational friendly algorithm. Through SIMD
optimization, the decoding speed of Aird is much higher than that of Zlib.
Aird SDK is a developer tool written in Java, C# and Python language. It is convenient for developers who want to read
the spectrum data in the Aird file quickly. With the high performance of reading and excellent
compression rate, developer can develop a lot of application based on Aird for data visualization
and analysis.
Aird Index File Suffix: .json
Aird Data File Suffix: .aird
Aird Index File and Aird Data File show be stored in the same directory with the same file.
You should use the AirdPro client to transfer the vendor files into Aird format.
You can download the AirdPro from the github:
https://github.com/CSi-Studio/AirdPro/releases/
After downloading, unzip the file, click the AirdPro.exe to start the AirdPro Application AirdPro is
written in C#, it is also an opensource project. Simple UI is provided by AirdPro for people to
convert the vendor file to the Aird file quickly.
- DIA/SWATH
- DDA
- PRM
- DIA_PASEF
- DDA_PASEF
Demo code: see SampleCode.java in the project or in the "How to use" chapter
-
Lu, M., An, S., Wang, R. et al. Aird: a computation-oriented mass spectrometry data format enables a higher compression ratio and less decoding time. BMC Bioinformatics 23, 35 (2022)
-
Wang,J. et al. StackZDPD: a novel encoding scheme for mass spectrometry data optimized for speed and compression ratio. Scientific Reports, 12, 5384.(2022)
<dependency>
<groupId>net.csibio.aird</groupId>
<artifactId>aird-sdk</artifactId>
<version>2.5.1.1</version>
</dependency>
Search "AirdSDK" in Nuget Package Manager
pip install AirdSDK
| Name | Type | Required | Description |
|---|---|---|---|
| version | String | True | Aird format version |
| versionCode | Integer | True | Aird format version code |
| engine | Integer | True | Compression engine type (0: Row Compression, 1: Column Compression) |
| compressors | List | True | The compression strategies for m/z, intensity and mobility array |
| instruments | List | True | General information about the MS instrument |
| dataProcessings | List | False | Description of any manipulation (from the first conversion to Aird format until the creation of the current Aird instance document) applied to the data |
| softwares | List | False | Software used to convert the data. If data has been processed (e.g. profile > centroid) by any additional progs these should be added too |
| parentFiles | List | False | Path to all the ancestor files (up to the native acquisition file) used to generate the current Aird document |
| rangeList | List | False | The precursor m/z window ranges which have been adjusted with experiment overlap. This field is targeted for DIA and PRM type format |
| indexList | List | True | The index for mass spectrometry data |
| indexStartPtr | Long | False | Start position of compressed binary index data (version code >=7) |
| indexEndPtr | Long | False | End position of compressed binary index data (version code >=7) |
| chromatogramIndex | ChromatogramIndex | False | Chromatogram information for MRM acquisition mode |
| type | String | True | Aird Type. Supported types: DIA, DDA, PRM, DIA_PASEF, DDA_PASEF, MRM, MSI_MALDI, COMMON |
| fileSize | Long | True | The file size for Aird file and JSON file |
| totalCount | Long | True | Total spectrums count |
| airdPath | String | False | The .aird file path |
| activator | String | False | Activator Method, CID,HCD,ETD,ECD |
| energy | Float | False | Collision Energy |
| msType | String | True | Mass Spectrum Type, PROFILE, CENTROIDED |
| rtUnit | String | True | rt unit, always second |
| polarity | String | True | Polarity type, POSITIVE, NEGATIVE, NEUTRAL |
| filterString | String | False | Filter string for spectrum selection |
| ignoreZeroIntensityPoint | Boolean | True | Whether ignore the point which intensity is 0 |
| mobiInfo | MobiInfo | False | ion mobility information |
| msiInfo | MsiInfo | False | MSI (Mass Spectrometry Imaging) information |
| creator | String | False | The file creator, this field can be set up in the AirdPro |
| createDate | String | False | The create date for the aird file |
| features | String | False | Some other features stored with "key:value;key:value" format |
| startTimeStamp | String | False | Experiment start timestamp |
| Name | Type | Required | Description |
|---|---|---|---|
| target | String | True | Compression target: mz, intensity, mobility, rt |
| methods | List | True | Compression methods in order, e.g. ["VB","Zstd"] |
| precision | Integer | True | Precision multiplier: 1000=3dp, 10000=4dp, etc. |
| digit | Integer | False | Use for StackZDPD algorithm, 2^digit = layers (Python SDK only) |
| byteOrder | String | False | Byte order: LITTLE_ENDIAN(default), BIG_ENDIAN |
| Name | Type | Required | Description |
|---|---|---|---|
| start | Double | True | Precursor m/z start |
| end | Double | True | Precursor m/z end |
| mz | Double | True | Precursor m/z |
| charge | Integer | False | Precursor charge, 0 when empty |
| features | String | False | Some other features stored with "key:value;key:value" format |
| Name | Type | Required | Description |
|---|---|---|---|
| level | Integer | True | 1:MS1, 2:MS2 |
| startPtr | Long | True | The start point for the block |
| endPtr | Long | True | The endpoint for the block |
| num | Integer | False | The scan number in the vendor file. If a block has a list of MS2, this field is the related MS1's number |
| rangeList | List | False | The precursor m/z window ranges which have been adjusted with experiment overlap. This field is targeted for DIA and PRM type format |
| nums | List | False | Scan numbers in the block |
| rts | List | True | All the retention times in the block |
| tics | List | False | Every Spectrum's total intensity in the block |
| injectionTimes | List | False | Every Spectrum's injection time in the block (C# and Java SDK only) |
| basePeakIntensities | List | True | Every Spectrum's total base peak intensity in the block |
| basePeakMzs | List | True | Every Spectrum's total base peak mz in the block |
| filterStrings | List | False | Every Spectrum's filter string in the block |
| activators | List | False | Every Spectrum's activator in the block |
| energies | List | False | Every Spectrum's energy in the block |
| polarities | List | False | Every Spectrum's polarity in the block |
| msTypes | List | False | Every Spectrum's msType in the block |
| tags | List | False | Used in StackZDPD, the original layers of every mz point (Python SDK only) |
| mzs | List | True | Size for every m/z bytes size |
| ints | List | True | Size for every intensity bytes size |
| mobilities | List | False | Size for every ion mobility bytes size |
| cvList | List<List> | False | PSI Controlled Vocabulary (Python SDK only) |
| features | String | False | Some other features stored with "key:value;key:value" format |
| Name | Type | Required | Description |
|---|---|---|---|
| manufacturer | String | False | Instrument manufacturer: "ABSciex","Thermo Fisher" |
| ionisation | String | False | Ionisation method |
| resolution | String | False | Resolution |
| model | String | False | Instrument model |
| source | List | False | Source: "electrospray ionization", "electrospray inlet" |
| analyzer | List | False | Analyzer: "quadrupole", "orbitrap" |
| detector | List | False | Detector: "inductive detector" |
| Name | Type | Required | Description |
|---|---|---|---|
| processingOperations | List | False | Any additional manipulation not included elsewhere in the dataProcessing element |
| Name | Type | Required | Description |
|---|---|---|---|
| name | String | True | The software name |
| version | String | False | The software version |
| type | String | False | The software function type, like "acquisition" |
| Name | Type | Required | Description |
|---|---|---|---|
| name | String | True | The filename |
| location | String | False | The file location |
| type | String | False | The file type |
| Name | Type | Required | Description |
|---|---|---|---|
| dictStart | long | True | start position in the aird for mobi array |
| dictEnd | long | True | end position in the aird for mobi array |
| unit | String | False | ion mobility unit |
| type | String | False | ion mobility type, see MobilityType |
AirdSDK provides the following core Parser classes for different mass spectrometry data acquisition modes:
- BaseParser: Abstract base class providing common spectrum reading functionality
- DDAParser: DDA (Data-Dependent Acquisition) mode parser
- DIAParser: DIA (Data-Independent Acquisition) mode parser
- PRMParser: PRM (Parallel Reaction Monitoring) mode parser (inherits from DIAParser)
- MRMParser: MRM/SRM (Multiple/Selected Reaction Monitoring) mode parser
- MSIMaldiParser: MSI MALDI (Mass Spectrometry Imaging) mode parser
// Load DIA data
DIAParser diaParser = new DIAParser("/FilePath/file.json");
// Load DDA data
DDAParser ddaParser = new DDAParser("/FilePath/file.json");
// Load PRM data
PRMParser prmParser = new PRMParser("/FilePath/file.json");
// Load MRM data
MRMParser mrmParser = new MRMParser("/FilePath/file.json");
// Load MSI MALDI data
MSIMaldiParser msiParser = new MSIMaldiParser("/FilePath/file.json");
DDAParser parser = new DDAParser(YOUR_AIRD_INDEX_FILE_PATH);
AirdInfo airdInfo = parser.getAirdInfo();
// Use BlockIndex and retention time to read single spectrum
double rt = 12.3456;
Spectrum spectrum = parser.getSpectrumByRt(blockIndex, rt);
// Use multi-parameter version to read single spectrum
Spectrum spectrum = parser.getSpectrumByRt(startPtr, rtList, mzOffsets, intOffsets, rt);
// Read spectrum by sequence number
int index = 12;
Spectrum spectrum = parser.getSpectrum(index);
// Read spectrum by BlockIndex and block index
Spectrum spectrum = parser.getSpectrumByIndex(blockIndex, index);
// Read all spectra from specified BlockIndex
TreeMap<Double, Spectrum> spectraMap = parser.getSpectra(blockIndex);
// Read spectra within specified retention time range
TreeMap<Double, Spectrum> spectraMap = parser.getSpectra(start, end, rtList, mzOffsets, intOffsets);
// Get MS1 spectrum index
BlockIndex ms1Index = ddaParser.getMs1Index();
// Get all MS2 spectrum indexes
List<BlockIndex> ms2Indexes = ddaParser.getAllMs2Index();
// Read all DDA data into memory (recommended for small files <200MB)
List<DDAMs> cycleList = ddaParser.readAllToMemory();
// Get MS1 spectrum mapping
TreeMap<Double, Spectrum> ms1Map = ddaParser.getMs1SpectraMap();
DIAParser diaParser = new DIAParser("/FilePath/file.json");
AirdInfo airdInfo = diaParser.getAirdInfo();
// Read DIA window blocks one by one
airdInfo.getIndexList().forEach(blockIndex -> {
TreeMap<Double, Spectrum> map = diaParser.getSpectra(blockIndex); // key is retention time
});
MRMParser mrmParser = new MRMParser("/FilePath/file.json");
// Get chromatogram index
ChromatogramIndex chromaIndex = mrmParser.getChromatogramIndex();
// Get all MRM ion pairs
List<MrmPair> mrmPairs = mrmParser.getAllMrmPairs();
// Batch get chromatogram data
HashMap<String, Xic> chromatograms = mrmParser.getChromatograms(start, end, keyList, rtOffsets, intOffsets);
// Get chromatogram data for specified retention time range
double[] rtData = mrmParser.getRts4Chroma(bytes, offset, length);
double[] intensityData = mrmParser.getInts4Chroma(bytes, start, length);
MSIMaldiParser msiParser = new MSIMaldiParser("/FilePath/file.json");
// Get MS1 index for MSI data
BlockIndex ms1Index = msiParser.getMs1Index();
// Read all MSI spectra into memory
List<Spectrum> spectra = msiParser.readAllToMemory();
// Get image data
List<ImageData> imageData = msiParser.getImageDataList(mz, tolerance);
// Decompress M/Z data
double[] mzValues = parser.getMzs(compressedBytes);
double[] mzValues = parser.getMzs(compressedBytes, offset, length);
int[] mzIntegerValues = parser.getMzsAsInteger(compressedBytes);
// Decompress intensity data
double[] intensities = parser.getInts(compressedBytes);
double[] intensities = parser.getInts(compressedBytes, start, length);
// Decompress mobility data
double[] mobilities = parser.getMobilities(compressedBytes, start, length);
// Calculate extracted ion chromatogram
Xic xic = parser.calcXic(spectraMap, mzStart, mzEnd);
// Close resources when done
parser.close();
Aird-SDK/
├── CSharpSDK/ # C# SDK Source Code
├── JavaSDK/ # Java SDK Source Code
├── PyAirdSDK/ # Python SDK Source Code
├── docs/ # Documentation Directory
│ ├── Java/ # Java SDK Documentation
│ ├── CSharp/ # C# SDK Documentation
│ └── Python/ # Python SDK Documentation
└── README.md # Project Overview
All SDKs support the following core Parser classes:
- BaseParser - Base class for all Parser classes, providing common functionality
- DDAParser - Data-Dependent Acquisition (DDA) mode
- DIAParser - Data-Independent Acquisition (DIA) mode
- MRMParser - Multiple Reaction Monitoring (MRM) mode
- PRMParser - Parallel Reaction Monitoring (PRM) mode
- DDAPasefParser - DDA-PASEF mode (with ion mobility)
- DIAPasefParser - DIA-PASEF mode (with ion mobility)
- MSIMaldiParser - MALDI imaging
- ColumnParser - Column data parsing
Detail sample code. See net.csibio.aird.sample.SampleCode