1 What is Aird?

1.1 Abstract

Aird is a new format for mass spectrometry data storage. It is an opensource and computation-oriented format with controllable precision, flexible indexing strategies, and high compression rate for m/z, intensity and ion mobility pairs. Aird provides a novel compressor called ComboComp for m/z data compression,which makes up an amazing compression rate. Compared with Zlib, m/z data is about 65% lower in the Aird on average. Aird is a computational friendly algorithm. Through SIMD optimization, the decoding speed of Aird is much higher than that of Zlib.
Aird SDK is a developer tool written in Java, C# and Python language. It is convenient for developers who want to read the spectrum data in the Aird file quickly. With the high performance of reading and excellent compression rate, developer can develop a lot of application based on Aird for data visualization and analysis.

Aird Index File Suffix: .json
Aird Data File Suffix: .aird
Aird Index File and Aird Data File show be stored in the same directory with the same file.

1.2 AirdPro: Conversion Client for Vendor Files

You should use the AirdPro client to transfer the vendor files into Aird format.
You can download the AirdPro from the github:
https://github.com/CSi-Studio/AirdPro/releases/
After downloading, unzip the file, click the AirdPro.exe to start the AirdPro Application AirdPro is written in C#, it is also an opensource project. Simple UI is provided by AirdPro for people to convert the vendor file to the Aird file quickly.

1.3 Supported Acquisition Methods

DIA/SWATH
DDA
PRM
DIA_PASEF
DDA_PASEF

Demo code: see SampleCode.java in the project or in the "How to use" chapter

1.4 Citation

Lu, M., An, S., Wang, R. et al. Aird: a computation-oriented mass spectrometry data format enables a higher compression ratio and less decoding time. BMC Bioinformatics 23, 35 (2022)
Wang,J. et al. StackZDPD: a novel encoding scheme for mass spectrometry data optimized for speed and compression ratio. Scientific Reports, 12, 5384.(2022)

2. How to import (Java, C#, Python)

2.1 Maven for Java SDK

<dependency>
    <groupId>net.csibio.aird</groupId>
    <artifactId>aird-sdk</artifactId>
    <version>2.5.1.1</version>
</dependency>

2.2 Nuget for C# SDK

Search "AirdSDK" in Nuget Package Manager

2.3 PyPI for Python SDK

pip install AirdSDK

3 Domain Definition

3.1 AirdInfo

Name	Type	Required	Description
version	String	True	Aird format version
versionCode	Integer	True	Aird format version code
engine	Integer	True	Compression engine type (0: Row Compression, 1: Column Compression)
compressors	List	True	The compression strategies for m/z, intensity and mobility array
instruments	List	True	General information about the MS instrument
dataProcessings	List	False	Description of any manipulation (from the first conversion to Aird format until the creation of the current Aird instance document) applied to the data
softwares	List	False	Software used to convert the data. If data has been processed (e.g. profile > centroid) by any additional progs these should be added too
parentFiles	List	False	Path to all the ancestor files (up to the native acquisition file) used to generate the current Aird document
rangeList	List	False	The precursor m/z window ranges which have been adjusted with experiment overlap. This field is targeted for DIA and PRM type format
indexList	List	True	The index for mass spectrometry data
indexStartPtr	Long	False	Start position of compressed binary index data (version code >=7)
indexEndPtr	Long	False	End position of compressed binary index data (version code >=7)
chromatogramIndex	ChromatogramIndex	False	Chromatogram information for MRM acquisition mode
type	String	True	Aird Type. Supported types: DIA, DDA, PRM, DIA_PASEF, DDA_PASEF, MRM, MSI_MALDI, COMMON
fileSize	Long	True	The file size for Aird file and JSON file
totalCount	Long	True	Total spectrums count
airdPath	String	False	The .aird file path
activator	String	False	Activator Method, CID,HCD,ETD,ECD
energy	Float	False	Collision Energy
msType	String	True	Mass Spectrum Type, PROFILE, CENTROIDED
rtUnit	String	True	rt unit, always second
polarity	String	True	Polarity type, POSITIVE, NEGATIVE, NEUTRAL
filterString	String	False	Filter string for spectrum selection
ignoreZeroIntensityPoint	Boolean	True	Whether ignore the point which intensity is 0
mobiInfo	MobiInfo	False	ion mobility information
msiInfo	MsiInfo	False	MSI (Mass Spectrometry Imaging) information
creator	String	False	The file creator, this field can be set up in the AirdPro
createDate	String	False	The create date for the aird file
features	String	False	Some other features stored with "key:value;key:value" format
startTimeStamp	String	False	Experiment start timestamp

3.2 Compressor

Name	Type	Required	Description
target	String	True	Compression target: mz, intensity, mobility, rt
methods	List	True	Compression methods in order, e.g. ["VB","Zstd"]
precision	Integer	True	Precision multiplier: 1000=3dp, 10000=4dp, etc.
digit	Integer	False	Use for StackZDPD algorithm, 2^digit = layers (Python SDK only)
byteOrder	String	False	Byte order: LITTLE_ENDIAN(default), BIG_ENDIAN

3.3 WindowRange

Name	Type	Required	Description
start	Double	True	Precursor m/z start
end	Double	True	Precursor m/z end
mz	Double	True	Precursor m/z
charge	Integer	False	Precursor charge, 0 when empty
features	String	False	Some other features stored with "key:value;key:value" format

3.4 BlockIndex

Name	Type	Required	Description
level	Integer	True	1:MS1, 2:MS2
startPtr	Long	True	The start point for the block
endPtr	Long	True	The endpoint for the block
num	Integer	False	The scan number in the vendor file. If a block has a list of MS2, this field is the related MS1's number
rangeList	List	False	The precursor m/z window ranges which have been adjusted with experiment overlap. This field is targeted for DIA and PRM type format
nums	List	False	Scan numbers in the block
rts	List	True	All the retention times in the block
tics	List	False	Every Spectrum's total intensity in the block
injectionTimes	List	False	Every Spectrum's injection time in the block (C# and Java SDK only)
basePeakIntensities	List	True	Every Spectrum's total base peak intensity in the block
basePeakMzs	List	True	Every Spectrum's total base peak mz in the block
filterStrings	List	False	Every Spectrum's filter string in the block
activators	List	False	Every Spectrum's activator in the block
energies	List	False	Every Spectrum's energy in the block
polarities	List	False	Every Spectrum's polarity in the block
msTypes	List	False	Every Spectrum's msType in the block
tags	List	False	Used in StackZDPD, the original layers of every mz point (Python SDK only)
mzs	List	True	Size for every m/z bytes size
ints	List	True	Size for every intensity bytes size
mobilities	List	False	Size for every ion mobility bytes size
cvList	List<List>	False	PSI Controlled Vocabulary (Python SDK only)
features	String	False	Some other features stored with "key:value;key:value" format

3.5 Instrument

Name	Type	Required	Description
manufacturer	String	False	Instrument manufacturer: "ABSciex","Thermo Fisher"
ionisation	String	False	Ionisation method
resolution	String	False	Resolution
model	String	False	Instrument model
source	List	False	Source: "electrospray ionization", "electrospray inlet"
analyzer	List	False	Analyzer: "quadrupole", "orbitrap"
detector	List	False	Detector: "inductive detector"

3.6 DataProcessing

Name	Type	Required	Description
processingOperations	List	False	Any additional manipulation not included elsewhere in the dataProcessing element

3.7 Software

Name	Type	Required	Description
name	String	True	The software name
version	String	False	The software version
type	String	False	The software function type, like "acquisition"

3.8 ParentFile

Name	Type	Required	Description
name	String	True	The filename
location	String	False	The file location
type	String	False	The file type

3.9 MobiInfo

Name	Type	Required	Description
dictStart	long	True	start position in the aird for mobi array
dictEnd	long	True	end position in the aird for mobi array
unit	String	False	ion mobility unit
type	String	False	ion mobility type, see MobilityType

4 API Document

4.1 Parser Classes Overview

AirdSDK provides the following core Parser classes for different mass spectrometry data acquisition modes:

BaseParser: Abstract base class providing common spectrum reading functionality
DDAParser: DDA (Data-Dependent Acquisition) mode parser
DIAParser: DIA (Data-Independent Acquisition) mode parser
PRMParser: PRM (Parallel Reaction Monitoring) mode parser (inherits from DIAParser)
MRMParser: MRM/SRM (Multiple/Selected Reaction Monitoring) mode parser
MSIMaldiParser: MSI MALDI (Mass Spectrometry Imaging) mode parser

4.2 Load Aird Info into memory

    // Load DIA data
    DIAParser diaParser = new DIAParser("/FilePath/file.json");
    
    // Load DDA data
    DDAParser ddaParser = new DDAParser("/FilePath/file.json");
    
    // Load PRM data
    PRMParser prmParser = new PRMParser("/FilePath/file.json");
    
    // Load MRM data
    MRMParser mrmParser = new MRMParser("/FilePath/file.json");
    
    // Load MSI MALDI data
    MSIMaldiParser msiParser = new MSIMaldiParser("/FilePath/file.json");

4.3 Read AirdInfo

    DDAParser parser = new DDAParser(YOUR_AIRD_INDEX_FILE_PATH);
    AirdInfo airdInfo = parser.getAirdInfo();

4.4 Read Spectrum by Retention Time

    // Use BlockIndex and retention time to read single spectrum
    double rt = 12.3456;
    Spectrum spectrum = parser.getSpectrumByRt(blockIndex, rt);
    
    // Use multi-parameter version to read single spectrum
    Spectrum spectrum = parser.getSpectrumByRt(startPtr, rtList, mzOffsets, intOffsets, rt);

4.5 Read Spectrum by Index

    // Read spectrum by sequence number
    int index = 12;
    Spectrum spectrum = parser.getSpectrum(index);
    
    // Read spectrum by BlockIndex and block index
    Spectrum spectrum = parser.getSpectrumByIndex(blockIndex, index);

4.6 Read Multiple Spectra

    // Read all spectra from specified BlockIndex
    TreeMap<Double, Spectrum> spectraMap = parser.getSpectra(blockIndex);
    
    // Read spectra within specified retention time range
    TreeMap<Double, Spectrum> spectraMap = parser.getSpectra(start, end, rtList, mzOffsets, intOffsets);

4.7 DDA-Specific Operations

    // Get MS1 spectrum index
    BlockIndex ms1Index = ddaParser.getMs1Index();
    
    // Get all MS2 spectrum indexes
    List<BlockIndex> ms2Indexes = ddaParser.getAllMs2Index();
    
    // Read all DDA data into memory (recommended for small files <200MB)
    List<DDAMs> cycleList = ddaParser.readAllToMemory();
    
    // Get MS1 spectrum mapping
    TreeMap<Double, Spectrum> ms1Map = ddaParser.getMs1SpectraMap();

4.8 DIA/SWATH Operations

    DIAParser diaParser = new DIAParser("/FilePath/file.json");
    AirdInfo airdInfo = diaParser.getAirdInfo();
    
    // Read DIA window blocks one by one
    airdInfo.getIndexList().forEach(blockIndex -> {
        TreeMap<Double, Spectrum> map = diaParser.getSpectra(blockIndex); // key is retention time
    });

4.9 MRM-Specific Operations

    MRMParser mrmParser = new MRMParser("/FilePath/file.json");
    
    // Get chromatogram index
    ChromatogramIndex chromaIndex = mrmParser.getChromatogramIndex();
    
    // Get all MRM ion pairs
    List<MrmPair> mrmPairs = mrmParser.getAllMrmPairs();
    
    // Batch get chromatogram data
    HashMap<String, Xic> chromatograms = mrmParser.getChromatograms(start, end, keyList, rtOffsets, intOffsets);
    
    // Get chromatogram data for specified retention time range
    double[] rtData = mrmParser.getRts4Chroma(bytes, offset, length);
    double[] intensityData = mrmParser.getInts4Chroma(bytes, start, length);

4.10 MSI MALDI Operations

    MSIMaldiParser msiParser = new MSIMaldiParser("/FilePath/file.json");
    
    // Get MS1 index for MSI data
    BlockIndex ms1Index = msiParser.getMs1Index();
    
    // Read all MSI spectra into memory
    List<Spectrum> spectra = msiParser.readAllToMemory();
    
    // Get image data
    List<ImageData> imageData = msiParser.getImageDataList(mz, tolerance);

4.11 Data Processing Functions

    // Decompress M/Z data
    double[] mzValues = parser.getMzs(compressedBytes);
    double[] mzValues = parser.getMzs(compressedBytes, offset, length);
    int[] mzIntegerValues = parser.getMzsAsInteger(compressedBytes);
    
    // Decompress intensity data
    double[] intensities = parser.getInts(compressedBytes);
    double[] intensities = parser.getInts(compressedBytes, start, length);
    
    // Decompress mobility data
    double[] mobilities = parser.getMobilities(compressedBytes, start, length);
    
    // Calculate extracted ion chromatogram
    Xic xic = parser.calcXic(spectraMap, mzStart, mzEnd);

4.12 Resource Management

    // Close resources when done
    parser.close();

5 Detailed Documentation

5.1 Multi-language SDK Documentation

Java SDK Documentation

C# SDK Documentation

Python SDK Documentation

5.2 Project Structure

Aird-SDK/
├── CSharpSDK/          # C# SDK Source Code
├── JavaSDK/            # Java SDK Source Code
├── PyAirdSDK/          # Python SDK Source Code
├── docs/               # Documentation Directory
│   ├── Java/           # Java SDK Documentation
│   ├── CSharp/         # C# SDK Documentation
│   └── Python/         # Python SDK Documentation
└── README.md           # Project Overview

5.3 Supported Parser Classes

All SDKs support the following core Parser classes:

Base Parsers

BaseParser - Base class for all Parser classes, providing common functionality

Data Acquisition Mode Parsers

DDAParser - Data-Dependent Acquisition (DDA) mode
DIAParser - Data-Independent Acquisition (DIA) mode
MRMParser - Multiple Reaction Monitoring (MRM) mode
PRMParser - Parallel Reaction Monitoring (PRM) mode

Advanced Feature Parsers

DDAPasefParser - DDA-PASEF mode (with ion mobility)
DIAPasefParser - DIA-PASEF mode (with ion mobility)
MSIMaldiParser - MALDI imaging
ColumnParser - Column data parsing

Sample Code

Detail sample code. See net.csibio.aird.sample.SampleCode

Name		Name	Last commit message	Last commit date
Latest commit History 509 Commits
CSharpSDK		CSharpSDK
JavaSDK		JavaSDK
PyAirdSDK		PyAirdSDK
docs		docs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README_CN.md		README_CN.md

License

CSi-Studio/Aird-SDK

Folders and files

Latest commit

History

Repository files navigation

1 What is Aird?

1.1 Abstract

1.2 AirdPro: Conversion Client for Vendor Files

1.3 Supported Acquisition Methods

1.4 Citation

2. How to import (Java, C#, Python)

2.1 Maven for Java SDK

2.2 Nuget for C# SDK

2.3 PyPI for Python SDK

3 Domain Definition

3.1 AirdInfo

3.2 Compressor

3.3 WindowRange

3.4 BlockIndex

3.5 Instrument

3.6 DataProcessing

3.7 Software

3.8 ParentFile

3.9 MobiInfo

4 API Document

4.1 Parser Classes Overview

4.2 Load Aird Info into memory

4.3 Read AirdInfo

4.4 Read Spectrum by Retention Time

4.5 Read Spectrum by Index

4.6 Read Multiple Spectra

4.7 DDA-Specific Operations

4.8 DIA/SWATH Operations

4.9 MRM-Specific Operations

4.10 MSI MALDI Operations

4.11 Data Processing Functions

4.12 Resource Management

5 Detailed Documentation

5.1 Multi-language SDK Documentation

Java SDK Documentation

C# SDK Documentation

Python SDK Documentation

5.2 Project Structure

5.3 Supported Parser Classes

Base Parsers

Data Acquisition Mode Parsers

Advanced Feature Parsers

Sample Code

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors 8

Uh oh!

Languages

Packages