[FeatureRequest]: FileInfoParser is inefficient #966

AndyEveritt · 2024-03-11T17:38:33Z

Is your feature request related to a problem? Please describe.

Currently FileInfoParser is inefficient at retrieving meta data about a gcode file.

It takes ~1 second to retrieve meta data on a Duet 2 Wifi & Duet 3 mini 5+ in standalone mode. Some analysis of the code has been done and the main areas that appear to be causing the slow performance are:

Reading and processing the footer of the file

RepRapFirmware/src/Storage/FileInfoParser.cpp

Lines 256 to 325 in 34c0bbd

    
           int nbytes = fileBeingParsed->Read(buf, sizeToRead); 
        
           if (nbytes != (int)sizeToRead) 
        
           { 
        
           	reprap.GetPlatform().MessageF(WarningMessage, "Failed to read footer from G-Code file \"%s\"\n", filePath); 
        
           	parseState = notParsing; 
        
           	fileBeingParsed->Close(); 
        
           	info = parsedFileInfo; 
        
           	return GCodeResult::warning; 
        
           } 
        
           buf[sizeToScan] = 0; 
        
           // Record performance data 
        
           uint32_t now = millis(); 
        
           accumulatedReadTime += now - startTime; 
        
           startTime = now; 
        
           bool footerInfoComplete = true; 
        
           // Search for filament used 
        
           if (parsedFileInfo.numFilaments == 0) 
        
           { 
        
           	parsedFileInfo.numFilaments = FindFilamentUsed(buf); 
        
           	if (parsedFileInfo.numFilaments == 0) 
        
           	{ 
        
           		footerInfoComplete = false; 
        
           	} 
        
           } 
        
           // Search for layer height 
        
           if (parsedFileInfo.layerHeight == 0.0) 
        
           { 
        
           	if (!FindLayerHeight(buf)) 
        
           	{ 
        
           		footerInfoComplete = false; 
        
           	} 
        
           } 
        
           // Search for object height 
        
           if (parsedFileInfo.objectHeight == 0.0) 
        
           { 
        
           	if (!FindHeight(buf, sizeToScan)) 
        
           	{ 
        
           		footerInfoComplete = false; 
        
           	} 
        
           } 
        
           // Search for number of layers 
        
           if (parsedFileInfo.numLayers == 0) 
        
           { 
        
           	// Number of layers should come before the object height 
        
           	(void)FindNumLayers(buf, sizeToScan); 
        
           } 
        
           // Look for print time 
        
           if (parsedFileInfo.printTime == 0) 
        
           { 
        
           	if (!FindPrintTime(buf) && fileBeingParsed->Length() - nextSeekPos <= GcodeFooterPrintTimeSearchSize) 
        
           	{ 
        
           		footerInfoComplete = false; 
        
           	} 
        
           } 
        
           // Look for simulated print time. It will always be right at the end of the file, so don't look too far back 
        
           if (parsedFileInfo.simulatedTime == 0) 
        
           { 
        
           	if (!FindSimulatedTime(buf) && fileBeingParsed->Length() - nextSeekPos <= GcodeFooterPrintTimeSearchSize) 
        
           	{ 
        
           		footerInfoComplete = false; 
        
           	} 
        
           }

Processing the header of the file

RepRapFirmware/src/Storage/FileInfoParser.cpp

Lines 147 to 169 in 34c0bbd

    
           if (parsedFileInfo.numFilaments == 0) 
        
           { 
        
           	parsedFileInfo.numFilaments = FindFilamentUsed(buf); 
        
           	headerInfoComplete &= (parsedFileInfo.numFilaments != 0); 
        
           } 
        
           // Look for layer height 
        
           if (parsedFileInfo.layerHeight == 0.0) 
        
           { 
        
           	headerInfoComplete &= FindLayerHeight(buf); 
        
           } 
        
           // Look for slicer program 
        
           if (parsedFileInfo.generatedBy.IsEmpty()) 
        
           { 
        
           	headerInfoComplete &= FindSlicerInfo(buf); 
        
           } 
        
           // Look for print time 
        
           if (parsedFileInfo.printTime == 0) 
        
           { 
        
           	headerInfoComplete &= FindPrintTime(buf); 
        
           }

With the existing implementation, these are the api call times:

Removing the Footer read and process code gives a small improvement:

Removing the Header processing code (so that it is only looking for the thumbnails in header):

Removing footer and header processing code (so that it is only looking for the thumbnails):

Most of the slow down appears to be in processing the header, with the next most time consuming section being reading and processing the footer.

Describe the solution you propose.

Instead of searching the file for the meta data on each request, cache the meta data for each file on the SD card in a hidden folder, eg .meta. The last modified time of the cache should also be saved in the meta file.

When asked to get meta data (M36/rr_fileinfo), compare the last modified time with the cache and if there is a difference, scan the file again, otherwise return the data from the meta file.

When retrieving the file list of a directory (M20/rr_filelist), you can do a similar check for each file to make sure the meta data is up to date, and any file that has been added/deleted also has its meta data added/deleted.

@dc42

I'm reluctant to cache the metadata on a separate file on the SD card. It means a whole load of new code to keep it in step with the GCode file. We could consider caching it in the file itself, as we do for the simulated print time; but the new code needed probably wouldn't fit on Duet 2.

Describe alternatives you've considered

Ex. A workaround exists but it is tedius, for example...

Provide any additional context or information.

Ex. Photos, mockups, etc.

The text was updated successfully, but these errors were encountered:

dc42 · 2024-03-15T14:03:14Z

I think the place to start is to speed up scanning of the file for metadata. Currently we read a block of the file and repeatedly scan the block for comment lines we are interested in. It would be quicker to find a comment line, then check whether that line is of interest; then find the next comment line and repeat. This would probably need about the same amount of code as the present mechanism. If this still isn't fast enough then we can cache the metadata on the screen (like DWC does already) or at the end of the job file (where we already store the simulated time).

dc42 · 2024-09-29T16:37:36Z

This has been implemented in 3.6-dev post beta.1. Refreshing the file data on my delta (Duet 3 Mini) takes:

Old code 4min 20sec, Duet 2 binary size 520192
New code 3min 15sec, Duet 2 binary size 519792

However, most of the time taken is taken by DWC between the end of one request and the start of the next. Typical times taken to parse an individual large file are:

Old code: header read time 0.076, header parse time 0.591, footer read time 0.022, footer parse time 0.048, seek time 0.003
New code: Preparation time 0.074, read time 0.114, parse time 0.089, seek time 0.001

There is scope to further reduce the parse time. Currently it finds the start and end of each line and if it is a comment line, it compares the start of the comment string with each of ~45 keywords. It would be quicker to use a trie or use 26 tables indexed by first letter (although that would reduce the readability). However, the parse time is currently 50% to 80% of the read time, so on this machine the speed improvement would be limited.

dc42 · 2024-10-04T18:11:43Z

Further improved the speed by doing a binary search on the first letter of the comment string. However, when timing the result by pressing Refresh on the Jobs page of DWC, most of the time taken is taken by DWC, not RRF, because there are often significant delays between requests received by RRF.

AndyEveritt added the enhancement Additional functionality, performance or other feature request label Mar 11, 2024

AndyEveritt assigned x0rtrunks Mar 11, 2024

dc42 modified the milestones: 3.5.2, 3.6.0 Apr 21, 2024

dc42 assigned dc42 and unassigned x0rtrunks Sep 27, 2024

dc42 closed this as completed Sep 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FeatureRequest]: FileInfoParser is inefficient #966

[FeatureRequest]: FileInfoParser is inefficient #966

AndyEveritt commented Mar 11, 2024

dc42 commented Mar 15, 2024 •

edited

Loading

dc42 commented Sep 29, 2024 •

edited

Loading

dc42 commented Oct 4, 2024

[FeatureRequest]: FileInfoParser is inefficient #966

[FeatureRequest]: FileInfoParser is inefficient #966

Comments

AndyEveritt commented Mar 11, 2024

Is your feature request related to a problem? Please describe.

Describe the solution you propose.

Describe alternatives you've considered

Provide any additional context or information.

dc42 commented Mar 15, 2024 • edited Loading

dc42 commented Sep 29, 2024 • edited Loading

dc42 commented Oct 4, 2024

dc42 commented Mar 15, 2024 •

edited

Loading

dc42 commented Sep 29, 2024 •

edited

Loading