Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FeatureRequest]: FileInfoParser is inefficient #966

Closed
AndyEveritt opened this issue Mar 11, 2024 · 3 comments
Closed

[FeatureRequest]: FileInfoParser is inefficient #966

AndyEveritt opened this issue Mar 11, 2024 · 3 comments
Assignees
Labels
enhancement Additional functionality, performance or other feature request
Milestone

Comments

@AndyEveritt
Copy link
Collaborator

Is your feature request related to a problem? Please describe.

Currently FileInfoParser is inefficient at retrieving meta data about a gcode file.

It takes ~1 second to retrieve meta data on a Duet 2 Wifi & Duet 3 mini 5+ in standalone mode. Some analysis of the code has been done and the main areas that appear to be causing the slow performance are:

  • Reading and processing the footer of the file
    • int nbytes = fileBeingParsed->Read(buf, sizeToRead);
      if (nbytes != (int)sizeToRead)
      {
      reprap.GetPlatform().MessageF(WarningMessage, "Failed to read footer from G-Code file \"%s\"\n", filePath);
      parseState = notParsing;
      fileBeingParsed->Close();
      info = parsedFileInfo;
      return GCodeResult::warning;
      }
      buf[sizeToScan] = 0;
      // Record performance data
      uint32_t now = millis();
      accumulatedReadTime += now - startTime;
      startTime = now;
      bool footerInfoComplete = true;
      // Search for filament used
      if (parsedFileInfo.numFilaments == 0)
      {
      parsedFileInfo.numFilaments = FindFilamentUsed(buf);
      if (parsedFileInfo.numFilaments == 0)
      {
      footerInfoComplete = false;
      }
      }
      // Search for layer height
      if (parsedFileInfo.layerHeight == 0.0)
      {
      if (!FindLayerHeight(buf))
      {
      footerInfoComplete = false;
      }
      }
      // Search for object height
      if (parsedFileInfo.objectHeight == 0.0)
      {
      if (!FindHeight(buf, sizeToScan))
      {
      footerInfoComplete = false;
      }
      }
      // Search for number of layers
      if (parsedFileInfo.numLayers == 0)
      {
      // Number of layers should come before the object height
      (void)FindNumLayers(buf, sizeToScan);
      }
      // Look for print time
      if (parsedFileInfo.printTime == 0)
      {
      if (!FindPrintTime(buf) && fileBeingParsed->Length() - nextSeekPos <= GcodeFooterPrintTimeSearchSize)
      {
      footerInfoComplete = false;
      }
      }
      // Look for simulated print time. It will always be right at the end of the file, so don't look too far back
      if (parsedFileInfo.simulatedTime == 0)
      {
      if (!FindSimulatedTime(buf) && fileBeingParsed->Length() - nextSeekPos <= GcodeFooterPrintTimeSearchSize)
      {
      footerInfoComplete = false;
      }
      }
  • Processing the header of the file
    • if (parsedFileInfo.numFilaments == 0)
      {
      parsedFileInfo.numFilaments = FindFilamentUsed(buf);
      headerInfoComplete &= (parsedFileInfo.numFilaments != 0);
      }
      // Look for layer height
      if (parsedFileInfo.layerHeight == 0.0)
      {
      headerInfoComplete &= FindLayerHeight(buf);
      }
      // Look for slicer program
      if (parsedFileInfo.generatedBy.IsEmpty())
      {
      headerInfoComplete &= FindSlicerInfo(buf);
      }
      // Look for print time
      if (parsedFileInfo.printTime == 0)
      {
      headerInfoComplete &= FindPrintTime(buf);
      }

With the existing implementation, these are the api call times:
image

Removing the Footer read and process code gives a small improvement:
image

Removing the Header processing code (so that it is only looking for the thumbnails in header):
image

Removing footer and header processing code (so that it is only looking for the thumbnails):
image

Most of the slow down appears to be in processing the header, with the next most time consuming section being reading and processing the footer.

Describe the solution you propose.

Instead of searching the file for the meta data on each request, cache the meta data for each file on the SD card in a hidden folder, eg .meta. The last modified time of the cache should also be saved in the meta file.

When asked to get meta data (M36/rr_fileinfo), compare the last modified time with the cache and if there is a difference, scan the file again, otherwise return the data from the meta file.

When retrieving the file list of a directory (M20/rr_filelist), you can do a similar check for each file to make sure the meta data is up to date, and any file that has been added/deleted also has its meta data added/deleted.

@dc42

I'm reluctant to cache the metadata on a separate file on the SD card. It means a whole load of new code to keep it in step with the GCode file. We could consider caching it in the file itself, as we do for the simulated print time; but the new code needed probably wouldn't fit on Duet 2.

Describe alternatives you've considered

Ex. A workaround exists but it is tedius, for example...

Provide any additional context or information.

Ex. Photos, mockups, etc.

@AndyEveritt AndyEveritt added the enhancement Additional functionality, performance or other feature request label Mar 11, 2024
@dc42
Copy link
Collaborator

dc42 commented Mar 15, 2024

I think the place to start is to speed up scanning of the file for metadata. Currently we read a block of the file and repeatedly scan the block for comment lines we are interested in. It would be quicker to find a comment line, then check whether that line is of interest; then find the next comment line and repeat. This would probably need about the same amount of code as the present mechanism. If this still isn't fast enough then we can cache the metadata on the screen (like DWC does already) or at the end of the job file (where we already store the simulated time).

@dc42 dc42 modified the milestones: 3.5.2, 3.6.0 Apr 21, 2024
@dc42 dc42 assigned dc42 and unassigned x0rtrunks Sep 27, 2024
@dc42
Copy link
Collaborator

dc42 commented Sep 29, 2024

This has been implemented in 3.6-dev post beta.1. Refreshing the file data on my delta (Duet 3 Mini) takes:

Old code 4min 20sec, Duet 2 binary size 520192
New code 3min 15sec, Duet 2 binary size 519792

However, most of the time taken is taken by DWC between the end of one request and the start of the next. Typical times taken to parse an individual large file are:

Old code: header read time 0.076, header parse time 0.591, footer read time 0.022, footer parse time 0.048, seek time 0.003
New code: Preparation time 0.074, read time 0.114, parse time 0.089, seek time 0.001

There is scope to further reduce the parse time. Currently it finds the start and end of each line and if it is a comment line, it compares the start of the comment string with each of ~45 keywords. It would be quicker to use a trie or use 26 tables indexed by first letter (although that would reduce the readability). However, the parse time is currently 50% to 80% of the read time, so on this machine the speed improvement would be limited.

@dc42 dc42 closed this as completed Sep 29, 2024
@dc42
Copy link
Collaborator

dc42 commented Oct 4, 2024

Further improved the speed by doing a binary search on the first letter of the comment string. However, when timing the result by pressing Refresh on the Jobs page of DWC, most of the time taken is taken by DWC, not RRF, because there are often significant delays between requests received by RRF.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Additional functionality, performance or other feature request
Projects
None yet
Development

No branches or pull requests

3 participants