Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with bput #64

Open
jnimoth opened this issue Dec 16, 2024 · 4 comments
Open

Issues with bput #64

jnimoth opened this issue Dec 16, 2024 · 4 comments

Comments

@jnimoth
Copy link

jnimoth commented Dec 16, 2024

A user of ours experiences issues when trying to upload with gocmd bput. I tried myself today and also got a lot of errors when trying bput command.

He a simple example:

$ ll bput_test
total 3.4G
-rw-rw-r-- 1 jelte jelte 954M Dec 16 14:33 1GB_Testfile
-rw-rw-r-- 1 jelte jelte 1.9G Dec 16 14:33 2GB_Testfile
-rw-rw-r-- 1 jelte jelte 477M Dec 16 14:33 500MB_Testfile
-rw-rw-r-- 1 jelte jelte  48M Dec 16 14:33 50MB_Testfile

Trying to upload via gocmd bput -d --min_file_num 1 bput_test/ gives an error (see attached log bput.log).

I also end up with a non-good replica in .gocmd_staging as well as parts of the files being transferred:

$ ils -L bput_test
/testrug/home/[email protected]/bput_test:
  rdms-testers      0 rootResc;mnt_nfsirods    500000000 2024-12-16.15:02 & 500MB_Testfile
        generic    /mnt/nfsirods/home/[email protected]/bput_test/500MB_Testfile
  rdms-testers      0 rootResc;mnt_nfsirods     50000000 2024-12-16.15:02 & 50MB_Testfile
        generic    /mnt/nfsirods/home/[email protected]/bput_test/50MB_Testfile
$ ils -L .gocmd_staging
/testrug/home/[email protected]/.gocmd_staging:
  rdms-testers      0 rootResc;mnt_nfsirods            0 2024-12-16.15:02 ? bundle_e7752d5110e9db4b6ce5ade52e2c6da4.tar
        generic    /mnt/nfsirods/home/[email protected]/.gocmd_staging/bundle_e7752d5110e9db4b6ce5ade52e2c6da4.tar

If I remove the 1GB and 2GB files from the folder, the bput transfer succeeds with the same parameters.

I also tried the original folder once more, but specified a bigger --max_file_size than the default (precisely: gocmd bput -d --min_file_num 1 --max_file_size "21474836480" bput_test/. This also resulted in an error, but a different one (see attached log bput_again.log). In this case, the upload location is completely empty and I also end up with a non-good replica in .gocmd_staging.

$ ils -L .gocmd_staging
/testrug/home/[email protected]/.gocmd_staging:
  rdms-testers      0 rootResc;mnt_nfsirods            0 2024-12-16.15:15 ? bundle_b5a4a757b693c4bfb0c96b46fa6b0ad7.tar
        generic    /mnt/nfsirods/home/[email protected]/.gocmd_staging/bundle_b5a4a757b693c4bfb0c96b46fa6b0ad7.tar

I also saw that when I use the --icat flag, the same upload seems to succeed, but I do not understand why so far.

So

$ gocmd bput --icat -d --min_file_num 1 bput_test/

results in all files being upload. No errors or non-good replicas in this case (see attached log bput_icat_default_size.log).

We are using iRODS 4.3.2 + PAM in the background. It is for me currently not clear if this is an issue with bput, our system, or wrong use of the command.

bput.log
bput_again.log
bput_icat_default_size.log

@iychoi
Copy link
Member

iychoi commented Dec 18, 2024

I believe there is a bug in redirect-to-resource feature which implements direct access to resource server.
Uploading a large bundle file failed due to the issue, therefore, you are seeing non-good replicas.

It is also possible that firewalls in the network disconnect the iRODS connection for some reasons. We have the firewall issue in CyVerse and we are waiting for patching the university firewall.

Anyway, the issue is not limited to bput, so can you try uploading large files using put with or without --icat? If you see the same issue with put without --icat, it's clearly redirect-to-resource bug. If put with --icat also fails, it's something else, like firewall, this is what we are facing in CyVerse.

@jnimoth
Copy link
Author

jnimoth commented Dec 19, 2024

I tested gocmd put with a 10GB file.

Normal put without any extra flags give an error as well for this file:

./gocmd put --progress /media/jelte/data_drive/10GB_Testfile 
/media/jelte/data_drive/10GB_Testfile                    ... done! [10.00GB in 1m27.29s]
Unexpected error!
Error Trace:
  - failed to perform parallel jobs:
    github.com/cyverse/gocommands/cmd/subcmd.(*PutCommand).Process
        /github/workspace/cmd/subcmd/put.go:213
  - failed to upload "/media/jelte/data_drive/10GB_Testfile" to "/rug/home/[email protected]/10GB_Testfile":
    github.com/cyverse/gocommands/cmd/subcmd.(*PutCommand).schedulePut.func1
        /github/workspace/cmd/subcmd/put.go:412
  - failed to upload data object chunk "/media/jelte/data_drive/10GB_Testfile" to resource server:
    github.com/cyverse/go-irodsclient/irods/fs.UploadDataObjectToResourceServer.func1
        /go/pkg/mod/github.com/cyverse/[email protected]/irods/fs/data_object_resource_server.go:781
  - failed to read transfer header from resource server:
    github.com/cyverse/go-irodsclient/irods/fs.uploadDataObjectChunkToResourceServer
        /go/pkg/mod/github.com/cyverse/[email protected]/irods/fs/data_object_resource_server.go:438
  - failed to receive data:
    github.com/cyverse/go-irodsclient/irods/connection.(*IRODSResourceServerConnection).RecvWithTrackerCallBack
        /go/pkg/mod/github.com/cyverse/[email protected]/irods/connection/resource_server_connection.go:362
  - failed to read from socket:
    github.com/cyverse/go-irodsclient/irods/util.ReadBytesWithTrackerCallBack
        /go/pkg/mod/github.com/cyverse/[email protected]/irods/util/io.go:28
  - EOF

And the file arrives as expected in a non-good state in the end:

ils -L 10GB_Testfile
  j.p.nimoth@r      0 rootResc;randy;pt0;mnt_nfsirods0            0 2024-12-19.15:16 ? 10GB_Testfile
        generic    /mnt/nfsirods/home/[email protected]/10GB_Testfile

Trying the same with --icat uploads the file without any errors!

Additionally, I also tried iput with a smaller files:
Interestingly, a 1GB Testfile did not give the error with put, even without the --icat flag. Using a 2GB file gave the error as shown above. Also tried 3GB which gave me an error as well. To my surprise, a 5GB file was uploaded in the same way without an issue (tried two times, two times without error).

@iychoi
Copy link
Member

iychoi commented Jan 7, 2025

Can you try v0.10.9 to check if the issue still exists?

@jnimoth
Copy link
Author

jnimoth commented Jan 13, 2025

I re-tried today the same things as before with the current v0.10.9 version. For me, the issue seems to be still there.
So again using this source folder:

$ ll bput_test/
total 3.4G
-rw-rw-r-- 1 jelte jelte 954M Jan 13 10:10 1GB_Testfile
-rw-rw-r-- 1 jelte jelte 1.9G Jan 13 10:10 2GB_Testfile
-rw-rw-r-- 1 jelte jelte 477M Jan 13 10:10 500MB_Testfile
-rw-rw-r-- 1 jelte jelte  48M Jan 13 10:09 50MB_Testfile

First, running ./gocmd bput -d --min_file_num 1 ./bput_test/ results in:

  • Incomplete transfer (50 + 500 MB are uploaded)
  • non-good replica in .gocmd_staging.
  • bput_min_1_max_size_default.log for the log file.

bput_min_1_max_size_default.log

Content of .gocmd_staging and dest. directory:

$ ils -L .gocmd_staging bput_test/ 
/rug/home/[email protected]/.gocmd_staging:
  j.p.nimoth@r      0 rootResc;randy;pt0;mnt_nfsirods0            0 2025-01-13.10:22 ? bundle_4399661cc3b529a04b2b753e4dada537.tar
        generic    /mnt/nfsirods/home/[email protected]/.gocmd_staging/bundle_4399661cc3b529a04b2b753e4dada537.tar
/rug/home/[email protected]/bput_test:
  j.p.nimoth@r      0 rootResc;randy;pt0;mnt_nfsirods0    500000000 2025-01-13.10:22 & 500MB_Testfile
        generic    /mnt/nfsirods/home/[email protected]/bput_test/500MB_Testfile
  j.p.nimoth@r      0 rootResc;randy;pt0;mnt_nfsirods0     50000000 2025-01-13.10:22 & 50MB_Testfile
        generic    /mnt/nfsirods/home/[email protected]/bput_test/50MB_Testfile

Second, running with the ./gocmd bput -d --min_file_num 1 --max_file_size "21474836480" ./bput_test/:

  • Upload dest. completely empty
  • a single, non-good replica at .gocmd_staging
  • Loogs in bput_min_1_max_size_21474836480.log

bput_min_1_max_size_21474836480.log

Third, running with --icat does result in it to work. All files uploaded to dest and no non-good replicas in .gocmd_staging

Fourth, I also re-tried uploading a single, bigger file using gocmd put with our without --icat flag, so ./gocmd put -d (--icat) /media/jelte/data_drive/10GB_Testfile.
Running without --icat gives again an error (see attached 10GB_put_no_icat.log) and results in a non-good replica:

$ ils -L 10GB_Testfile
  j.p.nimoth@r      0 rootResc;randy;pt0;mnt_nfsirods0            0 2025-01-13.10:39 ? 10GB_Testfile
        generic    /mnt/nfsirods/home/[email protected]/10GB_Testfile

10GB_put_no_icat.log

The same with --icat does work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants