Hey,
right now I try to get a simple io scenario running on our pbs cluster. The error is related to the write operation:
> java.io.IOException: Error write /home/user/.openmole/.tmp/ssh/openmole-6a9e2713-dbc6-45d8-970a-ea195c88e179/tmp/1508932279100/a7d83199-68b6-4af0-873f-7c4739b31108/aacbcbb2-48d0-4a6d-93a0-aa73cd92ce57.pbs on org.openmole.plugin.environment.pbs.PBSJobService$$anon$1@18773638
> at fr.iscpif.gridscale.storage.Storage$class.errorWrapping(Storage.scala:72)…
The program to run looks like this:
val proto1 = Val[Int]
val inputFile = Val[File]
val outputFile = Val[File]
val explo = ExplorationTask(proto1 in (0 to 9))
//Defines the scala task as a launcher of the hello executable
val javaTask =
ScalaTask(“val outputFile = newFile(); hello.Hello.run(proto1, inputFile, outputFile)”) set (
libraries += workDirectory / “Hello.jar”,
inputs += (proto1, inputFile),
outputs += (proto1, outputFile),
inputFile := workDirectory / “input.txt”
)
//Save the output file locally
val copyHook =
CopyFileHook(
outputFile,
workDirectory / “out-${proto1}.txt”
)
val env = PBSEnvironment(
“user”,
“11.11.11.11”)
explo -< (javaTask on env hook (copyHook))
Since openmole creates several folders on the cluster, the SSH access should work correctly. The execution tab within the gui shows only failed jobs at the env side. If I run the program only locally, it works fine.
If I run it just within the SSH environment, I get this error:
org.openmole.core.exception.InternalProcessingError: Error for context values in org.openmole.core.workflow.tools.InputOutputCheck$@34fb8371 {inputFile=/home/user/.openmole/.tmp/ssh/openmole-6a9e2713-dbc6-45d8-970a-ea195c88e179/tmp/1508935593043/109689a8-f51b-4863-9d8d-7287835a5ff7/ba5e8bcf-081d-40d0-bfec-09983a28afcb/.tmp/89ecd9d8-ea00-4f83-ad83-0f1652555b27/filed8e6b3f9-de83-4b9d-9ad6-42b506148a9a.bin, openmole$seed=145807262833502681, proto1=3}
at org.openmole.core.workflow.tools.InputOutputCheck$$anonfun$perform$2.apply(InputOutputCheck.scala:90)
at org.openmole.core.workflow.tools.InputOutputCheck$$anonfun$perform$2.apply(InputOutputCheck.scala:80)
Caused by: java.io.FileNotFoundException: /home/user/.openmole/.tmp/ssh/openmole-6a9e2713-dbc6-45d8-970a-ea195c88e179/tmp/1508935593043/109689a8-f51b-4863-9d8d-7287835a5ff7/ba5e8bcf-081d-40d0-bfec-09983a28afcb/.tmp/89ecd9d8-ea00-4f83-ad83-0f1652555b27/runtime26b19081-6155-42cc-84e1-695ab789e796/file6ffe3afd-57ea-4bd3-adde-d568a1f1a91d.bin (No such file or directory)
at java.io.FileOutputStream.open0(Native Method)
at java.io.FileOutputStream.open(FileOutputStream.java:270)
at java.io.FileOutputStream.(FileOutputStream.java:213)
at java.io.FileOutputStream.(FileOutputStream.java:162)
at hello.Hello.run(Hello.java:10)
at $line3.$read$$iw$$iw$$anonfun$1.apply(:25)
… 31 more
In the SSH case, the jobs at the env are marked as finished however the global total job counter doesn’t increase.
The tracejob of the torque platform produces an error for each execution like this:
user@master:~> tracejob 1161260
/var/spool/torque/server_priv/accounting/20171026: Permission denied
/var/spool/torque/mom_logs/20171026: No such file or directory
Job: 1161260.master.cluster
10/26/2017 10:02:08 S enqueuing into batch, state 1 hop 1
10/26/2017 10:02:31 S Job Modified at request of Scheduler@master.cluster
10/26/2017 10:02:31 L Job Run
10/26/2017 10:02:31 S Job Run at request of Scheduler@master.cluster
10/26/2017 10:02:31 S Not sending email: User does not want mail of this type.
10/26/2017 10:02:39 S Not sending email: User does not want mail of this type.
10/26/2017 10:02:39 S Exit_status=0 resources_used.cput=00:00:13 resources_used.mem=0kb resources_used.vmem=0kb resources_used.walltime=00:00:08
10/26/2017 10:02:39 S on_job_exit valid pjob: 1161260.master.cluster (substate=50)
10/26/2017 10:02:40 S dequeuing from batch, state COMPLETE
10/26/2017 10:03:11 S Unknown Job Id Error
Any idea?