OpenMole workflow not running on EGI


#21

Yes it is import _file_.RModel._ now. However I think there is a bug in the dev version. I am using it right now for a big experiment and my jobs states got stuck after a few hours. I am working on it. Tell me if it does the same for you.


#22

@helenea I found and fixed the bug stalling the jobs. Yoru OpenMOLE instance has been updated, tell me if it works.


(Hélène) #23

Ok thanks!
I’ve re-started my jobs, they’re running for now.
I still get a bunch of errors I never encountered before, but it’s still running.

Here are the kind of errors I get:

HTTP error: java.io.IOException: https://grid27.lal.in2p3.fr:443/dpmpart/part1/vo.complex-systems.eu/2017-12-06/job_5bd48e53-33c2-42af-a451-2f23fca2e39d.in.302611205.0?sfn=%2Fdpm%2Flal.in2p3.fr%2Fhome%2Fvo.complex-systems.eu%2F%2Fopenmole-d63801d8-fd85-4c21-bf3c-0a9b7b4f907c%2Ftmp%2F1512552444330%2F5f28f868-49dc-4b09-aedf-2d429f2708f1%2Fjob_5bd48e53-33c2-42af-a451-2f23fca2e39d.in&dpmtoken=43f5c2b6-efd3-4c16-bf65-1bc9ddcedfeb&token=NnWmOOX7kgtGdLO7uSmhbcsZys4%3D%401512554143%401 responded with an error: 404 Not Found
    	at gridscale.http.package$HTTP$.wrapError(package.scala:117)
    	at gridscale.http.package$HTTP.content(package.scala:184)
    	at gridscale.http.package$.read(package.scala:249)
    	at gridscale.webdav.package$.writeStream(package.scala:57)
    	at org.openmole.plugin.environment.egi.EGIEnvironment$$anon$2.$anonfun$upload$1(EGIEnvironment.scala:273)
    	at org.openmole.plugin.environment.egi.EGIEnvironment$$anon$2.$anonfun$upload$1$adapted(EGIEnvironment.scala:273)
    	at org.openmole.plugin.environment.batch.storage.StorageInterface$.upload(StorageInterface.scala:61)
    	at org.openmole.plugin.environment.egi.EGIEnvironment$$anon$2.upload(EGIEnvironment.scala:273)
    	at org.openmole.plugin.environment.egi.EGIEnvironment$$anon$2.upload(EGIEnvironment.scala:257)
    	at org.openmole.plugin.environment.batch.storage.StorageService.$anonfun$upload$2(StorageService.scala:224)
    	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
    	at org.openmole.plugin.environment.batch.storage.QualityControl.$anonfun$apply$1(QualityControl.scala:37)
    	at org.openmole.plugin.environment.batch.storage.QualityControl.timed(QualityControl.scala:50)
    	at org.openmole.plugin.environment.batch.storage.QualityControl.apply(QualityControl.scala:36)
    	at org.openmole.plugin.environment.batch.storage.StorageService.$anonfun$upload$1(StorageService.scala:224)
    	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
    	at org.openmole.plugin.environment.batch.environment.AccessToken.access(UsageControl.scala:7)
    	at org.openmole.plugin.environment.batch.storage.StorageService.upload(StorageService.scala:224)
    	at org.openmole.plugin.environment.batch.refresh.UploadActor$.$anonfun$initCommunication$3(UploadActor.scala:95)
    	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
    	at org.openmole.plugin.environment.batch.environment.BatchEnvironment$.signalUpload(BatchEnvironment.scala:74)
    	at org.openmole.plugin.environment.batch.refresh.UploadActor$.$anonfun$initCommunication$2(UploadActor.scala:95)
    	at org.openmole.plugin.environment.batch.refresh.UploadActor$.$anonfun$initCommunication$2$adapted(UploadActor.scala:93)
    	at org.openmole.core.workspace.NewFile.withTmpFile(NewFile.scala:21)
    	at org.openmole.plugin.environment.batch.refresh.UploadActor$.$anonfun$initCommunication$1(UploadActor.scala:93)
    	at org.openmole.core.workspace.NewFile.withTmpFile(NewFile.scala:21)
    	at org.openmole.plugin.environment.batch.refresh.UploadActor$.initCommunication(UploadActor.scala:68)
    	at org.openmole.plugin.environment.batch.refresh.UploadActor$.receive(UploadActor.scala:47)
    	at org.openmole.plugin.environment.batch.refresh.JobManager$DispatcherActor$.receive(JobManager.scala:48)
    	at org.openmole.plugin.environment.batch.refresh.JobManager$.$anonfun$dispatch$1(JobManager.scala:60)
    	at org.openmole.core.threadprovider.ThreadProvider$RunClosure.run(ThreadProvider.scala:23)
    	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    	at java.lang.Thread.run(Thread.java:748)
    Caused by: java.io.IOException: https://grid27.lal.in2p3.fr:443/dpmpart/part1/vo.complex-systems.eu/2017-12-06/job_5bd48e53-33c2-42af-a451-2f23fca2e39d.in.302611205.0?sfn=%2Fdpm%2Flal.in2p3.fr%2Fhome%2Fvo.complex-systems.eu%2F%2Fopenmole-d63801d8-fd85-4c21-bf3c-0a9b7b4f907c%2Ftmp%2F1512552444330%2F5f28f868-49dc-4b09-aedf-2d429f2708f1%2Fjob_5bd48e53-33c2-42af-a451-2f23fca2e39d.in&dpmtoken=43f5c2b6-efd3-4c16-bf65-1bc9ddcedfeb&token=NnWmOOX7kgtGdLO7uSmhbcsZys4%3D%401512554143%401 responded with an error: 404 Not Found
    	at gridscale.http.package$HTTP.testResponse$1(package.scala:161)
    	at gridscale.http.package$HTTP.withInputStream(package.scala:171)
    	at gridscale.http.package$HTTP.$anonfun$content$1(package.scala:187)
    	at gridscale.http.package$HTTP$.wrapError(package.scala:113)
    	... 35 more

Another one:

HTTP error: java.io.IOException: https://grid20.lal.in2p3.fr:443/dpmpart/part3/vo.complex-systems.eu/2017-12-06/job_f411b43e-1cb0-4b77-b311-b076153f2df9.in.302610934.0?sfn=%2Fdpm%2Flal.in2p3.fr%2Fhome%2Fvo.complex-systems.eu%2F%2Fopenmole-d63801d8-fd85-4c21-bf3c-0a9b7b4f907c%2Ftmp%2F1512552444330%2Fb85e8e7a-533a-46da-93e5-28e9bd59425b%2Fjob_f411b43e-1cb0-4b77-b311-b076153f2df9.in&dpmtoken=861911fb-86f7-49cc-b082-f0fd7e3dc9bd&token=x%2F5sO6rTa0%2BPAbV2u9CJ5wbfcAE%3D%401512554065%401 responded with an error: 403 Forbidden
	at gridscale.http.package$HTTP$.wrapError(package.scala:117)
	at gridscale.http.package$HTTP.content(package.scala:184)
	at gridscale.http.package$.read(package.scala:249)
	at gridscale.webdav.package$.writeStream(package.scala:57)
	at org.openmole.plugin.environment.egi.EGIEnvironment$$anon$2.$anonfun$upload$1(EGIEnvironment.scala:273)
	at org.openmole.plugin.environment.egi.EGIEnvironment$$anon$2.$anonfun$upload$1$adapted(EGIEnvironment.scala:273)
	at org.openmole.plugin.environment.batch.storage.StorageInterface$.upload(StorageInterface.scala:61)
	at org.openmole.plugin.environment.egi.EGIEnvironment$$anon$2.upload(EGIEnvironment.scala:273)
	at org.openmole.plugin.environment.egi.EGIEnvironment$$anon$2.upload(EGIEnvironment.scala:257)
	at org.openmole.plugin.environment.batch.storage.StorageService.$anonfun$upload$2(StorageService.scala:224)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
	at org.openmole.plugin.environment.batch.storage.QualityControl.$anonfun$apply$1(QualityControl.scala:37)
	at org.openmole.plugin.environment.batch.storage.QualityControl.timed(QualityControl.scala:50)
	at org.openmole.plugin.environment.batch.storage.QualityControl.apply(QualityControl.scala:36)
	at org.openmole.plugin.environment.batch.storage.StorageService.$anonfun$upload$1(StorageService.scala:224)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
	at org.openmole.plugin.environment.batch.environment.AccessToken.access(UsageControl.scala:7)
	at org.openmole.plugin.environment.batch.storage.StorageService.upload(StorageService.scala:224)
	at org.openmole.plugin.environment.batch.refresh.UploadActor$.$anonfun$initCommunication$3(UploadActor.scala:95)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
	at org.openmole.plugin.environment.batch.environment.BatchEnvironment$.signalUpload(BatchEnvironment.scala:74)
	at org.openmole.plugin.environment.batch.refresh.UploadActor$.$anonfun$initCommunication$2(UploadActor.scala:95)
	at org.openmole.plugin.environment.batch.refresh.UploadActor$.$anonfun$initCommunication$2$adapted(UploadActor.scala:93)
	at org.openmole.core.workspace.NewFile.withTmpFile(NewFile.scala:21)
	at org.openmole.plugin.environment.batch.refresh.UploadActor$.$anonfun$initCommunication$1(UploadActor.scala:93)
	at org.openmole.core.workspace.NewFile.withTmpFile(NewFile.scala:21)
	at org.openmole.plugin.environment.batch.refresh.UploadActor$.initCommunication(UploadActor.scala:68)
	at org.openmole.plugin.environment.batch.refresh.UploadActor$.receive(UploadActor.scala:47)
	at org.openmole.plugin.environment.batch.refresh.JobManager$DispatcherActor$.receive(JobManager.scala:48)
	at org.openmole.plugin.environment.batch.refresh.JobManager$.$anonfun$dispatch$1(JobManager.scala:60)
	at org.openmole.core.threadprovider.ThreadProvider$RunClosure.run(ThreadProvider.scala:23)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: https://grid20.lal.in2p3.fr:443/dpmpart/part3/vo.complex-systems.eu/2017-12-06/job_f411b43e-1cb0-4b77-b311-b076153f2df9.in.302610934.0?sfn=%2Fdpm%2Flal.in2p3.fr%2Fhome%2Fvo.complex-systems.eu%2F%2Fopenmole-d63801d8-fd85-4c21-bf3c-0a9b7b4f907c%2Ftmp%2F1512552444330%2Fb85e8e7a-533a-46da-93e5-28e9bd59425b%2Fjob_f411b43e-1cb0-4b77-b311-b076153f2df9.in&dpmtoken=861911fb-86f7-49cc-b082-f0fd7e3dc9bd&token=x%2F5sO6rTa0%2BPAbV2u9CJ5wbfcAE%3D%401512554065%401 responded with an error: 403 Forbidden
	at gridscale.http.package$HTTP.testResponse$1(package.scala:161)
	at gridscale.http.package$HTTP.withInputStream(package.scala:171)
	at gridscale.http.package$HTTP.$anonfun$content$1(package.scala:187)
	at gridscale.http.package$HTTP$.wrapError(package.scala:113)
	... 35 more

#24

Great

The storage at lal was under a big load. It signalled that to the sysadmin, they are looking into it.


(Hélène) #25

The jobs are still running but a lot of them fail, almost twice as much as successful ones, that’s unusual.