Error when bundle load on slurm

local
libraries
slurm

(Sebastien Rey) #1

Hi guys,

On commited version 3e777e8ae224aa01e11b30c29106b2671ee58717 of openmole.

We have some problem with OpenMOLE and SLURM with the latest version. We try to investigate the problem directly during task execution on the cluster, and we get this warning/error output. It seems sometime openmole have some problems with duplicate osgi bundle at initialisation, this is weird/strange :

    WARNING: Error loading bundle /gpfs1/home/2017016/thurau01/.openmole/.tmp/ssh/openmole-723f06ff-45a7-468c-9930-37d2f37a923b/tmp/1499860868108/7351da26-109b-4cca-9098-ac9723b83ab6/1cf2730e-e7cd-4bd0-b427-8de7c0191b8d/.tmp/ec8515f8-59f1-4059-ae6d-1c0bd6bab02d/file10cc7d8d-6953-4192-9f73-b337618d1e4a.bin
    org.openmole.core.exception.InternalProcessingError: Installing bundle /gpfs1/home/2017016/thurau01/.openmole/.tmp/ssh/openmole-723f06ff-45a7-468c-9930-37d2f37a923b/tmp/1499860868108/7351da26-109b-4cca-9098-ac9723b83ab6/1cf2730e-e7cd-4bd0-b427-8de7c0191b8d/.tmp/ec8515f8-59f1-4059-ae6d-1c0bd6bab02d/file10cc7d8d-6953-4192-9f73-b337618d1e4a.bin
            at org.openmole.core.pluginmanager.PluginManager$.installBundle(PluginManager.scala:176)
            at org.openmole.core.pluginmanager.PluginManager$.$anonfun$tryLoad$3(PluginManager.scala:113)
            at scala.util.Try$.apply(Try.scala:209)
            at org.openmole.core.pluginmanager.PluginManager$.$anonfun$tryLoad$2(PluginManager.scala:113)
            at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:234)
            at scala.collection.mutable.ArraySeq.foreach(ArraySeq.scala:74)
            at scala.collection.TraversableLike.map(TraversableLike.scala:234)
            at scala.collection.TraversableLike.map$(TraversableLike.scala:227)
            at scala.collection.AbstractTraversable.map(Traversable.scala:104)
            at org.openmole.core.pluginmanager.PluginManager$.tryLoad(PluginManager.scala:113)
            at org.openmole.runtime.Runtime.apply(Runtime.scala:115)
            at org.openmole.runtime.SimExplorer$.$anonfun$run$2(SimExplorer.scala:99)
            at scala.Option.foreach(Option.scala:257)
            at org.openmole.runtime.SimExplorer$.run(SimExplorer.scala:81)
            at org.openmole.runtime.SimExplorer.run(SimExplorer.scala)
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.lang.reflect.Method.invoke(Method.java:498)
            at org.openmole.launcher.Launcher.main(Launcher.java:129)
    Caused by: org.osgi.framework.BundleException: A bundle is already installed with the name "org.objectweb.asm" and version "5.1.0"
            at org.eclipse.osgi.container.ModuleContainer.install(ModuleContainer.java:254)
            at org.eclipse.osgi.storage.Storage.install(Storage.java:513)
            at org.eclipse.osgi.internal.framework.BundleContextImpl.installBundle(BundleContextImpl.java:146)
            at org.eclipse.osgi.internal.framework.BundleContextImpl.installBundle(BundleContextImpl.java:139)
            at org.openmole.core.pluginmanager.PluginManager$.installBundle(PluginManager.scala:171)
            ... 19 more
    Jul 12, 2017 2:06:36 PM org.openmole.runtime.Runtime $anonfun$apply$6
    WARNING: Error loading bundle /gpfs1/home/2017016/thurau01/.openmole/.tmp/ssh/openmole-723f06ff-45a7-468c-9930-37d2f37a923b/tmp/1499860868108/7351da26-109b-4cca-9098-ac9723b83ab6/1cf2730e-e7cd-4bd0-b427-8de7c0191b8d/.tmp/ec8515f8-59f1-4059-ae6d-1c0bd6bab02d/file350dfdd4-457e-4527-8f92-e3ff83d892e4.bin
    org.openmole.core.exception.InternalProcessingError: Installing bundle /gpfs1/home/2017016/thurau01/.openmole/.tmp/ssh/openmole-723f06ff-45a7-468c-9930-37d2f37a923b/tmp/1499860868108/7351da26-109b-4cca-9098-ac9723b83ab6/1cf2730e-e7cd-4bd0-b427-8de7c0191b8d/.tmp/ec8515f8-59f1-4059-ae6d-1c0bd6bab02d/file350dfdd4-457e-4527-8f92-e3ff83d892e4.bin
            at org.openmole.core.pluginmanager.PluginManager$.installBundle(PluginManager.scala:176)
            at org.openmole.core.pluginmanager.PluginManager$.$anonfun$tryLoad$3(PluginManager.scala:113)
            at scala.util.Try$.apply(Try.scala:209)
            at org.openmole.core.pluginmanager.PluginManager$.$anonfun$tryLoad$2(PluginManager.scala:113)
            at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:234)
            at scala.collection.mutable.ArraySeq.foreach(ArraySeq.scala:74)
            at scala.collection.TraversableLike.map(TraversableLike.scala:234)
            at scala.collection.TraversableLike.map$(TraversableLike.scala:227)
            at scala.collection.AbstractTraversable.map(Traversable.scala:104)
            at org.openmole.core.pluginmanager.PluginManager$.tryLoad(PluginManager.scala:113)
            at org.openmole.runtime.Runtime.apply(Runtime.scala:115)
            at org.openmole.runtime.SimExplorer$.$anonfun$run$2(SimExplorer.scala:99)
            at scala.Option.foreach(Option.scala:257)
            at org.openmole.runtime.SimExplorer$.run(SimExplorer.scala:81)
            at org.openmole.runtime.SimExplorer.run(SimExplorer.scala)
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.lang.reflect.Method.invoke(Method.java:498)
            at org.openmole.launcher.Launcher.main(Launcher.java:129)
    Caused by: org.osgi.framework.BundleException: A bundle is already installed with the name "com.google.guava" and version "19.0.0"
            at org.eclipse.osgi.container.ModuleContainer.install(ModuleContainer.java:254)
            at org.eclipse.osgi.storage.Storage.install(Storage.java:513)
            at org.eclipse.osgi.internal.framework.BundleContextImpl.installBundle(BundleContextImpl.java:146)
            at org.eclipse.osgi.internal.framework.BundleContextImpl.installBundle(BundleContextImpl.java:139)
            at org.openmole.core.pluginmanager.PluginManager$.installBundle(PluginManager.scala:171)
            ... 19 more

We continue to search why the error appears, because simple oms and simple gama task finished without problem


#2

That’s a bug that the runtime try to load these to libraries twice, but it is not what would make the runtime crash.


(Sebastien Rey) #3

Ok,
the model just finished to run on local (35 min), wo this is not a problem with gama task. The same model continue to run on SLURM, and not finished at 1h00 :-/

Some new info from thomas on another try of exec :

INFO: Warp/affine reduction enabled: true
    run.sh: line 50:  1018 Bus error               java -Djava.io.tmpdir="${TMPDIR}" -Dfile.encoding=UTF-8 -Xss2M -Xms64m -Xmx${MEMORY} -Dosgi.locking=none -Dosgi.configuration.area="${CONFIGDIR}" $FLAG -XX:ReservedCodeCacheSize=128m -XX:MaxMetaspaceSize=128m -XX:CompressedClassSpaceSize=128m -XX:+UseG1GC -XX:ParallelGCThreads=1 -XX:CICompilerCount=2 -XX:ConcGCThreads=1 -XX:G1ConcRefinementThreads=1 -cp "${LOCATION}/launcher/*" org.openmole.launcher.Launcher --plugins "${LOCATION}/plugins/" --priority "logging" --run org.openmole.runtime.SimExplorer --osgi-directory "${CONFIGDIR}" -- --workspace "${TMPDIR}" $@
    slurmstepd: error: Exceeded step memory limit at some point.

(Sebastien Rey) #4

Hum, we found a bug in gama headless which run two concurrent simulation in the same run : https://github.com/gama-platform/gama/issues/2207


Openmole remove runtime dir before end of exec :(
(Jonathan Passerat Palmbach) #5

This tells you that Slurm killed your job for having exceeded its memory limit at some point.
We use 2GB by default in OpenMOLE.