Quantcast

Karaf Corrupt Component Cache

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Karaf Corrupt Component Cache

bobshort
Hi,

We are using Karaf on plug computers in the field to collect data. These plug computers are sometimes hot power cycled due to power outages, etc. Occasionally, when they are hot power cycled, karaf will not come back up cleanly. We've tried both Felix and Equinox and had issues with both.

I noticed this from the Karaf docs:

    Launching Karaf can result in a deadlock in Felix during module dependency resolution.

    This is often a result of sending a SIGINT (control-C) to the process when it will not cleanly exit.
    This can corrupt the caches and cause startup problems in the very next launch. It is fixed by emptying the component cache:

    rm -rf data/cache/*


Unfortunately cleaning up the component cache effectively un-deploys all the features and bundles we've installed. In this case we have to get remote access to the plug computer and completely re-install all our features. This is a major pain for us and for our customers.

Any suggestions on workaround for this issue that does not involve re-installing everything?

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Karaf Corrupt Component Cache

David Jencks
I think this is an indication of why relying on mvn urls is a bad idea in karaf.

I'm not sure how practical it is in any version of karaf but you might try installing all the bundles you use in the karaf repository and modifying your feature descriptors to use reference: file urls.  This will mean the cache doesn't contain the actual bundles, just a little metadata bascally pointing to the actual file system location.  Since there's still only one copy of the actual bundle the disc space should be the same. When you do a clean start or remove your cache forcibly then everything you need will still be in the karaf repo.

A while back in karaf 3 you could sort of automate this by making a custom karaf assembly and including all your features as startup features.  This will basically hardcode what bundles are running, but they'll all be installed using reference urls and be actually present in the system repo.  I haven't tried this recently to make sure it still works, and I can't seriously recommend using unreleased karaf 3 in production.

david jencks

On Apr 12, 2012, at 10:35 AM, bobshort wrote:

> Hi,
>
> We are using Karaf on plug computers in the field to collect data. These
> plug computers are sometimes hot power cycled due to power outages, etc.
> Occasionally, when they are hot power cycled, karaf will not come back up
> cleanly. We've tried both Felix and Equinox and had issues with both.
>
> I noticed this from the Karaf docs:
>
> /    Launching Karaf can result in a deadlock in Felix during module
> dependency resolution.
>
>    This is often a result of sending a SIGINT (control-C) to the process
> when it will not cleanly exit.
>    This can corrupt the caches and cause startup problems in the very next
> launch. It is fixed by emptying the component cache:
>
>    rm -rf data/cache/*/
>
> Unfortunately cleaning up the component cache effectively un-deploys all the
> features and bundles we've installed. In this case we have to get remote
> access to the plug computer and completely re-install all our features. This
> is a major pain for us and for our customers.
>
> Any suggestions on workaround for this issue that does not involve
> re-installing everything?
>
>
>
> --
> View this message in context: http://karaf.922171.n3.nabble.com/Karaf-Corrupt-Component-Cache-tp3905992p3905992.html
> Sent from the Karaf - User mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Karaf Corrupt Component Cache

Achim Nierbeck
In the "good old times" I used to create my custom
Karaf just by using the maven-assemblies plugin combined with the
features-plugin. This way I just added all required bundles into the
system repo, addded those
bundles to the startup.properties and en-voilà had a nice Karaf runtime
with all needed bundles :)

just another working solution that already worked for Karaf 1.x to 2.x :)

just my 0.02$

regards, Achim


Am 12.04.2012 19:56, schrieb David Jencks:

> I think this is an indication of why relying on mvn urls is a bad idea in karaf.
>
> I'm not sure how practical it is in any version of karaf but you might try installing all the bundles you use in the karaf repository and modifying your feature descriptors to use reference: file urls.  This will mean the cache doesn't contain the actual bundles, just a little metadata bascally pointing to the actual file system location.  Since there's still only one copy of the actual bundle the disc space should be the same. When you do a clean start or remove your cache forcibly then everything you need will still be in the karaf repo.
>
> A while back in karaf 3 you could sort of automate this by making a custom karaf assembly and including all your features as startup features.  This will basically hardcode what bundles are running, but they'll all be installed using reference urls and be actually present in the system repo.  I haven't tried this recently to make sure it still works, and I can't seriously recommend using unreleased karaf 3 in production.
>
> david jencks
>
> On Apr 12, 2012, at 10:35 AM, bobshort wrote:
>
>> Hi,
>>
>> We are using Karaf on plug computers in the field to collect data. These
>> plug computers are sometimes hot power cycled due to power outages, etc.
>> Occasionally, when they are hot power cycled, karaf will not come back up
>> cleanly. We've tried both Felix and Equinox and had issues with both.
>>
>> I noticed this from the Karaf docs:
>>
>> /    Launching Karaf can result in a deadlock in Felix during module
>> dependency resolution.
>>
>>     This is often a result of sending a SIGINT (control-C) to the process
>> when it will not cleanly exit.
>>     This can corrupt the caches and cause startup problems in the very next
>> launch. It is fixed by emptying the component cache:
>>
>>     rm -rf data/cache/*/
>>
>> Unfortunately cleaning up the component cache effectively un-deploys all the
>> features and bundles we've installed. In this case we have to get remote
>> access to the plug computer and completely re-install all our features. This
>> is a major pain for us and for our customers.
>>
>> Any suggestions on workaround for this issue that does not involve
>> re-installing everything?
>>
>>
>>
>> --
>> View this message in context: http://karaf.922171.n3.nabble.com/Karaf-Corrupt-Component-Cache-tp3905992p3905992.html
>> Sent from the Karaf - User mailing list archive at Nabble.com.


--
- Apache Karaf<http://karaf.apache.org/>  Committer&  PMC
- OPS4J Pax Web<http://wiki.ops4j.org/display/paxweb/Pax+Web/>    Committer&  Project Lead
- Blog<http://notizblog.nierbeck.de/>

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Karaf Corrupt Component Cache

bobshort
Thanks for the suggestions.

I'm not sure if the local repo thing will work for us or not. Our system is end-user configurable. Basically we have a core set of bundles and then customers can install various bundles to integrate with a wide variety of end hardware (zigbee, modbus, z-wave, etc.). We've also got customer specific modules to integrate with proprietary systems. Ultimately we don't know what modules are going to be installed when the system ships, so we can't really add them to the local repo and startup config.

Once the devices are in the field and they get corrupted it is ugly to get them fixed. It is possible that Karaf may not be the best choice for a remote device that needs to come back from a power cycle reliably every time. Which is unfortunate because other than this issue Karaf has been great.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Karaf Corrupt Component Cache

Achim Nierbeck
You still have the deploy folder for custom bundles,
if your additional 3rd-party bundles need to be reliable started put
them into the deploy folder. The fileinstaller-service will
pick them up everytime your server did a "re-cycle" even
if the cache was dumped.

regards, Achim

Am 12.04.2012 23:26, schrieb bobshort:

> Thanks for the suggestions.
>
> I'm not sure if the local repo thing will work for us or not. Our system is
> end-user configurable. Basically we have a core set of bundles and then
> customers can install various bundles to integrate with a wide variety of
> end hardware (zigbee, modbus, z-wave, etc.). We've also got customer
> specific modules to integrate with proprietary systems. Ultimately we don't
> know what modules are going to be installed when the system ships, so we
> can't really add them to the local repo and startup config.
>
> Once the devices are in the field and they get corrupted it is ugly to get
> them fixed. It is possible that Karaf may not be the best choice for a
> remote device that needs to come back from a power cycle reliably every
> time. Which is unfortunate because other than this issue Karaf has been
> great.
>
> --
> View this message in context: http://karaf.922171.n3.nabble.com/Karaf-Corrupt-Component-Cache-tp3905992p3906534.html
> Sent from the Karaf - User mailing list archive at Nabble.com.


--
- Apache Karaf<http://karaf.apache.org/>  Committer&  PMC
- OPS4J Pax Web<http://wiki.ops4j.org/display/paxweb/Pax+Web/>    Committer&  Project Lead
- Blog<http://notizblog.nierbeck.de/>

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Karaf Corrupt Component Cache

David Jencks
In reply to this post by Achim Nierbeck
As I said I haven't tried this for a while but the idea behind the karaf assembly packaging was to make this pretty much automated so you didn't need to write an assembly descriptor or write the startup.properties by hand.  It looked a lot simpler to me :-/

thanks
david jencks


On Apr 12, 2012, at 2:05 PM, Achim Nierbeck wrote:

> In the "good old times" I used to create my custom
> Karaf just by using the maven-assemblies plugin combined with the
> features-plugin. This way I just added all required bundles into the system repo, addded those
> bundles to the startup.properties and en-voilà had a nice Karaf runtime
> with all needed bundles :)
>
> just another working solution that already worked for Karaf 1.x to 2.x :)
>
> just my 0.02$
>
> regards, Achim
>
>
> Am 12.04.2012 19:56, schrieb David Jencks:
>> I think this is an indication of why relying on mvn urls is a bad idea in karaf.
>>
>> I'm not sure how practical it is in any version of karaf but you might try installing all the bundles you use in the karaf repository and modifying your feature descriptors to use reference: file urls.  This will mean the cache doesn't contain the actual bundles, just a little metadata bascally pointing to the actual file system location.  Since there's still only one copy of the actual bundle the disc space should be the same. When you do a clean start or remove your cache forcibly then everything you need will still be in the karaf repo.
>>
>> A while back in karaf 3 you could sort of automate this by making a custom karaf assembly and including all your features as startup features.  This will basically hardcode what bundles are running, but they'll all be installed using reference urls and be actually present in the system repo.  I haven't tried this recently to make sure it still works, and I can't seriously recommend using unreleased karaf 3 in production.
>>
>> david jencks
>>
>> On Apr 12, 2012, at 10:35 AM, bobshort wrote:
>>
>>> Hi,
>>>
>>> We are using Karaf on plug computers in the field to collect data. These
>>> plug computers are sometimes hot power cycled due to power outages, etc.
>>> Occasionally, when they are hot power cycled, karaf will not come back up
>>> cleanly. We've tried both Felix and Equinox and had issues with both.
>>>
>>> I noticed this from the Karaf docs:
>>>
>>> /    Launching Karaf can result in a deadlock in Felix during module
>>> dependency resolution.
>>>
>>>    This is often a result of sending a SIGINT (control-C) to the process
>>> when it will not cleanly exit.
>>>    This can corrupt the caches and cause startup problems in the very next
>>> launch. It is fixed by emptying the component cache:
>>>
>>>    rm -rf data/cache/*/
>>>
>>> Unfortunately cleaning up the component cache effectively un-deploys all the
>>> features and bundles we've installed. In this case we have to get remote
>>> access to the plug computer and completely re-install all our features. This
>>> is a major pain for us and for our customers.
>>>
>>> Any suggestions on workaround for this issue that does not involve
>>> re-installing everything?
>>>
>>>
>>>
>>> --
>>> View this message in context: http://karaf.922171.n3.nabble.com/Karaf-Corrupt-Component-Cache-tp3905992p3905992.html
>>> Sent from the Karaf - User mailing list archive at Nabble.com.
>
>
> --
> - Apache Karaf<http://karaf.apache.org/>  Committer&  PMC
> - OPS4J Pax Web<http://wiki.ops4j.org/display/paxweb/Pax+Web/>    Committer&  Project Lead
> - Blog<http://notizblog.nierbeck.de/>
>

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Karaf Corrupt Component Cache

David Jencks
In reply to this post by bobshort
How have you been installing the configured set of bundles on the karaf machines?  

As Achim said fileinstall will keep all the bundles there but I think it copies the bundles rather than using reference: urls so (if I'm right) it will be using more disk space.  There might be other solutions.... keep talking :-)
thanks
david jencks

On Apr 12, 2012, at 2:26 PM, bobshort wrote:

> Thanks for the suggestions.
>
> I'm not sure if the local repo thing will work for us or not. Our system is
> end-user configurable. Basically we have a core set of bundles and then
> customers can install various bundles to integrate with a wide variety of
> end hardware (zigbee, modbus, z-wave, etc.). We've also got customer
> specific modules to integrate with proprietary systems. Ultimately we don't
> know what modules are going to be installed when the system ships, so we
> can't really add them to the local repo and startup config.
>
> Once the devices are in the field and they get corrupted it is ugly to get
> them fixed. It is possible that Karaf may not be the best choice for a
> remote device that needs to come back from a power cycle reliably every
> time. Which is unfortunate because other than this issue Karaf has been
> great.
>
> --
> View this message in context: http://karaf.922171.n3.nabble.com/Karaf-Corrupt-Component-Cache-tp3905992p3906534.html
> Sent from the Karaf - User mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Karaf Corrupt Component Cache

Achim Nierbeck
In reply to this post by David Jencks
Hi David,

I know for the Karaf 3.0 line I also consider this beeing the more
friendly way :)
though for all versions prior this a simple assembly xml file is quite
sufficient :)

regards, Achim

2012/4/13 David Jencks <[hidden email]>:

> As I said I haven't tried this for a while but the idea behind the karaf assembly packaging was to make this pretty much automated so you didn't need to write an assembly descriptor or write the startup.properties by hand.  It looked a lot simpler to me :-/
>
> thanks
> david jencks
>
>
> On Apr 12, 2012, at 2:05 PM, Achim Nierbeck wrote:
>
>> In the "good old times" I used to create my custom
>> Karaf just by using the maven-assemblies plugin combined with the
>> features-plugin. This way I just added all required bundles into the system repo, addded those
>> bundles to the startup.properties and en-voilà had a nice Karaf runtime
>> with all needed bundles :)
>>
>> just another working solution that already worked for Karaf 1.x to 2.x :)
>>
>> just my 0.02$
>>
>> regards, Achim
>>
>>
>> Am 12.04.2012 19:56, schrieb David Jencks:
>>> I think this is an indication of why relying on mvn urls is a bad idea in karaf.
>>>
>>> I'm not sure how practical it is in any version of karaf but you might try installing all the bundles you use in the karaf repository and modifying your feature descriptors to use reference: file urls.  This will mean the cache doesn't contain the actual bundles, just a little metadata bascally pointing to the actual file system location.  Since there's still only one copy of the actual bundle the disc space should be the same. When you do a clean start or remove your cache forcibly then everything you need will still be in the karaf repo.
>>>
>>> A while back in karaf 3 you could sort of automate this by making a custom karaf assembly and including all your features as startup features.  This will basically hardcode what bundles are running, but they'll all be installed using reference urls and be actually present in the system repo.  I haven't tried this recently to make sure it still works, and I can't seriously recommend using unreleased karaf 3 in production.
>>>
>>> david jencks
>>>
>>> On Apr 12, 2012, at 10:35 AM, bobshort wrote:
>>>
>>>> Hi,
>>>>
>>>> We are using Karaf on plug computers in the field to collect data. These
>>>> plug computers are sometimes hot power cycled due to power outages, etc.
>>>> Occasionally, when they are hot power cycled, karaf will not come back up
>>>> cleanly. We've tried both Felix and Equinox and had issues with both.
>>>>
>>>> I noticed this from the Karaf docs:
>>>>
>>>> /    Launching Karaf can result in a deadlock in Felix during module
>>>> dependency resolution.
>>>>
>>>>    This is often a result of sending a SIGINT (control-C) to the process
>>>> when it will not cleanly exit.
>>>>    This can corrupt the caches and cause startup problems in the very next
>>>> launch. It is fixed by emptying the component cache:
>>>>
>>>>    rm -rf data/cache/*/
>>>>
>>>> Unfortunately cleaning up the component cache effectively un-deploys all the
>>>> features and bundles we've installed. In this case we have to get remote
>>>> access to the plug computer and completely re-install all our features. This
>>>> is a major pain for us and for our customers.
>>>>
>>>> Any suggestions on workaround for this issue that does not involve
>>>> re-installing everything?
>>>>
>>>>
>>>>
>>>> --
>>>> View this message in context: http://karaf.922171.n3.nabble.com/Karaf-Corrupt-Component-Cache-tp3905992p3905992.html
>>>> Sent from the Karaf - User mailing list archive at Nabble.com.
>>
>>
>> --
>> - Apache Karaf<http://karaf.apache.org/>  Committer&  PMC
>> - OPS4J Pax Web<http://wiki.ops4j.org/display/paxweb/Pax+Web/>    Committer&  Project Lead
>> - Blog<http://notizblog.nierbeck.de/>
>>
>



--

Apache Karaf <http://karaf.apache.org/> Committer & PMC
OPS4J Pax Web <http://wiki.ops4j.org/display/paxweb/Pax+Web/>
Committer & Project Lead
blog <http://notizblog.nierbeck.de/>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Karaf Corrupt Component Cache

bobshort
In reply to this post by David Jencks
David Jencks wrote
How have you been installing the configured set of bundles on the karaf machines?  
I've been using a features repo hosted on a http server. The repo's url is added to the "org.apache.karaf.features.cfg" config file. After that features are install using "features:install featurename".

Using KAR files dumped into the deploy directory might be worth a try. That would make recovery a little easier, although I would still need to gain remote access to the plug computer to blow away the cache directory when it fails. I wouldn't want to blow away the cache on every reboot because that would make startup very slow on machines with limited resources (It already takes 2 minutes).

Does anyone know what the root problem is that causes this occasional cache corruption?

Regards,

B



Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Karaf Corrupt Component Cache

sully6768
I have had similar issues in the past with a dynamic deployment.  I ended up creating a bundle that would snapshot the cache at a given interval and then modified the start script to restore from that cache if there was an unclean shut down where "unclean" meant that a bread crumb had not been removed from the file system.

This allowed for a clean restart regardless.

Best Regards,
Scott ES
 
Scott England-Sullivan
Principal Consultant
FuseSource
Phone: (217) 390-3058
Web: fusesource.com
Twitter: sully6768
--
Scott England-Sullivan
----------------------------------
FuseSource
Web:     http://www.fusesource.com
Blog:     http://sully6768.blogspot.com
Twitter: sully6768
Loading...