Spring AMQP StatefulRetryOperationsInterceptor + cluster = RetryCacheCapacityExceededException

In last few months I had a chance to develop quite complex distributed system, driven by push notifications from Google Calendar. I decided to use Spring AMQP backed by RabbitMQ broker. Due to scaling and high availability requirements main backend service and broker was deployed to multiple servers.

Spring AMQP is very nice library which facilitates building message driven applications. As with most distributed systems, I had to deal with transient issues like: DNS resolution problems, network timeouts, transactions failed due to optimistic locking and so on. If you make your processing idempotent, retries are very efficient way of dealing with this kind of problems.

When processing fails, message is rejected and by default it goes back to AMQP queue. To handle it properly Spring AMQP uses Spring Retry for tracking how many retries were performed and then passes failed messages to DLQs. You can configure various aspects of retry policy. StatefulRetryOperationsInterceptor helps to build workflow in which non transient exceptions cause message to go immediately DLQ and potentially transient exceptions to be retried given number of times.

There is however one very dangerous aspect of using Spring Retry for this purpose. AMQP protocol allows storing headers for each message, which seems to be perfect place to keep retry counter. Spring Retry is build for more generic use case, so it has its own storage for context which, by default  is local. If you dig into documentation/code you can easily understand that for each failed message, library computes key and then stores RetryContext object in RetryContextCache.

This seems to be quite reasonable solution for generic retry library design. I started thinking: What would happen in clustered environment? If RetryContext is not clustered, but each server has its own copy then in worst case scenario I would end up with:

 (MAX_RETRY_NUMBER - 1) * NUMBER_OF_NODES + 1

So for deployment of two machines and max retries equal 3, in worst case scenario message would be rejected after 5 retries instead of 3. I though for a while about implementing custom clustered RetryContextCache implementation, but it did not seem justifiable. Exceptions were quite rare and this small difference in maximal number of retries seemed to be acceptable. Even Spring Retry documentation states that clustered RetryContextCache is overkill in most situations:

The default implementation of the RetryContextCache is in memory, using a simple Map. Advanced usage with multiple processes in a clustered environment might also consider implementing the RetryContextCache with a cluster cache of some sort (though, even in a clustered environment this might be overkill).

It turned out you actually MUST IMPLEMENT custom RetryContextCache. It does not have to be clustered, but still default Map based cache will sooner or later blow up in production. The problem with default implementation is that it can store 4096 keys and there is no expiry. If you have transient issues it might happen that it will fail on server A, and then server B will correctly consume message. Then you end up with 4095 limit... When you reach 0, Spring Retry starts throwing RetryCacheCapacityExceededException and no messages ever end up in DLQ.

2014-11-23 01:00:00,017 ERROR [com.XXX.calendarservice.exception.ServiceErrorHandler] (SimpleAsyncTaskExecutor-11539489) c416b403-e2b5-477f-af5f-293627c5837d INTERNAL_ERROR: class org.springframework.retry.TerminatedRetryException: Could not register throwable; nested exception is org.springframework.retry.policy.RetryCacheCapacityExceededException: Retry cache capacity limit breached. Do you need to re-consider the implementation of the key generator, or the equals and hashCode of the items that failed?: org.springframework.retry.TerminatedRetryException: Could not register throwable; nested exception is org.springframework.retry.policy.RetryCacheCapacityExceededException: Retry cache capacity limit breached. Do you need to re-consider the implementation of the key generator, or the equals and hashCode of the items that failed?

        at org.springframework.retry.support.RetryTemplate.doExecute(RetryTemplate.java:266) [spring-retry-1.0.3.RELEASE.jar:]

        at org.springframework.retry.support.RetryTemplate.execute(RetryTemplate.java:188) [spring-retry-1.0.3.RELEASE.jar:]

        at org.springframework.retry.interceptor.StatefulRetryOperationsInterceptor.invoke(StatefulRetryOperationsInterceptor.java:145) [spring-retry-1.0.3.RELEASE.jar:]

        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) [spring-aop-4.0.4.RELEASE.jar:4.0.4.RELEASE]

        at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:207) [spring-aop-4.0.4.RELEASE.jar:4.0.4.RELEASE]

        at sun.proxy.$Proxy117.invokeListener(Unknown Source)

        at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer.invokeListener(SimpleMessageListenerContainer.java:1111) [spring-rabbit-1.3.2.RELEASE.jar:]

        at org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.executeListener(AbstractMessageListenerContainer.java:559) [spring-rabbit-1.3.2.RELEASE.jar:]

        at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer.doReceiveAndExecute(SimpleMessageListenerContainer.java:904) [spring-rabbit-1.3.2.RELEASE.jar:]

        at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer.receiveAndExecute(SimpleMessageListenerContainer.java:888) [spring-rabbit-1.3.2.RELEASE.jar:]

        at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer.access$500(SimpleMessageListenerContainer.java:75) [spring-rabbit-1.3.2.RELEASE.jar:]

        at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer$AsyncMessageProcessingConsumer.run(SimpleMessageListenerContainer.java:989) [spring-rabbit-1.3.2.RELEASE.jar:]

        at java.lang.Thread.run(Thread.java:722) [rt.jar:1.7.0_09-icedtea]

Caused by: org.springframework.retry.policy.RetryCacheCapacityExceededException: Retry cache capacity limit breached. Do you need to re-consider the implementation of the key generator, or the equals and hashCode of the items that failed?

        at org.springframework.retry.policy.MapRetryContextCache.put(MapRetryContextCache.java:82) [spring-retry-1.0.3.RELEASE.jar:]

        at org.springframework.retry.support.RetryTemplate.registerThrowable(RetryTemplate.java:362) [spring-retry-1.0.3.RELEASE.jar:]

        at org.springframework.retry.support.RetryTemplate.doExecute(RetryTemplate.java:264) [spring-retry-1.0.3.RELEASE.jar:]

        ... 12 more


If you take a look on this dashboard you can immediately see when retry state storage stopped working correctly:

Grafana graph


Lessons learned: Do not always believe documentation but read implementation as well. I ended up with implementing simple RetryContextCache backed by guava-cache with expiration.

Graphhopper Docker container

In my most recent project, travel planning tool tribear.com, we had to optimize routes for independent travelers. After evaluating several alternatives we decided to go with open source routing engine GraphHopper.

GraphHopper can be used in several ways: webapp, android lib or regular java lib. We decided to use webapp deployment and call REST API to calculate routes between points. Moreover we wanted to run multiple instances of GraphHopper for each city/region. This way we do not need to have one huge GraphHopper instance for whole world (it needs over 16GB of memory), but separate small ones.

After playing a bit with docker project it seemed to be a perfect fit for automating configuration. I wanted to have simple image with java and GraphHopper run time with data directory mounted to host. You can find my dockerfile in github repo.

To run GraphHopper web app you need to provide map data by passing .pbf  file to program args. In my docker image, file with pbf extension is picked up from directory mounted to host directory. So first step is creating directory on your host and downloading .pbf file for particular region.

$ mkdir -p ~/private/graphhopper-data/berlin/
$ cd ~/private/graphhopper-data/berlin/
$ wget http://download.geofabrik.de/europe/germany/berlin-latest.osm.pbf

Take a look on config.properties. You can customize your GraphHopper instance here (for instance change routing from default car to bike). When you are ready, build image and run new docker container:

$ sudo docker build -t sogorkis/graphhopper .
$ sudo docker run \
      -d \
      --name=graphhopper-berlin \
      -v /home/stanislaw/private/graphhopper-data/berlin/:/data \
      -p 8990:8989 \
      sogorkis/graphhopper \
      /graphhopper/start.sh
$ sudo docker logs -f graphhopper-berlin
...
2014-10-04 11:21:30,110 [main] INFO  graphhopper.http.DefaultModule - loaded graph at:/data/berlin-latest.osm-gh, source:/data/berlin-latest.osm.pbf, acceptWay:car, class:LevelGraphStorage
2014-10-04 11:21:30,611 [main] INFO  graphhopper.http.GHServer - Started server at HTTP 8989

Please note that first run might take some time as GraphHopper needs to processes .pbf file and create additional work files. Tail logs till you see that server is started and run following url in your browser:

http://localhost:8990/?point=52.516534%2C13.381568&point=52.519877%2C13.40493&locale=en-US



The last thing which was needed is ability to override JAVA_OPTS, especially memory settings. To do that, just create env.sh file in mounted directory. For instance to reduce max heap size for graphhopper-berlin image, create following file:

$ cat /home/stanislaw/private/graphhopper-data/berlin/env.sh
JAVA_OPTS="-Xms128m -Xmx128m -XX:MaxPermSize=64m -Djava.net.preferIPv4Stack=true -server -Djava.awt.headless=true -Xconcurrentio"

Setting up Ubuntu 12.04 for Java development

In this post I will describe steps I have followed after fresh install of new Ubuntu 12.04 LTS. Linux is great environment for Java development and I am using for several years now. In my last work I had to maintain projects running on different versions of Java. This instruction gives some hints how you can deal with this situation effectively. Moreover I describe how to change GUI outline in Eclipse to be more compact using GTK configuration files.

JDK

I prefer installing Java manually. As I mentioned before I need to have several versions of Java including 1.5, 1.6 and 1.7. First step is to download required files from oracle site:
http://www.oracle.com/technetwork/java/javase/downloads

Choose Linux files for your OS architecture (i586 for 32-bit version of Ubuntu and x64 for 64-bit version). You should download files with .bin extension (for 1.5 and 1.6 JDK versions) or .tar.gz (1.7 JDK). After downloading, you should have following files in Downloads directory:
stanislaw@latitude:~/Downloads$ ls -l
total 212888
-rw-rw-r-- 1 stanislaw stanislaw 49760685 Jun  8 14:42 jdk-1_5_0_22-linux-i586.bin
-rw-rw-r-- 1 stanislaw stanislaw 85292206 Jun  8 14:43 jdk-6u31-linux-i586.bin
-rw-rw-r-- 1 stanislaw stanislaw 82927766 Jun  8 14:37 jdk-7u4-linux-i586.tar.gz
We will install all jdk's in /opt/java directory. First create directory:
sudo mkdir /opt/java
Now extract jdk-7 file and simply move it do /opt/java directory:
stanislaw@latitude:~/Downloads$ tar zxvf jdk-7u4-linux-i586.tar.gz
stanislaw@latitude:~/Downloads$ sudo mv jdk1.7.0_04/ /opt/java
To install jdk 1.5 and 1.6 first make bin files executable with chmod command:
stanislaw@latitude:~/Downloads$ chmod +x jdk-1_5_0_22-linux-i586.bin jdk-6u31-linux-i586.bin
Now run this executables (you will be shown license agreement; press q and confirm it by typing yes):
stanislaw@latitude:~/Downloads$ ./jdk-1_5_0_22-linux-i586.bin
...
stanislaw@latitude:~/Downloads$ ./jdk-6u31-linux-i586.bin
...
Right now 1.5 and 1.6 JDK are in Downloads directory. Move them to /opt/java:
stanislaw@latitude:~/Downloads$ sudo mv jdk1.5.0_22/ /opt/java/
stanislaw@latitude:~/Downloads$ sudo mv jdk1.6.0_31/ /opt/java/
Create symbolic links in /opt/java directory. This will simplify updating process to changing particular symbolic link.
stanislaw@latitude:~/Downloads$ cd /opt/java/
stanislaw@latitude:/opt/java$ sudo ln -s jdk1.5.0_22 jdk-1.5
stanislaw@latitude:/opt/java$ sudo ln -s jdk1.6.0_31/ jdk-1.6
stanislaw@latitude:/opt/java$ sudo ln -s jdk1.7.0_04/ jdk-1.7
Now it is time to set up java, javac and java plugin. It is best done with update-alternatives utility program. More information about this setup can be found https://sites.google.com/site/easylinuxtipsproject/java. Please note usage of paths to symbolic links created in previous step.
stanislaw@latitude:/opt/java$ sudo update-alternatives --install "/usr/bin/java" "java" "/opt/java/jdk-1.5/bin/java" 2
stanislaw@latitude:/opt/java$ sudo update-alternatives --install "/usr/bin/java" "java" "/opt/java/jdk-1.6/bin/java" 1
stanislaw@latitude:/opt/java$ sudo update-alternatives --install "/usr/bin/java" "java" "/opt/java/jdk-1.7/bin/java" 3

stanislaw@latitude:/opt/java$ sudo update-alternatives --install "/usr/bin/javac" "javac" "/opt/java/jdk-1.7/bin/javac" 3
update-alternatives: using /opt/java/jdk-1.7/bin/javac to provide /usr/bin/javac (javac) in auto mode.
stanislaw@latitude:/opt/java$ sudo update-alternatives --install "/usr/bin/javac" "javac" "/opt/java/jdk-1.6/bin/javac" 2
stanislaw@latitude:/opt/java$ sudo update-alternatives --install "/usr/bin/javac" "javac" "/opt/java/jdk-1.5/bin/javac" 1

sudo update-alternatives --install "/usr/lib/mozilla/plugins/libjavaplugin.so" "mozilla-javaplugin.so" "/opt/java/jdk-1.6/jre/lib//i386/libnpjp2.so" 1
Last step is setting JAVA_HOME environment variable in .bashrc. Simply add this line at the end of file, to keep java 1.6 as default. Please note that we do not need to set PATH variable, because we used update-alternatives utility to setup /usr/bin/java symbolic link. When I need to work with different version of Java, I simply override default JAVA_HOME value for specific application, i.e. JBoss AS.
export JAVA_HOME=/opt/java/jdk-1.6

Eclipse

Download Linux version, extract it, and move to /opt directory:
stanislaw@latitude:~/Downloads$ tar zxvf eclipse-jee-indigo-SR2-linux-gtk.tar.gz
stanislaw@latitude:~/Downloads$ sudo mv eclipse /opt/eclipse
Application should be ready to run with default executable /opt/eclipse/eclipse. In gtk environment spacing between GUI components seam to be to big. This can be hopefully fixed by customized gtk configuration based on this blog post. One thing I have added, is tool tip background and foreground configuration, to make it visible (see bug report).
stanislaw@latitude:/opt/eclipse$ cat eclipse-gtkrc
style "gtkcompact" {
font_name="Sans 9"
GtkButton::default_border={0,0,0,0}
GtkButton::default_outside_border={0,0,0,0}
GtkButtonBox::child_min_width=0
GtkButtonBox::child_min_heigth=0
GtkButtonBox::child_internal_pad_x=0
GtkButtonBox::child_internal_pad_y=0
GtkMenu::vertical-padding=1
GtkMenuBar::internal_padding=0
GtkMenuItem::horizontal_padding=4
GtkToolbar::internal-padding=0
GtkToolbar::space-size=0
GtkOptionMenu::indicator_size=0
GtkOptionMenu::indicator_spacing=0
GtkPaned::handle_size=4
GtkRange::trough_border=0
GtkRange::stepper_spacing=0
GtkScale::value_spacing=0
GtkScrolledWindow::scrollbar_spacing=0
GtkExpander::expander_size=10
GtkExpander::expander_spacing=0
GtkTreeView::vertical-separator=0
GtkTreeView::horizontal-separator=0
GtkTreeView::expander-size=8
GtkTreeView::fixed-height-mode=TRUE
GtkWidget::focus_padding=0
}
class "GtkWidget" style "gtkcompact"

style "gtkcompactextra" {
xthickness=0
ythickness=0
}
class "GtkButton" style "gtkcompactextra"
class "GtkToolbar" style "gtkcompactextra"
class "GtkPaned" style "gtkcompactextra"

style "tooltip" { bg[NORMAL] = "#FFFFFF" fg[NORMAL] = "#000000" }
widget "gtk-tooltips" style "tooltip"
In order to override gtk configuration for eclipse only, it is most convenient to prepare additional script eclipse.sh with GTK2_RC_FILES environment variable set.
#!/bin/sh

export GTK2_RC_FILES=/opt/eclipse/eclipse-gtkrc

/opt/eclipse/eclipse
Last step is to add launch icon. This can be done in several ways described here. I prefer method with manual creation of .desktop file. It can be easily used to add launch icon for any other application installed this way.
stanislaw@latitude:~$ cat /usr/share/applications/eclipse.desktop 
[Desktop Entry]
Name=Eclipse
Exec=/opt/eclipse/eclipse.sh
Icon=/opt/eclipse/icon.xpm
Type=Application
StartupNotify=true
Categories=Development

Other

Installing other java development software like maven, ant is simply done by extracting downloaded zip archives (I personally always use /opt directory) and setting specific environment variables like PATH and ANT_HOME. During my fresh install I installed additionally following packages:
# package libsvn-java is required by subclipse plugin
sudo apt-get install libsvn-java
# version control
apt-get install git svn
# my favorite terminal editor
sudo apt-get install vim
# MS fonts
sudo apt-get install ttf-mscorefonts-installer
# testing frontend configuration for production environment with apache
apt-get install apache2
# ... and many more

Hibernate Second Level Cache, JBoss Cache and JBoss AS 5.1 in default configuration

I've recently tried to set up JBoss Cache as a hibernate second level cache provider in Seam 2 application deployed in default configuration on JBoss AS 5.1. I also wanted to use Seam CacheProvider functionality. I've already used JBoss Cache with success in clustered environment, which used copy of all configuration.

However this time using copy-paste technique didn't worked well. In all configuration I just had to add following properties to persistence.xml:
<property name="hibernate.cache.region.factory_class" value="org.hibernate.cache.jbc2.JndiMultiplexedJBossCacheRegionFactory" />
   <property name="hibernate.cache.region.jbc2.cachefactory" value="java:CacheManager" />
   <property name="hibernate.cache.region.jbc2.cfg.entity" value="mvcc-entity" />
   <property name="hibernate.cache.region.jbc2.cfg.collection" value="mvcc-entity" />

If we do this in default configuration we get then following exception:
java.lang.ClassNotFoundException: org.hibernate.cache.jbc2.JndiMultiplexedJBossCacheRegionFactory.

It turns out that in default configuration, we have to copy three jars from all configuration lib directory: hibernate-jbosscache2.jar, jbosscache-core.jar, jgroups.jar. If you are working with project generated in seam-gen, you can add this jars to deployed-jars-ear.list. There is also no instance of org.jboss.ha.cachemanager.CacheManager boud to java:CacheManager jndi name. There are to ways of solving this problem. First is to copy necessary things from all configuration (i.e. cluster catalog where cacheManager is initiated). Second approach is to use other region_factory in hibernate config. I used second option, because application I'm working on, will be deployed on single server. Looking into documentation leaves two options for non jndi region_factory:
  • org.hibernate.cache.jbc2.SharedJBossCacheRegionFactory - one cache instance for entities, collections, queries, timestamps
  • org.hibernate.cache.jbc2.MultiplexedJBossCacheRegionFactory - separate cache instances for entities, collections, queries, timestamps

I used SharedJBossCacheRegionFactory for simplicity. We have to provide path to cache configuration file (hibernate.cache.region.jbc2.cfg.shared property). Samples of such files are available in jboss cache distribution. Working persistence.xml file in default configuration:
<property name="hibernate.cache.use_second_level_cache" value="true"/>
<property name="hibernate.cache.use_query_cache" value="true"/>
<property name="hibernate.cache.region_prefix" value="hibernate-cache"/>
<property name="hibernate.cache.region.factory_class" value="org.hibernate.cache.jbc2.SharedJBossCacheRegionFactory" />
<property name="hibernate.cache.region.jbc2.cfg.shared" value="META-INF/hibernate-jboss-cache.xml"/>

Last thing is setting up cache provider for seam in components.xml:
<cache:jboss-cache2-provider name="defaultCache" auto-create="true" configuration="META-INF/dcache.xml" />

Sample usage of cache provider for such configuration:
@In(value = "defaultCache")
    protected CacheProvider cacheProvider;

    private Object getSimpleDictionariesByType(SimpleDictionaryType type) {
        String cacheKey = "SimpleDictionaryType." + type;
        Object value = cacheProvider.get("dictionaryCache", cacheKey);

        if (value == null) {
                ...