The following Quartz cron expression triggers every minute.
0 * * * * ?
Most of the time, we are actually want Quartz to trigger every minute, but we do not care that it triggers at the second zero. It can even be detrimental to trigger at the second zero. We recently had a problem on a productive system of 20 servers: they all had to execute some short but database intensive job every minute and so every minute, at second zero, they assaulted the database with similar requests. The database did not like it very much.
To alleviate the problem, we first spreaded out the jobs so that they were scheduled 3 seconds apart and therefore used the whole minute to execute. We had two problems with this approach. Firstly, it works only well if the number of servers is known in advance. Secondly, it requires a different configuration for each server; if you have dozens of jobs, it might become challenging.
In Jenkins, when scheduling a build job, it is possible to use the symbol
H instead of a fixed number in cron
to let the system compute the number from a hash value of the name of the job. So the
H * * * * ? execute every minute at a second determined the name of the job.
This is a very good idea for our server situation: the hash will spread the scheduling of our servers without having to know their number in advance and the configuration is the same for each server.
Let’s call these kinds of cron expression ‘fuzzy cron expression’ and let’s try to (re)implement this idea. We actually need
three kinds of fuzzy cron expressions. (To avoid confusion with hours, we use the hash symbol
# instead of
#, that will be replaced by a number within the range of the corresponding expression part; for instance,
#for seconds will be replaced by a number between 0 and 59.
#\[(?<from>\d*)-(?<to>\d*)\], that will restrict the range of the corresponding expression part on the interval given by the endpoint
to; for instance
#[10-40]for seconds will be replaced by a number between 10 and 40.
#/(?<divisor>\d*), that will divide the full range of the corresponding expression part by the given
divisor; for instance,
#/10for seconds will be replaced by
iis a number between 0 and 9. This expression will trigger every ten seconds starting with second
i, e.g. if
iis 4, we will have the following trigger sequence: 4, 14, 24, 34, 44, 54.
To simplify the presentation, we will just transform a fuzzy expression for the seconds: the range of seconds
is 0-59. Assume we have a function
get-in-range that takes two integers as argument and return an integer in
the range defined by those argument (
In Clojure, the first fuzzy expression
# can be transformed in
(str (get-in-range 0 60))
The second fuzzy expression
#[from-to] can be transform into
(str (get-in-range from (inc to)))
Finally, the third fuzzy expression
#/divisor can transformed into
(format "%d-59/%d" (get-in-range 0 divisor) divisor)
Now that we can transform the fuzzy expressions into cron expression, we need to implement the
Actually, it is easier to write a
get-in-range function generator.
(defn get-in-range-builder [text] (let [hash-code (hash text)] (fn [from to-exclusive] (let [bound (- to-exclusive from)] (+ (mod hash-code bound) from)))))
To solve our server situation, we use the DNS name of the server together with the name of the scheduled job
as argument to the
get-in-range-builder function. In the above example, we use the standard java String hash
method. It can be usefull to experiment with other hashing function that would spread better. The Hashing
class of the Google Guava Library gives easy access to several hashing algorithm. For our purposes, we actually
settled on the sha256 algorithm that gave better spreading than the other algorithms.
The transformation a complete fuzzy cron expression consists of the same kind of transformation for the other parts of the cron expression (minutes und hours in the examples). You can find the complete Clojure and Java examples at the following links.