I was looking to learn how to write a non-blocking IO server in Java, but couldn't find one that suited my needs online. I found this example, but it didn't handle my situation. There is also Apache MINA, but it was a little complex for my simple needs. So armed with these examples and a couple of tutorials (here and here), I created my own.
Results tagged “Technical” from Cordinc Blog
Some time ago I wrote a post describing a Java Multi-channel Asynchronous Throttler I had written. At the time, I stated it would preserve the order of calls, but as Asa commented on that blog post, this was not always the case. Here is a new version that does preserve order, and passes Asa's test. As part of this work I also extracted common code into new classes and created a ChannelThrottler interface. It works by placing incoming tasks on an internal queue. All the code detailed in this post (and the other throttler post) is available here.
In another of my recent series of generic utility Java classes, I present a multi-channel asynchronous throttler. My project connected to an external webservice which imposed draconian restrictions if usage went above a certain level - defined as whether there were more than X service calls in Y seconds. I was determined to stay under this level. Furthermore, some types of service calls were more expensive than others and had their own limits in addition to the overall limit. That is for certain types of service call I had stay under two limits - a call specific limit and the overall limit that applied to all calls. I refer to this as multi-channel throttling. Also, I wanted the throttler to be asynchronous, that is I did not want to stop procesing while waiting for the webservice to respond to my call. Looking on the web I found a number of throttlers, but none that matched my multi-channel, asynchronous requirements.
Java provides a number of methods to run code at scheduled future time, the implementations of ScheduledExecutorService generally being the best way at the moment. ScheduledExecutorService also defines methods to run the code a periodic intervals (scheduleAtFixedRate & scheduleWithFixedDelay). However, what if you have classes that need to be notified at regular discrete intervals and need to know which interval that are in. A timer is required that with a set periodicity fires an event on any listeners and passes the number of times the period has elapsed. What this is doing is breaking up continuous "realtime" into discrete blocks or quantums of time. For example, if the period is 10 seconds, at the start the time quantum is 0 until 10 seconds have elapsed and an event is fired denoting the start of the first time quantum. Ten seconds later another event fires as the 2nd time quantum starts, and so on.
This is a fairly simple bit of Java code. Any experienced Java developer has probably already worked out a good solution, but for me it has come up a couple of times in the last few months so I thought I'd write it down.
This is a quick post about a coding problem that took longer than expected to solve. I have put an example solution here as a reference for myself or others, hopefully speeding up the task. I needed to create a dialog to connect to a local or remote JMX process. Essentially this would be very similar to the JConsole "New Connection" dialog. Initially I thought this would just be a standard component, but instead it took some trial and error (as well as digging through Sun documentation) to get the required functionality working.
Recently I wanted to have a clickable button in a table. Searching on Google for JButton in JTable I found a couple of suggestions, most notably this DevX article and this Esus article. There was also a StackOverflow question that just referenced other solutions. None really satisfied me. So borrowing their ideas I created my own solution.
Ah, the Singleton pattern. It is used to create a single, globally accessible instance of a class, preventing any further instantiations (so only one object of the class can ever exist). When I first learnt about Singletons soon after leaving university, I thought they were wonderful and used them often. Now I think that they are often a very bad idea. Not because of security concerns, as one person once suggested to me, but because I don't want static methods/classes hanging round (increasing global state) or I later realise more than one is needed!
Once when creating a consumer website for a bank I needed access to various pieces of bank information at numerous points in the code. "There is only one bank", I reasoned and made the class containing this information a singleton. Then the bank acquired another two banks so my design was broken and refactoring had to take place right before launch. However, despite learning repeatedly from experience, I still find myself creating more. I think this is due to laziness on my part. If I'm short of time or feel that it is not particularly important, I fall back on habit. Later, while refactoring I realise a better result could have been achieved with some thought. For instance, if the global access of singletons is a problem, perhaps using an IoC container (like Spring) to wire everything up could help.
I hereby resolve not to use Singletons without thinking very hard first about whether there is a better way.
What is the accepted way to test generated source code? A recent work project required me to generate some Java source code from small definition files as part of a larger framework. The plan was for other project teams to take the framework and use the generator to create source which would become part of their project. I wanted to write unit tests for the source code. The problem was how to write a test for code that didn't exist at the time the test was written because it hadn't been generated yet! I came up with a few options:
- Do a text comparison between generated code and the previously created known text output of the generation process. If they are exactly the same, the test passes. This tests the generation process if correct but doesn't test the logic of the generated code. If the generated code has syntactic errors, compilers or IDEs will pick this up quite fast and the developers in the other project will complain quickly, but subtle logic could go unchecked for ages.
- Write a test that generates a class file from known inputs then compiles it programmactically with the Java Compiler API then loads the class into the JVM and runs tests against it. This would be better than the previous idea, as it tests the functionality of the result. This was my first choice solution. However, getting the Java Compiler API to work can be a bit hard. There was a paucity of documentation on this and in the end I couldn't get it to work within time constraints.
- Generate the test at the same time as the generated code and pass them both to the other project. The other project would then test the generated code as if it was their code. This felt wrong to me. No other generations systems do this - it is like passing the buck. If there is an error in the generated code then the generator needs to be fixed not the generated class!
- A 2-step process - test the functionality and generation process separately. First, create a class that should be the same as a generated class given known inputs (probably by generating it!) and write tests that work against that class. Thus the problem of testing not yet existing code is solved by using some code generated earlier. This tests the generated code's functionality as long as the generating process doesn't change. So the second part is to test the generation process hasn't changed using the first test in this list. Add another test to generate the code and check that it is exactly the same text as the class the rest of the test is written against. If the generation output changes then you have to modify the pre-generated class, and thus change the functionality test if required. Easy.
I have been thinking that software development is all about models and events. By models I mean domain models, and by events I'm referring to an event-driven architecture. Not quite the models and bottles of Investment Banking.
The domain model is where the application's data lives. It is the model part of the MVC pattern. For example, in a system that manipulates songs, a song would be part of the domain model, together with the song's name, artist, album etc (in Object-oriented systems there would be a song class containing these attributes). The more time I spend coding, the more I find that the future course of a project is directly related to how its domain model was conceived. Using MVC, changing the model later can involve a great deal of work if the model touches everything else. Thus usually some time is spent on working out the model and refactoring it if necessary. Indeed, I remember once describing a less than satisfying job as drawing pictures all day, meaning UML mainly for the domain (I would have preferred to be coding).
I have a few tips for domain models. Follow the ideal of MVC, that is keep view and controller logic out of the model. Also, try to have fat models and skinny controllers (this is a Ruby On Rails idea, but I find it works well in general). Keep model classes immutable if possible. Lastly, write your own containers (containing and hiding actual SDK-based containers) for easy aggregation functions (and parallelising). None of this is particularly controversial, few people I have worked with would disagree.
What I have recently found surprising is that people do disagree with using an event based architecture where I feel it would work well. Event-driven systems are ones where changes to domain state are packaged as events and passed to listeners of those events through some form of callback. This helps promote a loosely-coupled design, something else I would strongly encourage. Also recommended is keeping the events at the lowest level appropriate in the model (a rule I've broken a few times). For instance, if you have an album which contains songs, a "songAdded" event would go on the album listeners, but a "songUpdated" event should fire on the song's listeners. Other people seem to prefer a system of loops polling for changes (or something similar). I find their code far more complicated. Event-driven systems are simple to write, simple to understand and you only need to handle the events you are interested in. I would be interested to hear what other experienced programmers have found works.
In a recent coding project I fell foul of Java enumerations. They seem like a very good idea, and indeed they are when used properly. However, there is sometimes a tendency to overuse them - something unfortunately learnt from personal experience.
Intially Java had no inbuilt concept of enumerated types. Developers often found themselves inadequately simulating them by creating a bunch on integer constants. Proper enumerations were added to Java in JDK1.5 with the enum keyword and developers rejoiced (or at least I did). The following example from the Java Language Guide shows how they are used:
public enum Rank { DEUCE, THREE, FOUR, FIVE, SIX,
SEVEN, EIGHT, NINE, TEN, JACK, QUEEN, KING, ACE }
public enum Suit { CLUBS, DIAMONDS, HEARTS, SPADES }
Actually, the above is just the basic usage pattern. Java enums can have data and behavior, as shown in the Planet example below (from the same guide). They can also implement abstract methods directly at the definition (see the guide) but they cannot be subclassed.
public enum Planet {
MERCURY (3.303e+23, 2.4397e6),
VENUS (4.869e+24, 6.0518e6),
EARTH (5.976e+24, 6.37814e6),
MARS (6.421e+23, 3.3972e6),
JUPITER (1.9e+27, 7.1492e7),
SATURN (5.688e+26, 6.0268e7),
URANUS (8.686e+25, 2.5559e7),
NEPTUNE (1.024e+26, 2.4746e7),
PLUTO (1.27e+22, 1.137e6);
private final double mass; // in kilograms
private final double radius; // in meters
Planet(double mass, double radius) {
this.mass = mass;
this.radius = radius;
}
public double mass() { return mass; }
public double radius() { return radius; }
// universal gravitational constant (m3 kg-1 s-2)
public static final double G = 6.67300E-11;
public double surfaceGravity() {
return G * mass / (radius * radius);
}
public double surfaceWeight(double otherMass) {
return otherMass * surfaceGravity();
}
}
I think the first (Rank & suit) example is correct usage of enums - essentially treating them as related constant values. In my opinion, the second example is problematic. Here the Planet enum is acting similarly to a class with each instance being like an object, with data and associated behaviour. At the level of the example this may seem acceptable - not much is happening here. However, code rarely stays static. Requirements change and thus the code changes to handle different functionality. It is not hard to imagine more data being required for each Planet. Object-oriented classes are designed to help handle this change. Enumerations are not and there is no inheritance, nor is it possible to pass non static constructor arguments. As a result, as the functionality of an enum becomes more "object-like", the code supporting it becomes increasingly messy.
The general rule is that linked constants should be enumerations, but anything more than that should probably be a class.
I have broken this rule myself and lived to regret it. A project I recently worked on aggregated a number of different financial markets and presented a single interface for downstream clients. This project used quite a few enumerations. For example, there were enums to define the market defined type of a financial instrument and another for its pricing method. These worked fine, they were mutually exclusive values read from the markets essentially as constants. There was also an enum for the markets themselves. At first this was fine, each market was treated the same. However, as the system expanded to new markets and extra functionality, differences between markets began to appear. More and more logic began to hang off the market enum. Some was data stored on the market enum. Most was determined by the environment and was set by configuration settings. This logic ended up in various special classes constructed by Factory classes switching on market. Indeed there was quite a bit of logic in the system that switched code path depending on market. The code would definitely have worked better had there been a Market class rather than enumeration. Refactoring all the code to use a class rather than a enumeration was a big job in an important working system.
Recently I have been thinking hard about threading in a Java project of mine, and it definitely requires a great deal of hard thinking. When I started working with Java there weren't any native threads, so it was just easier to make everything single threaded as the performance gain was minimal. However, as time has passed more and more of my code both at work and home has needed to work multi-threaded. As I learn more about Java threading, I realise how wrong I was doing it before - a cycle repeated a few times in the last 12 years. It was only after reading the great book Java Concurrency In Practice a few years ago that I realised that in Java "locking is not just about mutual exclusion; it is also about memory visibility." I can only hope that this is the last such cycle, but realise it's probably not.
- Talking to other developers there seems to be a standard progression in Java multi-threaded programming skill:
- Just write single threaded programs
- Write multi-threaded programs by marking everything synchronized
- Use wait() and notifyAll() to build up "threadsafe" collection classes
- Realise your "thread-safe" collection classes aren't particularly thread-safe and try using immutable domain objects with stateless controllers to work around the problem
- Trust in the Java API concurrent utilities, with its collections, locks and executors
- Realise that these classes don't solve all the problems and start thinking hard about all the issues in your code
- Become very scared about Java threading
I think the next step may be using another language that doesn't allow shared state. Scala uses an Actor model to asynchronously pass messages between threads. There is no shared memory so no threading problems. This sounds like a good way to go, but I still work in Java for most of my programming (not many Scala or Erlang jobs around!).
This whole situation reminds me of memory management in C++. There were simple rules to follow to reduce memory leakage, but the problem was hard to completely solve (especially to a newbie programmer) and normally your program still leaked. I remember leaning heavily on Purify. Then Java came along with its automatic memory management and the problem largely disappeared. Developers could focus on the business logic of their code rather than housekeeping. Hopefully a similar thing will happen with Java threading.
For a recent project I had to connect to two different read-only web services. Both were from commercial vendors providing very similar information (which I aggregated). One API was REST based the other used SOAP. The REST interface took me a few hours to write, the SOAP one took days!
For the REST interface I had to send requests to a URL that somewhat indicated what I was attempting to do - eg http://example.site/contracts/123 to get the details of contract with id 123. An XML was returned and using a SAX parser constructed my internal objects. In total it took 340 lines of my code to handle the entire web service. Easy!
Using the SOAP interface required me to download their WSDL and then construct a model of their domain objects - I used Apache Axis2 (not Apache Axis though, that is an old and unmaintained project that does the same thing - a confusion that cost me some time). Unfortunately the WSDL was wrong and the model didn't generate properly. I had to dive into the 2000 line WSDL file to find and fix the error - and it was near unreadable! Try looking at a commercial WSDL file some time. After some trial and error the model was right and when surrounded by copious error handling code it could be sent and data received from the service. It took 810 lines of my code, plus hundreds more in the generated model. Urgh. At this point one might suggest it was just a problem caused by a dodgy service provider. However, I have since added a third read-only web service to the project. It is also SOAP based and just as difficult to get running properly - apart from having to modify the WSDL since theirs was correct from the start.
I also have experience from the other side, providing a web service. Using Ruby on Rails I added a REST interface to my existing website by just providing XML views of the data. While working at a bank a SOAP interface to an existing legacy system was required, it took a great deal of fiddling and trust in our IBM development tools to get it working after a couple of days.
The experience reminds me of an interview on The Register where Tim O'Reilly suggests that SOAP is a standards first specification created without reference to how it would be used in the real world (because at the time the spec was created there weren't any people using it). SOAP definitely feels like an overengineered solution. The kind of thing I would have loved earlier in my career when suffering from Second System Syndrome. Over time I am becoming increasingly appreciative of simplicity in code.
For me, there is now no comparison. SOAP just doesn't seem to work well in any place I've seen it tried. My first choice for web service API will have to be REST.
Update: The weather_report project and the weather archive is no more. The effort required to maintain it was too great, for more information see here.
Today I have released my first Ruby gem: weather_report, version 0.0.1. It connects to the BBC Backstage weather API and gets weather observations or forecasts for thousands of cities worldwide. Don't be fooled by the low version number, it is usable. However, it can only handle the BBC weather feed, requires pre-knowledge of BBC weather location ids and is metric only (all things I plan to fix).
With weather_report, you can do things like:require 'weather_report' # 8 is the BBC Backstage weather code for London, UK londonWeather = WeatherReport.new(8) # to get the current temperature londonWeather.observation.temperature # to get the tomorrow's max temperature londonWeather.forecast.for_tomorrow.max_temperature # and much more!
sudo gem install weather_report
There is also a website with documentation here.
I would like to say thanks to the authors of newgem and this rubyforge tutorial, both of which greatly aided in weather_report's creation.
A while ago I wrote a blog post asking is there dumb money on Intrade? The short answer was probably not once the implied interest rate was considered, unless a chain of nearly expired "easy money" trades could be constructed. When I heard that the Goggle app engine supported Java I was keen to try it out. So I ported my manual Intrade interest rate calculation to Java and had it read the Intrade XML feed to check for any contracts finishing soon with high implied interest rates. Unfortunately, the code fell foul of Google's time limit on reading data from other websites. Rather than see the code go to waste, I decided to clean it up and release it. It is a command line Java app with all the code available under the MIT license. It's probably best to just load it up in Eclipse and run the launch file to see it working. Enjoy!
You can download the whole eclipse project here.
I recently needed to write a Combinatorial Iterator for a project I'm working on. That is a class which when given a size and a collection of items (normally a set), returns all the possible unique combinations of elements from the collection of the given size. For instance the combinations of size 3 of the set [1,2,3,4] are [1,2,3], [1,2,4], [1,3,4] and [2,3,4]. There is a nice discussion of this problem on StackOverflow. The number of possible combinations can explode in size quickly, there are 2,598,960 combinations of five cards from a standard pack of 52 cards. Thus I wanted to handle each combination in turn as required, rather than calculate all combinations upfront. Having recently started learning Scala and Ruby, I decided to implement my solution in those languages, plus Java (which I use every day at work).
I wanted a project to allow attachments to be added to tasks. My requirements were that: users should be able to add multiple attachments to a task up to some maximum limit; there should be validation on the size of the attachments (combined with the number limit this should ensure I don't run out of disk space too fast); and, that there should be some security on downloading a task's attachments. Paperclip seems to be the popular Rails attachment at the moment and is still under active development, so I hoped to use that. However, it does not handle multiple attachments for a model. There is a plugin PaperclipPolymorph which adds multiple attachments to Paperclip, but I just couldn't get it to meet my validation and security requirements. In the end I wrote my own solution and this article details it.
A few years ago I had the idea of creating a fairly standard online spaceship combat game (like EVE). The difference to other such games being that the players would not control their ships directly, but instead through code they submitted to the game - a more complicated and extended Robocode or CoreWar. At the time I stopped because I couldn't work out a way of preventing Denial of service attacks in the submitted code. I had hoped to use Java as it had an extensive security model, but I found there was no way of stopping untrusted code from starting too many threads or allocating too much memory.
Fast forward many years, and I thought I'd check out if the situation has improved. A search on Google returned all the same webpages I read 9 years ago - there hasn't been much movement in the area. The Javadocs suggested that the thread issue had been fixed, but no word about memory. I checked Robocode and it seems to just ignore the memory allocation problem. This is because it is not fixed!
WTF! Sun has gone on about the Java Security Model for some time now. Touting the safety of its sandbox. However, if untrusted code can crash the system it is broken - nothing more to say. I don't care about the rest. This has been a known issue for years.
Anyway, rant over. On a more positive side, I used StackOverflow for the first time in researching this issue and it is quite cool. My question is here. The answer states that there is a movement to fix the memory DoS attack, but that it is still in the requirements stage and probably will not be part of Java7.
I run a couple of Rails websites (queuesaurus & past weather forecasts), both on a small 256MB virtual host at Slicehost. As such I don't have a great deal of computing resources available to support two resouce hogging apps. If you have visited either of those sites, you will know that the first page view can take seemingly ages, but that after that it speeds up. As I use Phusion Passenger to serve the apps through Apache, I investigated some of the config options and how they affected performance.
I have been asked what time I get the forecasts for a location in my historical weather forecast system. The obvious problem being that if I read all the forecasts at nearly the same time, different cities could be in different days due to differing timezones.
The answer is, I read the forecasts at roughly midday in each of the locations. That is, every hour I read the forecasts for the locations where it is roughly midday. This is done by approximating the various timezones using the location's longitude. This isn't exact - it doesn't account for daylight savings or places like China, which is all one timezone despite being wide enough to be split into many. Generally it is accurate to within a couple of hours - fine for the purpose. The code is below.
time_band = -Time.now.gmtime.hour + 12 locs = Location.find_by_sql ["select * from locations where longitude >= ? and longitude < ?", (time_band-1)*15, time_band*15]
I am doing some work with the Rails text_field_with_auto_complete method to provide a dropdown list of completed options as a user types, like Google Suggest. However, I needed the item displayed in the dropdown to be different to the item displayed when it is selected. I couldn't find any help online. So at first I thought I would need to override the AJAX code, but when I looked I saw text_field_with_auto_complete already had this feature built in.
Recently for a little project I wanted to get weather reports, and being in London my first thought was to use BBC weather. Doing a little searching, the BBC provides a number of RSS feeds for its data (news, weather, etc), as part of the Backstage project. Details of the weather feeds are here. This post gives some of the tips and tricks I discovered using these feeds.
Update: based on the work I detail here I have created my own Javascript tooltip library based on Prototype.
Over a decade ago Borland had an IDE called Intrabuilder, which allowed you to create websites in a Visual Basic like manner using Javascript on the server. After leaving postgrad study I worked at a now defunct dot.com called Electrolley for 9 months using this tool to create an online grocery store. Serverside javascript was an idea well ahead of its time. Indeed, javascript on the client side in 1997 was crap too. Now javascript is cool and moving forward (backwards?). With my new project, I need to return to Javascript and the last couple of days have been a hard reintroduction.
I am using Prototype and Scriptaculous to make my website as easy to use as possible. I also found some random code on the web to do tooltips - it almost worked! Applying tooltips to Scriptaculous Sortable elements leads to many problems, and I have just finished largely rewriting the library.
- After moving some sortable elements the tooltips of the moved elements would be displayed underneath the unmoved elements. This is because of the way z-index stacking contexts are created in browsers. In moving an element, Scriptacolous clones it and moves the clone, inserting a new element when the element is dropped. This causes the dropped item to be in a different z-index stacking context than the other items. To get around this, the tooltips need to be drawn outside the moved elements context. This can be done by cloning the tooltip and then inserting it into the document at a point outside the sortable elements' container. I have explicitly put in the id's to which the tooltips is being attached, they should probably be passed in (left as an exercise for the reader).
this._clone = this.tool_tip.cloneNode(true); $("container").insertBefore(this._clone, $("main_content"));
When the tooltip needs to be hidden, just destroy the clone:
Element.remove(this._clone); this._clone = null; - This also means the tooltip needs to be positioned absolutely, so ensure you are using absolute screen x & y coordinates.
- Dragging a sortable element over other elements results in a number of mouse events firing as the elements are redrawn. Many of these events are consumed by Scriptaculous and not passed on. Thus, the tooltips hide and show functions may be called in any order, resulting in spurious and permanent copies (since it is cloned) of the tooltip all over the screen. To get around this just ensure there is only ever once tooltip on the screen at a time, so at the top of the show method put
if (this._clone) { hideTooltip(event) }
It seems simple now I've fixed the problems, but it took a couple of days to remember my old Javascript knowledge.
So you have decided to add full-text searching to your Ruby on Rails application. After some investigation (see here and here) Sphinx with the Ultrasphinx plugin looks like the best solution for your needs. Just one problem, despite your deployment environment being Linux, for various reasons your current development environment is Windows. While Sphinx runs on Windows, it explicitly states it prefers to run on other systems and there are a few little gotchas. Here is the step by step guide to getting it working:
- Download the Windows binaries rather than the source code.
- Extract the contents of the bin/release folder in the zip file you just downloaded into a location of your choice (eg. C:\Sphinx).
- Add the location of the folder in step 2 into your system path.
- Install chronic - gem install chronic
- Install Ultrasphinx - svn export svn://rubyforge.org/var/svn/fauna/ultrasphinx/trunk vendor/plugins/ultrasphinx
- Copy the example default.base to the app config folder and edit it. The path variable is the path to the location of the index (created in step 10), not the Sphinx binaries. Also, find the "set_rotate" option and set it to 0. This is not supported on Windows at the time of writing. You probably want to have this set differently in production.
- Run rake ultrasphinx:configure You should now have a Sphinx conf file.
- Add indexing to your model.
- Edit the Ultrasphinx.rake (in plugins/ultrasphinx/tasks) and where you see '#{Ultrasphinx::CONF_PATH}' remove the apostrophes, otherwise the rake task below won't be able to find the configuration file.
- Run rake ultrasphinx:index You should now have an index in the location specified in the conf path.
- Run rake ultrasphinx:daemon:start The sphinx daemon should start up.
- You are done and should be able to search. Good luck.
Update: In case you are not reading the comments, Naomi Novik has pointed out that Sphinx/Ultrasphinx doesn't seem to want to work if the Rails project is in a directory with spaces in the path.
Update: Some people have had problems beyond those I mentioned above. Take a look at this and this if you have extra problems.