Inside Paulo Abrantes' head
[ start | index | login or register ]
Next Page
Monday, 24. March 2008

Java Programming: Bytecode Injection

As probably all Java developers know, when compiling Java source code the compiler doesn't generate machine code like, for example, a C compiler does when compiling C source code, but rather an intermediate code. That intermediate code, called bytecode, is what is understood by the Java Virtual Machine (JVM).

Knowing how bytecode works isn't actually necessary for Java developers, although knowing it, or at least, knowing how to interact with it gives news possibilities in the developing process.

Bytecode Injection is one of those possibilities and it consists in changing existing bytecode, that is, in modifying a Java compiled resource (a class file). This post will explain how it can be done, creating a simple injection framework, discussing the advantages and disadvantages of different processes and for the code junkies there will be plenty of source code available.
While reading, some might think that the examples will be similar to Aspect Oriented Programming (AOP), that thought isn't wrong, since this subject can be the basis for AOP.

Bytecode Manipulation

Like it has been said, Java bytecode is intermediate code that is ran within the JVM. This code is generated by the Java Compiler, this means that, the class files generated from a Java compilation are in fact files that contain bytecode. Each instruction is one byte long and it might resemble code from the assembly language.

There are various libraries available for bytecode manipulation, each one has their goals. To mention a few there are >>ASM, >>BCEL and >>javassist.

Javassist is a bit different from the previous two because, instead of making the developer actually write bytecode, creates an abstraction that allows to write normal Java code - with some restrictions - converting it automatically to bytecode. Because of such feature, Javassist will be the library used in the example provided.

Until now it might not be clear why bytecode manipulation is an interesting concept. I see it as interesting concept because, it allows the developer to create a separation of concerns. For example, access control and persistent can be implemented into a working application using bytecode injection.

The Framework

Since the idea is to create a simple general purpose injection framework first, it has to be understood how to design it. In my opinion, there are four main topics regarding this subject:

  1. How to identify an injection
  2. Where can injections be done
  3. What is the injected code context
  4. What to code to inject
1. Identifying an Injection

An injection should be identified in a simple and concise way. In my opinion, the best way to achieve this is by using Java Annotations - more about annotations and how to write them in >>Java Programming: Doing your own annotations.

2. Where can injections be done

Since injections are being identified by Java Annotations, there is a limited set where to place injections. In this case, only methods and class properties will be allowed to contain injections. The properties injection will be only a shortcut to define injections for the property's getter and setter.

Two different annotations can be created, one for methods and another one for properties. They must hold at least, a marker for the injection call - which in this particular case will be the class name - and if the code should be injected before or after the existing method code is executed.

Below is presented the code of each one of the annotations.

Show InjectMethodCall source
@Retention(java.lang.annotation.RetentionPolicy.RUNTIME) @Target( { METHOD }) public @interface InjectMethodCall {

public String injectionCall() default "";

public InjectLocator location() default InjectLocator.BEFORE;

public enum InjectLocator { BEFORE, AFTER; } }

Show InjectProperty source
@Retention(java.lang.annotation.RetentionPolicy.RUNTIME) @Target( { FIELD }) public @interface InjectProperty {

public String injectGetter() default ""; public String injectSetter() default "";

public InjectLocator getterLocation() default InjectLocator.BEFORE; public InjectLocator setterLocation() default InjectLocator.BEFORE; }

3. What is the injected code context

In order to treat any type of injection code the same way, an interface shall be defined, this interface also allows to define what is available to the injected code. I think three things should be available:

3.1 The object where the method is being called
3.2 The name of the method
3.3 The arguments received in the method

With all this information the following interfaced, named InjectionCall, can be written in the following way:

package business.injectionCall;

public interface InjectionCall {

public void run(Object object, String methodName, Object[] args); }

Creating a new injection call, consists in creating another object that implements the InjectionCall interface. This interface, with the provided context is able to do all sort of things from checking the object's internal state to perform access control or persist values.

4. What to code to inject

The injection interface has been presented, although the actual code that execute the call to those interfaces, the actual code that is being injected in the methods, wasn't yet presented. That code is quite simple and is generated by a method called generateCodeWithInjector (in InjectionUtils). The code follows:

private static String generateCodeWithInjector(String injectorName, String methodName) { return "{ try { Class clazz = Class.forName(\"" + injectorName + "\");" + "business.injectionCall.InjectionCall injectionCall =" + "(business.injectionCall.InjectionCall) clazz.newInstance();" + "injectionCall.run(this,\"" + methodName + "\",$args); }" + "catch(Exception e) { e.printStackTrace(); } }"; }

First the injector's class - which will be implementing the InjectionCall interface - is loaded, then instantiated and finally the method run is called with the object itself, the method name and a special argument, $args. This is a special variable that is understood by javassist while compiling the code and replaced by an Object array containing the arguments of the current method.
More about the javassist's special variables can be read - along with examples - in the >>javassist documentation.

The four main concerns regarding the framework were presented, but there's still an important question left.
How is that simple snip of code just showed injected into the bytecode? That leads us to the injection process.

Injecting the code

injectionProcess This is the most interesting part, which is how can the injection actually be done. There are various ways of doing it, and two different ways will be presented:

  1. Injecting during the build process
  2. Injecting at runtime using a custom class loader
Even though there are obvious differences between the two previous methods, the actual injection process for a class is performed in the same way. The algorithm that performs the injection is described in the flowchart that is presented on figure at the right.

When a class in being looked into for possible injections, it first gets all its declared methods - that means not only public accessed method, but all - visited looking for an injection annotation. When that annotation is found, the information needed is retrieved from the annotation - such as the injection call that should be used - and the bytecode for that method is modified.

After visiting all the declared methods, the algorithm starts visiting all declared fields - once again declared, not public, hence all fields - looking for a specific injection annotation. When the annotation is found, the property is injected. By "injecting the property" should be understood as that property's setter and getter methods injection.

Below there's the code that implements the first part of the flowchart, visiting all declared methods and injecting code into them when needed.

Show Snip of injection process
// code snip in the generation CtClass cc = pool.get(clazz.getName()); for (Method method : clazz.getDeclaredMethods()) { if (method.isAnnotationPresent(InjectMethodCall.class)) { InjectMethodCall annotation = method .getAnnotation(InjectMethodCall.class); CtMethod ctMethod = cc.getDeclaredMethod(method.getName()); injectMethod(ctMethod, annotation .injectionCall(), annotation.location()); } } // more code

public static void injectMethod(CtMethod ctMethod, String injectionCall, InjectLocator locator) throws CannotCompileException {

String code = generateCodeWithInjector(injectionCall,ctMethod.getName()); switch (locator) { case BEFORE: ctMethod.insertBefore(code); break; case AFTER: ctMethod.insertAfter(code); break; } }

buildProcessInjection 1. Build Process

Doing the injection during the build process consists in having an application that reads the class files, perform the injection process in each class - meaning manipulating the class bytecode - and finally write the new class files over the old ones.

Everyone using this option should pay attention that this isn't a idempotent action, meaning that if for some reason the injector is ran twice in the build process, the code will be injected twice.

Below the main code for the injector application is presented.
The code is quite simple but there are two classes that are worth mentioning, they are, ClassPool which is the javassist pool for class representation, each time a class is requested to the pool, if already exists a representation for it it is returned, otherwise, it is created and, CtClass which is the javassist class representation for a Class object.

There are other CtXXX objects, each one of them represents an object from the class structure, such as CtMethod or CtField. Each one of this objects provides an interface to inject code, before, after or around it among other operations.


List<Class> classesToScan = getClassesInDirectory(args[0]); ClassPool pool = ClassPool.getDefault(); for (Class clazz : classesToScan) { CtClass classRepresentation = pool.get(clazz.getName()); performInjection(classRepresentation); classRepresentation.writeFile(InjectionUtils.BUILD_DIR); }

ClassLoaderInjection 2. Injecting at runtime using a custom class loader

This solution uses a custom made ClassLoader, anyone interested in knowing more about ClassLoaders should read >>Java Programming: First steps with ClassLoaders and >>Java Programming: Hot Deploy

The idea is simple, instead of parsing all classes at build time, each class is injected on demand at runtime when is requested to the class loader to load it. The injection is only done in memory. This makes the operation idempotent, since the class is loaded into memory only once and then the class code is cached.

This solution also allows a better flexibility, because by just switching the used class loader the classes can be used in the application with or without the injection code.

But there are disadvantages, the class loading will take more time since every class has to be processed and injected if needed. Also there's a bit of code modifications to use the code with such custom class loader.

The code snip below shows how a class - TestSample class in the business package - could be loaded with such custom class loader:

InjectionClassLoader classLoader = new InjectionClassLoader(TestInjectionClassLoader.class.getClassLoader()); Class clazz = classLoader.loadClass("business.TestSample"); TestSample testSample = (TestSample) clazz.newInstance();

Since using this option introduces an overhead at runtime a benchmark was done - benchmark code is also present in the available code - which loaded a Java Class containing injection code, and other that didn't, both cases with the InjectionClassLoader and the standard JVM class loader. The values presented below are an average of running the benchmark five times:

 Class with InjectionClass without Injection
Custom class loader406ms7ms
Standard class loader0ms0ms

The time of loading a class without Injection is linear with the number of methods and fields it contains. This time could be optimised by creating a class annotation that would tell the class loader that such class needed to be parsed in order to find code injections. Although the biggest problem isn't that but the actual loading of a class with injection which took almost half a second.

After seeing such results, the following advantages/disadvantages table regarding both methods can be presented:

 AdvantagesDisadvantages
During BuildRuntime performance isn't affected, transparency on the application codeNot idempotent, needs modification of the build process
ClassLoaderFlexibility, idempotent operationPerformance, code needs modification to use the custom class loader

Conclusions

Using the proper tools bytecode injections isn't complicated to perform and can easily provide separation of concerns. Two different ways of performing the actual injection were presented, both had advantages and disadvantages, using each one of them depends on what the developers want.

Anyone interested in seeing this concepts working in actual code, can download my >>code example, which contains the Injection Generator, the Injection Class Loader, the class loader benchmark and some examples objects that implement InjectionCall and a object that uses them. An ant build file is also provided in order to make things easier.

Hope the article was interesting.

4 comments (by m4ktub, pabrantes) | post comment

Monday, 10. March 2008

Intermission: Sorry For Downtime

Between 6pm until 8pm there was a downtime of this blog due to a power failure. The UPSs supporting the system couldn't provide power for that long resulting in downtime.

My apologies and thank you for your understanding.

no comments | post comment

Friday, 29. February 2008

Software Developing: Studying The Bliki Domain Model

The bliki concept has always been interesting. Merging the capabilities of a wiki platform with a blogging platform is a breakthrough because it allows a user to create various content that may or may not be a blog entry in a rather simple way. This isn't a new concept, according to Martin Fowler it's around since 2003(>>information from wikipedia), although it's not a widely used piece of software. Nowadays it's very usual to use blogging software or wiki software, but somehow the fusion between those two types of software seems to be forgotten by the general user.

When I started using SnipSnap and later developing code for it, not only my interest for blikis grew but also, allowed me to understand and put in practice the concepts that exist in this kind of software. SnipSnap isn't complete or perfect but learning how it works was a pleasant and instructive experience.

Lately, I've been looking into other blikis software in order to understand what is really the business logic and the underlying model. This post summons up the ideas and model I've thought.

This time example code isn't being provided, because it doesn't exist. The ideas below are purely conceptual and there is currently no source code implementing them. In my opinion, it seems a good model, but different people have different views from the problem so the objective of this post is to create discussion regarding the topic.

I've broken down the model into three parts, they are the Entry Model, the Meta Data Model and the Access Control Model. I'll be addressing separately to each one of them.
Readers should pay attention that some relations are only interesting in one of the models, so in order to save visual space those relations won't be represented in the other two models, but that doesn't mean it's not there.

The models below were designed with a domain driven design philosophy, anyone interested in learning more about such kind of architecture may read the following posts:

Let's now start with the entry model.

The Entry Model

This part of the model is the actual bliki's core. The Entry entity represents nothing more than information inserted by the User. In this kind of software three different types of entries were identified, they are:

  • Wiki Page
  • Post
  • Comment
The names are self explanatory, but that doesn't explain the logic that each one of this kind of entries contains.
For example, in my opinion, a Wiki Page has to support sub pages (not to mention history). The sub page concept should be understand as wiki pages children that create a page hierarchy. Also, I believe that any type of entry must allow - if permitted by the access control, of course - comments.
Some readers might be asking why isn't Blog listed as an entry. That happens because, a blog is nothing more than a container of Posts, it's not information inserted by the user but rather something that knows how to display some specific information.

entry-model But let's see in a more detailed manner the domain regarding the Entry Model that can be seen on the right.

The Entry entity has three sub-classes which are the types previously mentioned. Entries can contain other entries, which is the way to preserved the historic of entries. During implementation certain verifications need to be part of the business logic in order to only allow entries that are historic to be added in this relation. There's also another interesting relation in the Entry entity, any type of Entry allows comments, notice that this allows for comments to contain comments. At first this mind sound confusing, but it's actually the representation of a message thread.

Besides that, an Entry has an owner, which is a registered User within the bliki and an entity that represents all it's meta data called Meta Entry. We'll talk more about this last entity soon.

Regarding the Blog entity, like has been said it's not an Entry but a container of Posts. It also contains a list of editors, which are registered Users.

Finally there's the Wiki Page entity. This entity also uses the Composite Design Pattern allowing an hierarchy of pages. This entity can also be more specific such as a Wiki User Page, which is directly connected to the User or a Groovy Script Page.
A Groovy Script Page would contain Groovy code - more about Groovy in >>Java Programming: Getting your code spicy with Groovy - that would be executed at runtime. Although it sounds a shiny feature it's probably useless and it doesn't really need to exist in the model, that is why it was chosen to be represented with dashed lines.

The main ideas used when creating this part of model were the following:

  1. There are three different kinds of entries;
  2. Every entry has an owner;
  3. Every entry can be commented;
  4. It's good to have message threads support in the model;
  5. Every entry has historic;
  6. Every entry has meta data;
  7. Wiki Pages have to allow hierarchy;
  8. Wiki Pages may be extended for more specific behaviour such as pages for bliki's users.
Meta Data Model

From the three, this is probably the simplest model. The idea of this model is to understand what is the Entry's meta data and how it should be placed in the model. The Meta Data entities identified were the following (although there may be more):

The Meta Entry entity exists because if this meta data entities were directly connected to the Entry when a new version would be created the meta-information would have be copied (not actually copied since it was only references moving around). The problem with that - besides information not being centralised - would be the duplication of meta information creating strange scenarios.

meta-data-model For example, imagine a certain Tag T1 connected to an Entry E. After edition of E, there would be E and E', if the Tag was directly connected to the Entry both E and E' would be tagged with T1, that means T1 would have two entries but they were actually the same entry but in different versions!

Also, upon the presentation of the Access Control Model we'll see that there is once again the need to have an entity that represents not only the current version of the Entry but also all its historic.

Even though meta data entities have been presented, it wasn't yet explained what is a meta data entity. An Entry object regarding the information it holds contains a limited set of explicit - somehow directly inserted into the system - and implicit - in a way the system infers it - of information regarding the entry but without actually being the information it holds, hence called meta information or meta data.

Other examples of meta data can be the number of views a certain entry had, the number of editions, among other parameters that themselves can be part of the actual Meta Entry and do not need to be represented by an entity.

Regarding the guide lines that were used when creating this part, there was only one the Meta Model must be transversal to the Entry historic.

Access Control Model

This part of the model defines the access control for the bliki's >>CRUD operations. The access control is defined through rules, which have been limited to three different types:

  • User Rules
  • Group Rules
  • Role Rules
Each one of this kind of rules contains a tuple. One is the operation, the other is the User, Group or Role according to the access rule type.

access-control-model Like it can be seen in the model on the right a Group is nothing more than an aggregation of Users. Also, since a certain User should be able to be in more than one group the relation between both entities is many to many.

Another concept introduced with the Access Control Model is the Role. Every User has a set of roles. Examples of roles in this sort of applications may be, for example, admin, editor, blogger among others.
Defining access control using roles can be seen has defining access control for groups where the groups are defined by the system instead of the User.

The current model does not support composition of rules but that may be an interesting feature to add.

The main concerns when creating this part of the model were:

  1. The entry access control is transversal to the historic
  2. It should be able to define access control for a given user, group of users or a role
  3. The operations that are submitted to access control can be reduced to CRUD operations
Final Thoughts

By no means this model is being presented as the model. This is only a result found for the question "how is a bliki domain model?". It's not perfect, it may be flawed or have things missing. It actually should be seen as a first draft and it's totally open for discussions about the subject.

Even thought while writing this post there was the effort to see all the blikis that exist in order to summon up the knowledge that each piece of software contributes it's impossible to be sure that everything was seen - most likely it wasn't - and nothing was overlooked - most likely there are things overlooked. Once again discussion about the subject is probably the best thing to be done.

I'm also providing the >>complete model image for anyone interested.

Finally, I suggest the reading of the >>wikipedia blikis page.

Hope it was an interesting reading.

6 comments (by jff, m4ktub, pabrantes) | post comment

Monday, 28. January 2008

SnipSnap Developing: Trying to settle a roadmap

Even though I've been away mostly due to the amount of work (and also some compulsive geocaching) I've been working on some of the improvements I planned for SnipSnap. In my opinion I haven't, yet, reached a development milestone but things are getting close. In my last SnipSnap seveloping spree I've finished some of the things that were in the todo list for a long time.

While adding the latest features to my_snipsnap_fork_feature_list I noticed that besides what is presented in that list, I've never written anything else regarding the improvements. The problem with that list, is that it's just a list, and some of the items listed there, need configuration to work… So I decided to list some of the features explaining their configuration, the last item is the newest one - and probably one of the most exciting also - never yet documented.
I'm aware that this won't be interesting for everyone but, I hope some might find use for it.

Features Description

  • Social Web configuration: Maybe the name isn't the best, and any suggestions is always welcome. A Social Web Link is a an icon in the top right corner of the post that links to something. It can be digg, delicious or it can be the RSS feed for the comments.
Property In Config SnipDescription
app.configuration.socialWebcontains the identifiers to generate the links. Values are comma separated
app.configuration.NAME.linkname is one of the identifieds listed in the previous property. The Snip is available in the context has a bean, so if in the link there's the need to use the snips name or url it can be simply refereed as ${name} and ${url}
app.configuration.NAME.imagethe icon that will be displayed
  • Integration with feedburner: this feature allows the user to integrate SnipSnap with an existence feedburner account. The property app.feed needs to be configured with the feedburner's feed.
  • Integration with Google's WebMasters Account: The verification with meta-data has to be selected and then after receiving the keyword given by Google configure the property app.google.webmasters.verification.code with it.
  • Integration with Google Analytics: After registering in Google Analytics the user code has to be configured into the app.google.analytics.account property.
  • Integration with SnapPreview: Some love SnapPreview others hate it, just in case someone using SnipSnap wants to use it, there is now support. To use it, simply register at the service and then setup the app.feature.snapPreview.userKey property with the user key.
  • Snip permissions: Finally I've implemented a basic configurable permission system per snip. When accessing a Snip if the user is an administrator or the owner of the snip is able to configure the snips permissions. The syntax is simple, the value is one or more "Permission_Type:Role" comma separated.
TypeValues
PermissionView, Edit, Attach, Post
RoleOwner, Admin, Authenticated, Guest

But the permission setup isn't the only new feature, other new features are the user experience improvement in the search macro, fixing the annoying bug that prevented snips from 31 December to show up, the configuration of snip owner's comments through CSS, there's a pull quote macro - which can be seen in action below - a new version for the CollapsingBlock macro that now keeps state via cookies, a shoutbox system and a better support for multipart requests, which were giving a few troubles in browsers such as Safari 3.0.3.

The RoadMap

Establishing a roadmap is something a bit difficult to me, since SnipSnap is a side project, sometimes I may be months without actually creating something… Still I like to have objectives to pursue.

In the top priority are the following features:

  • Improved Access Control, in order to define users and group of users instead of only roles.
  • Allowing anonymous posts, after all almost every blog platform nowadays allows that, I don't see why SnipSnap shouldn't allow it.
  • Creating a good support for integration with AJAX components.
  • Create interfaces for configuring the features listed in features description.
My fork will continue, until I feel there is no use for it.
In my opinion the best thing would be to have the complete SnipSnap developing team working in this, although, I don't know anything from them since about a month ago I got an email from Angelo… It's just bad when there is no communication at all.

I don't know how SnipSnap will evolve, I can't guarantee it's longevity nor developing since I'm not the project administrator, but I can guarantee that in my fork, and that's what I'm doing right now. My fork will continue, until I feel there is no use for it.

I really hope things with the reactivation of SnipSnap go well, the project has plenty of potential and I think a good work can be done. Although, in order for that to happen, is needed that everyone wants to improve SnipSnap and even more important, believe that something can be done in order to help SnipSnap.

no comments | post comment

Monday, 07. January 2008

System Administration: Load Balancing with Apache

This year starts with a topic that doesn't show up that often on the blog, mostly because is not one of my main area of interest, system administration.

Since last month, this blog is running on a new application server.
With the addition of a new server some configuration had to be made on the network, giving me the chance to do some of the things that were on my to-do list.

One of those things was to create a decent fail system that is automatically activated when the application server is being deployed or, is down due to some sort of maintenance that requires application server downtime.

The current system configuration is a single application server running >>Jetty, being front-ended by the >>Apache httpd server using mod_proxy. Although this is the configuration being used, the only requirement for the solution that will be presented is the Apache httpd server the application server can be any other than Jetty such as Tomcat, JBoss, etc.

A simple but not effective solution

A very simple solution to take care of the application server errors is to specify a ErrorDocument for the 503 - the >>service unavailable code - in the apache configuration file.

The problem with such solution is that when the error, or errors, specified in the configuration are received from the application server apache sends a redirect to the client with the url provided in the configuration.
This means that the client is sent to the error page and by refreshing it won't request the page that was being initially requested from the application server.

In my opinion this is a problem. Specially because I already knew that it is possible to serve an unavailable page at any url without redirecting, it was Jorge Matias - one of the sysadmins at work - that pointed me out the solution. Thanks Jorge!

A good solution (in my opinion)

Since Apache Http v2.2 a Load Balancing module is available. The idea is that the Apache server behaves as a balancer that distributes the traffic among the application servers. Since, 2.2.4 there's an option which allows a server to be marked as hot standby (pay special attention: the documentation talks about this feature since 2.2.0 but it is only available since 2.2.4). An application server marked as an hot standby server will only be reached if and only if any other application server is unavailable.

As soon as the application servers are once again available the traffic will start being redirected to them.

This is a typical scenario for the application deployment. The application servers are unavailable, although there is other server that replies to any request with the same page saying that the service is unavailable.

This way if a client requests /something/something2/abc and the application server isn't available it won't be redirected but instead will get a virtual page served by the hot standby server.

Below is the explanation on how to create such environment.

Configuration

For this configuration example there will be two different instances of Apache running - the balancer and the hot standby - and one application server instance.

Let's first examine the balancer configuration.

<VirtualHost *:80> ProxyRequests off ServerName www.example.com

ProxyPass / balancer://cluster/

ProxyPassReverse / >>http://theOnlyServer:8668/ ProxyPassReverse / >>http://theOnlyServer/

<Proxy balancer://cluster> BalancerMember >>http://theOnlyServer:8668/ # The below is the hot standby BalancerMember >>http://someOtherServer/ status=+H </Proxy>

# Other configuration regarding the virtual # host that is not important for this example. </VirtualHost>

The Proxy balancer://cluster defines a new balancer cluster called cluster and it contains two members, defined between the Proxy tags: http://theOnlyServer:8668/ and http://someOtherServer/. The last one is marked with a status=+H which defines it's a hot standby server.

Another common pitfall is defining a single ProxyPassReverse for a cluster's ProxyPass. If a ProxyPass is defined for the cluster then, a ProxyPassReverse has to be defined for each BalancerMember.

Although there are various options that allow to define load factors, maximum of clients per Balancer Member, among other things the configuration can be as simple as presented above. To read more about the various options in the balancer module, please refer to the >>Apache Documentation.

On the hot standby server the configuration is also very simple. The apache's rewrite engine is used to rewrite any requested url into the not available page url. Let's then examine the hot standby apache's configuration (the one who's running in someOtherServer host):

RewriteEngine on RewriteRule /(.*) /unavailableServer.html

The previous configuration enables the rewrite engine on apache and rewrites any url to /unavailableServer.html. This is, once again, the simplest scenario. More complex configuration can be created, such as, creating specific rules in order to use css files and images.

Conclusions

With minimum effort a hot standby configuration can be created, improving the user experience when the application server is being deployed. At least, using this configuration the user gets a personalised screen rather than a huge sloppy "503 Service Unavailable".

no comments | post comment

Monday, 31. December 2007

Blogging: Two years have passed

Two years ago, in the 31st December 2005, I started this blog. It doesn't feel at all that two years have passed. Last year when the blog did one year I presented the blog status regarding the experience, this year I'll be doing the same.

Status

Last year when it was announced this blog's first year, I was very satisfied with the results. Looking at this year visitors' report and feedback that has been received through comments and email I can only say that I'm even more enthusiastic.
Even though this year I didn't post as often as I did in 2006 the number of visitors and the feedback on each post was much higher.
Actually the traffic for this blog has been increasing during this year, as it can be seen in the table below.
Anyone interested in numbers can compare the table below, with the one in >>Blogging: One year has passed:

MonthUnique visitorsNumber of visitsPagesHitsBandwidth
Jan95319021073430180273.03 MB
Feb68717251071921566241.27 MB
Mar74020011294431381328.15 MB
Apr73520871236345526346.72 MB
May8211775913124660194.02 MB
Jun97323801150731872247.20 MB
Jul3046488017396132083728.17 MB
Aug173631191254156045423.31 MB
Sep174938621643061775486.17 MB
Oct147844951362468576475.04 MB
Nov166748281642459579467.19 MB
Dec190654311887962965545.45 MB
Total16491384851626926262084.64 GB

This year there are various posts that I think are worth mentioning. From the beginning of the year there's a three post series about Domain Driven Design, which sadly I haven't continued yet:

Another series that has been created is the Java Programming series, which had interesting posts such as: Also worth mentioning is the announcement of a SnipSnap fork, as can been read in >>SnipSnap Developing: Planning a fork and that a few weeks ago I became part of the new SnipSnap developing team, as mentioned in >>Software Developing: The SnipSnap Saga.

Finally, I want to thank all the readers for reading, commenting and emailing me. I blog for the pleasure of writing and sharing my ideas but, without the readers this blog wouldn't be complete.
Special thanks go to jpmsi and m4ktub for all the technical criticism they've been giving in the Java Programming and Domain Driven Design series, and also to AdaHsu and derjohn for all the feedback and suggestions regarding the SnipSnap developing.

2008 RoadMap

In my opinion, during 2007 I managed to improve content quality and the writing itself, for the next year I plan to keep improving both. As a preview, more articles regarding Java and Domain Driven can be expected and also analysis of other languages that can ran in the JVM, such as Scala and Groovy. Some other topics (and new formats of presenting information) are under study but I won't reveal them yet, otherwise there would be no surprise.

An excellent 2008 for everyone!

2 comments (by m4ktub, pabrantes) | post comment

Monday, 17. December 2007

Software Developing: The SnipSnap Saga

As probably some might know I suggested, not long ago, merging my SnipSnap fork with the new born project SnipSnip. Anyone interested in this topic can find more information in >>Software Developing: SnipSnap, SnipIt and SnipSnip.

I had already chat with Angelo - SnipSnip's owner - by email and, we agreed that a merge would be a good option. Although, besides such emails no further actions were made regarding a possible merge.
This weekend, I received excellent news from him.

It seems, that admin access to >>SnipSnap project in sourceforge was granted to Angelo. That means we're now currently developing for SnipSnap!

I'm already listed as a developer of SnipSnap - which leaves me with a big smile on my face, really - and it seems my fork will be soon merged into SnipSnap code base. A small team will take from there and keep working to improve its features.

Definitely, the wonders of open source software! smiley

3 comments (by m4ktub, Hugo, pabrantes) | post comment

Monday, 10. December 2007

Java Programming: Getting your code spicy with Groovy

Groovy isn't a new language, it has been around since 2003. Hence, some of the readers might already know it, although others probably never heard of it. And of course, there will be the ones - like me - who know the language but never explored it.

Well, a few weeks ago I had some time to read part of >>Groovy's Documentation and started creating some test code and exploring Groovy's potential.

For the ones who don't know, Groovy is a dynamic language influenced by languages such as Ruby - which currently has a huge hype around it -, perl and Java.
A huge attractive that Groovy has for Java developers is that it can be compiled into byte code which runs directly on the JVM. But, there are various other ways to run Groovy:

  • Running Groovy code as a script using a standalone interpreter
  • Running Groovy within Java Code through a Groovy Java Shell
  • Integrating Groovy into Java code using a Script engine
The possibilities of integration with Java with very low effort made me put Groovy in my "should check" list.

For anyone with Java background - or probably any OO language - starting with Groovy isn't that hard, a few syntax modifications but that's it. Of course that there are some advanced issues that may take their time but, for what I've seen, I don't think they are so many. Also, since Groovy supports >>closures the code style can change a bit, becoming possible to use functional style (some might like, some might dislike).

Since I'm talking about a new language, I'll follow tradition and present the Hello World source code:

public class Hello { def hello(String name) { println "Hello " + name; } }

Hello hello = new Hello(); hello.hello("world");

Or in another way:

println "Hello world";

I think both ways are self explanatory and there's no need to comment any of the previous code.

Revisiting an example

While exploring what could be done with Groovy I remembered what I've written a few months ago regarding Java Hot Deploy, >>Java Programming: Hot Deploy.

The example presented in the Hot Deploy post was a simple command line application, where the commands - implemented via Command Pattern - could be added, removed or modified in runtime.

To achieve that solution, I had to write a custom class loader that would allow the JVM to load the modified classes into memory and, a file system watcher to know when a certain class was changed.
Even though the example presented in the post was a simple proof of concept, there was lots of code involved. More code means more work, the possibility of more bugs and a bigger effort to maintain.
Groovy can help in this problem by reducing the amount of code.

Since Groovy code can be dynamic, it's interesting to integrate new Groovy code with existing Java code through a script engine. In order to implement this solution it's important to understand what should be implemented has Java code and what should be implemented has Groovy.
Such problem is quite simple to solve since, what is intended to be dynamic are the commands, hence, the commands should be implemented as Groovy code.

The figure below represents the modifications in the application's design (top design explained in detail in the Hot deploy post), as it can be seen, the design becomes simpler.

refactoring

Regarding the GroovyScriptEngine there are a few things that deserve to be mentioned:

  • The constructor receives an array of Strings that specifying the directory or directories where groovy scripts can be found (which I called groovy script repository).
  • To execute a given script which is in the script repository the run(String name, Binding binding) method should be ran.
  • The Binding object is the bridge between Java and Groovy that allows to pass variables into Groovy or retrieve values from Groovy scripts.
Show CoreEngine Source
package net.pabrantes.groovyTests;

import groovy.lang.Binding; import groovy.util.GroovyScriptEngine; import groovy.util.ResourceException; import groovy.util.ScriptException;

import java.io.BufferedReader; import java.io.FileNotFoundException; import java.io.IOException; import java.io.InputStreamReader;

public class CoreEngine {

private GroovyScriptEngine scriptEngine;

public CoreEngine(String[] roots) throws IOException { scriptEngine = new GroovyScriptEngine(roots);

}

public void start() throws ResourceException, IOException { BufferedReader in = new BufferedReader(new InputStreamReader(System.in)); String input = ""; Boolean leave = Boolean.FALSE; Binding binding = new Binding(); binding.setVariable("leave", leave); while (!Boolean.TRUE.equals(leave)) { System.out.print("$ ");

input = in.readLine(); if (input != null) { String[] commands = input.split(" "); String[] arguments = getArguments(commands); binding.setVariable("args", arguments); try { long time1 = System.currentTimeMillis(); scriptEngine.run(commands[0] + ".groovy", binding); long time2 = System.currentTimeMillis(); int time = new Long(time2 - time1).intValue(); System.out.println(commands[0] + " took " + time + " ms"); } catch (ScriptException e) { if (e.getCause().getClass().equals( FileNotFoundException.class)) { System.out.println(commands[0] + " command not found!"); } else { System.err.println(e.getMessage()); }

} } leave = (Boolean) binding.getVariable("leave"); }

}

public static void main(String[] args) throws IOException, ResourceException {

String[] roots = new String[] { "groovyCommands" }; CoreEngine commandLine = new CoreEngine(roots); commandLine.start(); }

private String[] getArguments(String[] commands) { String[] arguments = commands.length > 1 ? new String[commands.length - 1] : new String[] {}; for (int i = 0; i < commands.length - 1; i++) { arguments[i] = commands[i + 1]; } return arguments; } }

Show ListDir Groovy script
// This is a closure // "it" refers to the argument passed to the closure. def getDir = { it.length == 0 ? "/" : it[0] };

File dir = new File(getDir(args)); File[] files = dir.listFiles();

files.length.times { println files[it]; }

The amount of code reduced in the engine itself and in the commands. Although, not everything is good, using Groovy scripts will have a negative impact in the applications' performance. To understand such impact, a simple benchmark was done using both implementations.

The benchmark consisted in running five times a given command and on the 4th time create a change on the command being run in order to create a reload action (which obviously will affect execution time). The benchmark was ran three times and the table below presents the time averages.

Time table For Java + Groovy Versus Java Only (average in 3 test runs)

Java +
Groovy
ls /
(in ms)
echo string
(in ms)
Java
Only
ls /
(in ms)
echo string
(in ms)
1st time26654261st time2613
2nd time190592nd time30
3rd time100443rd time740
4th time (*)10426414th time (*)82
5th time103105th time21

(*) - commands were modified, causing command reload.

Anyone interested can also >>download the full source code which includes Java and Groovy source code.

The code for the the Java only implementation can be found in the Hot Deploy post.

Conclusions

As can be seen in the table above, the delay introduced by Groovy is quite big. In this particular case, user's output, the delay isn't very relevant since, most of the times, it's still under a second. But, in other cases this delay won't be acceptable.

As any other solution it contains trade-offs that should be well thought through. The example allowed to understand that the integration of Java with Groovy is quite simple and has great potential but it also has its share of problems.

Also, should be kept in mind that Groovy is not all about integration with Java, the language by itself contains really great features such as the >>native support for markup languages and a >>set of features that allow to create Domain Specific Languages very easily..

In my opinion, Groovy really has lots of potential which I'm definitely interested in exploring.

4 comments (by m4ktub, pabrantes) | post comment

Wednesday, 14. November 2007

Software Developing: Fluent Interfaces

Lately I've seen various blogs mentioning Fluent Interfaces. Being a >>Martin Fowler's "fan" I already had read about the concept. As expected, using it implies decisions and trade-offs, this post will not only explain what a fluent interface is but also present the trade-offs and suggest possible implementations.

I'll start by introducing the concept.

Fluent Interface: The concept

The idea is quite simple, an object's interface is said to be fluent when any method invocation available in the interface allows the developer to chain other method invocations, making the code design "flow".
This property is sometimes called chainability (like in >>jquery), although >>Martin Fowler and >>Eric Evans coined the term fluent interface.

A good way to clarify the concept is by using an example.
Let's imagine the following scenario: there's an object - called QueryObject - that allows search terms (criteria) to be added and execute a query.

A possible implementation using a non-fluent interface could be:

public class QueryObject { private List<String> criterias;

public Queryobject() { criterias = new ArrayList<String>(); }

public void withCriteria(String criteria) { criterias.add(criteria); }

public List<QueryObjectResult> execute() { //.... some code here } }

Using such object results in the following:

QueryObject query = new QueryObject(); query.withCriteria("criteria 1"); query.withCriteria("criteria 2"); query.withCriteria("criteria 3"); List<QueryObjectResult> results = query.query();

On the other hand a fluent interface for query object would look like:

public class QueryObject { private List<String> criterias;

public Queryobject() { criterias = new ArrayList<String>(); }

public QueryObject withCriteria(String criteria) { criterias.add(criteria); return this; }

public List<QueryObjectResult> execute() { //.... some code here } }

Should be noticed that the method addCriteria was renamed to withCriteria and instead of returning void returns itself. The first modification (name modification) is irrelevant, but the second one (return type modification) it's what allows calling fluent to the object's interface.

Using this object would result in the following code:

QueryObject query = new QueryObject(); List<QueryObjectResult> results = query.withCriteria("criteria 1").withCriteria("criteria 2").withCriteria("criteria 3").execute();

With the previous example, fluent interfaces may look interesting but, not that useful.
In my opinion, fluent interfaces are truly helpful when designing custom >>Domain Specific Languages (DSL). Allowing the DSL keywords to chain with each-other can, when well designed, create instructions really close to natural language.
A good example of a well achieved DSL with fluent interface is >>JMock's test definitions. Here's an example taken out of >>jMock: Yoga for Your Unit Tests:

// [..snipped code..] logger.expects(once()).method("setLoggingLevel").with(eq(Logger.WARNING)) .id("warning level set"); logger.expects(once()).method("warn").with(warningMessage) .after("warning level set"); // [..snipped code..]

The previous code, in my opinion, has excellent readability which improves the code quality.

But, like I said before, there are trade-offs. One of the obvious trade-offs is the violation of the Command and Query Separation principle.
The Command and Query principle states that object methods should be divided in two categories: queries, where a result is returned and the object state isn't change and commands, where no result is returned and the object state is changed.
Having a setter or an adder (hence, command methods) that returns the object itself goes against the principle. This also means that fluent interface aren't compatible with the JavaBean's setter, getter convention.

Another trade-off is the complexity of designing these interfaces. A developer new to fluent interfaces might take a while until (s)he starts developing good fluent interfaces.
It may take extra effort to develop such interfaces, but - as already seen in the JMock example - it greatly improves software's design.

Fluent Interfaces: Implementation

There are various ways of implementing fluent interface, but one thing is certain, implementation should always be done in a non-intrusive way. Non-intrusive way should be understood as keeping "old" behaviour untouched and add fluent behaviour.
This can be done in different ways, I'll be discussing two different approaches: using inner classes and using proxies.

Using an Inner Class

fluent-flowchart-small

One of the simplest ways for creating a fluent interface is creating an inner class and then have methods to start (in the object) and finish (in the "fluent object") the fluent process. The inner class can be directly inside the class which will be receiving the fluent interface or if it's impossible to modify it, through hierarchy.

The idea is that at a given moment the developer decides to use the fluent interface by calling the start method. After that call, methods can be chained and then at the end of the chain another object can be returned or the same object from the beginning of the process, but in the non-fluent way.

The flowchart for the inner class' behaviour can be seen in the picture at the right.

The main advantage of this method is that it's quite simple to implement and doesn't necessarily need the creation of new public classes and interfaces that might contribute for an explosion of classes.
On the other hand the creation of such inner classes and their maintenance can be a huge and tedious task.

Probably a good scenario to use such implementation would be in an environment where a domain language would be used to define the application's domain classes and generating them through a code generator. That generator would create the classes and, for the ones who would be configured to allow fluent interfaces, the inner class.

Show source code
public class YetAnotherPOJO { private String someString; private Integer someInteger; private FluentYetAnotherPOJO fluentInterface;

public void setSomeString(String string) { this.someString = string; }

public String getSometString() { return this.someString; }

public void setSomeInteger(Integet integer) { this.someInteger = integer; }

public Integer getSomeInteger() { return this.someInteger; }

public FluentYetAnotherPOJO start() { if(fluentInterface == null) { fluentInterface = new FluentYetAnotherPOJO(); } return fluentInterface; }

private class FluentYetAnotherPOJO {

public FluentYetAnotherPOJO setSomeString(String string) { YetAnotherPOJO.this.setSomeString(string); return this; }

public FluentYetAnotherPOJO setSomeInteger(Integer integer) { YetAnotherPOJO.this.setSomeInteger(integer); return this; }

public String getSomeString() { return YetAnotherPOJO.this.getSomeString(); }

public Integer getSomeInteger() { return YetAnotherPOJO.this.getSomeInteger(); }

public YetAnotherPOJO finish() { return YetAnotherPOJO.this; } } }

Using a Proxy

Another way to implement fluent interfaces is by using proxies. The basic idea is to create a custom implementation handler that will use an interface that specifies a fluent interface and a certain object, linking them together through a convention. Anyone interested in reading about how Java proxies work, should read >>Java Programming: Proxies and References Java Programming: References' Package where I explain in detail what is a proxy, an implementation handler and the mechanism of Java proxies.

proxy-uml

The idea used in this approach is simple:

  • First there is an interface that represents a fluent interface containing every method needed, with a minor detail the setterMethods instead of being named setSomethig will be called something and will returns the interface itself (this avoids conflicts with the setter convention).
  • Then there is a custom implementation handler which always contains the object being proxied. When a method is called if its name isn't finish nor starts with get then, the handler tries to find a method named "set" suffixed by the name of the method invoked in the object being proxied and if found invoke it. Finally, after invocation returns itself.
I could provide some example source code, but >>Stephan Schmidt already wrote an excellent post talking about this idea called >> Fluent Interface and Reflection for Object building in Java, anyone interested in seeing the Java's source code for this implementation should also read his post.

The main advantages of using this approach is that there's only one class implementing the fluent interface behaviour, which is the custom implementation handler. Although, since this approach uses proxies, one interface will be needed for each desired fluent interface. This leads to what I've been calling an interface explosion which, in my opinion, is bad design.

Conclusions

Fluent Interfaces seem to be a good way to improve software design, specially if there are DSLs involved, but beware, they should be used with caution. In order to achieve a good and useful fluent interface the developer needs to already have knowledge about the applications' domain and how operations can and should be chained.
Another thing that must be kept in mind while working with fluent interfaces is that they violate the Command Query Separation principle, hence, may have conflicts with the getter, setter convention, which could lead to some problems.

2 comments (by m4ktub, pabrantes) | post comment

Monday, 05. November 2007

Software Developing: Implementing a ShoutBox on SnipsSnip

Since a few months ago I noticed that the number of visits were raising although most of the readers won't comment. In my opinion, this behaviour is due to the fact that SnipSnap only allows logged-in users to comment and even though the registration process is quite simple and fast some readers probably prefer to skip the process and not comment at all.
The idea of allowing guests to comment is in the list of features to implement, although since a comment is a snip and a snip needs to have an owner it has implications, such as, who is the owner of such snip - some default user? should be an editor and the snip be locked? - I still haven't find out a good solution so that implementation keeps getting postponed.

Meanwhile, I thought that a good idea would be the existence of a shoutbox. For the ones who don't know, a shoutbox is a chat-like feature that allows people to quickly leave messages on the website, generally without any form of user registration.
It's not my intention to replace the comments with the shoutbox, but it could be a feature that allows interaction with new users, which then might lead to the registration of those users.

On Friday I developed a simple implementation of a memory-only shoutbox. I'm saying memory-only because the messages aren't persistent, there's a buffer that keeps messages in memory only (size is configurable) and after it's full it has a policy of first in, first out (FIFO). Also since it's only in memory if the server is restarted the current shoutbox is lost.

The implementation is very simple and it only needed to create three new objects, the ShoutBox, the ShoutBoxServlet and finally the ShoutBoxMacro.

ShoutBox
The ShoutBox is the object that is part of the model. Basically ShoutBox is a wrapper around an application aware map that contains for each one of the application instances that SnipSnap is running a list of shouts. A shout is nothing more than a formatted string with the date, user and actual input.

ShoutBoxServlet
The ShoutBoxServlet is the controller. It receives data from the interface and after input formatting and escaping (in order to avoid XSS) it passes the information to the ShoutBox object.

ShoutBoxMacro
The ShoutBoxMacro is the tool given to the user. It is a component that renders the shouts' list for the current application and also renders an input box to enter new shouts.

A schematic can be found below.

shoutbox-render

This is the first version and I'm aware that improvements can be done, such as:

  • Allow editors to moderate the ShoutBox. Currently shouts are only deleted when they are thrown out of the shout buffer.
  • Using javascript to submit and update the shout panel auto-magically.
Any other suggestions, comments and critics are welcomed.
4 comments (by skimmer, m4ktub, pabrantes) | post comment

Next Page
Who am I?
paulo-roca2My name is Paulo Abrantes AKA pabrantes and I'm a software developer. I'm currently employed at >>CIIST working as a Java developer in >>FenixEDU.

This blog is mostly about Java programming, domain driven design and snipsnap bliki developing. Everything written in this blog is my personal opinion and it may not reflect the opinions of my employer and co-workers.


Blog subscription
subscribe by rss subscribe by email

Links
>> Home
>> Paulo's Profile
>> Post History
>> Add to Technorati Favorites
>> Paulo's Photo Gallery
>> WishList
>> Posting without Login

Search Blog
Fellow Bloggers

Recent Posts

Java Programming: Bytecode Injection
Intermission: Sorry For Downtime
Software Developing: Studying The Bliki Domain Model
SnipSnap Developing: Trying to settle a roadmap
System Administration: Load Balancing with Apache
Blogging: Two years have passed
Software Developing: The SnipSnap Saga
Java Programming: Getting your code spicy with Groovy
Software Developing: Fluent Interfaces
Software Developing: Implementing a ShoutBox on SnipsSnip
Software Developing: SnipSnap, SnipIt and SnipSnip
Java Programming: Proxies and Access Control
Java Programming: Proxies and References
Java Programming: References' Package
YALM: Yet Another Layout Modification

For older posts, please refer to post-history for a complete Post History

Logged in Users: (0)
… and 21 Guests.
This is a modified version of snipsnap.org created by >>Paulo Abrantes