Java Moods

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Friday, 26 March 2010

Having Fun with Encoding!

Posted on 08:20 by Unknown

Compiler Plugin

Recently, I edited our main company root POM to upgrade some plugins to new versions. Of course, we are following best practice to lock down the plugin version, so when a new version is available we only need to adjust the parent POM. Nearly all version updates were on the last build number digit, which is the z in x.y.z version string – so I didn't expect much difficulties.

However, for the compiler plugin, it was a jump from version 2.0.2 to 2.1, and indeed it turned out that some of the test cases failed compiling with strange encoding issues when using the new compiler plugin version.

Specify Encoding

We are following the suggestion to specify a POM property for source file encoding, for not being forced to configure encoding for all relevant plugins individually. Moreover, we were exactly using what's shown in the example:

<project>
...
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
...
</properties>
...
</project>
That is, we assumed our source files were UTF-8 encoded, which is the most widely used encoding for unicode characters. But, for some of the projects, that's actually not the case since we are using Eclipse with the default setting for text file encoding which is Cp1252 (Western European) on our german Windows.

Why didn't we ever notice that? Well, it happens that both the UTF-8 as well as Cp1252 encodings are backwards compatible with ASCII. We are coding most of the stuff in english (concerning package, class, method, attribute and parameter names, and even Javadoc comments), so the resulting byte stream will never be different for both encodings. However, some of the files used german umlauts in line comments which are exactly the files that can't be compiled any more with new compiler plugin version.

When looking at the debug output of compiler plugin 2.0.2 mojo configuration, you can see that the encoding is not explicitely set, probably meaning that the platform default encoding is used (which is again Cp1252 on all build machines):

[DEBUG] Configuring mojo 'org.apache.maven.plugins:maven-compiler-plugin:2.0.2:compile' -->
[DEBUG] (f) basedir = ...
[DEBUG] (f) buildDirectory = ...
[DEBUG] (f) classpathElements = [...]
[DEBUG] (f) compileSourceRoots = [...]
[DEBUG] (f) compilerId = javac
[DEBUG] (f) debug = true
[DEBUG] (f) failOnError = true
[DEBUG] (f) fork = false
[DEBUG] (f) optimize = true
[DEBUG] (f) outputDirectory = ...
[DEBUG] (f) outputFileName = xxx-0.2.0-SNAPSHOT
[DEBUG] (f) projectArtifact = xxx:jar:0.2.0-SNAPSHOT
[DEBUG] (f) showDeprecation = false
[DEBUG] (f) showWarnings = false
[DEBUG] (f) source = 1.6
[DEBUG] (f) staleMillis = 0
[DEBUG] (f) target = 1.6
[DEBUG] (f) verbose = false
[DEBUG] -- end configuration --

The new version 2.1 of compiler plugin is now considering what has been configured in project.build.sourceEncoding property, and hence tries to compile the Cp1252 coded source file with UTF-8 encoding which doesn't work when umlauts are used.

Specify Correct Encoding

Of course, the solution is to specify the correct encoding in project.build.sourceEncoding property, matching the encoding that is used in the development environment when writing the source files.

Oh, yes, Cp1252 is quite similar to ISO 8859-1 encoding (only some special characters on positions 0x80–0x9F are different which we don't use), so in fact we are using ISO 8859-1 now to allow builds on non-Windows platforms as well.

Certainly, it would be nice if the plugins had a history on their site where you can find this type of changes for new versions, without having to search in the Jira...

Read More
Posted in Eclipse, Maven, Windows | No comments
Newer Posts Older Posts Home
Subscribe to: Posts (Atom)

Popular Posts

  • DocBook with Maven Issue
    We are using DocBook for writing technical documentation for all our projects and in-house frameworks. We are actually quite happy with thi...
  • How big is BigDecimal?
    Lately, there was a debate in our company about rounding of numbers, more specific on how, when and where to do that. One of the questions w...
  • Eclipse: User Operation is Waiting, and Waiting, ...
    I am using Eclipse since quite a long time, sometimes around 2002. That was version 2.0, if I remember correctly. Since then, I have always ...
  • Google and the Crystal Ball
    Google brought us the Web Search. They brought us the Maps. They brought us their Mail, the News, the Images, the Videos... In other words, ...
  • Spring: Use Custom Namespaces!
    Have you ever heard of custom XML namespaces for Spring? I know you love Spring (like I do), so... probably yes. They are available since Sp...
  • Jenkins: Pimp It Up!
    Some days ago, I started to review what plugins are available for Jenkins, my favorite CI server . I haven't done so for a long time, so...
  • Maven vs. Ant: Stop the Battle
    Maven? Ant? Oh boy, how this bothers me. The endless debate and religious battle about which build tool is the better build tool, no, is the...
  • Checkstyle: One and Only Configuration File?
    The Checkstyle Challenge When you are using both Eclipse and Maven, you are probably facing the same challenge like we do: you would like to...
  • The Way From Hudson To Jenkins
    Some time has gone by since the Hudson/Jenkins fork ... and there has been even more talk in the community. However, slowly the dust settles...
  • HDD / SSD Battle
    The Problem You know, the laptop I'm using for my daily work job is not the fastest one. In contrast, it's more than 5 years old and...

Categories

  • BestPractices
  • Cargo
  • Checkstyle
  • Eclipse
  • Google
  • Hudson
  • Java
  • JBoss
  • JEE
  • Jenkins
  • JUnit
  • Maven
  • Nexus
  • oAW
  • Optimization
  • OSGi
  • Performance
  • Profiles
  • QA
  • Size
  • Spring
  • Testing
  • Tools
  • WebApp
  • Windows

Blog Archive

  • ►  2011 (5)
    • ►  May (1)
    • ►  April (1)
    • ►  March (2)
    • ►  February (1)
  • ▼  2010 (11)
    • ►  October (2)
    • ►  September (1)
    • ►  April (1)
    • ▼  March (1)
      • Having Fun with Encoding!
    • ►  February (4)
    • ►  January (2)
  • ►  2009 (30)
    • ►  December (3)
    • ►  November (4)
    • ►  October (2)
    • ►  September (3)
    • ►  June (4)
    • ►  May (5)
    • ►  April (4)
    • ►  March (5)
Powered by Blogger.

About Me

Unknown
View my complete profile