Author Topic: Problem of characters encoding with AIL  (Read 7781 times)

Offline florence_c

  • Newbie
  • *
  • Posts: 14
  • Karma: 0
Problem of characters encoding with AIL
« on: January 25, 2013, 12:14:07 PM »
Hello,

We are currently developping a Java agent desktop using framework AIL v7.6.4.
We encounter some difficulties when retrieving the application options (using ailFactory.getApplicationInfo() method): options that have characters with accents (french language) are bad displayed (ex: "D?jeuner" instead of "Déjeuner").
The 'config' database is Cp1252 (MSSQL Server 2008 R2) and our Java application is UTF8.

Thanks in advance for your help.
Florence.
« Last Edit: January 25, 2013, 12:20:43 PM by florence_c »

Offline imaki

  • Jr. Member
  • **
  • Posts: 51
  • Karma: 0
Re: Problem of characters encoding with AIL
« Reply #1 on: January 28, 2013, 09:43:28 AM »
Is it possible, that your application is outputting data on command prompt? That might be the problem..
Have you tried to write results to an utf-8 file?
Have you debugged the solution and can you see that the characters are really returned as [u]'?'[/u]?

Offline florence_c

  • Newbie
  • *
  • Posts: 14
  • Karma: 0
Re: Problem of characters encoding with AIL
« Reply #2 on: February 01, 2013, 10:11:37 AM »
Hi,

I have debugged by getting bytes of my string data (data.getBytes(charset_name)) trying various charset names. The result is that the byte code is always corresponding to an unexpected character. For example, for UTF-8 charset, I obtain 63, which corresponding to "?"

Here is an use case and some basic code snippet, if you want to reproduce this case :

1/ Have a config database with collate = French_CI_AS

2/ In CME, create new AIL application (type CLIENT or SERVER, no matter) and add the following section/options:
    [motifs]
    DEJEUNER=Déjeuner

3/ Create new Java project. Main code snippet is :

public void main(String[] args)
{
    // ... init AIL connection ...
    ApplicationInfo appInfo = ailFactory.getApplicationInfo();
    Map appOpts =  appInfo.mOptions;
    Map test = (Map)appOpts.get("motifs");
    System.out.println(test.get("DEJEUNER"));
}

4/ Run JVM with argument -Dfile.encoding=UTF-8 (mandatory for us)

5/ In console, see the result:
- expected => Déjeuner
- observed => D?jeuner

Offline cavagnaro

  • Administrator
  • Hero Member
  • *****
  • Posts: 7639
  • Karma: 56330
Re: Problem of characters encoding with AIL
« Reply #3 on: February 01, 2013, 03:50:20 PM »
console can be messy...try an output as proposed and see what happens

Offline florence_c

  • Newbie
  • *
  • Posts: 14
  • Karma: 0
Re: Problem of characters encoding with AIL
« Reply #4 on: February 01, 2013, 04:46:50 PM »
Well, the first time we saw the problem, was not into the console, but into the textBoxes of our web application. So we are sure it is not a simple console issue.

I simplified the code snippet in order for you to reproduce the case, but the problem can be seen anyhow you display the data.

Offline cavagnaro

  • Administrator
  • Hero Member
  • *****
  • Posts: 7639
  • Karma: 56330
Re: Problem of characters encoding with AIL
« Reply #5 on: February 01, 2013, 05:10:41 PM »
Question, you are using tomcat?

Offline florence_c

  • Newbie
  • *
  • Posts: 14
  • Karma: 0
Re: Problem of characters encoding with AIL
« Reply #6 on: February 01, 2013, 05:19:49 PM »
Yes, we are using Tomcat 6.
If you want to suggest to set file encoding to cp1252, I will say that it is not possible for us : setting file encoding to UTF8 is a requirement, since our application is one among others into Tomcat webapps, and we can't afford changing this configuration option.

Offline cavagnaro

  • Administrator
  • Hero Member
  • *****
  • Posts: 7639
  • Karma: 56330
Re: Problem of characters encoding with AIL
« Reply #7 on: February 02, 2013, 12:25:19 AM »
haha no sir, what I was going to suggest it to force UTF8 as default, as Tomcat default is another.
On the connector put this setting too
<Connector port="8080" URIEncoding="UTF-8"/>

on the server.xml file


I know you did add the parameter on the JVM options but by some reason sometimes it won't work.

Offline florence_c

  • Newbie
  • *
  • Posts: 14
  • Karma: 0
Re: Problem of characters encoding with AIL
« Reply #8 on: February 04, 2013, 08:25:06 AM »
Hello,

I just tried your suggestion, but it does not work.... :-[
Don't know if it's a clue, but setting file.encoding JVM option to 'cp1252' makes that stuff work, that's why I was talking about this in a previous post  :P

Let me expose my own analysis of the issue:
The AIL method 'ailFactory.getApplicationInfo().mOptions' returns CME options as String (that's the result of the getClass method). But by setting JVM option to UTF8, all the String objects are UTF8 objects. Since the original object is cp1252 encoded, and the final one is UTF8, you must do something with encoding to make it work. That's why I think AIL method does not care about the encoding.
Do you know if there is a way to get CME options with another method ? or if it's possible to force AIL encoding ?

Offline cavagnaro

  • Administrator
  • Hero Member
  • *****
  • Posts: 7639
  • Karma: 56330
Re: Problem of characters encoding with AIL
« Reply #9 on: February 04, 2013, 09:28:18 AM »
So you can receive on CP1252, so far that works, but now you want to convert it to UTF...so you can do something like this:
http://www.jguru.com/faq/view.jsp?EID=137049

Just thinking loud

Offline florence_c

  • Newbie
  • *
  • Posts: 14
  • Karma: 0
Re: Problem of characters encoding with AIL
« Reply #10 on: February 04, 2013, 10:22:05 AM »
Well...
I already mentioned this in a previous post, but I should have been more precise:

I have tested the getBytes() method on my string object retrieved with AIL method, using various charset names:
1/ In my CME application options, I put the following string for testing : "èé"
2/ I retrieve the string with ailFactory.getApplicationInfo().mOptions (note that important : when retrieved, it is ALREADY a String)
3/ I display the bytes of every char of my string, with a simple for block:
byte[] theBytes = myString.getBytes(charsetNameToTest);
for (byte eachByte : theBytes) {
LOGGER.debug("\nbyte = " + eachByte);
}
4/ the results for various charsets:
cp1252 => both 'è' and 'é' equal to '63'
UTF8 => both 'è' and 'é' have 3 bytes equal to '-17'/'-65'/'-67'
UTF16 => both 'è' and 'é' have 2 bytes equal to '-1'/'-3'

Indeed, I think that once the String is retrieved with AIL method, it's too late : the String is built, not correctly, not with the right encoding, and nothing you can do after can reverse what has been done.
When the JVM is cp1252, no problem : start is cp1252, end is cp1252, encoding is implicit. But when start and end have not the same encoding, it fails.

OK, I realize that the way I explained the thing, it seems to have no solution, but I still have a little hope that my analysis is wrong...

Offline cavagnaro

  • Administrator
  • Hero Member
  • *****
  • Posts: 7639
  • Karma: 56330
Re: Problem of characters encoding with AIL
« Reply #11 on: February 04, 2013, 11:40:28 AM »
Not sure if would help but on GAD:

Section: multimedia
Option Name: enable-multicharset-environment
Default Value: false
Valid Values:
·        true, to prevent the corruption of the Subject of the e-mail, Genesys Desktop retrieves the Subject from UCS instead of Interaction Server.
·        false, the capability is not enabled.

Offline florence_c

  • Newbie
  • *
  • Posts: 14
  • Karma: 0
Re: Problem of characters encoding with AIL
« Reply #12 on: February 04, 2013, 04:27:08 PM »
Well, indeed, I have looked for some helpful options in the AIL pdf docs, but nothing at all related to 'charset', or 'encoding'.
So, unless someone can indicate me some hidden option I could use for my needs, I will have to search the solution elsewhere :(